Yo Meng , , Hilong Liu , , , Pengfei Lin , , Mengrong Ding , Chngming Dong
a State Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics, Institute of Atmospheric Physics, Chinese Academy of Sciences,Beijing, China
b College of Earth and Planetary Sciences, University of Chinese Academy of Sciences, Beijing, China
c Oceanic Modeling and Observation Laboratory, Nanjing University of Information Science and Technology, Nanjing, China
Keywords:Mesoscale eddy Kuroshio extension Eddy identification and tracking
ABSTRACT Some relatively mature mesoscale eddy products have been released for scientific purposes in recent decades.However, the metrics used to identify eddies, the tracking methods, and the definition of the physical parameters are all different across the different datasets, so intercomparisons and validation of these datasets are badly needed. Here, the authors intercompare the basic features of ocean mesoscale eddies in the Kuroshio extension region from four eddy datasets -namely, Chelton, GEM-M, Faghmous, and Dong. In the case study, eddy numbers and locations as well as the eddy tracks identified by the four datasets are compared for a specific date. The authors find that all the datasets have different eddy numbers, but more than 50% of identified eddies coincide.GEM-M, with the so-called “segmentation ”algorithm, can identify considerably more eddies than others, while Chelton identifies fewer eddies due to tracking errors, which also lead to a long lifespan. From the analysis of the probability distribution function of eddy features, GEM-M eddies tend to have a larger amplitude and radius and Chelton tends to have long-life eddies. It is further found that the geographic distributions and temporal variation of normalized eddy features are highly similar among the four datasets -particularly among Chelton,Fahgmous, and Dong. In addition, the mean trajectories of the four datasets are generally overlapped initially,and then spread after 245 days. The findings help toward better understanding the uncertainties of eddy features in the Kuroshio extension region.
Oceanic mesoscale eddies are characterized by closed circulation,with a time scale of several days to hundreds of days, and a spatial horizontal scale of tens to hundreds of kilometers. Eddies are found almost everywhere in the global oceans, except in some “eddy desert ”regions( Chelton et al., 2011 ), where almost no eddies can be found according to the present datasets available. Eddies are an important oceanic phenomenon, as they dominate the kinetic energy of the ocean and are responsible for a significant part of the transport of heat, salt, and nutrients in the ocean ( Fu et al., 2010 ).
Studies of eddies rely heavily on both the observational data and the detection methods involved. In recent decades, along with the rapid development of high-resolution ocean satellite remote sensing technology, the methods of automatic detection and analysis of ocean mesoscale eddies have developed quickly. Accordingly, some now relatively mature data products have been released for scientific purposes, such as Chelton et al. (2011) , Dong et al. (2011) , Faghmous et al. (2013) ,Matsuoka et al. (2016) , and Tian et al. (2020) . Numerous studies have already employed these datasets (e.g., Gaube et al., 2013 ; Amores et al.,2017 ). However, the metrics to identify eddies, the tracking methods,and the definitions of the physical parameters are all different across the different methods, which may profoundly affect their results. But will these differences influence analysis results of the spatial and temporal variabilities of eddies? And if so, how can we reduce the effects on our research of using different datasets? These are fundamental questions that have thus far been largely ignored. Therefore, as a first step toward answering these questions, intercomparisons and validations of these datasets are badly needed. Some eddy products have compared the mean properties of eddies, like lifespan, amplitude, displacement,and translational speed, with existing datasets to show the discrepancies( Liu et al., 2016 ; Sun et al., 2017 ; Tian et al., 2020 ), which is helpful for the development and progress of models. Here, we further compare four representative datasets.
The intercomparison of the four eddy datasets in this study takes place in an eddy-rich region -the Kuroshio extension (KE) region.The datasets are: Chelton et al. (2011) , GEM ( Li et al., 2016 ),Faghmous et al. (2013) , and Dong et al. (2011) , which are all widely recognized in the research community. Chelton, GEM, and Faghmous are typical representations of employing a sea level anomaly (SLA) criterion. Dong, however, applies a geometric criterion to identify eddies.The results of this paper will help toward better understanding the differences among different eddy datasets in the KE region. We describe the datasets and methods in Section 2 , and the results of the intercomparison among the four datasets are presented in Section 3 . Section 4 provides some concluding remarks.
Table S1 shows the basic information of the four datasets. First, the most popular eddy dataset, provided by Chelton et al. (2011) , hereafter referred to simply as “Chelton ”, who were the first to publish results on global eddies, is employed, which is officially released on the Chelton website ( http://wombat.coas.oregonstate.edu/eddies/ ). Chelton is derived from the spatially high-pass filtered daily SLA of AVISO (Archiving, Validation, and Interpretation of Satellite Oceanographic data),with wavelength scales larger than 20° of longitude by 10° of latitude during the period 1993-2010. They utilized the closest eddy strategy.For each eddy at time step k identifies the closest eddy at time step k + 1 as part of the trajectory of same eddy ( Chelton et al., 2011 ).
Second, the eddy dataset identified by the GEM (Genealogical Evolution Model) method of Li et al. (2016) is adopted with modified criteria (hereafter referred to as GEM-M). GEM is also a geometric method using daily SLAs in the same period as Chelton, but it can further identify the eddy separation and mergence, which leads to many more eddies and relatively short lifespans and trajectories of eddies.Wang et al. (2019) studied the merging events of two eddies for volume, vorticity, and total angular momentum, as well as eddy potential energy, based on GEM. Since the tracking method of Chelton is disappointing at dealing with temporary missing eddies, GEM-M adopts an inheritance method to evaluate the similarity between the two eddies,and then connects them.
The third dataset is also identified by the SLA of the same 0.25°×0.25° merged satellite data, AVISO, but with a parameterfree pattern-mining method, which was proposed by Faghmous et al.(2013) -hereafter referred to simply as “Faghmous ”. It is different from the traditional approach like that used in Chelton, in that Fahgmous starts with the extrema of eddies and then builds up the body. The method guarantees that there is only one extreme point in the eddy.The tracking method is the same as Chelton but with different criteria.
The identification method of the fourth dataset, which was proposed by Dong et al. (2011) , is based on the characteristics of the sea surface temperature (SST) field. We hereafter refer to this product simply as“Dong ”. They deduced the velocity vector of the thermal wind from the SST field from REMSS (Remote Sensing Systems) with a spatial resolution of 9 km and temporal resolution of 1 week, and used the geometric characteristics of the velocity vector field to identify the eddy centers and boundaries. In this case, the definition of amplitude, which is based on the SLAs, cannot be obtained, so we cannot compare it with other data in the following results. They also claimed that the method is especially suitable for analyzing the results of high-resolution numerical models, and for analyzing eddies. The tracking method is the same as in Chelton.
Eighteen years of data from 1993 to 2010 are selected for analysis of the four datasets. The additional physical variables upon which we focus are the lifespan, the amplitude, and the radius, although they have different definitions in the various datasets. Since Faghmous and Dong provide weekly data, the eddy lifetime is multiplied by seven to transfer the unit from weeks to days.
Fig. 1 shows the locations of cyclonic eddies (CEs) and anti-cyclonic eddies (AEs) identified by the four datasets in the KE region on 4 June 2008, overlaid with the SLA from AVISO for the first three datasets and by the surface geostrophic currents for Dong. We can see that all the datasets have different numbers of CEs and AEs, ranging from 17 to 30 for CEs and from 23 to 30 for AEs. Nearly half of the eddy positions for each dataset are coincident with Chelton’s, which we consider as a reference in the present study. The exact number of overlapping eddies for CEs (AEs) for GEM-M, Faghmous, and Dong are 13 (11), 14 (12),and 12 (13). Among all the four eddy datasets, there are 10 coincident CEs and 13 coincident AE centers, accounting for about 40%-57% of the total numbers. We also find that most of the coincident eddies are typical eddies with large SLAs and obvious boundaries. The number in GEM-M is generally higher than in the other results, because weaker eddies can be detected by the so-called “segmentation ”algorithm of GEM-M.

Fig. 1. Locations of cyclonic (red diamonds) and anticyclonic (black points)eddy centers on 4 June 2008 for (a) Chelton, (b) GEM-M, (c) Faghmous, and (d)Dong. The coloring in (a-c) represents the SLA (units: m), and in (d) it is the vectors of surface geostrophic currents (units: m s ? 1 ).

Fig. 2. Probability density distribution (PDF) of (a) lifetime (units: days), (c)radius (units: km), and (e) amplitude (units: cm) of cyclonic eddies for Chelton(blue curve), GEM-M (pink curve), Faghmous (green curve), and Dong (black curve) over the KE region (30°-40°N, 143°E-175°W) during the period 1993-2010. Panels (b, d, f) are the same as (a, c, e), respectively, but for anticyclonic eddies.
To compare the tracking algorithms of the four datasets, we selected a CE and an AE in Fig. 1 that existed in all four datasets (Fig. S1). We find that the lifetimes of the CE (AE) for Chelton, GEM-M, Faghmous,and Dong are 629 (70) days, 36 (92) days, 21 (70) days, and 28 (91)days since 4 June 2008, respectively. The lifetime of Chelton’s CE is much longer than that of the other datasets. Further investigation indicates that Chelton’s CE jumps to another eddy on 16 July 2008 by mistake, resulting in the extremely long lifetime. This situation can also be found in other cases. That is also the important reason why there is a longer lifetime and fewer eddy generations for Chelton on average.Tian et al. (2020) also indicated, through comparing with the Faghmous and AVISO17 (version 1.0) eddy products, that their method could track more stable eddy tracks without large jumps. The judgment of eddy tracks is still a key line of research.
To investigate the statistical characteristics of the four datasets, the PDFs of lifespan, radius, and amplitude in the KE region during the period 1993-2010, for all the datasets, are shown in Fig. 2 . The mean values of these variables are shown in Table 1 . It seems that the differences among the datasets are larger than those between CEs and AEs for each product. Here, we focus on the former. The total number of ed-dies in GEM-M is the highest among the four sets of results, being about 2038 for CEs and 2027 for AEs, followed by Dong’s (1843 and 1660)and Faghmous’s (1726 and 1504). Due to the mistake in the eddy tracking mentioned above, the eddy numbers of Chelton (1139 and 1107)are nearly half those of GEM-M. The higher number of GEM-M is also related to the splitting technology with watershed segmentation, enabling GEM-M to identify very close-distance eddy neighbors ( Li and Sun, 2015 ).

Table 1 Eddy numbers and mean values of characteristics of the four eddy datasets in the KE during the period 1993-2010.
Compared with Chelton, the other three datasets tend to have a greater number of short-term eddies, particularly Faghmous.Table 1 shows that Chelton’s lifetime is about 30-50 days larger than for the other datasets. As mentioned in the case study, the tracking errors in Chelton are the possible reason behind the longer lifetime of eddies in Chelton.
Because the definitions of eddy boundary for the four datasets are different, we find large differences in radius among them, particularly between GEM-M and the other three. The PDF curves of Chelton, Faghmous, and Dong are similar, with about 65%, 80%, and 62% radii of eddies less than 100 km, respectively. However, GEM-M tends to have the largest radii, with only 29% less than 100 km. The average radius for GEM-M is about 130 km, but around 90 km for the other three, and this is because the eddy boundary of GEM-M is determined by the outermost SLA contour. As for the other research methods, Chelton mainly uses a direct contour of SLA around which the average geostrophic speed is at a maximum; Faghmous uses a “bottom-up ”method similar to Chelton to identify the eddy boundary; and Dong defines the outermost closed isothermal line around the eddy center as the boundary. All these boundary definitions are generally smaller than that of GEM-M (Fig. S2).
The amplitude of GEM-M is also larger than two of the other datasets -Chelton and Faghmous. Dong does not provide amplitude data. The most frequent amplitudes for Chelton and Faghmous range between 4 and 6 cm, while GEM-M is about 18 cm. The percentage of amplitude less than 10 cm is about 43% for Chelton and 55% for Faghmous, while it is only about 5% for GEM-M. In Table 1 , the mean amplitude of Chelton and Faghmous is similar at around 14 cm, but both are much smaller than in GEM-M, which is close to 26-27 cm. The amplitude of GEM-M eddies is defined as the difference in SLA between the eddy’s extrema and its mean perimeter; thus, the large boundary in GEM-M may be one of the important factors affecting the amplitude.
Fig. 3 shows the geographic distributions of generation number, lifetime, radius, and amplitude of the four datasets in 2°×2° grids. All variables have been rescaled by the minimum values and the maximum value to transform the value between 0 and 1. The black contours of absolute dynamic topography denote the axis of the jet stream in the KE. Although the original values have large differences among the datasets, the spatial distributions of normalized eddy characteristics are similar. The generation number of CEs (AEs) in the north (south)of the KE is more than that in the south (north) ( Fig. 3 (a,b)). The spatial correlation coefficients of eddy number between Chelton and the other three datasets is 0.36 for GEM-M, 0.56 for Faghmous, and 0.54 for Dong ( Table 2 ). The value of GEM-M is relatively low, which is because GEM-M identifies more CEs near the coast. This can also be found in the standard deviation in Fig. 3 (i).

Fig. 3. The mean geographic distribution of the (a) number, (c) lifetime, (e)radius, and (g) amplitude of cyclonic eddies (CEs) at each 1° grid during 1993-2010 for the four datasets. Panels (b, d, f, h) are the same as (a, c, e, g) but for anticyclonic eddies (AEs). The standard deviation of the (i) number, (k) lifetime,(m) radius, and (o) amplitude of CEs at each grid between the four datasets.Panels (i, k, m, o) are the same as (j, l, n, p) but for AEs. The black box is the study area.

Table 2 Spatial correlation coefficients between the mean state of Chelton and the other datasets during the period 1993-2010.
The geographic distributions of lifetime are also highly consistent among the four products ( Fig. 3 (c,d)). Long-lived CEs (AEs) are mainly located in the south (north). The spatial correlation coefficients of number between Chelton and three other datasets are all up to 0.76 ( Table 2 ).The standard deviation is small in the KE region ( Fig. 3 (k,l)). Both results denote that there is a similar spatial pattern of lifetime in the four datasets, although there are large differences in their original data.
The radius distributions of the four datasets are also similar( Fig. 3 (e,f)). All datasets show that the CEs (AEs) with large-sized radii are mainly concentrated on the south side (north side), with spatial correlation exceeding 0.65. Moreover, the standard deviation accounts for approximately 0 in the KE region ( Fig. 3 (m,n)). Even though the mean radius of GEM-M is higher than the other three, it will not impact the spatial pattern of the radius in the KE.
The spatial correlation coefficients of amplitude are all over 0.95,while the standard deviation of CE amplitude is relatively high in the jet stream axis ( Fig. 3 (o)). This is because the large amplitude (over 0.5 cm) of CEs for Chelton, GEM-M, and Faghmous extend eastward to 170°E, 175°E, and 150°E, respectively (Fig. S3).
The annual mean and climatological monthly mean area-averaged eddy number, radius, and amplitude over the KE region during the period 1993-2010 are shown in Fig. 4 . All variables have been rescaled by the minimum and the maximum values to transform the values between 0 and 1. Here, we only show the results of total eddies, not in different polarities, because the results of CEs and AEs are similar as the total one. The results for GEM-M are quite different from those of the other datasets. It is clear that the radius and amplitude of GEM-M have low values during 1999-2000, but the values of the other three are high.The correlation coefficients between GEM-M and the others are 0.59,0.33, and 0.21 in terms of annual mean numbers, but the coefficients between the three are much higher (0.6-0.81). Li et al. (2016) found that the eddies in the KE region have a high frequency of merging and splitting events. GEM-M used splitting technology with the watershed to segment multinuclear eddies into mononuclear ones. The shrinkage of the eddy boundary reduces the value of the radius and amplitude.There is a high possibility that the splitting frequency for eddies was larger during 1999-2000.
The seasonal variation of eddies is an important issue, and the four datasets show similar variability to some extent. Recent studies have shown that eddy numbers are associated with background potential energy, which is highly correlated with the product of mixed-layer depth and mesoscale strain rate ( Wang et al., 2019 ; Zhang et al., 2020 ). The monthly climatology also shows high correlation between Chelton and Faghmous. However, the eddy number and amplitude from GEM-M show different results from February to June. As in the annual mean, the correlation between GEM-M and the other curves have low coefficients.That is also related with the eddy splitting and/or merging process in GEM. The splitting and/or merging frequency for eddies is larger from May to July in the KE region, but smaller from February to April.
x
- andy
-axis represent longitude and latitude, respectively. The trajectories of CEs and AEs in the four datasets overlap in the first 4°, lasting about 245 days. All mean trajectories move westward with maximum westerly movement of 16°, 14°, 12°,and 13° for Chelton, GEM-M, Faghmous, and Dong, respectively. The meridional movement of the eddy is different among the four datasets,particularly at about 245 days. The northward deflection of AEs for Chelton, GEM-M, Faghmous, and Dong is 3°, 0°, 2°, and 1°, respectively. The mean trajectory for GEM moves south for a while, then turns north.The southward deflection of CEs for Chelton, GEM-M, Faghmous, and Dong is 4°, 2°, 1°, and 2°, respectively. Chelton et al. (2011) found that CEs (AEs) show a poleward (equatorward) propagation, but their analysis was on the global scale. The propagation characteristics of CEs and AEs are different in different regions, especially for western boundaries( Cheng et al., 2014 ). Das et al. (2019) also reported similar southward(northward) propagation of CEs (AEs).
Fig. 4. Normalized annual mean area-averaged (a) total eddy number, (b) radius, and (c) amplitude in the KE (box in Fig. 3 ) during the period 1993-2010 for Chelton (blue curve), GEM-M (pink curve), Faghmous (green curve), and Dong (black curve). (d-f) as in (a-c) but for the monthly climatology.

Fig. 5. Period mean of (a) anticyclonic and (b) cyclonic eddy trajectories in the KE (box in Fig. 3 ) during the period 1993-2010 for Chelton (blue curve),GEM-M (pink curve), Faghmous (green curve), and Dong (black curve).
Because the methods of the tracking algorithms for the four datasets are different, the mean trajectories ultimately move apart after about 5°meridionally. Although Chelton, Faghmous, and Dong apply the same tracking theory, there are some differences among them, such as the predefined search space, treatment of temporarily missing eddies, and so on.
In the present study, the basic features of ocean mesoscale eddies in the KE region from four eddy datasets have been intercompared, including Chelton et al. (2011) , GEM-M ( Li et al., 2016 ),Faghmous et al. (2013) , and Dong et al. (2011) . All the datasets have different numbers of CEs and AEs, but more than 50% of identified eddies are coincident in our selected case. During the period 1993-2010,GEM-M identifies the highest eddy number, which is nearly twice as large as that of Chelton, which has the lowest numbers.
According to the PDF analysis, there are larger differences in eddy features between GEM-M and the other three datasets. Due to the definition of radius and amplitude, the values of mean radius and amplitude for GEM-M are also larger than for the other datasets. Besides, Chelton tends to have long-life eddies, while Faghmous is opposite, which has many short-life eddies.
Although the characters of eddies are different among the datasets,the geographic distributions and temporal variation of normalized eddy features are similar among the four datasets -particularly among Chelton, Fahgmous, and Dong. Because the identification method is unique,the spatial and temporal correlations for GEM-M with the other datasets are relatively low. That also suggests that most of the datasets can capture the basic features of the mesoscale eddies in the KE region.
We also investigated the mean trajectories of eddies in the four datasets. Eddies generally move westward, with relatively smaller meridional deflection, and with CEs (AEs) exhibiting an equatorward(poleward) deflection. The trajectories of CEs and AEs in the four datasets overlap in the first 4°, which lasts about 245 days. After that,the trajectories spread completely, indicating large uncertainties in the trajectories of long-lived eddies.
Based on the above results, we find that the products using different identification and tracking methods are comparable in their spatial patterns, temporal variations, and trajectories, particularly after normalizing. This indicates that all the methods are capable of capturing the variability of oceanic mesoscale eddies in the eddy-rich region. However, there are also large discrepancies in eddy characteristics across the datasets, which will lead to difficulties for direct comparisons. Therefore, the definitions in different products should be unified in the future.Besides, the tracking method also needs improving, especially for longlived eddies.
However, there are two reasons why we cannot figure out which is the most accurate dataset. First, the results of eddy identification and tracking depend greatly on the definitions and methods for eddies and their properties. Second, there is a lack of data that can be used as the“truth ”. Thus, we can only perform an intercomparison, without saying which dataset is better than the others. In the future, the effect of the eddy detection method needs deeper and more systematic and comprehensive analysis.
Funding
This study was supported by National Key R&D Program for Developing Basic Sciences [grant number 2018YFA0605703 ], the Strategic Priority Research Program of the Chinese Academy of Sciences[grant number XDB42010404 ], and the National Natural Science Foundation of China [grant numbers 41976026 , 41776030 , 41931183 , and 41931182 ].
Acknowledgments
We thank Prof. Liang Sun for kindly providing the source code of the GEM method, as well as his comments on the manuscript. The authors acknowledge the technical support from the National Key Scientific and Technological Infrastructure project “Earth System Science Numerical Simulator Facility ”(EarthLab).
Supplementary materials
Supplementary material associated with this article can be found, in the online version, at doi: 10.1016/j.aosl.2020.100011 .
Atmospheric and Oceanic Science Letters2021年1期