Mengqi Liu ,Jinqiang Zhang ,c,Xiangao Xia ,c,?
a Key Laboratory of Middle Atmosphere and Global Environment Observation, Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing, China
b University of Chinese Academy of Sciences, Beijing, China
c Collaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters, Nanjing University of Information Science & Technology, Nanjing, China
Keywords:Clear-sky detection methods Surface irradiance Pollution Total-sky imager
ABSTRACT Surface irradiance measurements with high temporal resolution can be used to detect clear skies, which is a critical step for further study, such as aerosol and cloud radiative effects.Twenty-one clear-sky detection (CSD)methods are assessed based on five years of 1-min surface irradiance data at Xianghe –a heavily polluted station on the North China Plain.Total-sky imager (TSI) discrimination results corrected by manual checks are used as the benchmark for the evaluation.The performance heavily relies on the criteria adopted by the CSD methods.Those with higher cloudy-sky detection accuracy rates produce lower clear-sky accuracy rates, and vice versa.A general tendency in common among all CSD methods is the detection accuracy deteriorates when aerosol loading increases.Nearly all criteria adopted in CSD methods are too strict to detect clear skies under polluted conditions,which is more severe if clear-sky irradiance is not properly estimated.The mean true positive rate (CSD method correctly detects clear sky) decreases from 45% for aerosol optical depth (AOD) ≤ 0.2% to 6% for AOD > 0.5.The results clearly indicate that CSD methods in a highly polluted region still need further improvements.
Surface irradiance is highly desirable for the quantification of the global energy balance and the assessment of solar energy ( Xie and Liu,2013 ; Garcia et al., 2014 ).Instantaneous surface irradiance is highly affected by cloud variations, based on which clear-sky detection (CSD)methods can be developed.These CSD methods can be separated into two broad categories according to the data analyzed ( Gueymard et al.,2019 and references therein): (1) using global horizontal irradiance(GHI) and/or diffuse horizontal irradiance (DHI) to detect cloudless skies (CSD), i.e., the absence of visible clouds across the whole sky dome that GHI and DHI are sensitive to; and (2) using direct normal irradiance (DNI) to detect clear sun (CSD), i.e., the absence of clouds along the line of sight to the sun.The criteria of CSDis obviously stricter than that of CSD sun .
The separation between clear and cloudy skies is mainly accomplished according to their extreme difference in the smoothness of the temporal variation of surface irradiance ( Perez et al., 1990 ; Shen et al.,2018 ).This assumption generally works well but is occasionally not true.For instance, the magnitude and temporal variability of GHI in the presence of haze and thin clouds may look similar, which is because the attenuation of the direct component of GHI by aerosols and thin clouds is likely offset by an increase of the diffuse component, leading to cloudy GHI resembling the clear-sky counterpart ( Li et al., 2007 ).The performance of the CSD methods is highly dependent on thresholds that are likely site-dependent as a result of diverse climate ( Long and Ackerman, 2000 ).The CSD methods should thus be carefully evaluated under different climatic conditions characterized by diverse cloud systems and aerosol loadings.The total-sky imager (TSI) with a field-of-view of 160°is essentially sensitive to the whole sky dome that impacts the magnitude and temporal evolution of GHI and DHI; therefore, the TSI is a good option for the performance assessment of CSD methods ( Huo and Lu, 2012 ).
Surface irradiance stations instrumented by high-quality radiometers have been widely established in China, and are expected to play a critical role in the observation and forecasting of renewable energy and climate.There are two decisive issues in CSD that need further investigation.One is to produce a good clear-sky reference, and the other is to evaluate CSD methods in polluted regions ( Gueymard et al., 2019 ; Bright et al., 2020 ).Given the fact that heavy aerosol loading in China can lead to reduction in surface irradiance comparable to clouds ( Li et al., 2007 ), the goal of this study is to evaluate the performance of CSD methods at a heavily polluted station.Attention is paid to how the CSD methods are affected by pollution.In order to fulfill this aim, clear skies automatically determined by TSI images are manually screened, which results in a more precise benchmark for the evaluation of CSD methods.Recommendations for further study on CSD methods are also discussed.

Fig.1.(a)Performance of the TSI algorithm via comparison against manual checks of TSI images, and (b) the performance of the TSI algorithm under different AOD 550 ranges.TP represents samples determined as clear by both the TSI algorithm and visual inspection, whereas TN represents both detected as cloudy skies.FP and FN are when a time step is identified as clear (cloudy) by the manual check but cloudy (clear) by the TSI algorithm, respectively.Unidentified represents samples that cannot be identified unambiguously by the manual check.
The measurements were taken at Xianghe (39.75°N, 116.95°E) –a suburban site on the North China Plain characterized by frequent occurrence of a thick layer of haze intervened by a background level of aerosols.The annual mean aerosol optical depth (AOD) at 550 nm(AOD 550 ) is 0.63, which is accompanied by a large day-to-day variation(standard deviation of 0.56) ( Xia et al., 2016 ).Furthermore, AODgenerally increases during daytime, with a maximum increase of 40%in winter ( Song et al., 2018 ).
GHIs between January 2005 and December 2009 were measured by a Kipp & Zonen CM21 pyranometer.An Eppley Normal Incident Pyrheliometer and a Black & White pyranometer were installed on a sun tracker to measure DNI and DHI, respectively.The measurements were quality-controlled using the BSRN-recommended procedures and were uploaded to the BSRN data archive ( Driemel et al., 2018 ).
TSI-440, manufactured by Yankee Environmental Systems, is a fullcolor sky camera.The solid-state charged-coupled imager of the TSI looks downwards onto a heated and rotating hemispherical mirror, and sky images are thereafter taken every minute.A shadow band on the mirror blocks the intense direct-normal light from the sun to prevent flares and protect the imager optics.A red-to-blue threshold is used to distinguish between clear and cloudy pixels from 24-bit JPEG format images with 352 ×288 pixels.
AOD 550 (calculated by AOD 440 and ?ngstr?m exponent), and water vapor (WV) data from the Aerosol Robotic Network (AERONET:https://aeronet.gsfc.nasa.gov/ ) were used to drive the REST2 model( Gueymard, 2008 ) to calculate clear-sky GHI (GHI cs ).AOD 550 and WV were derived from spectral direct solar irradiance measurements by a sunphotometer with accuracies of 0.02 and 10%, respectively (Holben et al., 1998 ).Additionally, NASA’s MERRA-2 aerosol and WV products( https://gmao.gsfc.nasa.gov/reanalysis/MERRA-2/ ) (at 0.5° × 0.625°spatial resolution) are used to discuss how GHIimpacts the CSD methods that depend on GHIin cloud-screening.
Twenty-one CSD methods are introduced in Gueymard et al.(2019) .These CSD methods differ in the quantities and thresholds they adopt.GHI, DHI, DNI, and/or their derivations are used.The derived quantities include the clearness index and its modification, the clear-sky index, the diffuse fraction (K
), the line length difference between measurements and clear-sky counterparts ( ΔL
), and the sky clearness.
A detailed description of these parameters and CSD methods can be found in the supplementary material.Despite being an important instrument to detect the sky condition,the TSI is impacted by issues such as a dirty camera, as well as birds and bugs on or obscuring the camera lens.Furthermore, the TSI fails to discriminate clear skies during heavily pollution episodes and, on the contrary, omits thin cirrus clouds ( Long et al., 2006 ).Therefore, TSI discrimination results by the algorithm are checked by visual inspection of raw TSI images.Although this is labor-intensive owing to the huge number of TSI images needing inspection, it helps to provide a valuable and robust clear-sky reference.A confusion matrix is useful when evaluating the performance of the TSI since the sky state is either clear or cloudy.The matrix consists of five classes: true positive (TP), true negative (TN),false positive (FP), false negative (FN), and unidentified.TP means that sky clear is determined by both the TSI algorithm and human observer,whereas TN means that both detect cloudy skies.FP and FN are when a time step is identified as clear (cloudy) by the manual check but cloudy(clear) by the TSI algorithm, respectively.Unidentified means a time step that cannot be identified unambiguously by the manual check.

Table 1.The fractions of TP, TN, FP, and FN associated with 21 CSD methods under conditions with two AOD 550 ranges (clean and polluted, respectively).The results for the CSD sky and CSD sun methods are divided by the horizontal solid line.

Fig.2.Sky images of clean (AOD 550 = 0.06) examples on 20 January 2005 ((a) 0800 LST, (b) 1100 LST, (c) 1400 LST, and (d) 1700 LST), and polluted (AOD 550 = 0.61)examples on 4 November 2009 ((e) 0800 LST, (f) 1100 LST, (g) 1400 LST, and (h) 1605 LST).

Fig.3.Time series of solar radiation and derived parameters from these raw measurements for the CSD methods under two distinct cases with low (AOD 550 = 0.06 on 20 January 2005) and high (AOD 550 = 0.61 on 4 November 2009) aerosol loading.The temporal evolution (in 1-min time steps) of the measured GHI, DNI, DHI,and their clear-sky counterparts calculated by using REST2 are shown in (a) and (b), respectively.Panels (c) and (d) show the concomitant evolution of K d and the threshold adopted by Lefevre.Panels (e) and (f) indicate the variation of GHI, and the blue shading represents the clear-sky range according to Long.The ΔL in 10-min intervals and the threshold for clear sky adopted by Reno are shown in (g) and (h).
Fig.1 (a) provides the proportions of the five classes.Cloudy skies account for 58.6% (TN + FN), the overwhelming majority of which (97%)are correctly determined by the TSI algorithm.The most striking feature is that the TSI algorithm performs poorly under clear-sky conditions.TP and FP are 12.4% and 28.4%, respectively, which means only 30% (TP/ (TP + FP)) of clear skies are correctly discriminated by the algorithm.Fig.1 (b) provides results under four AODranges.The TSI algorithm generally fails to identify clear skies when the AODis larger than 0.5.Moreover, it cannot effectively discriminate thin cloud from clear skies, leading to a large proportion of FN.Both factors work together to produce a low TP rate, which is rectified by manual inspection to the best of our ability.
Given aerosol loading is the key issue in CSD, the performances of 21 CSD methods under different AOD 550 conditions are evaluated separately.Table 1 provides the TP and TN rates under four AODranges for all the CSD methods that are validated against the visually inspected TSI reference.A few interesting features merit discussion.
First, a high TP rate is generally associated with a low TN rate,and vice versa.This implies that the CSD methods with high TN rates also likely lead to misclassification of clear skies into cloudy skies, i.e.,a high FP rate.Second, the performance is highly dependent on the CSD method.Regarding the TP rate, three CSDmethods, i.e., Battles( Batlles et al., 2000 ), Ineichen09 ( Ineichen et al., 2009 ), and Quesada( Quesada-Ruiz et al., 2015 ), perform relatively well, with the TP rate approaching 79% for AOD≤ 0.2.This is mostly because of loose criteria for detecting clear skies are adopted, which conversely produce a relatively low TN (5.23%, 6.57%, and 5.69%, respectively, for AOD≤ 0.2).On the contrary, those CSD sky methods adopting strict clear-sky criteria –for example, Perez ( Perez et al., 1990 ), Long ( Long and Ackerman, 2000 ), Garcia ( Garcia et al., 2014 ), Ineichen06 ( Ineichen, 2006 ),Ineichen16 ( Ineichen, 2016 ), and Zhang ( Zhang et al., 2018 ) –naturally result in a low TP rate (<
20% for AOD≤ 0.2 and nearly fail to detect clear skies for AOD 550>
0.5).The results here agree with previous studies; for example, Gueymard et al.(2019) reported Battles produces the highest TP and the lowest TN rate.Third, TP is highly correlated to AOD.The TN rates for AOD> 0.5 are substantially higher than those for AOD 550 ≤ 0.2.The criteria of these methods are too strict under polluted conditions.CSDmethods show similar features as those of CSD sky.This is indeed the case for Inman ( Inman et al., 2015 ) and Larraneta ( Larraneta et al., 2017 ), for example.The TP rates of these methods are about 60% for AOD≤ 0.2, and 12% for AOD> 0.5;whereas, the TN rates are about 19% for AOD 550 ≤ 0.2, and 65% for AOD> 0.5.Fourth, some methods are very sensitive to AOD values,especially those not using clear-sky irradiance estimation.For instance,the TP rate of Battles is about 100% of all clear-sky samples for AOD≤ 0.2, but decreases to 40% for AOD> 0.5; whereas, its TN rate increases from 25% to 83% for all cloudy-sky samples.Methods using clear-sky irradiance estimation are not very sensitive to AOD values; for example, AliaMartinez, Polo, and Reno, which have relatively high TP rates on the basis of high TN rates ( Alia-Martinez et al., 2016 ; Polo et al.,2009 ; Reno and Hansen, 2016 ).The TP rates of these methods are over 44% for AOD 550 ≤ 0.2 and over 4% for AOD 550 > 0.5 (55% and 25%,respectively, of all clear-sky samples).Meanwhile, their TN rates are relatively high (over 17% and 76% for AOD 550 ≤ 0.2 and AOD 550 >0.5).The clear-sky criteria of some CSDmethods –for example, Ineichen16 –exclude nearly all high AOD samples, and hence ineffectively detect clear skies.In order to demonstrate aerosol effects on CSD performance in more detail, two examples, 20 January 2005 (AOD 550 = 0.06) and 4 November 2009 (AOD= 0.61), are examined to show how CSD methods perform ( Fig.2 ).Time series of GHI, DNI, and DHI and their derivatives are shown in Fig.3.Note that the clear-sky radiation was calculated by REST2 with AERONET AOD and WV products.For 20 January 2005,the thresholds adopted by these CSD methods work well in most cases;however, the criteria are too strict to effectively detect clear skies on 4 November 2009.High AOD results in smaller GHI but larger DHI values, and therefore a larger Kthat always exceeds the specified clearsky threshold (0.3 adopted by Lefevre ( Lefevre et al., 2013 )).Owing to the occasional non-uniformity of aerosol plumes, the temporal fluctuation of GHI also very likely exceeds the clear-sky threshold, especially when very strict criteria are adopted; for example, the clear-sky threshold adopted by Long is too strict even under low AOD ( Fig.3 (e)).Although the ΔL threshold ( Reno and Hansen, 2016 ) overcomes this default when AOD is low, but it fails in the presence of high AOD.
A precise clear-sky irradiance is required for some CSD methods as the reference to normalize surface irradiance.This is generally achieved if we use instantaneous AERONET or hourly MERRA-2 data to drive the REST2 model.Thereby, relatively higher TP rates are derived relative to those on the basis of climatological AERONET data.A higher clearsky accuracy rate is accompanied by a lower cloudy-sky accuracy rate,regardless of the reference clear-sky irradiance used in the CSD method.The method of Gueymard et al.(2019) produces a 41% TP rate using instantaneous AERONET data (over 77% of all clear-sky samples), and the mean cloudy-sky accuracy rate is only about 52% in all cloudy-sky samples.On the contrary, methods with high cloudy-sky accuracy rates always have low clear-sky accuracy rates.Shen’s (Shen et al., 2018 )method produces the highest TN rate (over 44%); however, its TP rate is less than 15% regardless of the reference clear-sky irradiance.Obtaining a perfect balance between these two opposite scenarios appears to be difficult for all CSD methods; even when clear-sky irradiance is estimated with good accuracy.
Using TSI and surface irradiance data during 2004–09 at Xianghe –a heavily polluted station on the North China Plain –21 CSD methods were evaluated.The major results can be summarized as follows:
The CSD performance generally becomes worse as aerosol loading increases.A significant decrease can be seen in the clear-sky accuracy rate between AOD 550 ≤ 0.2 and AOD 550 > 0.5; the maximum precision decreases by about 80%.In contrast, the cloudy-sky accuracy rate slightly increases along with the AOD.AliaMartinez, Polo, and Reno produce relatively higher TP and TN rates under both clean and polluted conditions (over 44% and 17% for clean conditions, and over 4% and 76% for polluted conditions), which are recommended in this paper.
The thresholds of the CSD methods seem to be improper under polluted conditions.Comparing to clean clear skies, polluted clear skies are characterized by higher K d,GHI variance, and ΔL,which should be carefully considered.All these CSD methods adopt specified separation thresholds to detect clear skies that are linear in nature.Machine learning has demonstrated potential in pattern recognition and classification,which is expected to improve the skill of CSD methods.
Funding
This work was supported by the National Key R&D Program of China [grant number 2017YFA0603504], the Strategic Priority Research Program of the Chinese Academy of Sciences [grant number XDA17010101 ], and the National Natural Science Foundation of China[grant number 41875183 ].
Acknowledgments
For the coded versions of all the CSD methods mentioned in this paper, the reader can be found in the CSD Library at https://jamiembright.github.io/csd-library/ .
Supplementary materials
Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.aosl.2020.100016 .
Atmospheric and Oceanic Science Letters2021年2期