,,,
1.College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,P.R.China;
2.College of Civil Aviation,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,P.R.China;
3.Collaborative Innovation Center of Novel Software Technology and Industrialization,Nanjing 210023,P.R.China
Abstract:T he rapid growth of air traffic has continuously increased the workload of controllers,which has become an important factor restricting sector capacity.If similar traffic scenes can be identified,the historical decision-making experience may be used to help controllers decide control strategies quickly.Considering that there are many traffic scenes and it is hard to label them all,in this paper,we propose an active SVM metric learning(ASVM 2L)algorithm to measure and identify the similar traffic scenes.First of all,we obtain some traffic scene samples correctly labeled by experienced air traffic controllers.We design an active sampling strategy based on voting difference to choose the most valuable unlabeled samples and label them.T hen the metric matrix of all the labeled samples is learned and used to complete the classification of traffic scenes.We verify the effectiveness of ASVM 2L on standard data sets,and then use it to measure and classify the traffic scenes on the historical air traffic data set of the Central South Sector of China.The experimental results show that,compared with other existing methods,the proposed method can use the information of traffic scene samples more thoroughly and achieve better classification performance under limited labeled samples.
Key words:air traffic;similar scene;active learning;metric learning;SVM
Air traffic system is highly dynamic and complex due to the influence of weather and traffic flow.Air traffic controllers often need to judge emergencies and release control strategies quickly to avoid conflicts,congestion,delays and other situations.With the continuous growth of airspace traffic,controllers often fall into overloaded work conditions,which brings potential dangers to air traffic.Senior controllers can usually rely on their experience to make quick decision under historically similar weather and traffic flow conditions.For novice controllers,they cannot easily make control strategies due to the lack of experience in handling similar scenes.Therefore,the automatic identification of similar air traffic scenes and the automatic extraction of corresponding traffic control strategies can assist the controllers to make quick decisions and reduce their workload.In 2013,Grabbe et al.[1]proposed a similar scene-based strategy-assisted decision-making idea for air traffic control.It first finds out the historical days that are most similar to the reference day by measuring a set of given air traffic characteristics,and then analyzes the actual control strategies in the historical days(such as ground delay waiting procedures,TMIs,etc.)and performance evaluation indicators after implementation(such as average delay time,number of cancelled flights,etc.).Based on the analysis,recommendations of control strate-gies on the reference day are given.Grabbe’s idea is widely accepted by scholars,and a lot of research work has been carried out,mainly focusing on the feature selection,traffic scenes extraction and clustering of similar scenes[2-3].These methods eliminate the influence of correlation and redundancy between features by feature selection or extraction and then cluster the traffic scenes by measuring their distance.So that some different types of similar traffic scenes can be obtained.However,because the traffic scene data do not have labels and cluster algorithms cannot give the accurate number of categories,the clustering results only give a rough category division that is also hard to be explained.This obviously cannot bring direct help to the controllers in making current decision.
In order to classify and identify similar traffic scenes more accurately,we adopt the idea of semisupervised learning to propose a method for identifying similar traffic scenes in sectors based on active SVM metric learning(ASVM2L).In this method,the senior controllers(Oracle)first mark some traffic scene samples.For a large number of unlabeled traffic scene samples,an active learning strategy is used to pick up the most difficult samples for metric learner and submit them to Oracle to get labels.Then the labeled samples are used to train a SVM metric learner.The process will be repeated until the best metric matrix is obtained,thereby obtaining the best classification of similar traffic scenes.At last,we verify the effectiveness of the proposed ASVM2L algorithm on standard data sets,and the performance of the proposed similar traffic scene identification method based on ASVM2L on the real traffic scene data set.
To identify similar traffic scenes,a lot of researches have been conducted in feature selection and extraction,similarity measurement and clustering.Kuhn et al.[4]selected some features based on expert knowledge,such as the distance from the airport to the center of heavy rainfall,the strength of the crosswind on the runway,the number of scheduled flights,etc.,to calculate the similarity between the traffic scenes.Christien et al.[5]used features such as the number and type of aircraft,the number of aircraft conflicts,and static sector indicators as input for traffic scene classification.In 2016,Grabbe et al.[6]used the Yakkad coefficient to calculate the similarity of traffic scenes,and then applied it as the input of similar scene clustering and classification.Hu et al.[7]extracted discrete features and time series features of traffic scenes by reducing the 382-D traffic features,and used these features to cluster similar traffic scenes.
It can be seen that the current related researches mostly use clustering to obtain the categories similar traffic scenes.By calculating the Euclidean distance between the samples,the similar scenes are grouped into a cluster and regarded as the same category.Through these studies,we can initially obtain several categories of traffic scenes consistent with the controller’s cognition.This provides an opportunity for semi-supervised metric learning algorithms to be applied to measure the similarity of traffic scenes.Metric learning is an important research aera in machine learning.Finding appropriate metrics between samples by metric learning can help subsequent learning algorithms achieve better performance.As the most common metric learning method,the mahalanobis distance metric uses the original label information to generate side information,which is constructed into pairwise constraints or triple constraints acting on the loss function to implement metric learning.Xing et al.[8]firstly proposed a convex optimization method using pairwise constraints to learn Mahalanobis distance metric,and then proposed the large margin nearest neighbor algorithm(LMNN)[9]based on theK-nearest neighbor(K-NN)idea.The training goal of LMNN is to keep theKnearest neighbor samples of each sample in the same class,and the samples of different categories far apart.Neighborhood component analysis[10](NCA)obtains the transformation matrix in the Mahalanobis distance by randomly selecting thecross-validation results of the nearest neighbor optimization leave-one-out method on the training set.Unlike other methods,the NCA method is nonparametric and does not make any assumptions about the shape and boundary of the class distribution.It is both a metric learning method and a linear dimensionality reduction method.Information theory metric learning(ITML)[11]obtains the metric matrix by minimizing the differential relative entropy under the pairwise constraint of inequality.The maximally collapsing metric learning[12](MCML)constructs a convex optimization problem,which generates a metric matrix by folding samples of the same class to a point and pushing away samples of other classes infinitely.Wang et al.[13]proposed a kernel classification framework for learning the Mahalanobis distance metrics in the original feature space by combining with SVM. SVM metric learning(SVM2L)supports both pairwise constraints and triple constraints.As a kernel metric learning algorithm,SVM2L is not only more efficient than other metric learning algorithms,but also more suitable for applications with a small number of useful samples.
In some research work,metric learning algorithms have been combined with active learning to solve semi-supervised metric problems.The key idea of active learning is to select the sample with the maximum information from the unlabeled samples and submit it to Oracle for labeling and then use them for algorithm training.It can achieve better accuracy with few labeled samples.Pasolli et al.[14]proposed the active metric learning algorithm for remote sensing spectral image classification.By combining LMNN metric learning and 1-NN clustering,unlabeled samples were evaluated,and the most uncertain samples were selected,labeled and added to training set.Repeat the training process until a better metric result is obtained.Priyadarshini et al.[15]proposed the batch decorrelation for active metric learning for object recognition.This algorithm uses the metric matrix calculated by deep neural network to evaluate the set of constraints and select a batch of triplets with large amount of information to train better model.
Inspired by these works,we propose a metric learning algorithm that combines the idea of active learning with SVM2L to solve the problem of air traffic scenes.
In order to facilitate the introduction of the algorithm proposed in this article,the following four definitions are given.
Definition 1The definition of labeled sample set is

wherex i∈Rdindicates that the feature number of thei-th sample isdandy iis the label of theith sample.
Definition 2The definition of triple constraint set is

wherenis number of triple constraints.For each triple constraint(x1,x2,x3)l,ensuring that samplehas the same label with sample,and sampleis different.
Definition 3Committee set is

whereVrepresents the committee set andV iis theith committee member which can be machine learning algorithm,evaluation indicator,etc.
Definition 4Given a positive semidefinite matrixM,the definition of distance between two samples is

2.2.1 Metric learning based on SVM
SVM2L uses triple constraintsTgenerated by the train set to learn a metric matrixM,which can satisfy most of the triple constraints in the setT.The problem can be expressed as the following optimization problem

Eq.(1)can be rewritten by Lagrange remainder as

whereαandβare Lagrange multipliers and satisfy?l,αl≥0,βl≥0.In order to find the minimum,let the partial derivatives ofMandξto be 0,then we can get

Putting Eqs(.3,4)into Eq(.2),we can get the dual problem as

wheretiis thei-th triple constraint.K T(ti,tj)is the kernel function for triple constraints.

From Eq(.3),we can get the formula relationship between metric matrixMand Lagrange multiplierα,shown as

2.2.2 Query strategy for active learning
Semi-supervised learning has gradually become a hot issue in machine learning.In semi-supervised learning,unlabeled data are large and easy to obtain,but it is difficult to get their labels.The main idea of active learning[16]is that if the algorithm is allowed to select samples to train the model,it can achieve better accuracy with fewer labelled samples.Usually,the active learner selects samples from unlabeled samples and submit them to Oracle for labelling.Therefore,the key problem of active learning is how to select the samples to be labeled,that is,the query strategy.There are six kinds of query strategies:Uncertainty sampling,query-by-committee,expected model change,expected error reduction,variance reduction,and density-weighted methods[17].
In this paper,we design a query-by-committee strategy for active learning.According to Definition 3,we use threeK-NN classifiers as committee membersV'={V1(1-NN),V2(3-NN),V3(5-NN)}.For unlabeled samplex u,these three classifiers will give three labels.Then there will three cases of the three labels:
According to these three cases,active learning selects the samples with the largest difference value and submits them to Oracle.Oracle gives true labels of the samples,and adds them to training set to learn a new matrix.It should be noted that samples with the same difference value have the same amount of information,and the value for metric learning is also the same.Therefore,if there are many samples with the largest difference value,we can randomly select a batch size of samples and submit them to Oracle.
2.2.3 ASVM2L algorithm
Combining SVM2L algorithm with active query strategy,we propose an active metric learning based support vector machines(ASVM2L).The process of ASVM2L is described as Algorithm 1.In Step 2,the algorithm selects the nearest neighbor samples of the same label and ones of different la-bels for each sample to construct the triple constraints.Steps 3—5 solve SVM metric learning.Step 6 is the termination condition of the algorithm.Steps 7—8 use the active learning algorithm to select the samples with the largest difference value for labeling.
Algorithm 1 ASVM 2L
Initial set:Committee setV',OracleO
Input:Unlabeled sample setX,the batch of selection batch,constraint parameterC,and initial parametersn
Output:metric matrixM
(1)Selectnsamples to OracleOfor labeling.This labeled set is denoted asL0,and the rest of unlabeled set is named asU0.
(2)Construct a triple constraint set TCibased on the current labeled setL i.
(3)Solve Eq.(2)with the triple constraint TCi,to get Lagrange multiplierα.
(4)SolveM iaccording to Eq.(7).
(5)ProjectM iinto the positive semi-definite space to obtain the metric matrix.
(6)IfU i=?,the algorithm terminates.
(7)The unlabeled sample setSiwith the largest difference value in batch size is selected according to the designed active selection strategy.
(8)LetoracleOlabelforSi,L i+1=L i+Si,U i+1=U i-S i,i=i+1,back to Step 2.
Fig.1 is the framework of identifying similar traffic scenes based on ASVM2L.The left part represents the data pre-processing process.We extract 382 traffic features from original data and use principal component algorithm(PCA)to reduce the dimensionality to 60-D principal components.The middle part is the process of ASVM2L.According to the interaction between the metric learning model and Oracle,the optimal metric matrix is learned from the fewest labelled samples.The right part is the process of identifying similar traffic scenes.The new traffic scenes are classified by the existing labeled scenes and metric matrix,and the samples of the same class are similar scenes.

Fig.1 Framework of identifying similar traffic scenes based on ASVM 2L
To verify the effectiveness of ASVM2L algorithm,we compare ASVM2L with other metric learning algorithms,LMNN,NCA,ITML,MCML,and SVM2L,on five UCI standard data sets shown in Table 1.
For each data set,70%of the data are used formetric matrix learning,and the remaining 30%are used forK-NN classification to verify the effect of metric learning.We calculate the 3-NN classification accuracy after metric learning for all the six algorithms on each data set.The experimental results are shown in Table 2.It can be seen that ASVM2L achieved the highest accuracy on two data sets and the second highest accuracy on three data sets.It can be concluded that ASVM2L has more stable and superior metric performance.

Table 1 Standard data sets

Table 2 Performance comparison of metric learning algorithms %
To verify the effectiveness of active learning query strategy,we compare ASVM2L with SVM2L using random sampling strategy on the five standard data sets.The learning curves of the two algorithms are shown in Fig.2.The red curve represents the 3-NN classification accuracy after ASVM2L,and the blue curve represents the 3-NN classification accuracy after random-sampling SVM2L.It can be seen that by introducing the active learning strategy,ASVM2L can achieve the better classification accuracy than the random-sampling SVM2L.The results on the five data sets all show that when the number of training samples reaches half of the size of all samples,the ASVM2L algorithm has obtained the learning performance similar to supervised metric learning,which means ASVM2L can greatly take advantage of the label information of limited labelled samples,or it can use fewer labelled samples to get better performance.

Fig.2 Learning curves of ASVM 2L and random-sampling SVM 2L on five data sets
To verify the performance of ASVM2L for air traffic scenes classification,we use the real air traffic data including fly log data and fly plan data from of the No.5 regional sector in Central South of China.The data include 382 features,such as sector traffic flow,variance of heading,flight distance,flight time,average flight speed,and the other features of each altitude.According to the actual operation of the sector,the data are divided into two classes.The method of feature calculation is described in detail in Ref[.7].
Since the original air traffic data are large,and have high dimensionality and strong redundancy,we use PCA to reduce the dimensionality of the data by calculating the cumulative variance contribution rate.Fig.3 shows the change curve of cumulative variance contribution rate with the number of principal components.When we set the threshold to 0.85,we can reduce the dimensions of the original data from 382 to 60.That means the extracted 60 principal components can represent most of the information of the original data.

Fig.3 Change curve of cumulative variance
The feature correlation heat map after dimensionality reduction is shown in Fig.4.It can be seen that the extracted 60 principal components can be considered as having no redundancy.

Fig.4 Feature correlation heat map after dimensionality reduction
After the data preprocessing,we use the first 70%of the data for metric matrix learning,and the last 30%for 3-NN to verify the ability of ASVM2L for improving the classification accuracy.We also apply six supervised metric learning algorithms to conduct the same experiment for comparisons.The classification accuracies of the seven algorithms are listed in Table 3.

Table 3 Classification accuracy of traffic scenes
From Table 3,we can get that the 3-NN classification accuracy of all metric learning algorithms is better than the Euclidean distance.ASVM2L as a semi-supervised learning algorithm performs the best in classify traffic scenes compared with other supervised algorithms.What’s more noteworthy is that ASVM2L only uses 30%of the labelled samples to achieve the best accuracy,which means that the proposed method can make better use of the label information of traffic scene samples when solving this problem.
Fig.5 shows the change of classification accura-cy of ASVM2L with iterations on the whole air traffic scene data set.It can be seen from Fig.5 that when the ASVM2L runs to its 12th iteration,the classification accuracy reaches the highest 84.34%,and the number of labeled samples used for calculation at this time is 1 861.After that,the classification accuracy begins to decline slowly and gradually converges to 83.25%,which is exactly the classification accuracy of SVM2L on this data set.In other words,when we use the whole data set to learn ASVM2L,we just degenerate it into the supervised algorithm SVM2L,which completely fails to reflect the advantages of active learning in semi-supervised learning.

Fig.5 Learning curve of ASVM 2L on air traffic data
In order to classify air traffic scenes more accurately,in this paper,we propose a support vector machine-based active metric learning algorithm ASVM2L to measure the samples before classification.Considering that we currently do not have accurate labeled traffic scene samples,ASVM2L can combine knowledge of domain experts and machine learning results by using the active learning method.Through active query sampling,the most uncertain unlabeled samples with the most information are selected and labeled by domain experts,and then they can be used to train a better metric matrix,which will greatly help the subsequent classifier to improve the classification performance.Comparative experiments on five standard data sets prove that the ASVM2L algorithm proposed in this paper can make full use of the information of the limited labelled samples,and quickly obtain the best metric matrix.Comparative experiments on data of air traffic scenes also prove that as a semi-supervised metric learning algorithm,ASVM2L can achieve the best classification accuracy by much less labelled traffic scenes samples.Relying on the obtained classification model,more traffic scenes can be classified,and more similar scenes can be identified.Furthermore,control strategies in similar scene can be acquired and analyzed,and similar decision-making rules can be found to help controllers make decisions quickly in the current scene.
AcknowledgementsThis work was supported by the National Natural Science Foundation of China(No.61501229)and the Fundamental Research Funds for the Central Universities(Nos.2019054,2020045).
AuthorDr.CHEN Haiyan received her M.S.and Ph.D.degrees in computer science from Nanjing University of Aeronautics and Astronautics(NUAA),Nanjing,China,in 2005 and 2012,respectively.She is currently a lecturer at the College of Computer Science and Technology,NUAA.Her research interests include machine learning and artificial intelligence.
Author contributionsDr.CHEN Haiyan designed the study,analyzed the result and revised the manuscript.Mr.HOU Xiaye designed the algorithm,implemented the model and wrote the manuscript.Dr.YUAN Ligang contributed to the background and data of this study.Mr.ZHANG Bing contributed to the data preprocessing.All authors commented on the manuscript draft and approved the submission.
Competing inter estsT he authors declare no competing interests.
Transactions of Nanjing University of Aeronautics and Astronautics2021年4期