Border Gateway Protocol Anomaly Detection and Time Series Classification
Student thesis: Doctoral Thesis
Related Research Unit(s)
Border Gateway Protocol is an important and default inter-domain protocol that assists different Autonomous Systems to exchange routing reachability information. With the increasing complexity and the capacity of the Internet, various types of anomalies such as misconfiguration, worms, prefix hijacks and black-out affect the performance of the global network at large scale. Thus it is significant to capture such events to ensure the stability of Internet and prevent further loss. In this thesis, the characteristics of BGP anomalous traffic from different aspects are analyzed and various models are proposed to tackle the problems existing in the identification scenarios. In the analysis, there exists mainly several challenges in identifying the anomalous traffic and our main contributions are intending to solve those issues: the first challenge is the imbalanced class label in the data distribution, i.e., during the model training, the number of anomalous traffic data is much less than the number of regular traffic. In terms of the imbalance classification, we propose an improved ensemble model which utilizes the minority class data at most and have obtained desirable as well as more robust results. We extract new aggregated features from the raw traffic and categorize those features into three modalities according to the data type: continuous, binary and categorical. The traditional machine learning methods mainly adopt feature selection models individually before the classifier, while in our work we propose to integrate the feature selection into the ensemble model to select subset features to reduce model variance and improve classification performance. Secondly, it's necessary to analyze temporal attributes among in traffic, while the previous works often neglect the temporal attributes and traditional classifiers usually fail to take the temporal information into consideration. In our work, a sliding window is employed and the original input data are converted into sequential series. To model the generated sequences, a novel recurrent neural network (RNN) is proposed to capture the dependency relationship on multiple resolution. Specifically, we propose a Multi-Scale Long Short Term Memory model (MSLSTM) to learn temporal dependency of traffic data. In the proposed MSLSTM model, the multi-resolution sequences are generated based on Discrete Wavelet Transform (DWT). The first layer of MSLSTM is an attention-based network, which learns the importance of different time scales independently and the integrated representations act as the input of second layer of MSLSTM, which captures the long and short-term temporal dependency in the sequence. To verify the feasibility, extensive and comprehensive experiments have been conducted to compare the performance of proposed model with the baseline methods such as RNN, LSTM, stacked LSTM, Support Vector Machine (SVM), Naive Bayes (NB), 1-Nearest Neighbor (1NN), Ada.Boost and Random Forest (RF) as well as other popular classifiers. Last but not least, we extend our work from anomaly classification to general time series classification. The characteristics of time series with different aspects are analyzed and a hybrid learning framework which models the global trend and local patterns jointly is proposed. The global shape learning is based on DWT and LSTM, and the local features is modeled with shapelet transform (ST). The both information is subsequently integrated with a neural network fusion layer by a $softmax$. The experimental results demonstrate that the proposed framework has achieved considerable improvements on both classification performance and computational complexity.
- Border Gateway Protocol (BGP), Anomaly Classification, Ensemble Learning, Neural Network, Discrete Wavelet Transform (DWT), Long Short-term Memroy (LSTM), Multi-scale, Time Series Classification, Shapelet Transform