Skip to main content
No Access

Network traffic performance analysis from passive measurements using gradient boosting machine learning

Published Online:pp 13-30

Effective monitoring and analysis of network traffic are vital for scientific computing, since scientific applications often require moving massive data from one site to another. A body of statistical and machine learning techniques have been introduced for network traffic monitoring and analysis, but this is considered a highly challenging task due to several reasons, such as unavailability of label information, complication of real-time analysis, and generalisation property of machine learning models. In this paper, we present a novel method to identify continuous time windows of low throughput for the purpose of network performance analysis and anomaly detection, in order to facilitate data transfers for high-performance scientific computing. The presented method is based on supervised learning techniques with an adaptive labelling function that automatically determines if the time window is whether 'slow' or 'normal'. The presented method is validated on real large datasets collected from several data transfer nodes (DTNs) located in Science DMZ. Our experimental results show that the proposed method is able to quickly predict time windows of low performing network transfers, that require attention from network engineers and also to identify the most important features for the classification.


network traffic, TCP performance, UMAP, classification, Tstat