Regístrese para ver los datos de contacto

An approach for detecting encrypted malware traffic via fully convolutional masked autoencoders

28/04/2026 HEP Journals

Cybersecurity has always been the focus of Internet research. Malware refers to software intentionally designed to harm computer systems, networks, or users by stealing, corrupting data, disrupting operations, or gaining unauthorized access. Existing malware traffic detection techniques rely on a sufficient amount of labeled data readily available for model training, limiting the capability of transferring to new malware detection.

To solve the problems, a research team led by Meng SHEN published their new research on 15 April 2026 in Frontiers of Computer Science co-published by Higher Education Press and Springer Nature.

The team proposed an adaptive encrypted malware traffic detection method, Malcom, based on fully convolutional masked autoencoders to detect malware traffic hidden in the encrypted traffic. Extensive experiments on real-world datasets demonstrate that Malcom outperforms the state-of-the-art (SOTA) methods in two typical scenarios, i.e., a few-shot learning scenario and an imbalanced dataset scenario.

In the research, they analyze the Open System Interconnection (OSI) reference model to extract discriminative features that can differentiate between malware and benign traffic. In order to achieve adaptive encrypted traffic detection, they propose Malcom consisting of two stages, i.e., the pre-training stage and the fine-tuning stage. In the pre-training stage, they pre-train a sparse convolutional-based ConvNeXt traffic encoder and a lightweight traffic decoder with unlabeled traffic samples. In the fine-tuning stage, they only require a few labeled new malware traffic data for fine-tuning to achieve high accuracy in detecting the new malware.
The researchers first propose a novel traffic representation named Header-Payload Matrix (HPM) to extract discriminative features that can differentiate from malware and benign traffic. Subsequently, they develop a hierarchical ConvNeXt traffic encoder and a lightweight ConvNeXt traffic decoder to learn high-level features from a large amount of unlabeled data. The masked autoencoder framework enables their model to be adaptive to new malware detection by fine-tuning with only a few labeled data. The experiments on real-world datasets are performed in two typical scenarios, i.e. a few-shot learning scenario and an imbalanced dataset scenario. The experimental data shows that Malcom outperforms the state-of-the-art (SOTA) methods in both scenarios.

Future work can resort to more traffic datasets of other downstream traffic classification tasks, e.g. encrypted application classification, website fingerprinting attack on Tor, and IoT traffic classification, to verify the generality of Malcom. Furthermore, they will explore the model's robustness when faced with malware traffic with obfuscation strategies.
DOI:10.1007/s11704-025-41273-9

https://dx.doi.org/10.1007/s11704-025-41273-9

Archivos adjuntos

System Overview of Malcom
The Construction Process of the Traffic Representation HPM

28/04/2026 HEP Journals

Regions: Asia, China

Keywords: Applied science, Computing

Disclaimer: AlphaGalileo is not responsible for the accuracy of content posted to AlphaGalileo by contributing institutions or for the use of any information through the AlphaGalileo system.