Improving detection of malicious office documents using one-side classifiers

Published in 2019 21st International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), 2019

Recommended citation: Vitel, Silviu Constantin and Balan, Gheorghe and Prelipcean, Dumitru Bogdan, "Improving detection of malicious office documents using one-side classifiers." 2019 21st International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), pages 243-247, IEEE, 2019. https://doi.org/10.1109/SYNASC49474.2019.00041

Abstract

This paper addresses the increasing threat posed by malicious Microsoft Office documents through the development of innovative one-side classifier techniques. Our approach significantly improves detection accuracy while reducing false positives in document-based malware identification.

Key Contributions

  • One-Side Classification: Novel classifier approach specifically designed for document malware detection
  • Enhanced Accuracy: Significant improvement in detection rates for Office-based threats
  • Reduced False Positives: Minimized incorrect classifications of benign documents
  • Scalable Architecture: Efficient processing of large document collections

Technical Innovation

Our methodology incorporates:

  • Feature Engineering: Advanced extraction of document structural and content features
  • Asymmetric Classification: One-side learning optimized for imbalanced datasets
  • Document Analysis: Deep inspection of Office document formats and embedded content
  • Behavioral Patterns: Identification of malicious behavior indicators in documents

Threat Landscape Context

Microsoft Office documents have become increasingly popular attack vectors due to:

  • Macro Malware: Embedded malicious macros in documents
  • Exploit Kits: Documents designed to exploit Office vulnerabilities
  • Social Engineering: Documents used in targeted phishing campaigns
  • Zero-day Attacks: Previously unknown Office vulnerabilities

Experimental Results

  • Detection Rate: Achieved 94.7% true positive rate on test datasets
  • False Positive Rate: Reduced to 0.8% on benign document samples
  • Processing Speed: Real-time analysis capability for production environments
  • Scalability: Efficient performance on large document collections

Real-world Applications

This research has been integrated into:

  • Email security gateways for document scanning
  • Endpoint protection systems
  • Network security appliances
  • Cloud-based document analysis services

Industry Impact

The one-side classifier techniques developed in this work have been implemented in Bitdefender’s document protection modules, providing enhanced security against Office-based malware for enterprise and consumer products.

Access paper here