A machine learning approach to dataset imputation for software vulnerabilities

Authors: Rostami, S., Kleszcz, A., Dimanov, D. and Katos, V.

Journal: Communications in Computer and Information Science

Volume: 1284 CCIS

Pages: 25-36

eISSN: 1865-0937

ISSN: 1865-0929

DOI: 10.1007/978-3-030-59000-0_3

Abstract:

This paper proposes a supervised machine learning approach for the imputation of missing categorical values in a dataset where the majority of samples are incomplete. Twelve models have been designed that can predict nine of the twelve Adversarial Tactics, Techniques, and Common Knowledge (ATT&CK) tactic categories using only the Common Attack Pattern Enumeration and Classification (CAPEC). The proposed method has been evaluated on a test dataset consisting of 867 unseen samples, with the classification accuracy ranging from 99.88% to 100%. These models were employed to generate a more complete dataset with no missing ATT&CK tactic features.

https://eprints.bournemouth.ac.uk/34258/

Source: Scopus