Computer Science Faculty Research

Text Mining for Security Threat Detection Discovering Hidden Information in Unstructured Log Messages

Candace Suh-Lee, University of Nevada, Las Vegas
Ju-yeon Jo, University of Nevada, Las Vegas
Yoohwan Kim, University of Nevada, Las VegasFollow

Document Type

Conference Proceeding

Publication Date

10-19-2016

Publication Title

IEEE Conference on Communications and Network Security (CNS)

Volume

2016

Abstract

The exponential growth of unstructured messages generated by the computer systems and applications in modern computing environment poses a significant challenge in managing and using the information contained in the messages. Although these data contain a wealth of information that is useful for advanced threat detection, the sheer volume, variety, and complexity of data make it difficult to analyze them even by well-trained security analysts. While conventional Security Information and Event Management (SIEM) systems provide some capability to collect, correlate, and detect certain events from structured messages, their rule-based correlation and detection algorithms fall short in utilizing the information within the unstructured messages. Our study explores the possibility of utilizing the techniques for data mining, text classification, natural language processing, and machine learning to detect security threats by extracting relevant information from various unstructured log messages collected from distributed non-homogeneous systems. The extracted features are used to run a number of experiments on the Packet Clearing House SKAION 2006 IARPA Dataset, and their prediction capability is evaluated. In comparison with the base case without feature extraction, an average of 16.73% performance gain and 84% time reduction was achieved using extracted features only, and a 23.48% performance gain was attained using both unstructured free-text messages and extracted features. The results also show a strong potential for further increase in performance by increasing size of training datasets and extracting more features from the unstructured log messages.

File Format

pdf

File Size

632 kb

Language

english

Repository Citation

Suh-Lee, C., Jo, J., Kim, Y. (2016). Text Mining for Security Threat Detection Discovering Hidden Information in Unstructured Log Messages. IEEE Conference on Communications and Network Security (CNS), 2016
http://dx.doi.org/10.1109/CNS.2016.7860492

Find It

UNLV article access

COinS

Digital Scholarship@UNLV

Computer Science Faculty Research

Text Mining for Security Threat Detection Discovering Hidden Information in Unstructured Log Messages

Document Type

Publication Date

Publication Title

Volume

Abstract

File Format

File Size

Language

Repository Citation

Browse

Links

Digital Scholarship@UNLV

Computer Science Faculty Research

Text Mining for Security Threat Detection Discovering Hidden Information in Unstructured Log Messages

Authors

Document Type

Publication Date

Publication Title

Volume

Abstract

File Format

File Size

Language

Repository Citation

Share

Browse

Links