Detecting DGA Activity in Network Data with Elastic ML - Oct 1, 2020 Elastic Stockholm Meetup

Elastic

Oct 1, 2020

After infecting a target machine, many malicious programs need to communicate with a command & control server ( C & C) that is controlled by the malware author. In order to avoid detection and subvert defensive measures, malware authors employ domain generation algorithms (DGA), which enable the malware to generate hundreds or thousands of new domains, one of which is then registered by the malware author as the location of the C&C server.

Because this problem involves high amounts of data (think thousands of domains generated by the malware) and an approach that is not amenable to rule writing (most domains follow random-like patterns), it is a great problem for machine learning to solve! In this talk, we will take a look at how one can train a supervised classification model in the Elastic stack to detect DGA domains and furthermore how one can use inference processors and ingest pipelines to deploy this model to classify network data at ingest time.

Useful background reading for this talk are these two blogposts

https://www.elastic.co/blog/machine-learning-in-cybersecurity-training-supervised-models-to-detect-dga-activity

https://www.elastic.co/blog/machine-learning-in-cybersecurity-detecting-dga-activity-in-network-data