In the project, we identify common pitfalls in the design, implementation, and evaluation of learning-based security systems. We demonstrate how individual pitfalls can lead to unrealistic performance and interpretations, obstructing the understanding of the security problem at hand. As a remedy, we propose actionable recommendations to support researchers in avoiding or mitigating the pitfalls where possible.
In the project, we develop criteria for comparing and evaluating explanation methods in the context of computer security. These cover general properties, such as the accuracy of explanations, as well as security- focused aspects, such as the completeness, efficiency, and robustness. We observe significant differences between explanation methods and build on these to derive general recommendations for selecting and applying them in computer security.
This project studies image-scaling attacks, a new form of attacks that allow an adversary to manipulate images, such that they change their content during downscaling. Image-scaling attacks are a considerable threat, as scaling is omnipresent in computer vision. Moreover, these attacks are agnostic to the learning model and training data, affecting any learning-based system operating on images.
In this research project we explore similarities between machine learning and digital watermarking under attack. As part of the project, we have developed a unified view on attacks in both domains and created a framework for modeling evasion and poisoning attacks. The code and datasets of our case studies are publicly available.
In this project, we attack methods for authorship attribution of source code using adversarial learning. We exploit that these methods rest on machine learning and thus can be deceived by adversarial examples of source code. Our attack performs a series of semantics-preserving code transformations that mislead the attribution but appear plausible to a developer. Our attack and the datasets are publicly available.
Joern is a platform for robust analysis of C/C++ code. It generates code property graphs, a novel graph representation of code that exposes the code’s syntax, control-flow, data-flow and type information. Code property graphs are stored in a graph database. This allows code to be mined using search queries formulated in the graph traversal language Gremlin. Joern forms the basis for assisted vulnerability discovery using machine learning techniques.
Pulsar is a network fuzzer with automatic protocol learning and simulation capabilites. The tool allows to model a protocol through machine learning techniques, such as clustering and hidden Markov models. These models can be used to simulate communication between Pulsar and a real client or server thanks to semantically correct messages which, in combination with a series of fuzzing primitives, allow to test the implementation of an unknown protocol for errors in deeper states of its protocol state machine.
The Drebin dataset consists of roughly 5,000 malicious Android applications that have been collected as part of the Mobile Sandbox project between 2010 and 2012. The dataset can be used to experiment with Android malware and compare different detection approaches.
Adagio is a collection of Python modules for analyzing and detecting Android malware. These modules allow to extract labeled call graphs from Android APKs or DEX files and apply an explicit feature map that captures their structural relationships. Additional modules provide classes for designing binary or multiclass classification experiments and applying machine learning for detection of malicious structure.
Malheur is a tool for the automatic analysis of program behavior recorded from malware. It has been designed to support the regular analysis of malware and the development of detection and defense measures. Malheur allows for identifying novel classes of malware with similar behavior and assigning unknown malware to discovered classes using machine learning.
Harry is a tool for comparing strings and measuring their similarity. The tool supports several common distance and kernel functions for strings as well as some excotic similarity measures. The focus lies on implicit similarity measures, that is, comparison functions that do not give rise to an explicit vector space. Examples of such similarity measures are the Levenshtein and Jaro-Winkler distance.
Sally is a small tool for mapping a set of strings to a set of vectors. This mapping is referred to as embedding and allows for applying techniques of machine learning and data mining for analysis of string data. Sally can applied to several types of string data, such as text documents, DNA sequences or log files, where it can handle common formats such as directories, archives and text files.
Letter Salad, or Salad for short, is an efficient and flexible implementation of the anomaly detection method Anagram. The method uses n-grams (substrings of length n) maintained in a Bloom filter for efficiently detecting anomalies in large sets of string data. Salad extends the original method by supporting n-grams of bytes as well n-grams of words and tokens.