Salad

A Content Anomaly Detector based on n-Grams

Manual Page

NAME

salad − A Content Anomaly Detector Based on n-Grams

SYNOPSIS

salad [<mode>] [options]

DESCRIPTION

Letter Salad or Salad for short, is an efficient and flexible implementation of the well-known anomaly detection method Anagram by Wang et al (RAID 2006).

Salad enables detecting anomalies in large-scale string data. The tool is based on the concepts of n-grams, that is, strings are compared using all substrings of length n. During training, cf. salad-train(1), these n-grams are extracted from a collection of strings and stored in a Bloom filter. This enables the detector to represent a large number of n-grams in very little memory. During anomaly detection, the n-grams of unknown strings are matched against the Bloom filter and strings containing several n-grams not seen during training are flagged as anomalous.

Salad extends the original method Anagram in different ways: First, the tool does not only operate on n-grams of bytes, but is also capable of comparing n-grams over bits or words and tokens. Second, Salad implements a 2-class version of the detector that enables discriminating strings of two types, cf. salad-predict(1) for more details. Finally, the tool features a build-in inspection and statistic mode that can help to analyze the learned Bloom filter and its predictions, cf. salad-inspect(1) and salad-stats(1) respectively.

The tool can be utilized in different fields of application. For example, the concept underlying Salad has been prominently used for intrusion detection, but is not limited to this scenario.

OPTIONS

The options depend on the provided mode of operation. If no mode is specified, the following generic options are available:

--help

Print the help screen.

--version

Print version and copyright.

MODES

There exist two different means of accessing salad’s modes: (1) As command line option to the main executable, or (2) as stand-alone executable prefixed with salad-.

The following list contains the names of the stand-alone executables, for which individual man pages are available:

salad-train(1)
Trains the anomaly detector.

salad-predict(1)
Predicts the anomaly score of the specified data.

salad-stats(1)
Provides statistical information of a trained anomaly detector.

salad-inspect(1)
Analyzes the specified data with respect to the n-gram model used by the detector.

COPYRIGHT

Copyright (c) 2012-2015, Christian Wressnegger
All rights reserved.

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.