Pre-processink-osquery

TL;DL, pre-processink-osquery is a preprocessing toolkit that helps transform osquery events on a fleet manager, into a machine learning training data set.

pre-processink-osquery transforms data on a fleet manager into a training set in multiple stages, allowing for automated rules based labeling and manual human labeling. Both kinds of labeling are necessary to cover different use cases and still not make the labeling overwhelming to the administrators.

Here is the pre-processing workflow, it follows:

STAGE 0:

As of now it queries an ElasticSearch backend of any fleet manager that stores osquery events. (The design being extensible it can be customized to query any data store containing any event source)
It queries event of one probe at a time.
Labels each event as RED/YELLOW/GREEN based on features, their values and what rules are configured for those values.
If no rules is specified for a particular combination of events or their values the event will not be labeled.
All event including labeled and unlabeled ones are written into an output <probe name>.csv file in textual form.
There will be one .csv per probe created in each run.

STAGE 1:

Merge all the probe specific .csv files into a merged csv file called stage_1.csv.
Data can also be append into a pre-existing stage_1.csv

MANUAL LABELING:

At this stage human administrator can manually label events that were not labeled or which were incorrectly labeled by the automated rules.

STAGE 2:

In this stage the labels and features are transformed into numeric values. This will be the training set ready to be fed into a ML toolkit like Tensorflow.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
source		source
transform		transform
util		util
README.md		README.md
es.json		es.json
main		main
main.go		main.go
my.db		my.db

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Pre-processink-osquery

STAGE 0:

STAGE 1:

MANUAL LABELING:

STAGE 2:

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

auditNG/pre-processink-osquery

Folders and files

Latest commit

History

Repository files navigation

Pre-processink-osquery

STAGE 0:

STAGE 1:

MANUAL LABELING:

STAGE 2:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages