data_hacking

Examples of using IPython, Pandas, and Scikit Learn to get the most out of your security data.

Data Hacking Project

A collection of projects and exercises related to data hacking.

Projects

Hacking Clustering: A project that uses clustering algorithms to group similar binary files.
- Includes two notebooks: one for PE (Portable Executable) files, and another for Mach-O (Mach Object) files.
SWF Classification: A project that classifies SWF (Shockwave Flash) files using various classification techniques.
- Includes a notebook viewer and GitHub repository.
Java Class File Classification: A project that classifies Java class files using various classification techniques.
- Includes a notebook viewer and GitHub repository.

PE File Similarity Graph using Workbench: A notebook that creates a graph of similar PE (Portable Executable) files.
Windows Executable Clustering by Image Similarity: A notebook that clusters Windows executables based on their image similarity.

Setup

Required packages:

Brew/apt-get: graphviz, freetype, and zmq
Python: ipython, pygraphviz, pandas, matplotlib, networkx, pyzmq, jinja2, scipy, patsy, statsmodels, and pefile

Some exercises use packages from the data_hacking repository. To install these packages, run:

%> sudo python setup.py install

To uninstall, run:

%> sudo pip uninstall data_hacking

Install IPython

Install IPython using the normal method.

Running Notebooks

Most notebooks will have relative paths to some resources, data files, or images. To run a notebook, change into the project directory and run ipython with this alias:

alias ipython='ipython notebook  --FileNotebookManager.notebook_dir=`pwd`'

Then, run:

$ cd data_hacking/fun_with_syslog
$ ipython (as aliased above)

> Visit data_hacking Website <

data_hacking

Data Hacking Project

Projects

Setup

Install IPython

Running Notebooks

Best Search

Popular Tags

CyberCodex

data_hacking

Data Hacking Project

Projects

Related Notebooks

Setup

Install IPython

Running Notebooks

Best Search

Popular Tags