Scalable Binary Data Extraction in Hadoop. Malware Processing and Analytics over Pig, Exploration through Django, Twitter Bootstrap, and Elasticsearch.
BinaryPig
A malware processing and analytics platform built on top of Pig, with exploration capabilities through Django, Twitter Bootstrap, and Elasticsearch.
Pig Libraries and Scripts
- Build BinaryPig JAR:
./build.sh
- Install dependencies for binarypig modules (see installation docs in "docs" directory)
- Use binarypig: run pig scripts from the "examples" directory
Webapp
- Install the binarypig webapp:
cd webapp
and create a local settings file, set up MySQL database, install dependencies with pip, initialize database, and run server
- Run the binarypig webapp:
./manage.py runserver 0.0.0.0:8000
Issues
- Some issues encountered when running Python-based binarypig pig jobs on CentOS
- SELinux must be disabled to avoid python processes hanging
Getting Up and Running with Vagrant
- A mini howto for setting up Binary Pig on an Ubuntu 14.04 VM with Hadoop, Pig, Elasticsearch, and Django
- Clone repository, create a local settings file, install dependencies, initialize database, and run server
- SSH into the VM, set up Oracle Java, MySQL, and Django admin user
License
Licensed under the Apache 2.0 license, Copyright 2013 Endgame, Inc.
Contributors
- Jason Trost
- Telvis Calhoun
- Zach Hanif
> Visit binarypig Website <