binarypig

Scalable Binary Data Extraction in Hadoop. Malware Processing and Analytics over Pig, Exploration through Django, Twitter Bootstrap, and Elasticsearch.
binarypig logo

BinaryPig

A malware processing and analytics platform built on top of Pig, with exploration capabilities through Django, Twitter Bootstrap, and Elasticsearch.


Pig Libraries and Scripts

  • Build BinaryPig JAR: ./build.sh
  • Install dependencies for binarypig modules (see installation docs in "docs" directory)
  • Use binarypig: run pig scripts from the "examples" directory

Webapp

  • Install the binarypig webapp: cd webapp and create a local settings file, set up MySQL database, install dependencies with pip, initialize database, and run server
  • Run the binarypig webapp: ./manage.py runserver 0.0.0.0:8000

Issues

  • Some issues encountered when running Python-based binarypig pig jobs on CentOS
    • SELinux must be disabled to avoid python processes hanging

Getting Up and Running with Vagrant

  • A mini howto for setting up Binary Pig on an Ubuntu 14.04 VM with Hadoop, Pig, Elasticsearch, and Django
    • Clone repository, create a local settings file, install dependencies, initialize database, and run server
    • SSH into the VM, set up Oracle Java, MySQL, and Django admin user

License

Licensed under the Apache 2.0 license, Copyright 2013 Endgame, Inc.


Contributors

  • Jason Trost
  • Telvis Calhoun
  • Zach Hanif




> Visit binarypig Website <