Skip to content

Latest commit

 

History

History
82 lines (54 loc) · 2.54 KB

README.md

File metadata and controls

82 lines (54 loc) · 2.54 KB

Agile Data the Book

You can buy the book here. You can read the book on O'Reilly OFPS now. Work the chapter code examples as you go. Don't forget to initialize your python environment. Try linux (apt-get, yum) or OS X (brew, port) packages if any of the requirements don't install in your virtualenv.

Agile Data Code Examples

Setup your Python Virtual Environment

# From project root

# Setup python virtualenv
virtualenv -p `which python2.7` venv --distribute
source venv/bin/activate
pip install -r requirements.txt

Download your Gmail Inbox!

# From ch3

# Download your gmail inbox
cd gmail
./gmail.py -m automatic -u [email protected] -p 'my_password_' -s ./email.avro.schema -f '[Gmail]/All Mail' -o /tmp/test_mbox 2>&1 &

Chapter 2: Data

An example spreadsheet is available at ch02/Email Analysis.xlsb. Example Pig code is available at ch02/probability.pig.

Chapter 3: Agile Tools

Full tutorial in Chapter 3 README.

Highlight:

Download your Gmail Inbox!

# From ch3

# Download your gmail inbox
cd gmail
./gmail.py -m automatic -u [email protected] -p 'my_password_' -s ./email.avro.schema -f '[Gmail]/All Mail' -o /tmp/test_mbox 2>&1 &

Chapter 4: To the Cloud!

Chapter 4 tutorial

Chapter 7: Collecting and Displaying Atomic Records

Chapter 7 tutorial

Chapter 8: Creating Charts

Chapter 8 tutorial

Chapter 9: Building Interactive Reports

Chapter 9 tutorial

Chapter 10: Making Predictions

Chapter 10 tutorial

Chapter 11: Driving Actions

Chapter 11 tutorial