Skip to content

Installation Guide (v0.2beta) old

Florian Schoppmann edited this page Jul 8, 2011 · 1 revision

Prerequisites for MADlib:

  • Greenplum Database >= 3.3 or PostgreSQL >= 8.4
  • Python >= 2.6 and PyGreSQL module:
    • Greenplum: both come with the DB package
    • PostgreSQL: must be installed manually
  • LAPACK library and header files (http://www.netlib.org/lapack/):
    • Mac OSX : lapack comes preinstalled (as part of the Accelerate framework)
    • RedHat/CentOS : lapack comes with with the lapack-devel package
    • Solaris : lapack comes with the Oracle Performance Studio

Installing from Binaries:

..coming soon!

Installing from Source:

Requires: cmake 2.6.3 or higher

  1. Source your database environment:

    * Greenplum: Login to the master node using gpadmin account and set Greenplum environment:
   . $GPHOME/greenplum_path.sh


* PostgreSQL: Login to the database server and make sure the right pg_config is accesible, e.g.:

   export PATH=$PATH:/Library/PostgreSQL/9.0/bin
  1. Download the latest MADlib source code from Beta or Master branch: https://github.com/madlib/madlib/zipball/master. Unzip it to any location (let's call it $MADLIB_SOURCE).

  2. Build MADlib package.

    NOTE: by default MADlib installs in /usr/local/madlib so if you run this as a non-root user you should chose a different directory (let's call it $MADLIB_TARGET):

   cd $MADLIB_SOURCE
   # ./configure is a wrapper around cmake. As such, you can supply CMake arguments, e.g.:
   # a custom location for your database:
   # ./configure -DCMAKE_PREFIX_PATH=/Library/PostgreSQL/9.0/
   # a custom installation directory:
   # ./configure -DCMAKE_INSTALL_PREFIX=$MADLIB_TARGET
   ./configure
   cd build/
   make install
  1. Greenplum multi-node cluster only: Push MADlib package to the segment nodes using Greenplum database utilities:
   cd $HOME
   tar -cvf ./madlib.tar ./madlib
   gpscp -f seg_host_file ./madlib.tar =:$HOME
   gpssh -f seg_host_file << EOF
      cd $HOME
      tar -xvf ./madlib.tar
      rm -f ./madlib.tar
   EOF
  1. To install MADlib objects into a database schema (for example: host=localhost:5432, db=testdb, user=gpadmin) run:

    * Greenplum:
   $MADLIB_TARGET/madlib/bin/madpack -p greenplum -c gpadmin@localhost:5432/testdb install


* PostgreSQL:

   $MADLIB_TARGET/madlib/bin/madpack -p postgres -c postgres@localhost:5432/testdb install
  1. To confirm your installation was correct you can run the install check command, e.g. for Greenplum:

    ``` $MADLIB_TARGET/madlib/bin/madpack -p greenplum -c gpadmin@localhost:5432/testdb install-check

* Done. Visit [MADlib User Documentation](http://doc.madlib.net) for more info about MADlib functions.

## Running Multiple Versions:

To install another MADlib package (of any version) on a system with an existing MADlib installation, follow these hints:

1. When building from source: adjust `CMAKE_INSTALL_PREFIX` variable to a desired new location, e.g.:
<br><br>`./configure -DCMAKE_INSTALL_PREFIX=$NEW_MADLIB_TARGET`

2. When installing from binaries, e.g. from RPM: add `--relocate` parameter: 
<br><br>`rpm -i MADlib-ver_XYZ.rpm --relocate /usr/local=/new/madlib/target`

3. When installing database objects from any MADlib package:
   * Source the proper database environment. 
   * Call `madpack` installer from the $NEW_MADLIB_TARGET and point the connection string to the desired database/schema.