Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dockerize cluster #23

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
.python-version
generated/*
build-all.sh
build-conf.sh
deploy.sh
provision.sh
87 changes: 40 additions & 47 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,47 +1,40 @@
# Druid Docker Image

## Run a simple Druid cluster

[Install Docker](docker-install.md)

Download and launch the docker image

```sh
docker pull druidio/example-cluster
docker run --rm -i -p 3000:8082 -p 3001:8081 -p 3090:8090 druidio/example-cluster
```

Wait a minute or so for Druid to start up and download the sample.

On OS X

- List datasources

```
curl http://$(docker-machine ip default):3000/druid/v2/datasources
```

- access the coordinator console

```
open http://$(docker-machine ip default):3001/
```

On Linux

- List datasources

```
curl http://localhost:3000/druid/v2/datasources
```

- access the coordinator console at http://localhost:3001/

## Build Druid Docker Image

To build the docker image yourself

```sh
git clone https://github.com/druid-io/docker-druid.git
docker build -t example-cluster docker-druid
```
# Deploy Druid Cluster with Docker

For an all-in-one Docker container(all Druid services running in one single container), look at [here](all-in-one/README.md)

### Prerequisites:
1. Install docker and docker-machine on you development machine
1. Register a docker hub account if you don't have one. You are going to need to pull and push your docker images from there
1. Install Jinja2 using pip
`$ pip install jinja2`
1. For Mac OSX users only: install virtualbox

### Directories:
1. `templates/scripts` has all the templates for docker management scripts:
- `build-all.sh.template`: the script to build all images and push to docker hub
- `build-conf.sh.template`: the script to build only the image that contains conf files
- `provision.sh.template`: the script to provision nodes
- `deploy.sh.template`: the script to deploy all nodes
1. `templates/dockerfiles` has all the Dockerfile templates that define images
1. `templates/conf` has all the configuration templates for druid nodes
1. When running `pre_build.py`, the Dockerfile templates and conf file templates will read `config.json` and render into `generated/`
**Always remember to change template files instead of those in the `generated/` directory as those will be overwritten once you run `pre_build.py`**

### Usage:
1. For Mac OSX users only: Create a docker-machine dedicated for building images and setting up environment
`$ docker-machine create --driver virtualbox local && eval $(docker-machine env local)`
1. Change `config.json` accordingly
1. Prebuild
`$ python pre_build.py`
1. Run `$ ./provision.sh` to provision nodes
1. Build images accordingly by running either `build-all.sh` or `build-conf.sh`
1. Run `$ ./deploy.sh`

### Container Lifecycle management:
1. To see the status of all containers, run `$ eval $(docker-machine env --swarm d-druid-swarm-master) && docker ps -a`

### Notes:
1. To switch docker-machine, run `$ eval $(docker-machine env <the machine name>)`.
To switch to the swarm master, run `$ eval $(docker-machine env --swarm <the swarm master machine name>)`
1. To attach a new session to a running container, run `docker exec -it <container_name> /bin/bash`
1. To view the logs for a druid node, run `docker logs <container_name>`, or attach a new session and go to the actual log path to view the logs.
File renamed without changes.
47 changes: 47 additions & 0 deletions all-in-one/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# All-In-One-Container Druid Docker Image

## Run a simple Druid cluster

[Install Docker](../docker-install.md)

Download and launch the docker image

```sh
docker pull druidio/example-cluster
docker run --rm -i -p 3000:8082 -p 3001:8081 -p 3090:8090 druidio/example-cluster
```

Wait a minute or so for Druid to start up and download the sample.

On OS X

- List datasources

```
curl http://$(docker-machine ip default):3000/druid/v2/datasources
```

- access the coordinator console

```
open http://$(docker-machine ip default):3001/
```

On Linux

- List datasources

```
curl http://localhost:3000/druid/v2/datasources
```

- access the coordinator console at http://localhost:3001/

## Build Druid Docker Image

To build the docker image yourself

```sh
git clone https://github.com/druid-io/docker-druid.git
cd all-in-one && docker build -t example-cluster docker-druid
```
File renamed without changes.
File renamed without changes.
25 changes: 25 additions & 0 deletions config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
{
"docker_hub_username": "xiaoyao1991",
"image_tag": "latest",
"overlay_net": "d-my-net",
"common_conf_dir": "/usr/local/conf",
"deep_storage_dir": "/var/druid-segments",

"coordinator_node_port": 8080,
"coordinator_node_aliases": ["coordinator"],

"historical_node_port": 8081,
"historical_node_aliases": ["historical"],

"broker_node_port": 8082,
"broker_node_aliases": ["broker"],

"realtime_node_port": 8083,
"realtime_node_aliases": ["realtime"],

"overlord_node_port": 8084,
"overlord_node_aliases": ["overlord"],

"middle_manager_node_port": 8085,
"middle_manager_node_aliases": ["middlemanager"]
}
55 changes: 55 additions & 0 deletions pre_build.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
import json
from jinja2 import Environment, FileSystemLoader
import os
import argparse
import stat
import errno

TEMPLATE_DIR = "./templates/"
GENERATED_DIR = "./generated/"
CONFIG_FILE = "./config.json"

def prebuild():
# prepare jinja env
env = Environment(loader=FileSystemLoader('./'))

# prepare user defined configs
with open(CONFIG_FILE, 'r') as fp:
config = json.load(fp)

# generate directories
if not os.path.exists(GENERATED_DIR):
os.makedirs(GENERATED_DIR)

os.chdir(TEMPLATE_DIR)
for dp, _, _ in os.walk('./'):
if not os.path.exists(os.path.join('../', GENERATED_DIR, dp)):
os.makedirs(os.path.join('../', GENERATED_DIR, dp))

# render all templates
templates = [os.path.join(dp, f) for dp, dn, filenames in os.walk('./') for f in filenames]
os.chdir('../')

for template in templates:
generated_filename = GENERATED_DIR + template.rsplit(".template")[0]
with open(generated_filename, 'w') as fp:
if template.endswith('.template'):
fp.write(env.get_template(TEMPLATE_DIR + template).render(**config))
else:
with open(TEMPLATE_DIR + template, 'r') as rofp:
fp.write(rofp.read())

scripts = ['build-all.sh', 'build-conf.sh', 'deploy.sh', 'provision.sh']
for script in scripts:
try:
os.symlink(GENERATED_DIR + 'scripts/' + script, script)
except OSError, e:
if e.errno == errno.EEXIST:
os.remove(script)
os.symlink(GENERATED_DIR + 'scripts/' + script, script)

st = os.stat(script)
os.chmod(script, st.st_mode | stat.S_IEXEC)

if __name__ == '__main__':
prebuild()
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
jinja2
114 changes: 114 additions & 0 deletions templates/conf/druid/_common/common.runtime.properties.template
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
#
# Licensed to Metamarkets Group Inc. (Metamarkets) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. Metamarkets licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#

#
# Extensions
#

# This is not the full list of Druid extensions, but common ones that people often use. You may need to change this list
# based on your particular setup.
druid.extensions.loadList=["druid-kafka-eight", "druid-histogram", "druid-datasketches", "druid-lookups-cached-global", "mysql-metadata-storage"]

# If you have a different version of Hadoop, place your Hadoop client jar files in your hadoop-dependencies directory
# and uncomment the line below to point to your directory.
#druid.extensions.hadoopDependenciesDir=/my/dir/hadoop-dependencies

#
# Logging
#

# Log all runtime properties on startup. Disable to avoid logging properties on startup:
druid.startup.logging.logProperties=true

#
# Zookeeper
#

druid.zk.service.host=zookeeper
druid.zk.paths.base=/druid

#
# Metadata storage
#

# druid.metadata.storage.type=derby
# druid.metadata.storage.connector.connectURI=jdbc:derby://metadata.store.ip:1527/var/druid/metadata.db;create=true
# druid.metadata.storage.connector.host=metadata.store.ip
# druid.metadata.storage.connector.port=1527

# For MySQL:
druid.metadata.storage.type=mysql
druid.metadata.storage.connector.connectURI=jdbc:mysql://mysql:3306/druid
druid.metadata.storage.connector.user=druid
druid.metadata.storage.connector.password=diurd

# For PostgreSQL (make sure to additionally include the Postgres extension):
#druid.metadata.storage.type=postgresql
#druid.metadata.storage.connector.connectURI=jdbc:postgresql://db.example.com:5432/druid
#druid.metadata.storage.connector.user=...
#druid.metadata.storage.connector.password=...

#
# Deep storage
#

druid.storage.type=local
druid.storage.storageDirectory={{ deep_storage_dir }}

# For HDFS (make sure to include the HDFS extension and that your Hadoop config files in the cp):
#druid.storage.type=hdfs
#druid.storage.storageDirectory=/druid/segments

# For S3:
#druid.storage.type=s3
#druid.storage.bucket=your-bucket
#druid.storage.baseKey=druid/segments
#druid.s3.accessKey=...
#druid.s3.secretKey=...

#
# Indexing service logs
#

druid.indexer.logs.type=file
druid.indexer.logs.directory=var/druid/indexing-logs

# For HDFS (make sure to include the HDFS extension and that your Hadoop config files in the cp):
#druid.indexer.logs.type=hdfs
#druid.indexer.logs.directory=hdfs://namenode.example.com:9000/druid/indexing-logs

# For S3:
#druid.indexer.logs.type=s3
#druid.indexer.logs.s3Bucket=your-bucket
#druid.indexer.logs.s3Prefix=druid/indexing-logs

#
# Service discovery
#

druid.selectors.indexing.serviceName=druid/overlord
druid.selectors.coordinator.serviceName=druid/coordinator

#
# Monitoring
#

druid.monitoring.monitors=["com.metamx.metrics.JvmMonitor"]
druid.emitter=logging
druid.emitter.logging.logLevel=info
19 changes: 19 additions & 0 deletions templates/conf/druid/_common/log4j2.xml.template
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
<?xml version="1.0" encoding="UTF-8" ?>
<Configuration status="WARN">
<Appenders>
<Console name="Console" target="SYSTEM_OUT">
<PatternLayout pattern="%d{ISO8601} %p [%t] %c - %m%n"/>
</Console>
<File name="File" fileName="/var/log/druid/${sys:logfilename}.log">
<PatternLayout>
<Pattern>%d %p %c{1.} [%t] %m%n</Pattern>
</PatternLayout>
</File>
</Appenders>
<Loggers>
<Root level="info">
<AppenderRef ref="Console"/>
<AppenderRef ref="File" />
</Root>
</Loggers>
</Configuration>
8 changes: 8 additions & 0 deletions templates/conf/druid/broker/jvm.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
-server
-Xms24g
-Xmx24g
-XX:MaxDirectMemorySize=4096m
-Duser.timezone=UTC
-Dfile.encoding=UTF-8
-Djava.io.tmpdir=var/tmp
-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
16 changes: 16 additions & 0 deletions templates/conf/druid/broker/runtime.properties.template
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
druid.service=druid/broker
druid.port={{ broker_node_port }}

# HTTP server threads
druid.broker.http.numConnections=5
druid.server.http.numThreads=25

# Processing threads and buffers
druid.processing.buffer.sizeBytes=536870912
druid.processing.numThreads=7

# Query cache
druid.broker.cache.useCache=true
druid.broker.cache.populateCache=true
druid.cache.type=local
druid.cache.sizeInBytes=2000000000
8 changes: 8 additions & 0 deletions templates/conf/druid/coordinator/jvm.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
-server
-Xms3g
-Xmx3g
-Duser.timezone=UTC
-Dfile.encoding=UTF-8
-Djava.io.tmpdir=var/tmp
-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
-Dderby.stream.error.file=var/druid/derby.log
Loading