transform-cluster

Overview

A repo to demonstrate the capabilities of a number of Hashicorp's products:

Packer - Used to build specific AMIs for the project
Terraform - Used to deploy infrastructure to AWS
Consul - Used as a service discovery layer for the cluster
Nomad - Used as a scheduler for batch job delivery

The project delivers a nomad cluster and demonstrates its use as a batch job scheduler, anonymising email texts using the scrubadub library

https://github.com/datascopeanalytics/scrubadub

Cluster Sizing

Consul requires a quorum to perform correctly and Hashicorp advise that the recommended configuration is to either run 3 or 5 Consul servers per datacenter. It is possible to run 7 Consul servers providing the network latency is low, however, you should never go above this number. The Hashicorp deployment table provides some additional information if required.

Instance Sizing

For testing purposes the default value of t2.micro is more than adequate. Additional testing is required to provide details on sensible production setups with workload estimates.

Security

The terraform code in this repo is most definitely NOT SAFE FOR PRODUCTION

There is no supporting infrastructure such as an [E|A]LB
There is no TLS on anything
The nomad worker node has a role on it to allow it to read/write to the S3 bucket. This is not secure as all jobs on the node would inherit this role. A better way would be to provide API keys via Vault.

Dependencies

This module makes the following assumptions:

You have a functioning VPC in AWS
You have these ENVARS set up to allow AWS access (or other method):
- AWS_ACCESS_KEY_ID
- AWS_SECRET_ACCESS_KEY
- AWS_DEFAULT_REGION

Further Improvements

Automatic job Dispatch - The jobs should use a further piece of infrastructure (AWS Lambda) that triggers on an S3 event when new files are added to the source folder.

Vault Integration - As mentioned in security section, The S3 permissions would benefit from utilising Vault to take API keys rather than having a role on the worker node to limit access to only jobs that need it.

Cluster Scaling - Currenly in this POC the cluster does not scale to the number of jobs. An improvement would be to add the use of Replicator to allow for dynamic scaling of the cluster in response to metrics

https://github.com/elsevier-core-engineering/replicator

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
jobs		jobs
packer		packer
terraform		terraform
.gitignore		.gitignore
README.MD		README.MD

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

transform-cluster

Overview

Cluster Sizing

Instance Sizing

Security

Dependencies

Further Improvements

About

Releases

Packages

Languages

iainthegray/transform-cluster

Folders and files

Latest commit

History

Repository files navigation

transform-cluster

Overview

Cluster Sizing

Instance Sizing

Security

Dependencies

Further Improvements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages