Skip to content

Commit

Permalink
Merge pull request #10 from dominodatalab/more_ng
Browse files Browse the repository at this point in the history
PLAT-5931: Support node group taints, labels, and tags
  • Loading branch information
Michael Fraenkel authored Nov 16, 2022
2 parents 83651a2 + 75958a6 commit 64b67c2
Show file tree
Hide file tree
Showing 20 changed files with 201 additions and 108 deletions.
4 changes: 2 additions & 2 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ orbs:
jobs:
build:
docker:
- image: cimg/aws:2022.09
- image: cimg/aws:2022.11
parameters:
workspace:
type: string
Expand All @@ -18,7 +18,7 @@ jobs:
- checkout

- terraform/install:
terraform_version: '1.3.3'
terraform_version: '1.3.4'

- terraform/fmt:
path: .
Expand Down
10 changes: 4 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,15 +70,13 @@ aws s3 rb s3://"${AWS_TERRAFORM_REMOTE_STATE_BUCKET}" --force
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 1.3.0 |
| <a name="requirement_aws"></a> [aws](#requirement\_aws) | >= 4.0 |
| <a name="requirement_local"></a> [local](#requirement\_local) | >= 2.2.0 |
| <a name="requirement_random"></a> [random](#requirement\_random) | >= 3.4.3 |
| <a name="requirement_tls"></a> [tls](#requirement\_tls) | >= 3.4.0 |

## Providers

| Name | Version |
|------|---------|
| <a name="provider_aws"></a> [aws](#provider\_aws) | 4.36.1 |
| <a name="provider_random"></a> [random](#provider\_random) | 3.4.3 |
| <a name="provider_tls"></a> [tls](#provider\_tls) | 4.0.3 |

## Modules
Expand All @@ -98,25 +96,25 @@ aws s3 rb s3://"${AWS_TERRAFORM_REMOTE_STATE_BUCKET}" --force
| [aws_iam_policy.route53](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_policy) | resource |
| [aws_iam_role_policy_attachment.route53](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_role_policy_attachment) | resource |
| [aws_key_pair.domino](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/key_pair) | resource |
| [random_shuffle.azs](https://registry.terraform.io/providers/hashicorp/random/latest/docs/resources/shuffle) | resource |
| [aws_availability_zones.available](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/availability_zones) | data source |
| [aws_ec2_instance_type_offerings.nodes](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/ec2_instance_type_offerings) | data source |
| [aws_iam_policy_document.route53](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
| [aws_iam_role.eks_master_roles](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_role) | data source |
| [aws_route53_zone.hosted](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/route53_zone) | data source |
| [aws_subnet.specified](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/subnet) | data source |
| [aws_subnet.private](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/subnet) | data source |
| [aws_subnet.public](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/subnet) | data source |
| [tls_public_key.domino](https://registry.terraform.io/providers/hashicorp/tls/latest/docs/data-sources/public_key) | data source |

## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_additional_node_groups"></a> [additional\_node\_groups](#input\_additional\_node\_groups) | Additional EKS managed node groups definition. | <pre>map(object({<br> ami = optional(string)<br> instance_type = string<br> min_per_az = number<br> max_per_az = number<br> desired_per_az = number<br> labels = map(string)<br> volume = object({<br> size = string<br> type = string<br> })<br> }))</pre> | `{}` | no |
| <a name="input_additional_node_groups"></a> [additional\_node\_groups](#input\_additional\_node\_groups) | Additional EKS managed node groups definition. | <pre>map(object({<br> ami = optional(string)<br> instance_types = list(string)<br> spot = optional(bool, false)<br> min_per_az = number<br> max_per_az = number<br> desired_per_az = number<br> labels = map(string)<br> taints = optional(list(object({ key = string, value = optional(string), effect = string })), [])<br> tags = optional(map(string), {})<br> volume = object({<br> size = string<br> type = string<br> })<br> }))</pre> | `{}` | no |
| <a name="input_availability_zones"></a> [availability\_zones](#input\_availability\_zones) | List of Availibility zones to distribute the deployment, EKS needs at least 2,https://docs.aws.amazon.com/eks/latest/userguide/network_reqs.html.<br> Note that setting this variable bypasses validation of the status of the zones data 'aws\_availability\_zones' 'available'.<br> Caller is responsible for validating status of these zones. | `list(string)` | `[]` | no |
| <a name="input_bastion_ami_id"></a> [bastion\_ami\_id](#input\_bastion\_ami\_id) | AMI ID for the bastion EC2 instance, otherwise we will use the latest 'amazon\_linux\_2' ami | `string` | `""` | no |
| <a name="input_cidr"></a> [cidr](#input\_cidr) | The IPv4 CIDR block for the VPC. | `string` | `"10.0.0.0/16"` | no |
| <a name="input_create_bastion"></a> [create\_bastion](#input\_create\_bastion) | Create bastion toggle. | `bool` | `false` | no |
| <a name="input_default_node_groups"></a> [default\_node\_groups](#input\_default\_node\_groups) | EKS managed node groups definition. | <pre>object(<br> {<br> compute = object(<br> {<br> ami = optional(string)<br> instance_type = optional(string, "m5.2xlarge")<br> min_per_az = optional(number, 0)<br> max_per_az = optional(number, 10)<br> desired_per_az = optional(number, 1)<br> labels = optional(map(string), {<br> "dominodatalab.com/node-pool" = "default"<br> })<br> volume = optional(object(<br> {<br> size = optional(number, 100)<br> type = optional(string, "gp3")<br> }),<br> {<br> size = 100<br> type = "gp3"<br> }<br> )<br> }),<br> platform = object(<br> {<br> ami = optional(string)<br> instance_type = optional(string, "m5.4xlarge")<br> min_per_az = optional(number, 1)<br> max_per_az = optional(number, 10)<br> desired_per_az = optional(number, 1)<br> labels = optional(map(string), {<br> "dominodatalab.com/node-pool" = "platform"<br> })<br> volume = optional(object(<br> {<br> size = optional(number, 100)<br> type = optional(string, "gp3")<br> }),<br> {<br> size = 100<br> type = "gp3"<br> }<br> )<br> }),<br> gpu = object(<br> {<br> ami = optional(string)<br> instance_type = optional(string, "g4dn.xlarge")<br> min_per_az = optional(number, 0)<br> max_per_az = optional(number, 10)<br> desired_per_az = optional(number, 0)<br> labels = optional(map(string), {<br> "dominodatalab.com/node-pool" = "default-gpu"<br> "nvidia.com/gpu" = true<br> })<br> volume = optional(object(<br> {<br> size = optional(number, 100)<br> type = optional(string, "gp3")<br> }),<br> {<br> size = 100<br> type = "gp3"<br> }<br> )<br> })<br> })</pre> | <pre>{<br> "compute": {},<br> "gpu": {},<br> "platform": {}<br>}</pre> | no |
| <a name="input_default_node_groups"></a> [default\_node\_groups](#input\_default\_node\_groups) | EKS managed node groups definition. | <pre>object(<br> {<br> compute = object(<br> {<br> ami = optional(string)<br> instance_types = optional(list(string), ["m5.2xlarge"])<br> spot = optional(bool, false)<br> min_per_az = optional(number, 0)<br> max_per_az = optional(number, 10)<br> desired_per_az = optional(number, 1)<br> labels = optional(map(string), {<br> "dominodatalab.com/node-pool" = "default"<br> })<br> taints = optional(list(object({ key = string, value = optional(string), effect = string })), [])<br> tags = optional(map(string), {})<br> volume = optional(object(<br> {<br> size = optional(number, 100)<br> type = optional(string, "gp3")<br> }),<br> {<br> size = 100<br> type = "gp3"<br> }<br> )<br> }),<br> platform = object(<br> {<br> ami = optional(string)<br> instance_types = optional(list(string), ["m5.4xlarge"])<br> spot = optional(bool, false)<br> min_per_az = optional(number, 1)<br> max_per_az = optional(number, 10)<br> desired_per_az = optional(number, 1)<br> labels = optional(map(string), {<br> "dominodatalab.com/node-pool" = "platform"<br> })<br> taints = optional(list(object({ key = string, value = optional(string), effect = string })), [])<br> tags = optional(map(string), {})<br> volume = optional(object(<br> {<br> size = optional(number, 100)<br> type = optional(string, "gp3")<br> }),<br> {<br> size = 100<br> type = "gp3"<br> }<br> )<br> }),<br> gpu = object(<br> {<br> ami = optional(string)<br> instance_types = optional(list(string), ["g4dn.xlarge"])<br> spot = optional(bool, false)<br> min_per_az = optional(number, 0)<br> max_per_az = optional(number, 10)<br> desired_per_az = optional(number, 0)<br> labels = optional(map(string), {<br> "dominodatalab.com/node-pool" = "default-gpu"<br> "nvidia.com/gpu" = true<br> })<br> taints = optional(list(object({ key = string, value = optional(string), effect = string })), [<br> { key = "nvidia.com/gpu", value = "true", effect = "NO_SCHEDULE" }<br> ])<br> tags = optional(map(string), {})<br> volume = optional(object(<br> {<br> size = optional(number, 100)<br> type = optional(string, "gp3")<br> }),<br> {<br> size = 100<br> type = "gp3"<br> }<br> )<br> })<br> })</pre> | <pre>{<br> "compute": {},<br> "gpu": {},<br> "platform": {}<br>}</pre> | no |
| <a name="input_deploy_id"></a> [deploy\_id](#input\_deploy\_id) | Domino Deployment ID. | `string` | `"domino-eks"` | no |
| <a name="input_efs_access_point_path"></a> [efs\_access\_point\_path](#input\_efs\_access\_point\_path) | Filesystem path for efs. | `string` | `"/domino"` | no |
| <a name="input_eks_master_role_names"></a> [eks\_master\_role\_names](#input\_eks\_master\_role\_names) | IAM role names to be added as masters in eks. | `list(string)` | `[]` | no |
Expand Down
42 changes: 19 additions & 23 deletions main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@

# Check the zones where the instance types are being offered
data "aws_ec2_instance_type_offerings" "nodes" {
for_each = toset([for ng in merge(var.default_node_groups, var.additional_node_groups) : ng.instance_type])
for_each = { for name, ng in merge(var.default_node_groups, var.additional_node_groups) : name => ng.instance_types }

filter {
name = "instance-type"
values = [each.value]
values = each.value
}

location_type = "availability-zone"
Expand All @@ -29,30 +29,26 @@ data "aws_availability_zones" "available" {
}
}

data "aws_subnet" "specified" {
data "aws_subnet" "public" {
count = var.vpc_id != null ? length(var.public_subnets) : 0
id = var.public_subnets[count.index]
}

data "aws_subnet" "private" {
count = var.vpc_id != null ? length(var.private_subnets) : 0
id = element(var.private_subnets, count.index)
id = var.private_subnets[count.index]
}

locals {
# Get zones where ALL instance types are offered(intersection).
zone_intersection_instance_offerings = setintersection([for k, v in data.aws_ec2_instance_type_offerings.nodes : toset(v.locations)]...)
# Get the zones that are available and offered in the region for the instance types.
az_names = var.vpc_id != null ? distinct(data.aws_subnet.specified[*].availability_zone) : length(var.availability_zones) > 0 ? var.availability_zones : data.aws_availability_zones.available.names
az_names = var.vpc_id != null ? distinct(data.aws_subnet.private[*].availability_zone) : length(var.availability_zones) > 0 ? var.availability_zones : data.aws_availability_zones.available.names
offered_azs = setintersection(local.zone_intersection_instance_offerings, toset(local.az_names))
num_of_azs = var.vpc_id != null ? length(local.az_names) : var.number_of_azs
}

resource "random_shuffle" "azs" {
input = local.offered_azs
result_count = local.num_of_azs

lifecycle {
precondition {
condition = length(local.offered_azs) >= local.num_of_azs
error_message = "Availability of the instance types does not satisfy the desired number of zones, or the desired number of zones is higher than the available/offered zones"
}
}
# error -> "Availability of the instance types does not satisfy the desired number of zones, or the desired number of zones is higher than the available/offered zones"
azs_to_use = slice(tolist(local.offered_azs), 0, local.num_of_azs)
}

locals {
Expand All @@ -78,7 +74,7 @@ module "storage" {
efs_access_point_path = var.efs_access_point_path
s3_force_destroy_on_deletion = var.s3_force_destroy_on_deletion
vpc_id = local.vpc_id
subnets = local.private_subnets
subnet_ids = [for s in local.private_subnets : s.subnet_id]
}

locals {
Expand All @@ -102,16 +98,16 @@ module "network" {
deploy_id = var.deploy_id
region = var.region
cidr = var.cidr
availability_zones = random_shuffle.azs.result
public_subnets = local.public_cidr_blocks
private_subnets = local.private_cidr_blocks
availability_zones = local.azs_to_use
public_cidrs = local.public_cidr_blocks
private_cidrs = local.private_cidr_blocks
flow_log_bucket_arn = { arn = module.storage.s3_buckets["monitoring"].arn }
}

locals {
vpc_id = var.vpc_id != null ? var.vpc_id : module.network[0].vpc_id
public_subnets = var.vpc_id != null ? var.public_subnets : module.network[0].public_subnets
private_subnets = var.vpc_id != null ? var.private_subnets : module.network[0].private_subnets
public_subnets = var.vpc_id != null ? [for s in data.aws_subnet.public : { subnet_id = s.id, az = s.availability_zone }] : module.network[0].public_subnets
private_subnets = var.vpc_id != null ? [for s in data.aws_subnet.private : { subnet_id = s.id, az = s.availability_zone }] : module.network[0].private_subnets
}

module "bastion" {
Expand All @@ -122,7 +118,7 @@ module "bastion" {
region = var.region
vpc_id = local.vpc_id
ssh_pvt_key_path = aws_key_pair.domino.key_name
bastion_public_subnet_id = local.public_subnets[0]
bastion_public_subnet_id = local.public_subnets[0].subnet_id
bastion_ami_id = var.bastion_ami_id
}

Expand Down
Loading

0 comments on commit 64b67c2

Please sign in to comment.