The Table Capacity Mode Optimization Tool analyzes DynamoDB table usage (metrics), simulate autoscaling behavior based on the specified parameters and generate analysis summary for optimizing capacity mode and provisioning.
- Retrieves DynamoDB table information and autoscaling settings
- Retrieves the following metrcis from CloudWatch:
- ProvisionedReadCapacityUnits (1 hour period)
- ProvisionedWriteCapacityUnits (1 hour period)
- ConsumedReadCapacityUnits (1 minute period)
- ConsumedWriteCapacityUnits (1 minute period)
- Estimates Provisioned Capacity units by simulating autoscaling behavior based on the above metrcis and the specified read/write utilization targets and minimum/maximum units parameters.
- Generates a capacity mode analysis based on specified parameters.
Scaling simulation behavior:
- Scale-out: When consumed reads or writes exceeds the target utilization for two consecutive data points(two minutes).
- Scale-in: When consumed reads or writes falls %20 below the target utilization for 15 consecutive data points(15 minutes). It performs scale-in up to four times as needed, after which it scales in once every 1-hour windows within a day. This is to simulate the DynamoDB's scale-in limitation.
Limitations:
- The analysis generated by the tool are based on the above metrics for the number of days specified by
--number-of-days-look-back
. Due to CloudWatch metrics with a period of 60 seconds (1 minute) are available up to 14 days. - The tool can not consider the scaling delays (the time it takes for the DynamoDB Autoscaling service to perform a scaling operation) when simulating the autoscaling behavior. Refer to AWS DynamoDB Autoscaling docs for further information and how it works.
Be aware that changing the capacity mode can have an impact on the applications, so be sure to test any changes thoroughly before making them in production.
DynamoDB capacity modes determine how your tables will scale to handle varying levels of read and write traffic. It's important to understand the cost implications of each mode to make informed decisions about which mode to use.
There are two capacity modes available in DynamoDB:
-
Provisioned Mode: In this mode, you specify the read and write capacity for your table and DynamoDB reserves that capacity for your use. You pay a predictable hourly rate for the amount of capacity you provision.
-
On-Demand Mode: In this mode, DynamoDB automatically scales the read and write capacity for your table in response to traffic. You pay for only the requests that you make.
- Provisioned mode is ideal for workloads that have consistent traffic patterns or predictable spikes in traffic.
- Provisioned mode provides better performance predictability and enables users to fine-tune capacity to their needs.
- Provisioned mode is also less expensive than on-demand mode for predictable workloads.
Note: Use auto-scaling to ensure that your tables can handle capacity changes automatically without manual intervention.
- On-demand mode is ideal for workloads with unpredictable or infrequent traffic patterns.
- On-demand mode allows users to pay only for the capacity they consume without having to manually adjust capacity.
- On-demand mode can be more expensive than provisioned mode for predictable workloads.
- Python 3.10 or 3.11
- AWS CLI configured with appropriate credentials and region
- Required Python packages listed in `requirements.txt
-
Clone the repository
-
Create Python Virtual Environment for clean install
python3 -m venv .venv source .venv/bin/activate
-
Install the required Python packages:
pip3 install -r requirements.txt
-
Set up the AWS CLI with appropriate credentials and region if you haven't done so already.
-
Run the
capacity_reco.py
script:The following parameters are required to run the script:
--dynamodb-tablename
: DynamoDB table name (optional; if not provided, the script will process all tables in the specified region)--dynamodb-read-utilization
: DynamoDB read utilization percentage (default: 70)--dynamodb-write-utilization
: DynamoDB write utilization percentage (default: 70)--dynamodb-minimum-write-unit
: DynamoDB minimum write unit (default: 1)--dynamodb-maximum-write-unit
: DynamoDB maximum write unit (default: 80000)--dynamodb-minimum-read-unit
: DynamoDB minimum read unit (default: 1)--dynamodb-maximum-read-unit
: DynamoDB maximum read unit (default: 80000)--number-of-days-look-back
: Number(1-14) of days to look back for CloudWatch metrics (default: 14)--max-concurrent-tasks
: Maximum number of tasks to run concurrently (default: 5)
-
with default values:
python3 capacity_reco.py
-
with the desired values:
python3 capacity_reco.py --dynamodb-tablename <table_name> --dynamodb-read-utilization <read_utilization> --dynamodb-write-utilization <write_utilization> --dynamodb-minimum-write-unit <minimum_write_unit> --dynamodb-maximum-write-unit <maximum_write_unit> --dynamodb-minimum-read-unit <minimum_read_unit> --dynamodb-maximum-read-unit <maximum_read_unit> --number-of-days-look-back <number_of_days_look_back> --max-concurrent-tasks <max_concurrent_tasks> [--debug]
Add the
--debug
flag to save metrics and estimates as CSV files in theoutput
folder.
- The output files will be saved in the
output
folder.
Check the generated files in the output
folder. Here is an example output
Note: The analysis generated is applicable only to the period specified by --number-of-days-look-back
and is provided on separate lines for the table and relevant index for both ReadCapacityUnits
and WriteCapacityUnits
.
The analysis_summary.csv
file contains the following columns,
index_name
: The name of the index associated table in the analysis.base_table_name
: The name of the table associated with the analysis.class
: Storage class of the table associated with the analysis.metric_name
: The name of the metric associated with the analysis.est_provisioned_cost
: The estimated cost of the Provisioned mode based on the table's simulated usage.current_provisioned_cost
: If table capacity mode is provisioned, the table's current Provisioned cost.ondemand_cost
: The cost of using on-demand mode for the table.recommended_mode
: The recommended capacity mode for the table.current_mode
: The table's current capacity mode.status
: The status of the table or index ifOptimized
orNot Optimized
based on the analysis.savings_pct
: The estimated percentage of cost savings considering the recommended capacity mode.number_of_days
: The number of days in the lookback period for the analysis.current_min_capacity
: If table capacity mode is provisioned and autoscaling is enabled, the table's current minimum capacity units.simulated_min_capacity
: The minimum capacity units based on what the provisioned capacity mode is simulated.current_max_capacity
: If table capacity mode is provisioned and autoscaling is enabled, the table's current maximum capacity units.simulated_max_capacity
: The maximum capacity units based on what the provisioned capacity mode is simulated.current_target_utilization
: If table capacity mode is provisioned and autoscaling is enabled, the table's current target utilization.simulated_target_utilization
: The target utilization based on what the provisioned capacity mode is simulated.current_cost
: The table's current cost for period analyzed.recommended_cost
: The table's estimated cost for period analyzed considering the recommended capacity mode.autoscaling_enabled
: The Table's current Autoscaling Status.Note
: The analysis provided in this script compares your table consumption and simulates cost using different parameters. This tool does not have access to your contextual information, business requirements or organization best practices. When changing your capacity mode from on-demand to provisioned based on the results, remember there were some assumptions made: The analysis window is 14 days and auto-scaling responds instantaneously. (In reality, Auto scaling service might take 4 mins to provision new table capacity depending on your increase conditions).