Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agressive Auto Scaling in AWS - 1 - #4721

Closed
mekya opened this issue Jan 3, 2023 · 3 comments
Closed

Agressive Auto Scaling in AWS - 1 - #4721

mekya opened this issue Jan 3, 2023 · 3 comments
Assignees

Comments

@mekya
Copy link
Contributor

mekya commented Jan 3, 2023

It's known that auto scaling does not respond fast when there is a too much demand. Here is a typical basic scenario:

Steps

  • Have a single server 2 cores in an auto scaling ORIGIN group
  • Have a single server 2 cores in an auto scaling EDGE group
  • Publish a live stream
  • Play the stream with 2K viewers

For sure, 2 cores cpu cannot handle 2K viewers. Expectation is to have a solution that to scale the instances within 3 minutes to support 2K viewers and there is no bad user experience. Here are more details:

  • Confirm that the servers cannot receive more demands that they can serve.
  • Users always have good audio/video quality - No robotic audio/no video freeze due to load performance problems
  • Server may send 'highResourceUsage' but handle that on the viewer side gracefully
  • Users can wait 3 minutes in total to start the playing after live stream has started
  • It will be better to have solutions with and with and without warm pools but we can start with warm pools

Thank you so much pushing us for this issue @ashraf

@mekya mekya moved this to 📋 Backlog in Ant Media Server Jan 3, 2023
@mekya mekya moved this from 📋 Backlog to 🔖 Sprint in Ant Media Server Jan 25, 2023
@ant-media ant-media deleted a comment from ashraf-zz Jan 28, 2023
@muratugureminoglu
Copy link
Collaborator

Hi,

Since the AutoScale structure is triggered based on CloudWatch metrics, it will take some time for a 2 Core instance (we can say about 110-120 viewers, which means 17+1 instances) to be triggered by AutoScale for an aggressive 2K viewers. But I tried to do some tuning below, hope it helps.

1. If you are using simple monitoring, Cloudwatch metrics are created every 5 minutes on the EC2 instance. This will extend the triggering time of a new instance in AutoScale accordingly. If you use Detailed monitoring, you can decrease this time by up to 1 minute[1]. You can find if it's active or not in EC2 > Launch Configuration > Your Launch Configuration > Advance Configuration.

2. You can create a CloudWatch rule and then add it to AutoScaling as a Policy (The old policy should be deleted). In this CloudWatch alarm, if you set the datapoint as 1 and keep the Period time for 1 minute and reduce the Warm Up time from 300 seconds to 10 seconds, the new instance opening time will be around 2 minutes.

CloudWatch Alarm

aws-cloudwatch

Scaling Policy

simple-scaling

3. You can keep ready the servers using Warm Pool, this will shorten the boot time.

EC2 > Auto Scaling groups > Your Group > Instance Management > Warm Pool

warm-pool

4. Finally, "maybe" we can come up with a solution for this with the help of AWS-Cli or Lambda. I need to do some research on this.

[1] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-cloudwatch-new.html

@mekya mekya moved this from 🔖 Sprint to 🏗 In progress in Ant Media Server Jan 30, 2023
@mekya mekya self-assigned this May 8, 2023
@mekya mekya moved this from 🏗 In progress to 🔖 Sprint in Ant Media Server May 25, 2023
@mekya mekya moved this from 🔖 Sprint to Next Sprint in Ant Media Server Jul 10, 2023
@mekya mekya moved this from Next Sprint to 📋 Backlog in Ant Media Server Aug 14, 2023
@mekya mekya moved this from 📋 Backlog to 🔖 Sprint in Ant Media Server Sep 11, 2023
@muratugureminoglu
Copy link
Collaborator

Here is a lambda script: ant-media/Scripts#260

@mekya
Copy link
Contributor Author

mekya commented Sep 25, 2023

Thank you @muratugureminoglu,

Could you please let us know what this script does? You may add some info to the PR and top of the script.

@burak-58 burak-58 moved this from 🔖 Sprint to After sprint in Ant Media Server Oct 2, 2023
@github-project-automation github-project-automation bot moved this from After sprint to ✅ Done in Ant Media Server Dec 25, 2023
@mekya mekya reopened this Sep 20, 2024
@mekya mekya moved this from ✅ Done to After sprint in Ant Media Server Sep 20, 2024
@burak-58 burak-58 moved this from 🏗 In progress to After sprint in Ant Media Server Oct 7, 2024
@github-project-automation github-project-automation bot moved this from After sprint to ✅ Done in Ant Media Server Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

3 participants