项目作者: vrivellino

项目描述 :
Spoptimize: Replace AWS AutoScaling instances with spot instances
高级语言: Python
项目地址: git://github.com/vrivellino/spoptimize.git
创建时间: 2018-02-06T14:55:04Z
项目社区:https://github.com/vrivellino/spoptimize

开源协议:Mozilla Public License 2.0

下载


SPOPTIMIZE

Build Status
Coverage Status

Spoptimize is a tool that automates use of Amazon EC2 spot instances in
your AutoScaling Groups.

About Spoptimize

Spoptimize was inspired by AutoSpotting, performs very similar
actions, but has its own - completely unique - implementation.

But why reinvent the wheel and not use AutoSpotting?

I had been noodling on ways to utilize spot instances in AutoScaling groups for quite awhile. Before writing
Spoptimize, I had brainstormed a few different ideas before I came across AutoSpotting. I thought the idea was
ingenious, but I thought it might be fun to build a similar system that was event driven vs using polling. I
had never used AWS Step Functions before, so I took the opportunity
to build my own tool using Step Functions whose executions were initiated by AutoScaling Launch Notifications.

How it works

Each launch notification is processed by a Lambda, which in turns begins an execution of Spoptimize’s Step
Funcions.

The Step Function execution manages the execution of Lambda functions which perform these actions:

  1. Wait following new instance launch. (See spoptimize:init_sleep_interval below)
  2. Verify that the new on-demand instance is healthy according to autoscaling.
  3. Request Spot Instance using specifications defined in autoscaling group’s launch configuration.
  4. Wait for Spot Request to be fulfilled and for spot instance to be online. (See
    spoptimize:spot_req_sleep_interval below)
  5. Acquire an exclusive lock on the autoscaling group. This step prevents multiple executions from attaching &
    terminating instances simultaneously.
  6. Attach spot instance to autoscaling group and terminate original on-demand instance.
  7. Wait for spot instance to be healthy according to autoscaling. (See spoptimize:spot_attach_sleep_interval
    below)
  8. Verify health of spot instance and release exclusive lock.

Screenshot of a successful execution:
AWS Step Function execution

Deploying

Here’s a breakdown the privileges required for deployment. Deployment requires the ability to:

  • create/update/delete:
    • CloudFormation stacks
    • IAM Managed Policy
    • IAM Roles
    • CloudWatch Alarms
    • DynamoDb tables whose table names begin with spoptimize
    • Lambda functions whose function names begin with spoptimize
    • Step Functions whose names begin with spoptimize
  • create a SNS topic named spoptimize-init
  • create a S3 bucket named spoptimize-artifacts-YOUR_AWS_ACCOUNT_ID
  • read/write to aforementioned S3 bucket with a prefix of spoptimize

Note: many of the names and prefixes can be overridden via setting environment variables prior to running the
deployment script.

Quick Launch

You can deploy Spoptimize via the CloudFormation console using the following launch button. It will deploy the
latest build:

Launch

Deployment Script

If you wish to deploy Spoptimize via a shell or an automated process, you can utilize the included deploy
script.

Prerequisites:

  • Bash
  • AWS CLI
  • API access to an AWS account

First clone this repo, or download a tar.gz or zip from Releases.

Deploy both the IAM stack and the Step Functions & Lambdas:

  1. $ ./deploy.sh

Deploy just the IAM stack:

  1. $ ./deploy.sh iam

Deploy just the Step Functions and Lambdas:

  1. $ ./deploy.sh cfn

Configuration

After Spoptimize is deployed, configure your autoscaling groups to send launch notifications to the
spoptimize-init SNS topic.

Set via CloudFormation (see NotificationConfigurations):

  1. LaunchGroup:
  2. Type: AWS::AutoScaling::AutoScalingGroup
  3. Properties:
  4. LaunchConfigurationName: !Ref LaunchConfig
  5. DesiredCapacity: 0
  6. MinSize: 0
  7. MaxSize: 12
  8. VPCZoneIdentifier:
  9. - !Select [ 0, !Ref SubnetIds ]
  10. - !Select [ 1, !Ref SubnetIds ]
  11. MetricsCollection:
  12. - Granularity: 1Minute
  13. HealthCheckGracePeriod: 120
  14. Cooldown: 180
  15. HealthCheckType: ELB
  16. TargetGroupARNs:
  17. - !Ref ElbTargetGroup
  18. Tags:
  19. - Key: Name
  20. Value: !Ref AWS::StackName
  21. PropagateAtLaunch: true
  22. NotificationConfigurations:
  23. - TopicARN: !Sub "arn:aws:sns:${AWS::Region}:${AWS::AccountId}:spoptimize-init"
  24. NotificationTypes:
  25. - autoscaling:EC2_INSTANCE_LAUNCH

And in the console:
EC2 AutoScaling console showing notifications tab

Newly launched instances will (eventually) be replaced by spot instances.

Configuration Overrides

Spoptimize’s wait intervals may be overridden per AutoScaling via the use of tags.

  • spoptimize:min_protected_instances: Set a minimum number of on-demand instances for the autoscaling group.
    Defaults to 0. This prevents Spoptimize from replacing all on-demand instances with spot instances.
    NOTE: Spoptimzie leverages Instance
    Protection

    to achieve this.
  • spoptimize:init_sleep_interval: Initial wait interval after launch notification is received. Spoptimize
    won’t do anything during this wait period. Defaults to approximately the group’s Health Check Grace
    Period times the Desired Capacity plus 30-90s. This is directly correlated to the capacity to allow for
    rolling updates to complete before any instances are replaced.
  • spoptimize:spot_req_sleep_interval: Wait interval following spot instance request. Default is 30s.
  • spoptimize:spot_attach_sleep_interval: Wait interval following attachment of spot instance to
    autoscaling group. Defaults to the group’s Health Check Grace Period plus 30s.
  • spoptimize:spot_failure_sleep_interval: Wait interval between iterations following a spot instance
    failure. Defaults to 1 hour. A spot failure may be a failed spot instance request or a failure of the
    spot instance after it comes online.

Below are override tags I used during development. (Note: these are very aggressive so that I could watch
Spoptimize in action.)

Set via CloudFormation:

  1. Tags:
  2. - Key: Name
  3. Value: !Ref AWS::StackName
  4. PropagateAtLaunch: true
  5. - Key: spoptimize:min_protected_instances
  6. Value: 1
  7. PropagateAtLaunch: false
  8. - Key: spoptimize:init_sleep_interval
  9. Value: 45
  10. PropagateAtLaunch: false
  11. - Key: spoptimize:spot_req_sleep_interval
  12. Value: 10
  13. PropagateAtLaunch: false
  14. - Key: spoptimize:spot_attach_sleep_interval
  15. Value: 125
  16. PropagateAtLaunch: false
  17. - Key: spoptimize:spot_failure_sleep_interval
  18. Value: 900
  19. PropagateAtLaunch: false

And in the console:
EC2 AutoScaling console showing tags tab

Notes

  • Auto-Scaling groups that deploy EC2 instances to VPCs are tested. Auto-Scaling groups in EC2-Classic should
    work, but is not tested.