Spoptimize: Replace AWS AutoScaling instances with spot instances
Spoptimize is a tool that automates use of Amazon EC2 spot instances in
your AutoScaling Groups.
Spoptimize was inspired by AutoSpotting, performs very similar
actions, but has its own - completely unique - implementation.
But why reinvent the wheel and not use AutoSpotting?
I had been noodling on ways to utilize spot instances in AutoScaling groups for quite awhile. Before writing
Spoptimize, I had brainstormed a few different ideas before I came across AutoSpotting. I thought the idea was
ingenious, but I thought it might be fun to build a similar system that was event driven vs using polling. I
had never used AWS Step Functions before, so I took the opportunity
to build my own tool using Step Functions whose executions were initiated by AutoScaling Launch Notifications.
Each launch notification is processed by a Lambda, which in turns begins an execution of Spoptimize’s Step
Funcions.
The Step Function execution manages the execution of Lambda functions which perform these actions:
spoptimize:init_sleep_interval
below)spoptimize:spot_req_sleep_interval
below)spoptimize:spot_attach_sleep_interval
Screenshot of a successful execution:
Here’s a breakdown the privileges required for deployment. Deployment requires the ability to:
spoptimize
spoptimize
spoptimize
spoptimize-init
spoptimize-artifacts-YOUR_AWS_ACCOUNT_ID
spoptimize
Note: many of the names and prefixes can be overridden via setting environment variables prior to running the
deployment script.
You can deploy Spoptimize via the CloudFormation console using the following launch button. It will deploy the
latest build:
If you wish to deploy Spoptimize via a shell or an automated process, you can utilize the included deploy
script.
Prerequisites:
First clone this repo, or download a tar.gz or zip from Releases.
Deploy both the IAM stack and the Step Functions & Lambdas:
$ ./deploy.sh
Deploy just the IAM stack:
$ ./deploy.sh iam
Deploy just the Step Functions and Lambdas:
$ ./deploy.sh cfn
After Spoptimize is deployed, configure your autoscaling groups to send launch notifications to thespoptimize-init
SNS topic.
Set via CloudFormation (see NotificationConfigurations
):
LaunchGroup:
Type: AWS::AutoScaling::AutoScalingGroup
Properties:
LaunchConfigurationName: !Ref LaunchConfig
DesiredCapacity: 0
MinSize: 0
MaxSize: 12
VPCZoneIdentifier:
- !Select [ 0, !Ref SubnetIds ]
- !Select [ 1, !Ref SubnetIds ]
MetricsCollection:
- Granularity: 1Minute
HealthCheckGracePeriod: 120
Cooldown: 180
HealthCheckType: ELB
TargetGroupARNs:
- !Ref ElbTargetGroup
Tags:
- Key: Name
Value: !Ref AWS::StackName
PropagateAtLaunch: true
NotificationConfigurations:
- TopicARN: !Sub "arn:aws:sns:${AWS::Region}:${AWS::AccountId}:spoptimize-init"
NotificationTypes:
- autoscaling:EC2_INSTANCE_LAUNCH
And in the console:
Newly launched instances will (eventually) be replaced by spot instances.
Spoptimize’s wait intervals may be overridden per AutoScaling via the use of tags.
spoptimize:min_protected_instances
: Set a minimum number of on-demand instances for the autoscaling group.spoptimize:init_sleep_interval
: Initial wait interval after launch notification is received. Spoptimizespoptimize:spot_req_sleep_interval
: Wait interval following spot instance request. Default is 30s.spoptimize:spot_attach_sleep_interval
: Wait interval following attachment of spot instance tospoptimize:spot_failure_sleep_interval
: Wait interval between iterations following a spot instanceBelow are override tags I used during development. (Note: these are very aggressive so that I could watch
Spoptimize in action.)
Set via CloudFormation:
Tags:
- Key: Name
Value: !Ref AWS::StackName
PropagateAtLaunch: true
- Key: spoptimize:min_protected_instances
Value: 1
PropagateAtLaunch: false
- Key: spoptimize:init_sleep_interval
Value: 45
PropagateAtLaunch: false
- Key: spoptimize:spot_req_sleep_interval
Value: 10
PropagateAtLaunch: false
- Key: spoptimize:spot_attach_sleep_interval
Value: 125
PropagateAtLaunch: false
- Key: spoptimize:spot_failure_sleep_interval
Value: 900
PropagateAtLaunch: false
And in the console: