A generic {@link CloudAdapter} that is built to work against any cloudprovider. A cloud-specific {@link ScalingGroup} is used to provide primitivesfor managing the scaling group according to the API/protocol provided by the particular cloud provider. A {@link ScalingGroup} for the targeted cloudneeds to be supplied at construction-time.
The configuration ( {@link BaseCloudAdapterConfig}) for this adapter specifies how the {@link BaseCloudAdapter}:
- should configure its ScalingGroup to allow it to communicate with its cloud API ( {@code scalingGroup}).
- provisions new instances when the scaling group needs to grow ( {@code scaleUpConfig}).
- decommissions instances when the scaling group needs to shrink ( {@code scaleDownConfig}).
- performs boot-time liveness checks when starting new group members ( {@code bootTimeCheck}).
- performs periodical run-time liveness checks on existing group members ( {@code runTimecheck}).
- alerts system administrators (via email) when resize operations, liveness checks, etc fail ( {@code alerts}).
A configuration document may look as follows:
{ "scalingGroup": { "name": "MyScalingGroup", "config": { "awsAccessKeyId": "ABC...XYZ", "awsSecretAccessKey": "abc...xyz", "region": "us-east-1" } }, "scaleUpConfig": { "size": "m1.small", "image": "ami-018c9568", "keyPair": "instancekey", "securityGroups": ["webserver"], "bootScript": [ "#!/bin/bash", "sudo apt-get update -qy", "sudo apt-get install -qy apache2" ] }, "scaleDownConfig": { "victimSelectionPolicy": "CLOSEST_TO_INSTANCE_HOUR", "instanceHourMargin": 0 }, "liveness": { "loginUser": "ubuntu", "loginKey": "/path/to/instancekey.pem", "bootTimeCheck": { "command": "sudo service apache2 status | grep 'is running'", "retryDelay": 20, "maxRetries": 15 }, "runTimeCheck": { "command": "sudo service apache2 status | grep 'is running'", "period": 60, "maxRetries": 3, "retryDelay": 10 } }, "alerts": { "subject": "[elastisys:scale] scaling group alert for MyScalingGroup", "recipients": ["receiver@destination.com"], "sender": "noreply@elastisys.com", "mailServer": { "smtpHost": "mail.server.com", "smtpPort": 465, "authentication": {"userName": "john", "password": "secret"}, "useSsl": True } }, "poolUpdatePeriod": 60 }
The {@link BaseCloudAdapter} operates according to the {@link CloudAdapter}contract. Some details on how the {@link BaseCloudAdapter} satisfies thecontract are summarized below.
Configuration: When {@link #configure} is called, the {@link BaseCloudAdapter} expects aJSON document that validates against its JSON Schema (as returned by {@link #getConfigurationSchema()}). The entire configuration document is passed on to the {@link ScalingGroup} via a call to{@link ScalingGroup#configure}. The parts of the configuration that are of special interest to the {@link ScalingGroup}, such as cloud login details and scaling group name, are located under the {@code scalingGroup} key. The{@code scalingGroup/config} configuration key holds implementation-specificsettings for the particular {@link ScalingGroup} implementation. An exampleconfiguration is given above. Identifying group members: When {@link #getMachinePool} is called, the scaling group members areidentified via a call to {@link ScalingGroup#listMachines()}. Handling resize requests:
When {@link #resizeMachinePool} is called, the actions taken depend on if theresize request requires growing or shrinking the scaling group. - scale up: start by sparing machines from termination if the termination queue is non-empty. For any remaining instances: request them to be started by the ScalingGroup via {@link ScalingGroup#startMachines}. The {@code scaleUpConfig} is passed to the {@link ScalingGroup}.
- scale down: start by terminating any machines in {@link MachineState#REQUESTED} state, since these are likely to not yet incurcost. Any such machines are terminated immediately. If additional capacity is to be removed, select a victim according to the configured {@code victimSelectionPolicy} and schedule it for termination according tothe configured {@code instanceHourMargin}. Each instance termination is delegated to {@link ScalingGroup#terminateMachine(String)}.
Tracking group member liveness:
The {@link BaseCloudAdapter} uses two tests to monitor the{@link LivenessState} of machines in the {@link ScalingGroup}: - boot-time liveness test, which waits for a server to come live when a new server is provisioned in the scaling group.
- run-time liveness test, which is performed periodically to verify that scaling group members are still operational.
Both tests work by attempting to execute an SSH command against the machine (a limited number of times), and if the exit code is zero, the machine is considered {@link LivenessState#LIVE}, otherwise {@link LivenessState#UNHEALTHY}. Alerts:
If an alerts attribute is present in the configuration, the {@link BaseCloudAdapter} will send alert emails to notify selected recipientsof interesting events (such as errors, scale-ups/scale-downs, liveness state changes, etc).