Ready to Boost Your Startup? Click to Start Up Your Free Subscription!

Trend

Cost Optimization through EC2 Automation: Leveraging AWS Lambda and EventBridge

Authored by Robb Lee

Cost Optimization through EC2 Automation

Are you keeping your development EC2 instance running all day?

If you could turn it on only when needed, you’d save costs and make management much easier. Surprisingly, many teams face this issue.

Our team faced the same challenge.

Even on weekends and nights without overtime, EC2 instances were still running. But we were being charged for the time when they weren’t actually in use. Over time, unnecessary operating costs piled up, and resource management became inefficient.

To solve this, we implemented automation using AWS Lambda and Amazon EventBridge. As a result, we were able to reduce our EC2 costs by 45% and significantly decrease management overhead.

How did we achieve this? Let’s explore how we tackled this.

The Reason We Decided to Implement Automation

After analyzing our AWS costs over the past month, we found that a significant amount of unnecessary expenses were being incurred from EC2 instances for development.

  • EC2 instances for development were idle for more than 14 hours on average each day.
  • Instances continued running over the weekend even when not in use, causing unnecessary charges.
  • The manual process of starting and stopping EC2 instances was cumbersome and inefficient.

To address these issues, we decided to implement an automated EC2 scheduling system. With automation in place, we expected to run instances only when needed, and automatically shut them down when not in use, achieving both cost savings and operational efficiency.

Architecture for EC2 Automation

To efficiently manage EC2 instances for development, we built an automation system using AWS serverless services. This ensures that instances run only when needed and automatically shut down when not in use.

Key Components and Their Roles

Component Role
Amazon EventBridgeSchedules the start and stop of EC2 instances.
AWS LambdaExecutes logic to control EC2 instances based on EventBridge triggers.
IAMManages permissions for Lambda to control EC2 instances.

Automation Workflow

  1. Start of Workday (10 AM) → EventBridge triggers Lambda to automatically start EC2 instances.
  2. End of Workday (7 PM) → EventBridge triggers Lambda to automatically stopEC2 instances.
  3. Exception Handling → If a developer needs to keep an instance running for overtime or weekends:
    • An 'Override' tag can be added to the EC2 instance to prevent automatic shutdown.
    • Lambda checks for this tag and skips termination if the override is set.

With this architecture, EC2 instances can be managed automatically without manual intervention, reducing unnecessary costs while providing flexible exception handling.

EC2 Automation Implementation Process

To automatically start and stop EC2 instances for development, the first step is to set up tags that identify which instances should be automated.

1. Configuring EC2 Tags

To distinguish between instances that should be automated and those that should not, specific tags need to be added to the target EC2 instances. The following AWS CLI command adds a Scheduled=True tag to a specific EC2 instance:

aws ec2 create-tags --resources i-0abcd1234efgh5678 --tags Key=Scheduled,Value=True

Only instances with this tag will follow the automated schedule for start and stop events. Instances without this tag are excluded from automation and must be managed manually. Additionally, to exclude a specific instance from automatic shutdown, an Override=True tag can be added. Before stopping an instance, the Lambda function checks for this tag and skips termination if it is set.

2. Configuring IAM Policies

For AWS Lambda to control EC2 instances, an appropriate IAM policy must be configured. This grants Lambda the necessary permissions to start and stop instances.

The following policy allows Lambda to start and stop EC2 instances:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeInstances",
                "ec2:StartInstances",
                "ec2:StopInstances"
            ],
            "Resource": "*"
        }
    ]
}

To enhance security, you can modify this policy to restrict actions to only EC2 instances with the Scheduled=True tag.


Creating an IAM Role and Assigning It to Lambda

  1. Go to the AWS IAM Console and create a new IAM role.
  2. Select AWS Service → Lambda as the trusted entity.
  3. Attach the above IAM policy to the role.
  4. Assign the created IAM role to your Lambda function.

By configuring these permissions, Lambda will only have access to manage designated EC2 instances, ensuring both security and effective automation.

3. Implementing the Lambda Function

You can write a Lambda function to automatically start and stop EC2 instances based on their tags. It only targets instances with the Scheduled=True tag.

Lambda Code (Supports Both Start & Stop Operations) The following code is a Lambda function that starts or stops EC2 instances. While the previous version only handled start_instances, this version also includes stop_instances.

import boto3
import logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)
def get_instances():
    ec2 = boto3.resource('ec2')
    return [
        instance.id for instance in ec2.instances.all() 
        if any(tag.get('Key') == 'Scheduled' and tag.get('Value') == 'True' 
            for tag in instance.tags or [])
    ]
def lambda_handler(event, context):
    try:
        ec2_client = boto3.client('ec2', region_name='ap-northeast-2')
        instances = get_instances()
        if not instances:
            logger.info("No scheduled instances found")
            return
        ec2_client.start_instances(InstanceIds=instances)
        logger.info(f"Successfully started instances: {instances}")
    except Exception as e:
        logger.error(f"Failed to start instances: {str(e)}")
        raise

Key Features of the Lambda Function

  1. get_instances()
    • Retrieves all EC2 instances with the Scheduled=True tag.
  2. lambda_handler(event, context)
    • Checks the event['action'] value to determine whether to start or stop instances.
    • Executes start_instances() if action is "start".
    • Executes stop_instances() if action is "stop".
    • Logs a warning if an invalid action value is received.

Integrating EventBridge with Lambda

  • EC2 Start Event (Triggered at 10 AM)
    • EventBridge sends the event { "action": "start" } to Lambda.
  • EC2 Stop Event (Triggered at 7 PM)
    • EventBridge sends the event { "action": "stop" } to Lambda..

With this setup, EC2 instances automatically start at the beginning of the workday and shut down at the end, ensuring seamless automation. 🚀

4. Configuring EventBridge Scheduling

Amazon EventBridge allows you to schedule EC2 start and stop times, reducing the need for manual intervention and helping to lower operational costs.


Creating EventBridge Rules for EC2 Scheduling

Use the following AWS CLI commands to configure EventBridge to automatically start and stop EC2 instances:

# Start EC2 at 10:00 AM (UTC: 1:00 AM)
aws events put-rule \
    --name ec2-start-schedule \
    --schedule-expression "cron(0 1 ? * MON-FRI *)"
# Stop EC2 at 7:00 PM (UTC: 10:00 AM)
aws events put-rule \
    --name ec2-stop-schedule \
    --schedule-expression "cron(0 10 ? * MON-FRI *)"

With this setup, EC2 instances will automatically start during work hours and shut down after work, reducing unnecessary costs!

Challenges and Solutions Found During Operation

After implementing automation, we encountered a few issues and addressed them as follows:

Issue Solution
Need to prevent EC2 from starting on holidaysStore holiday data in DynamoDB and check it in Lambda.
Occasionally need to start EC2 urgentlyAdd an Override tag to exempt instances from automatic shutdown.
Need to control the startup sequenceUse Step Functions to manage instance dependencies.

By implementing these solutions, we were able to effectively handle operational challenges without disruptions.

Benefits of Automation

After implementing EC2 scheduling automation, we achieved significant improvements:

Cost Optimization through EC2 Automation

45% Reduction in AWS Costs

  • By automatically shutting down idle EC2 instances, we eliminated unnecessary expenses.

Automatic EC2 Shutdown Outside Work Hours

  • Instances no longer remained on during weekends or late hours when not needed.
  • Resources are now utilized efficiently, running only when necessary.

Reduced Operational Overhead

  • Developers no longer need to manually manage EC2 instances, reducing workload.
  • IAM policies and tag-based management enhanced both security and operational efficiency.

Most importantly, after implementing automation, we could finally leave work stress-free, even at night. 😊

Future Enhancements to Consider

To further improve our EC2 scheduling automation system, we are exploring the following additional features:

Slack Notification Integration

  • Send alerts to the team channel when EC2 instances start or stop automatically.
  • Improves operational transparency by allowing developers to monitor instance status in real time.

Auto-Scaling with CloudWatch

  • Analyze EC2 instance metrics such as CPU usage and network traffic.
  • Implement auto-scaling to shut down instances when usage is low and launch additional instances when demand increases.

Automated Cost Reports

  • Generate and email a monthly EC2 cost report.
  • Track cost savings and identify further optimization opportunities.

By implementing these enhancements, we expect to achieve smarter infrastructure management, maximize operational efficiency, and further reduce costs. 🚀

Closing

We introduced an EC2 automation approach using AWS Lambda and EventBridge. By implementing this system, our team successfully achieved both cost optimization and reduced operational overhead.

If you’re facing similar challenges, we highly recommend trying this automation for yourself. With efficient resource management and cost optimization, you can create a more streamlined and effective operational environment. 😊

  • Robb Lee
    Robb Lee

    Technical Project Manager

    Robb is a skilled Technical Project Manager specializing in data security, governance, and cloud-based solutions At QueryPie, he supports the development and operation of solutions, ensuring organizations can efficiently manage and protect their data with scalable, secure, and high-performance technologies. Additionally, he helps clients seamlessly and safely manage their data through QueryPie.

3 Minutes to Wow !

Let us show you how QueryPie can transform the way you govern and share your sensitive data.

Take a Virtual Tour