Detecting Infrastructure Drift with Terraform and Python

The Problem

Infrastructure drift is one of those problems that sneaks up on you. Someone makes a manual change in the AWS console, a script modifies a security group, or an auto-scaling event creates resources that Terraform doesn't know about. Before you know it, your Terraform state and actual infrastructure are out of sync.

What is Infrastructure Drift?

Drift occurs when the actual state of your cloud resources differs from what your Infrastructure as Code (IaC) tool believes the state to be. This can lead to:

Security vulnerabilities — ports opened manually that bypass your code review process
Deployment failures — Terraform plans that fail because reality doesn't match expectations
Cost overruns — orphaned resources that nobody knows about

Building AWSDriftGuard

I built AWSDriftGuard as a Python CLI tool that compares AWS resources against Terraform state files. The core architecture is straightforward:

def detect_drift(terraform_state, aws_resources):
    drift_report = []
 
    for resource_type, tf_resources in terraform_state.items():
        aws_actual = aws_resources.get(resource_type, [])
 
        # Find resources in Terraform but not in AWS (deleted)
        for tf_res in tf_resources:
            if tf_res["id"] not in [a["id"] for a in aws_actual]:
                drift_report.append({
                    "type": "DELETED",
                    "resource": tf_res,
                })
 
        # Find resources in AWS but not in Terraform (unmanaged)
        for aws_res in aws_actual:
            if aws_res["id"] not in [t["id"] for t in tf_resources]:
                drift_report.append({
                    "type": "UNMANAGED",
                    "resource": aws_res,
                })
 
    return drift_report

Supported Resources

The tool currently supports drift detection for:

EC2 instances — including tags, security groups, and instance types
S3 buckets — policies, versioning, and encryption settings
RDS instances — parameter groups and backup configurations
IAM roles — attached policies and trust relationships
Security groups — inbound and outbound rules

Slack Integration

Detection is only useful if the right people know about it. I integrated the Slack API to send drift reports directly to the team's infrastructure channel:

def send_slack_notification(drift_report, channel):
    blocks = format_drift_as_blocks(drift_report)
    client.chat_postMessage(
        channel=channel,
        blocks=blocks,
        text=f"Infrastructure drift detected: {len(drift_report)} issues found"
    )

Running Modes

AWSDriftGuard supports two modes:

Detect Mode

Outputs drift results to the console. Perfect for local development and CI pipelines.

awsdriftguard detect --state ./terraform.tfstate --region us-east-1

Report Mode

Sends a formatted drift report to Slack, ideal for scheduled runs via cron or CloudWatch Events.

awsdriftguard report --state ./terraform.tfstate --slack-channel #infrastructure

Detecting Infrastructure Drift with Terraform and Python

The Problem

What is Infrastructure Drift?

Building AWSDriftGuard

Supported Resources

Slack Integration

Running Modes

Detect Mode

Report Mode

Lessons Learned

State file parsing requires care

Rate limiting

What's Next