Tutorial 04: Remote State and Locking

Brutal Truth Up Front

Local state files are development-only toys. In production, they guarantee eventual state corruption from concurrent applies or lost laptops.

Remote state with locking isn’t optional for teams - it’s the minimum viable configuration. Without it, two engineers can simultaneously destroy each other’s changes, and you won’t know until production breaks.

Prerequisites

Completed Tutorials 01-03
AWS account with permissions to create S3 buckets and DynamoDB tables
Understanding of state file purpose

What You’ll Build

An S3 backend for state storage with DynamoDB table for state locking. Then you’ll intentionally trigger lock conflicts to see the protection in action.

The Exercise

Step 1: Create Backend Infrastructure

We need to bootstrap remote state, which creates a chicken-and-egg problem: How do you use Terraform to create infrastructure for Terraform state?

Answer: Create the S3 bucket and DynamoDB table with local state first, then migrate.

Create bootstrap/main.tf:

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

provider "aws" {
  region = "us-east-1"
}

resource "aws_s3_bucket" "terraform_state" {
  bucket = "terraform-state-${data.aws_caller_identity.current.account_id}"
  
  tags = {
    Name      = "Terraform State Bucket"
    ManagedBy = "Terraform"
  }
}

resource "aws_s3_bucket_versioning" "terraform_state" {
  bucket = aws_s3_bucket.terraform_state.id
  
  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_s3_bucket_server_side_encryption_configuration" "terraform_state" {
  bucket = aws_s3_bucket.terraform_state.id
  
  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "AES256"
    }
  }
}

resource "aws_s3_bucket_public_access_block" "terraform_state" {
  bucket = aws_s3_bucket.terraform_state.id
  
  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

resource "aws_dynamodb_table" "terraform_locks" {
  name         = "terraform-state-locks"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "LockID"
  
  attribute {
    name = "LockID"
    type = "S"
  }
  
  tags = {
    Name      = "Terraform State Locks"
    ManagedBy = "Terraform"
  }
}

data "aws_caller_identity" "current" {}

output "s3_bucket_name" {
  value = aws_s3_bucket.terraform_state.id
}

output "dynamodb_table_name" {
  value = aws_dynamodb_table.terraform_locks.name
}

Apply this:

cd bootstrap
terraform init
terraform apply

Note the bucket and table names from outputs.

Step 2: Configure Remote Backend

Create your main project in a separate directory:

cd ..
mkdir my-project
cd my-project

Create backend.tf:

terraform {
  backend "s3" {
    bucket         = "terraform-state-YOUR-ACCOUNT-ID"  # Replace with your bucket
    key            = "my-project/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-state-locks"
  }
}

Create simple main.tf:

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

provider "aws" {
  region = "us-east-1"
}

resource "aws_s3_bucket" "example" {
  bucket = "example-bucket-${random_id.suffix.hex}"
}

resource "random_id" "suffix" {
  byte_length = 4
}

output "bucket_name" {
  value = aws_s3_bucket.example.bucket
}

Step 3: Initialize with Remote Backend

terraform init

Terraform configures the S3 backend. Check S3 - you’ll see my-project/terraform.tfstate stored there.

Apply to create resources:

terraform apply

Now check ls -la in your directory. No local terraform.tfstate file! It’s in S3.

Step 4: Test State Locking

Open two terminal windows to the same directory.

Terminal 1:

terraform apply

When it prompts for confirmation, don’t type yes yet. This acquires a lock.

Terminal 2:

terraform plan

You’ll see:

Error: Error acquiring the state lock

Lock Info:
  ID:        abc123-def456-...
  Path:      terraform-state-YOUR-ACCOUNT-ID/my-project/terraform.tfstate
  Operation: OperationTypeApply
  Who:       your-user@hostname
  Version:   1.5.0
  Created:   2025-02-05 10:30:00 UTC

The lock prevents concurrent modifications. Check DynamoDB table terraform-state-locks - you’ll see an entry while Terminal 1 holds the lock.

Type yes in Terminal 1 to complete. The lock releases. Now Terminal 2 can acquire it.

Step 5: Force-Unlock (Emergency Only)

If an apply crashes and leaves a dangling lock:

terraform force-unlock LOCK-ID

Replace LOCK-ID with the ID from the error message.

Warning: Only do this if you’re certain no other process is actually running. Force-unlocking during an active apply causes state corruption.

The Break (Intentional Failure Scenarios)

Scenario 1: Lost Lock Detection

Simulate a crashed process:

# Terminal 1
terraform apply
# Type 'yes'
# While it's running, kill the process: Ctrl+C or Ctrl+Z

# Terminal 2
terraform plan

You’ll see a lock error. The lock persists even though the process died. This is intentional - Terraform doesn’t know if the process is dead or just slow.

Force unlock:

terraform force-unlock LOCK-ID

Scenario 2: Backend Configuration Mismatch

Change backend.tf to wrong bucket:

terraform {
  backend "s3" {
    bucket         = "nonexistent-bucket-name"
    key            = "my-project/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-state-locks"
  }
}

Try to init:

terraform init -reconfigure

Terraform errors: bucket doesn’t exist. If you’d applied without noticing, you’d create a new empty state in the wrong location.

The Recovery

Migrating Local State to Remote

If you have an existing project with local state:

Add backend configuration to existing project
Run init with migrate flag:

terraform init -migrate-state

Terraform detects local state, asks to copy to remote backend, and transfers it.

Recovering from Lost State Lock

If you’re certain no other process is running:

terraform force-unlock LOCK-ID

Check who created the lock before force-unlocking:

# The error message shows:
#   Who: larue@laptop
# Verify that's not an active session

State File Versioning Recovery

If someone corrupts state, S3 versioning saves you:

aws s3api list-object-versions \
  --bucket terraform-state-YOUR-ACCOUNT-ID \
  --prefix my-project/terraform.tfstate

aws s3api get-object \
  --bucket terraform-state-YOUR-ACCOUNT-ID \
  --key my-project/terraform.tfstate \
  --version-id VERSION-ID \
  terraform.tfstate.recovered

Replace state with recovered version:

terraform state push terraform.tfstate.recovered

Exit Criteria

You understand this tutorial if you can:

Configure S3 backend with DynamoDB locking
Explain why DynamoDB locking prevents state corruption
Identify when force-unlock is appropriate (almost never)
Migrate existing local state to remote backend safely
Recover previous state versions using S3 versioning

Key Lessons

Remote state is required for teams - local state doesn’t scale
State locking prevents corruption from concurrent operations
S3 versioning is your safety net when mistakes happen
Force-unlock is dangerous - only use when certain no process is running
Backend configuration lives in code - treat it like infrastructure

Why This Matters in Production

On DoD programs with multiple engineers:

Without remote state:

Engineer A applies changes
Engineer B applies changes 5 minutes later
Engineer B’s state file overwrites A’s
Engineer A’s resources exist but aren’t in state
Next apply destroys A’s resources because Terraform doesn’t know about them

With remote state + locking:

Engineer A applies, acquires lock
Engineer B’s apply waits for lock
Sequential operations, no corruption
State versioning provides audit trail for compliance

Real incident: Developer forgot to configure backend, deployed with local state, lost laptop. $12K in AWS resources running with no Terraform management. Had to manually import everything.

FedRAMP High Configuration

Production backend for classified environments:

terraform {
  backend "s3" {
    bucket         = "terraform-state-fedramp-prod"
    key            = "infrastructure/prod/terraform.tfstate"
    region         = "us-gov-west-1"
    encrypt        = true
    kms_key_id     = "arn:aws-us-gov:kms:us-gov-west-1:123456789012:key/abc-def"
    dynamodb_table = "terraform-state-locks-fedramp"
    
    # GovCloud requires role assumption
    role_arn       = "arn:aws-us-gov:iam::123456789012:role/TerraformBackendAccess"
  }
}

Additional requirements:

KMS encryption with GovCloud key
IAM role assumption (not access keys)
MFA for state modification
CloudTrail logging of all state access

Next Steps

Tutorial 05: Importing Existing Infrastructure - Learn to adopt unmanaged resources into Terraform without recreation.

Cleanup

terraform destroy

# Optionally destroy bootstrap resources
cd ../bootstrap
terraform destroy

Note: Destroying the S3 bucket requires it to be empty. If state files exist, remove them first:

aws s3 rm s3://terraform-state-YOUR-ACCOUNT-ID --recursive

Prerequisites

Brutal Truth Up Front

Prerequisites

What You’ll Build

The Exercise

Step 1: Create Backend Infrastructure

Step 2: Configure Remote Backend

Step 3: Initialize with Remote Backend

Step 4: Test State Locking

Step 5: Force-Unlock (Emergency Only)

The Break (Intentional Failure Scenarios)

Scenario 1: Lost Lock Detection

Scenario 2: Backend Configuration Mismatch

The Recovery

Migrating Local State to Remote

Recovering from Lost State Lock

State File Versioning Recovery

Exit Criteria

Key Lessons

Why This Matters in Production

FedRAMP High Configuration

Next Steps

Cleanup

Additional Resources

Keywords

Need Help Implementing This?