This is a collection of notes covering Infrastructure as Code (IaC) principles, Terraform fundamentals, and automation best practices.
Infrastructure as Code Fundamentals
What is Infrastructure as Code?
Infrastructure as Code (IaC) is the practice of managing and provisioning computing infrastructure through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools.
Key Benefits:
- Version Control: Track infrastructure changes over time
- Reproducibility: Create identical environments consistently
- Automation: Reduce manual errors and deployment time
- Documentation: Infrastructure becomes self-documenting
- Cost Management: Better visibility into resource usage
IaC Principles
- Declarative: Describe the desired state, not the steps to achieve it
- Idempotent: Running the same configuration multiple times produces the same result
- Immutable: Replace infrastructure components rather than modifying them
- Version Controlled: All infrastructure definitions should be in source control
- Testable: Infrastructure should be validated before deployment
Terraform
Terraform makes it easy to describe your desired infrastructure as code. It takes care of invoking the appropriate APIs to turn your description of infrastructure into actual running resources.
Core Concepts
Configuration Language
Terraform uses HashiCorp Configuration Language (HCL), which is declarative and describes an intended goal rather than the steps to reach that goal.
Basic Syntax:
resource "aws_vpc" "main" {
cidr_block = var.base_cidr_block
}
<BLOCK TYPE> "<BLOCK LABEL>" "<BLOCK LABEL>" {
# Block body
<IDENTIFIER> = <EXPRESSION> # Argument
}
Blocks
Blocks are containers for other content and usually represent the configuration of some kind of object like a resource.
Common Block Types:
resource: Infrastructure objectsdata: Read-only informationprovider: Plugin configurationsvariable: Input parametersoutput: Return valuesmodule: Reusable configurations
Getting Started with Terraform
1. Provider Configuration
Specify the cloud provider and authentication:
# Configure the AWS Provider
provider "aws" {
region = "us-west-2"
# Authentication via environment variables:
# AWS_ACCESS_KEY_ID
# AWS_SECRET_ACCESS_KEY
# Or use AWS profiles/IAM roles
}
# Alternative provider examples
provider "google" {
project = "my-project-id"
region = "us-central1"
}
provider "azurerm" {
features {}
}
2. Resource Declaration
Define the infrastructure components you want to create:
# VPC
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
tags = {
Name = "main-vpc"
Environment = "production"
}
}
# Internet Gateway
resource "aws_internet_gateway" "main" {
vpc_id = aws_vpc.main.id
tags = {
Name = "main-igw"
}
}
# Subnet
resource "aws_subnet" "public" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.1.0/24"
availability_zone = "us-west-2a"
map_public_ip_on_launch = true
tags = {
Name = "public-subnet"
Type = "public"
}
}
# Security Group
resource "aws_security_group" "web" {
name_prefix = "web-"
vpc_id = aws_vpc.main.id
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "web-security-group"
}
}
# EC2 Instance
resource "aws_instance" "web" {
ami = "ami-0c02fb55956c7d316" # Amazon Linux 2
instance_type = "t3.micro"
subnet_id = aws_subnet.public.id
vpc_security_group_ids = [aws_security_group.web.id]
user_data = <<-EOF
#!/bin/bash
yum update -y
yum install -y httpd
systemctl start httpd
systemctl enable httpd
echo "<h1>Hello from Terraform!</h1>" > /var/www/html/index.html
EOF
tags = {
Name = "web-server"
}
}
Variables and Outputs
Input Variables
Make your configurations flexible and reusable:
# variables.tf
variable "environment" {
description = "Environment name"
type = string
default = "dev"
}
variable "instance_type" {
description = "EC2 instance type"
type = string
default = "t3.micro"
validation {
condition = contains([
"t3.micro", "t3.small", "t3.medium"
], var.instance_type)
error_message = "Instance type must be t3.micro, t3.small, or t3.medium."
}
}
variable "allowed_cidr_blocks" {
description = "CIDR blocks allowed to access the instance"
type = list(string)
default = ["0.0.0.0/0"]
}
variable "tags" {
description = "Default tags to apply to resources"
type = map(string)
default = {
Terraform = "true"
Owner = "infrastructure-team"
}
}
Using Variables:
resource "aws_instance" "web" {
instance_type = var.instance_type
tags = merge(var.tags, {
Name = "${var.environment}-web-server"
Environment = var.environment
})
}
Output Values
Return information about your infrastructure:
# outputs.tf
output "vpc_id" {
description = "ID of the VPC"
value = aws_vpc.main.id
}
output "instance_public_ip" {
description = "Public IP of the web server"
value = aws_instance.web.public_ip
}
output "instance_dns" {
description = "Public DNS name of the web server"
value = aws_instance.web.public_dns
sensitive = false
}
output "database_endpoint" {
description = "Database endpoint"
value = aws_rds_instance.main.endpoint
sensitive = true
}
Data Sources
Query existing infrastructure or external data:
# Get latest Amazon Linux AMI
data "aws_ami" "amazon_linux" {
most_recent = true
owners = ["amazon"]
filter {
name = "name"
values = ["amzn2-ami-hvm-*-x86_64-gp2"]
}
}
# Get current AWS region
data "aws_region" "current" {}
# Get current AWS account ID
data "aws_caller_identity" "current" {}
# Use data sources in resources
resource "aws_instance" "web" {
ami = data.aws_ami.amazon_linux.id
instance_type = "t3.micro"
tags = {
Name = "web-server"
Region = data.aws_region.current.name
AccountId = data.aws_caller_identity.current.account_id
}
}
Modules
Create reusable, composable infrastructure components:
Module Structure
modules/
└── vpc/
├── main.tf
├── variables.tf
├── outputs.tf
└── README.md
Module Definition (modules/vpc/main.tf)
resource "aws_vpc" "this" {
cidr_block = var.cidr_block
enable_dns_hostnames = var.enable_dns_hostnames
enable_dns_support = var.enable_dns_support
tags = merge(var.tags, {
Name = var.name
})
}
resource "aws_internet_gateway" "this" {
vpc_id = aws_vpc.this.id
tags = merge(var.tags, {
Name = "${var.name}-igw"
})
}
resource "aws_subnet" "public" {
count = length(var.public_subnet_cidrs)
vpc_id = aws_vpc.this.id
cidr_block = var.public_subnet_cidrs[count.index]
availability_zone = var.availability_zones[count.index]
map_public_ip_on_launch = true
tags = merge(var.tags, {
Name = "${var.name}-public-${count.index + 1}"
Type = "public"
})
}
Module Variables (modules/vpc/variables.tf)
variable "name" {
description = "Name prefix for VPC resources"
type = string
}
variable "cidr_block" {
description = "CIDR block for VPC"
type = string
default = "10.0.0.0/16"
}
variable "public_subnet_cidrs" {
description = "CIDR blocks for public subnets"
type = list(string)
default = ["10.0.1.0/24", "10.0.2.0/24"]
}
variable "availability_zones" {
description = "Availability zones for subnets"
type = list(string)
}
variable "tags" {
description = "Tags to apply to all resources"
type = map(string)
default = {}
}
Module Outputs (modules/vpc/outputs.tf)
output "vpc_id" {
description = "ID of the VPC"
value = aws_vpc.this.id
}
output "public_subnet_ids" {
description = "IDs of the public subnets"
value = aws_subnet.public[*].id
}
output "internet_gateway_id" {
description = "ID of the Internet Gateway"
value = aws_internet_gateway.this.id
}
Using Modules
module "vpc" {
source = "./modules/vpc"
name = "production"
cidr_block = "10.0.0.0/16"
availability_zones = ["us-west-2a", "us-west-2b"]
public_subnet_cidrs = [
"10.0.1.0/24",
"10.0.2.0/24"
]
tags = {
Environment = "production"
Project = "web-app"
}
}
# Use module outputs
resource "aws_instance" "web" {
subnet_id = module.vpc.public_subnet_ids[0]
# ... other configuration
}
State Management
Local State
By default, Terraform stores state locally in terraform.tfstate:
# Initialize Terraform
terraform init
# Plan changes
terraform plan
# Apply changes
terraform apply
# Show current state
terraform show
# List resources in state
terraform state list
Remote State
For team environments, use remote state storage:
# Configure S3 backend
terraform {
backend "s3" {
bucket = "my-terraform-state"
key = "production/terraform.tfstate"
region = "us-west-2"
# DynamoDB table for state locking
dynamodb_table = "terraform-locks"
encrypt = true
}
}
State Commands
# Import existing resource
terraform import aws_instance.web i-1234567890abcdef0
# Remove resource from state (doesn't destroy)
terraform state rm aws_instance.web
# Move resource in state
terraform state mv aws_instance.web aws_instance.app
# Refresh state from real infrastructure
terraform refresh
Terraform Workflow
1. Development Workflow
# 1. Initialize working directory
terraform init
# 2. Format and validate configuration
terraform fmt
terraform validate
# 3. Plan changes
terraform plan -out=tfplan
# 4. Apply changes
terraform apply tfplan
# 5. Clean up (when needed)
terraform destroy
2. CI/CD Integration
# GitHub Actions example
name: Terraform
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
jobs:
terraform:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Setup Terraform
uses: hashicorp/setup-terraform@v1
with:
terraform_version: 1.0.0
- name: Terraform Init
run: terraform init
- name: Terraform Format Check
run: terraform fmt -check
- name: Terraform Validate
run: terraform validate
- name: Terraform Plan
run: terraform plan
- name: Terraform Apply
if: github.ref == 'refs/heads/main'
run: terraform apply -auto-approve
Advanced Terraform Concepts
Workspaces
Manage multiple environments with the same configuration:
# Create and switch to workspace
terraform workspace new staging
terraform workspace new production
# List workspaces
terraform workspace list
# Switch workspace
terraform workspace select production
# Use workspace in configuration
resource "aws_instance" "web" {
instance_type = terraform.workspace == "production" ? "t3.medium" : "t3.micro"
tags = {
Environment = terraform.workspace
}
}
Provisioners
Execute scripts on resources:
resource "aws_instance" "web" {
# ... other configuration
# Remote exec provisioner
provisioner "remote-exec" {
inline = [
"sudo yum update -y",
"sudo yum install -y httpd",
"sudo systemctl start httpd"
]
connection {
type = "ssh"
user = "ec2-user"
private_key = file("~/.ssh/id_rsa")
host = self.public_ip
}
}
# Local exec provisioner
provisioner "local-exec" {
command = "echo Instance ${self.id} created at ${timestamp()}"
}
}
Dynamic Blocks
Generate repeated nested blocks:
resource "aws_security_group" "web" {
name_prefix = "web-"
dynamic "ingress" {
for_each = var.ingress_ports
content {
from_port = ingress.value
to_port = ingress.value
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
}
}
variable "ingress_ports" {
type = list(number)
default = [80, 443, 22]
}
Best Practices
1. Code Organization
terraform/
├── environments/
│ ├── dev/
│ ├── staging/
│ └── production/
├── modules/
│ ├── vpc/
│ ├── security/
│ └── compute/
├── shared/
│ ├── variables.tf
│ └── outputs.tf
└── scripts/
├── deploy.sh
└── validate.sh
2. Security Best Practices
Sensitive Data Management:
# Use environment variables for secrets
variable "database_password" {
description = "Database password"
type = string
sensitive = true
}
# Mark outputs as sensitive
output "database_password" {
value = random_password.db_password.result
sensitive = true
}
# Use AWS Secrets Manager
resource "aws_secretsmanager_secret" "db_password" {
name = "database-password"
}
resource "aws_secretsmanager_secret_version" "db_password" {
secret_id = aws_secretsmanager_secret.db_password.id
secret_string = random_password.db_password.result
}
Resource Tagging:
locals {
common_tags = {
Environment = var.environment
Project = var.project_name
ManagedBy = "terraform"
Owner = var.team_email
CostCenter = var.cost_center
}
}
resource "aws_instance" "web" {
# ... other configuration
tags = local.common_tags
}
3. Performance Optimization
Use Data Sources Efficiently:
# Cache data source results with locals
locals {
availability_zones = data.aws_availability_zones.available.names
}
data "aws_availability_zones" "available" {
state = "available"
}
Minimize Provider Calls:
# Use for_each instead of count when possible
resource "aws_subnet" "private" {
for_each = var.private_subnets
vpc_id = aws_vpc.main.id
cidr_block = each.value.cidr
availability_zone = each.value.az
tags = {
Name = each.key
}
}
Troubleshooting
Common Issues
State Lock:
# Force unlock (use with caution)
terraform force-unlock LOCK_ID
Import Existing Resources:
# Import resource to state
terraform import aws_instance.web i-1234567890abcdef0
Debug Mode:
# Enable detailed logging
export TF_LOG=DEBUG
terraform plan
Validate Configuration:
# Check syntax and validate
terraform fmt -check
terraform validate
terraform plan -detailed-exitcode
Infrastructure Testing
1. Terraform Validate
terraform validate
2. Unit Tests with Terratest
// test/terraform_test.go
package test
import (
"testing"
"github.com/gruntwork-io/terratest/modules/terraform"
"github.com/stretchr/testify/assert"
)
func TestTerraformVPC(t *testing.T) {
terraformOptions := &terraform.Options{
TerraformDir: "../",
Vars: map[string]interface{}{
"environment": "test",
},
}
defer terraform.Destroy(t, terraformOptions)
terraform.InitAndApply(t, terraformOptions)
vpcId := terraform.Output(t, terraformOptions, "vpc_id")
assert.NotEmpty(t, vpcId)
}
3. Policy as Code
# Using Sentinel (Terraform Cloud/Enterprise)
import "tfplan"
main = rule {
all tfplan.resource_changes as _, rc {
rc.type is "aws_instance" implies
rc.change.after.instance_type in ["t3.micro", "t3.small"]
}
}
Cost Management
1. Resource Tagging for Cost Allocation
locals {
cost_tags = {
CostCenter = var.cost_center
Project = var.project
Environment = var.environment
Owner = var.owner
}
}
2. Right-sizing Resources
variable "instance_types" {
type = map(string)
default = {
dev = "t3.micro"
staging = "t3.small"
production = "t3.medium"
}
}
resource "aws_instance" "web" {
instance_type = var.instance_types[var.environment]
}
3. Scheduled Resources
# Auto-scaling schedule for non-production
resource "aws_autoscaling_schedule" "scale_down" {
count = var.environment != "production" ? 1 : 0
scheduled_action_name = "scale-down"
min_size = 0
max_size = 0
desired_capacity = 0
recurrence = "0 18 * * MON-FRI"
autoscaling_group_name = aws_autoscaling_group.web.name
}
This comprehensive guide covers the essential concepts and practices for Infrastructure as Code using Terraform, providing a solid foundation for managing infrastructure programmatically.