Scalable Web App with ALB, ASG and WAF

Overview

I built scalable web application architecture on AWS using Terraform. It features a custom VPC, public subnets, an Application Load Balancer, Auto Scaling Group, and EC2 instances. I also integrated WAF protection and security groups to ensure robust access control while supporting high availability and traffic spikes.

You can find the project in this GITHUB REPO. Here's what you need to run it:

You need to know your AWS Credentials
You need to create your own Keypair to log into EC2 instances
Knowing how to check your own IP address can help

Check out the project's readme file for instructions on how to download it and run it on your machine

Cloud Architecture

I used the these resources to build the architecture in the diagram above

VPC built with 2 public subnets, each in its own Availability Zone
An internet gateway
EC2 instances, inside a Target Group
ASG pointing to Target Group, with a min, max and desired no. of EC2s
1 Route Table
Web Application Firewall pointing to the ALB

Code Snippets

Web Application Firewall & Association

									
# Rule for WAF
resource "aws_wafv2_web_acl" "web-acl" {
  name        = "app-WAF"
  scope       = "REGIONAL"
  description = "WAF with IPSet rule"
  
  default_action {
    allow {}
  }

  rule {
    name     = "allow-my-ip"
    priority = 1

    action {
      allow {}
    }

    statement {
      ip_set_reference_statement {
        arn = aws_wafv2_ip_set.my_ipSet.arn
      }
    }

    visibility_config {
      cloudwatch_metrics_enabled = true
      sampled_requests_enabled   = true
      metric_name                = "allow-my-ip"
    }
  }

  visibility_config {
    cloudwatch_metrics_enabled = true
    sampled_requests_enabled   = true
    metric_name                = "app-waf"
  }
}

# WAF-ALB association 

resource "aws_wafv2_web_acl_association" "waf-alb-association" {
  resource_arn = aws_lb.app_alb.arn
  web_acl_arn  = aws_wafv2_web_acl.web-acl.arn
}

Launch Template & Auto Scaling Group

									
# Create Launch Template for Auto-Scaling Group
resource "aws_launch_template" "launch_template_for_asg" {
  name_prefix   = "my-template-"
  image_id      = data.aws_ami.ubuntu.id
  instance_type = "t2.micro"
  key_name      = var.key_name

  network_interfaces {
    security_groups = [aws_security_group.ec2-sg.id]
    associate_public_ip_address = true
  }

  # user_data = base64encode(file("user_data.sh"))
  user_data = base64encode(templatefile("user_data.sh", {}))
}

# # Create Auto-Scaling Group 
resource "aws_autoscaling_group" "asg_for_main_vpc" {
  desired_capacity     = 2
  max_size             = 3
  min_size             = 1
  vpc_zone_identifier  = [aws_subnet.public_subnet1a.id, aws_subnet.public_subnet1b.id]
  target_group_arns    = [aws_lb_target_group.alb-tg.arn]

  launch_template {
    id      = aws_launch_template.launch_template_for_asg.id
    version = "$Latest"
  }

  tag {
    key                 = "Name"
    value               = "ASG-Instance"
    propagate_at_launch = true
  }

}

Application Load Balancer & Listener

	
#Create ALB
resource "aws_lb" "app_alb" {
  name               = "app-alb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.alb_sg.id]
  subnets            = [
    aws_subnet.public_subnet1a.id,
    aws_subnet.public_subnet1b.id
  ]

  tags = {
    Name = "app-alb"
  }
}

# Create ALB listener

resource "aws_lb_listener" "http_listener" {
  load_balancer_arn = aws_lb.app_alb.arn
  port              = 80
  protocol          = "HTTP"

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.alb-tg.arn
  }
}

Software used

Terraform for provisioning platform and application infrastructure
Git version control
Bash for storing environment variables, managing key pairs + SSH-ing into EC2 instsances
VSCode editor

Challenges I faced and lessons learned

This is a great project because I learned a lot!

Applitcation vs Network layer protocols: I learned all about differences between protocols in the various layers of the OSI model. When running Terraform Apply I would get errors like this: Error: updating Security Group (sg-0fc824fe31b5bbc9d) ingress rules [..] Invalid value 'http' for IP protocol. Unknown protocol. I couldn't work it out until I checked my security groups and saw I gave them this value protocol = "http". HTTP is an APPLICATION layer protocol (i.e. OSI layer 7) while Terraform expects NETWORK-layer protocols, such as TCP or UDP!

Terraform provisioning can be slow: Terraform provisioning is slow. Its so cool that you can just type in "terraform apply" and your whole infrastructure will start being created. But it can take a few minutes. It's not instantaneous. You will need to learn to be patient. Get used to hitting terraform apply, going to get some tea, and coming back to see the results.

Pointing: In the cloud many resources will point to another resource. For instance, EC2's SG points to the ALB, letting in only traffic from the ALB. And the ALB has its own SG. Also the WAF points to the ALB. You need to focus on the flow of data and see the order in which it arrives at each resource. Understanding where data is coming from will often dictate how a resource is created or how the resource uses it.

The order of what you build: In theory the order in which you build your resources doesn't technically matter as long as you build them correctly because Terraform can work out the order. However, I find that understanding the connection between the resources is crucial. An engineer ultimately must ensure not that they all work perfectly on an individual level, but in unison. Therefore, it makes sense to me to create a Target Group before an ASG given that the ASG will point to it. It also makes sense to create a Launch Template before the ASG because the ASG requires one. After this, you can move onto the ALB, ensuring that the correct number of "desired" EC2's are in fact being provisioned and that you can indeed load balance between them on the ALB's dns link. Once that all works, then you can move on to the Firewall, first blocking your own IP and then allowing it. Ensuring that it is indeed picking up your IP or giving you a 404 error when necessary. It's similar to building a house. Start with the foundations, then do the walls, and then finish with the roof, right? I think it makes sense to build in the order of the data flow