Multi-VPC Transit Gateway with Site-to-Site VPN
Overview
I built an automated hybrid cloud architecture connecting three AWS VPCs via Transit Gateway with StrongSwan IPsec VPN on an on-prem device, and deployed it using Terraform.
You can find the code files in this GITHUB REPO. In order to run it you will first need to know the following:
- You need to know which AWS region you are using
- You need to create your own SSH Keypair to log into EC2 instances
- Your IP address (for SSH access)
Check out the project's readme file for instructions on how to download it and run it on your machine
Cloud Architecture
I used the these resources to build the architecture in the diagram above
- 3 VPCs in the cloud (VPC-A: 10.0.0.0/16, VPC-B: 10.1.0.0/16, VPC-C: 10.2.0.0/16)
- 1 VPC simulating on-premises network (10.100.0.0/16)
- AWS Transit Gateway providing hub-and-spoke connectivity between cloud VPCs
- Site-to-Site VPN connection between Transit Gateway and on-premises VPN endpoint
- StrongSwan IPsec VPN software on EC2 instance acting as Customer Gateway
Code Snippets
Transit Gateway with Route Table association
resource "aws_ec2_transit_gateway" "tgw" {
description = "main-tgw"
amazon_side_asn = 64512
auto_accept_shared_attachments = "disable"
default_route_table_association = "disable"
default_route_table_propagation = "disable"
dns_support = "enable"
tags = {
Name = "main-tgw"
Environment = "prod"
}
}
resource "aws_ec2_transit_gateway_route_table" "tgw_rt" {
transit_gateway_id = aws_ec2_transit_gateway.tgw.id
}
resource "aws_ec2_transit_gateway_route_table_association" "tgw-rt-assoc-vpc-a" {
transit_gateway_attachment_id = aws_ec2_transit_gateway_vpc_attachment.tgw-at-vpc-a.id
transit_gateway_route_table_id = aws_ec2_transit_gateway_route_table.tgw_rt.id
}
VPN Connection
resource "aws_vpn_connection" "vpn_on_prem_1a" {
customer_gateway_id = aws_customer_gateway.cgw_on_prem_1a.id
transit_gateway_id = aws_ec2_transit_gateway.tgw.id
type = "ipsec.1"
static_routes_only = true
tags = {
Name = "vpn-on-prem-1a"
}
}
resource "aws_ec2_transit_gateway_route" "route_to_onprem" {
transit_gateway_route_table_id = aws_ec2_transit_gateway_route_table.tgw_rt.id
destination_cidr_block = "10.100.0.0/16"
transit_gateway_attachment_id = aws_vpn_connection.vpn_on_prem_1a.transit_gateway_attachment_id
}
EC2 with Source/Dest Check Disabled
resource "aws_instance" "ec2-on-prem-1a" {
ami = data.aws_ami.ubuntu.id
instance_type = "t3.micro"
subnet_id = aws_subnet.subnet-on-prem-1a.id
key_name = "MY_EC2_INSTANCE_KEYPAIR"
vpc_security_group_ids = [aws_security_group.ec2-on-prem-1a-sg.id]
user_data_base64 = base64encode(templatefile("user_data.sh", {}))
source_dest_check = false
tags = {
Name = "ec2-on-prem-1a"
}
}
resource "aws_eip" "eip-on-prem-1a" {
instance = aws_instance.ec2-on-prem-1a.id
domain = "vpc"
tags = {
Name = "eip-on-prem-1a"
}
}
Technologies used
- Infrastructure as Code: Terraform
- Github for tracking the version control
- Bash for storing environment variables, managing key pairs + SSH-ing into EC2 instsances
- VPN: StrongSwan (IPsec)
- Operating System: Ubuntu 22.04
- AWS components:
- VPC
- EC2
- Transit Gateway
- Site-to-Site VPN
- Elastic IP
- Security Groups
- Route Tables
Challenges & Key Takeaways
This is a fun project for anyone starting out as it seems easy at first, but then you need to ensure many things work together
HIGH AVAILABILITY: Understanding *Availability Zones (AZs)* and their relationship to VPCs and subnets was crucial for designing resilient architectures. A key learning: while a VPC spans all Availability Zones in a region, each subnet exists in only one AZ. In this project, I deployed each VPC with a single subnet for simplicity, but this creates a single point of failure. If that specific AZ experiences an outage, the entire VPC becomes unavailable. In production, you would deploy multiple subnets across different AZs within the same VPC - if one AZ goes down, resources in other AZs continue operating, ensuring high availability. This architectural decision highlighted the trade-off between learning project simplicity and production-grade resilience.
VPN & SECURITY: Configuring the *Site-to-Site VPN* gave me hands-on experience with the IPsec protocol and establishing the IPSEC tunnel. I learned about the two-phase handshake: IKE Phase 1 establishes the secure control channel and authenticates both sides using the pre-shared key, while IKE Phase 2 negotiates the actual data tunnel parameters. Watching the tunnel status change from "connecting" to "ESTABLISHED" in the StrongSwan logs made these abstract concepts concrete. Understanding Dead Peer Detection (DPD) was also important - it's how both sides monitor tunnel health and automatically re-establish connections if one side becomes unreachable.
NETWORKING: Transit Gateway as a routing hub. Transit Gateway transformed what would have been a complex mesh of VPC peering connections into a clean hub-and-spoke architecture. Instead of creating individual peering connections between each VPC pair (which would have required three connections for three VPCs), Transit Gateway serves as a central router. All VPCs connect to the Transit Gateway, and it handles routing between them using a single route table. This design is far more scalable - adding a fourth or fifth VPC simply means attaching it to the existing Transit Gateway rather than creating multiple new peering connections.
SOURCE/DESTINATION CHECKS: This was a critical learning moment. By default, AWS performs source/destination checks on EC2 instances, dropping any packets that aren't explicitly from or to that instance's IP address. This security feature prevented my on-premises EC2 from routing VPN traffic - packets destined for the cloud VPCs were being dropped because they weren't addressed to the EC2 itself. Setting source_dest_check = false in Terraform allowed the instance to act as a router, forwarding traffic between the VPN tunnel and the local subnet. This is essential for any EC2 instance acting as a NAT device, VPN gateway, or router.
TGW AS ROUTING HUB: Transit Gateway transformed what would have been a complex mesh of VPC peering connections into a clean hub-and-spoke architecture. Instead of creating individual peering connections between each VPC pair (which would have required three connections for three VPCs), Transit Gateway serves as a central router. All VPCs connect to the Transit Gateway, and it handles routing between them using a single route table. This design is far more scalable - adding a fourth or fifth VPC simply means attaching it to the existing Transit Gateway rather than creating multiple new peering connections.