AWS Network Firewall Routing

Recently at work we started using AWS Network Firewall to meet a compliance objective, and in doing so I learned some of the tricks and gotchas around how to set up your network routing in various network architectures. AWS has some of this documented pretty well and some of it documented pretty terribly. In this post I’ll explain what I learned and hopefully the next person who Googles it will end up here.

Before you started using AWS Network Firewall

Prior to using AWS Network Firewall, you probably had one or more private subnets which run your workloads, and one or more public subnets which run your NAT Gateways. Your private subnet routes 0.0.0.0/0 to the NAT Gateway, and your public subnet routes 0.0.0.0/0 to an IGW. It’s a pretty common architecture, like this:

architecture 1

Then you add a subnet for AWS Network Firewall

Then when you add AWS Network Firewall, it needs its own subnet. This is the architecture that AWS documents here.

Your private subnet still routes 0.0.0.0/0 to a NAT Gateway in your public subnet, exactly as it did before.

But now your public subnet routes 0.0.0.0/0 to the Firewall endpoint interface, and your firewall subnet routes 0.0.0.0/0 to your IGW. Like this (and this is the same idea as seen in the AWS doc linked above):

architecture 2

The first gotcha, reverse routing from the IGW

Here’s the first gotcha that I ran into, and in my opinion AWS documents this pretty terribly and it’s a concept that probably most engineers have never heard of. Your IGW needs its own routing table to handle reverse-routing (eg. internet traffic coming back into your VPC, which needs to route through the network firewall).

You need an edge routing table, which is a separate routing table that is not attached to any subnets. It gets attached to your IGW. If you do this in the AWS Console, create a new routing table called production-edge (or whatever). Look in the Edge Associations tab of the routing table, and attach the routing table to your IGW. Then create a route within this routing table which routes your public subnet CIDR (10.0.10.0/24 in my diagram above) to your network firewall.

Or if you use Terraform, you can do it like this:

resource "aws_route_table" "edge" {
  vpc_id = aws_vpc.default.id

  tags = {
    Name = "production-edge"
  }
}

resource "aws_route" "first_az" {
  route_table_id            = aws_route_table.edge.id
  destination_cidr_block    = "10.0.10.0/24"
  vpc_endpoint_id           = ENDPOINT_ID_OF_YOUR_FIREWALL
}

resource "aws_route_table_association" "edge" {
  gateway_id     = aws_internet_gatway.foo.id
  route_table_id = aws_route_table.edge.id
}

Referencing the ENDPOINT_ID_OF_YOUR_FIREWALL can be kind of tricky in Terraform, and this article isn’t intended as a Terraform article per se, but this github comment was helpful for me.

And if you only use 1 AZ, that’s all there is to it. You should now be done.

If you use multiple AZ’s with one NAT Gatway in each AZ

If you use multiple AZ’s with one NAT Gateway in each AZ and one firewall endpoint in each AZ, then the architecture and the routing are pretty straightforward. You just do the same thing as seen above, but in more AZ’s. Like this:

architecture 3

As for the reverse / edge routing in this scenario, you still only have one production-edge routing table, but now you add more routes to it for each additional subnet / each additional firewall endpoint ID.

resource "aws_route" "first_az" {
  route_table_id            = aws_route_table.edge.id
  destination_cidr_block    = "10.0.10.0/24"
  vpc_endpoint_id           = ENDPOINT_ID_OF_YOUR_FIREWALL_IN_AZ1
}

resource "aws_route" "second_az" {
  route_table_id            = aws_route_table.edge.id
  destination_cidr_block    = "10.0.11.0/24"
  vpc_endpoint_id           = ENDPOINT_ID_OF_YOUR_FIREWALL_IN_AZ2
}

resource "aws_route" "third_az" {
  route_table_id            = aws_route_table.edge.id
  destination_cidr_block    = "10.0.12.0/24"
  vpc_endpoint_id           = ENDPOINT_ID_OF_YOUR_FIREWALL_IN_AZ3
}

One other architecture example

What if you have subnets across multiple AZ’s, but you only deploy NAT Gateways and Firewall endpoints in one AZ, presumably because you’re working with a dev/test/staging VPC and high availability isn’t important? Like this:

architecture 4

In that case, you still need to set up the reverse routing for all three public subnet CIDR’s, but they would all route to the endpoint in AZ 1. Like this:

resource "aws_route" "first_az" {
  route_table_id            = aws_route_table.edge.id
  destination_cidr_block    = "10.0.10.0/24"
  vpc_endpoint_id           = ENDPOINT_ID_OF_YOUR_FIREWALL_IN_AZ1
}

resource "aws_route" "second_az" {
  route_table_id            = aws_route_table.edge.id
  destination_cidr_block    = "10.0.11.0/24"
  vpc_endpoint_id           = ENDPOINT_ID_OF_YOUR_FIREWALL_IN_AZ1
}

resource "aws_route" "third_az" {
  route_table_id            = aws_route_table.edge.id
  destination_cidr_block    = "10.0.12.0/24"
  vpc_endpoint_id           = ENDPOINT_ID_OF_YOUR_FIREWALL_IN_AZ1
}

It would be cool if a 200 trillion dollar company like Amazon would have documented this more thoroughly themselves and this didn’t fall to random people on the internet to do it for them for free, but here we are. Hopefully this helps you.