On this post I decided to continue exploring AWS VPC connectivity and talk about how to connect VPCs. If you have VPCs on the same region, you could simply use VPC peering and be done with it. But if your VPCs are located in different regions, you'll need to explore your options.
I decided to test and document one of the more inexpensive and simple options I could think of, full mesh connectivity between VPCs using IPsec site-to-site tunnels. And the inexpensive part is taken care of by using StrongSwan 5.4.0 on CentOS 7 to implement this.
Basically the scenario here is that I want to connect two VPCs on different regions:
us-east-1 VPC with IP addresses in 172.16.0.0/16;
us-east-2 VPC with IP addresses in 172.31.0.0/16.
It is a simple exercise to extrapolate this configuration to have additional VPCs connected to these two via full mesh, so I won't get into the specifics of this here. Consider that as homework.
IP Addresses and Security Groups
First, create one Elastic IP Address for each StrongSwan instance. Optionally, create a hostname for each in Route53 if you think that will help you later on.
Then, create one security group for each of the StrongSwan instances. Leave all outbound traffic as allowed, and create the following inbound rules:
Create an SSH rule to allow you to log into the box later on;
All trafficfrom all of the VPC IP address ranges. In our example, this means allowing all traffic from 172.16.0.0/16 and 172.31.0.0/16 on the security group. This is necessary because when an instance acts a router, you can't differentiate traffic directed to its own IP address or to one of the remote networks it can route to on the security group. Any such differentiation will unfortunately need to be implemented internally in iptables;
For each of the other elastic IP addresses of StrongSwan instances it will need to connect to, create the following rules:
|Custom ICMP Rule - IPv4||Time Exceeded||All||elastic IP/32|
|Custom ICMP Rule - IPv4||Destination Unreachable||All||elastic IP/32|
|Custom ICMP Rule - IPv4||Echo Reply||N/A||elastic IP/32|
|Custom ICMP Rule - IPv4||Echo Request||N/A||elastic IP/32|
|Custom ICMP Rule - IPv4||Traceroute||N/A||elastic IP/32|
|Custom Protocol||AH (51)||All||elastic IP/32|
|Custom UDP Rule||UDP||4500||elastic IP/32|
|Custom UDP Rule||UDP||500||elastic IP/32|
The ICMP rules above serve two purposes. Firstly, the traceroute and echo reply/request ones will make it easier for you to troubleshoot the connectivity between the StrongSwan instances. Most importantly, though, the time exceeded and destination unreachable entries are there to allow path MTU discovery to happen properly between StrongSwan instances communicating over the Internet.
Next, update all of existing security groups in each VPC to ensure these same ICMP messages are accepted from all VPCs IP address ranges (172.16.0.0/16 and 172.31.0.0/16 in our example). The objective here is similar: to allow troubleshooting and proper path MTU discovery to happen on the end-to-end communications between machines on different VPCs through the VPN.
Create StrongSwan Instances and Configure Linux
This is what you need to keep in mind when creating the instances:
Use the latest CentOS 7 AMI to create a new instance on a public subnet of the chosen region with the security group we recently created;
Associate the elastic IP address to the instance;
Disable the source/destination check on the instance since it will act as a router.
Then, SSH into the machine (keep in mind the default username for the AMI is
centos) so we can configure the operating system properly. Make sure you become root for the following configuration steps.
/etc/sysctl.conf contains the following lines and then force them to be loaded by running
sysctl -p /etc/sysctl.conf or by rebooting:
net.ipv4.ip_forward = 1 net.ipv4.conf.all.send_redirects = 0 net.ipv4.conf.default.send_redirects = 0 net.ipv4.tcp_max_syn_backlog = 1280 net.ipv4.icmp_echo_ignore_broadcasts = 1 net.ipv4.conf.all.accept_source_route = 0 net.ipv4.conf.all.accept_redirects = 0 net.ipv4.conf.all.secure_redirects = 0 net.ipv4.conf.all.log_martians = 1 net.ipv4.conf.default.accept_source_route = 0 net.ipv4.conf.default.accept_redirects = 0 net.ipv4.conf.default.secure_redirects = 0 net.ipv4.icmp_echo_ignore_broadcasts = 1 net.ipv4.icmp_ignore_bogus_error_responses = 1 net.ipv4.tcp_syncookies = 1 net.ipv4.conf.all.rp_filter = 1 net.ipv4.conf.default.rp_filter = 1 net.ipv4.tcp_mtu_probing = 1
As a side note, it is strongly recommended that you include
net.ipv4.tcp_mtu_probing = 1 on the
sysctl.conf of all of your Linux EC2 instances, since they use jumbo frames by default.
Let's make sure the machine is fully patched, that we can use EPEL and that we install StrongSwan by issuing the following commands:
yum install epel-release yum repolist yum update yum install strongswan systemctl enable strongswan
In order to ensure the cryptography and logging work properly, the system needs to have proper time synchronization. Make sure NTP is installed and configured to run on system start:
yum install ntp systemctl enable ntpd
server configuration entries in
/etc/ntp.conf so the AWS recommended NTP server pool is used:
server 0.amazon.pool.ntp.org iburst server 1.amazon.pool.ntp.org iburst server 2.amazon.pool.ntp.org iburst server 3.amazon.pool.ntp.org iburst
Finally, restart the NTP service with
systemctl restart ntpd and check that it is working properly with
We'll configure StrongSwan to use RSA keys for authentication, so the first step is to create those keys and associate them with the servers in the StrongSwan configuration.
On each StrongSwan instance, create its own RSA key. This is how you would do it on the us-east-1 StrongSwan instance:
cd /etc/strongswan/ipsec.d/private/ openssl genrsa -out us-east-1.key 4096 chmod og-r us-east-1.key openssl rsa -in us-east-1.key -pubout > ../certs/us-east-1.pub
Once you do that, you need to edit
/etc/strongswan/ipsec.secrets to let StrongSwan know what to do with the private key. Add a line to that file that associates each instance's own elastic IP address to the key file. Assuming the elastic IP address of the us-east-1 StrongSwan instance is
220.127.116.11, this is what that line would look like:
18.104.22.168 : RSA us-east-1.key
Then, you copy each StrongSwan instance's
.pub file to the
/etc/strongswan/ipsec.d/certs directory of each of the other StrongSwan instances. In our example, if you were on the us-east-1 instance you would see something like this:
$ find /etc/strongswan/ipsec.d/ -name *.key /etc/strongswan/ipsec.d/private/us-east-1.key $ find /etc/strongswan/ipsec.d/ -name *.pub /etc/strongswan/ipsec.d/certs/us-east-1.pub /etc/strongswan/ipsec.d/certs/us-east-2.pub
Finally, you configure
/etc/strongswan/ipsec.conf to tie it all together. This is what the configuration file would look like the the elastic IP for the us-east-1 and us-east-2 instances were 22.214.171.124 and 126.96.36.199, respectively:
config setup # strictcrlpolicy=yes # uniqueids = no conn %default fragmentation=force dpdaction=restart ike=aes192gcm16-aes128gcm16-aes192-prfsha256-ecp256-ecp521,aes192-sha256-modp3072 esp=aes192gcm16-aes128gcm16-aes192-ecp256,aes192-sha256-modp3072# keyingtries=%forever keyexchange=ikev2 authby=rsasig forceencaps=yes leftid=188.8.131.52 leftrsasigkey=us-east-1.pub leftsubnet=172.16.0.0/16 # Add connections here. conn us-east-2 right=184.108.40.206 rightsubnet=172.31.0.0/16 rightrsasigkey=us-east-2.pub auto=start
Keep in mind that
left in StrongSwan parlance means the side of the VPN that is local to the instance you are configuring, and
right is the remote side. So the configuration file on the us-east-2 instance would look like this:
config setup # strictcrlpolicy=yes # uniqueids = no conn %default fragmentation=yes dpdaction=restart ike=aes192gcm16-aes128gcm16-aes192-prfsha256-ecp256-ecp521,aes192-sha256-modp3072 esp=aes192gcm16-aes128gcm16-aes192-ecp256,aes192-sha256-modp3072# keyingtries=%forever keyexchange=ikev2 authby=rsasig forceencaps=yes leftid=220.127.116.11 leftrsasigkey=us-east-2.pub leftsubnet=172.31.0.0/16 # Add connections here. conn us-east-1 right=18.104.22.168 rightsubnet=172.16.0.0/16 rightrsasigkey=us-east-1.pub auto=start
Please review the StrongSwan documentation on ipsec.conf to better understand some of the choices I've made there, and tweak the setup to meet your needs. I wouldn't change the configuration on the
forceencaps options, though, since I had problems if they were not set as above.
Once you've set all of this up, run
systemctl restart strongswan and monitor the logs with
tail -f /var/log/messages | grep charon for log entries related to the IPsec tunnel negotiations and authentication.
Hopefully by now you will be able to ping us-east-2's StrongSwan instance internal (172.31.0.x) IP address from the SSH session on us-east-1's StrongSwan instance.
Finally, in order to allow machines on one region to talk to machines and services on the other, we'll need to update the route tables.
What you need to do is to add a new route that tells machines on a region that in order to talk to the addresses on the other regions, they must go through the StrongSwan instance.
So in our example, you should add a new route to all routing tables in us-east-1 that has a
Destination of 172.31.0.0/16, and a
Target that is the instance ID of the us-east-1 StrongSwan instance.
Conversely, you should add a new route to all routing tables in us-east-2 that has a
Destination of 172.16.0.0/16, and a
Target that is the instance ID of the us-east-2 StrongSwan instance.
Finally, make sure that the security groups of services that need to be accessed across the VPN will now allow the IP addresses of the remote machines in. Once you do that, you can then test the communication between regions successfully. Of course, if you enabled ICMP as recommended above, you should be able to ping any instance in us-east-2 from any instance in us-east-1 and vice-versa by now.
You could achieve some level of redundancy and distribution of load by increasing the number of VPN concentrator instances you stand up.
One idea would be to create one VPN concentrator per availability zone instead of just one per region. In this scenario even if one availability zone (or its StrongSwan instance) become unavailable, the rest of the availability zones will remain connected.
This is a high level guide of what that would entail in addition to what was discussed above:
- Create the additional StrongSwan instances as per the instructions above;
- Separate the routing tables per availability zone and assign each one to its corresponding subnets;
ipsec.confon all machines to have one connection for each VPN concentrator. Also update each one's
rightsubnetdefinitions so that each server is only responsible for the IP address ranges of the subnets in its availability zone.
I have not covered implementing HA on StrongSwan, though apparently that is supported as well. If you get this working let me know.
A few security-minded tips that I would recommend you implement:
Ensure you close off SSH access to the StrongSwan instances after you're done configuring them, by removing the applicable Security Group inbound rule. You can always allow it temporarily again on the Security Group if and when you need it.
Install the CloudWatch Logs Agent on the machine, remember we covered this already here. Make sure you collect at least the following files:
Harden the operating system and make sure to keep install security updates as they become available.