Fan-in from multiple routes to a single VM with iptables, for FIX on Quickfix

Recently we were trying to set-up some network configuration to route FIX traffic from a single Docker container on a VM instance through four of our VPN gateways, two of the other side’s VPN gateways and on to two destination machines on the other side. It’s somewhat complicated, but necessary due to some constraints being out of our control.

The ultimate goal is to have failover redundancy in the system so that a different route can be used if one route is down.

We discovered that Quickfix can handle connection cycling itself by configuring its peers like this:

SocketConnectPort=1234
SocketConnectHost=0.0.0.0
SocketConnectPort1=1234
SocketConnectHost1=0.0.0.1
SocketConnectPort2=1234
SocketConnectHost2=0.0.0.2
SocketConnectPort3=1234
SocketConnectHost3=0.0.0.3

LogonTimeout=10

http://www.quickfixengine.org/quickfix/doc/html/configuration.html

(Those are just example IPs and ports for the different hosts).

Quickfix will automatically cycle to the next host if it fails to log on within the LogonTimeout, in this case 10 seconds.

This would have solved the problem on its own, except for the fact that we are only dealing with two final destination IPs in this network architecture. That is, there are two destinations, but four routes through to them for redundancy.

The other side is expecting a set of four source IPs from us that they can route traffic back to over the internet. On our side, we route packets to any of those four IPs back to our single VM instance.

The first step is using dummy IP addresses in the Quickfix config so that it can see four distinct peer IP addresses. We use the 192.0.2.0/24 IP range designated for documentation purposes in RFC 5737.

SocketConnectPort=1234
SocketConnectHost=192.0.2.0
SocketConnectPort1=1234
SocketConnectHost1=192.0.2.1
SocketConnectPort2=1234
SocketConnectHost2=192.0.2.2
SocketConnectPort3=1234
SocketConnectHost3=192.0.2.3

As far as Quickfix is concerned, there are now four distinct hosts to cycle.

However, we want the VPNs to see those packets as coming from four different source IP addresses.

This is where iptables first comes in:

sudo iptables -t nat -A POSTROUTING --destination 192.0.2.0 -j SNAT --to-source ${SOURCE_IP_0}
sudo iptables -t nat -A POSTROUTING --destination 192.0.2.1 -j SNAT --to-source ${SOURCE_IP_1}
sudo iptables -t nat -A POSTROUTING --destination 192.0.2.2 -j SNAT --to-source ${SOURCE_IP_2}
sudo iptables -t nat -A POSTROUTING --destination 192.0.2.3 -j SNAT --to-source ${SOURCE_IP_3}

This uses the POSTROUTING part of the nat (network address translation) chain to alter the apparent source IP address of these packets to be one of four IP addresses. Specifically, the SNAT target is used: “source-NAT”. Now the rest of the network sees these packets as coming from four distinct sources (even though they really come from one).

The VPNs consider the source IP as part of their routing, so they know how to route those packets further.

This won’t work yet, though. The rest of the network doesn’t know what to do with packets heading for a destination like 192.0.2.0, as it’s a dummy IP address.

The natural thing to do might be to try and use iptables again to also change the destination of those packets to one of the two final destinations, e.g.:

# (note: these iptables commands are not valid, as we will see below)
sudo iptables -t nat -A POSTROUTING --destination 192.0.2.0 -j DNAT --to-destination ${FINAL_DESTINATION_0}
sudo iptables -t nat -A POSTROUTING --destination 192.0.2.1 -j DNAT --to-destination ${FINAL_DESTINATION_1}
sudo iptables -t nat -A POSTROUTING --destination 192.0.2.2 -j DNAT --to-destination ${FINAL_DESTINATION_0}
sudo iptables -t nat -A POSTROUTING --destination 192.0.2.3 -j DNAT --to-destination ${FINAL_DESTINATION_1}

DNAT is “destination-NAT”, the counterpart to SNAT.

Unfortunately, if you try that, you will find that iptables rejects it with an error like this:

DNAT target: used from hooks POSTROUTING, but only usable from PREROUTING/OUTPUT

It turns out you can only use DNAT (destination adjustment) in PREROUTING or OUTPUT, and you can only use SNAT (source adjustment) in POSTROUTING. Those happen in the following order:

  1. PREROUTING
  2. OUTPUT
  3. POSTROUTING

If we try to use DNAT to change the destinations, it has to happen first, but then the SNAT adjustment cannot see the dummy IP to adjust the apparent source IP.

We have to use one final approach to alter both the source and destination of those packets: the ctorigdst option of iptables. This is a shorthand for “match against original destination address”. It lets you identify packets based on their original destination IP before other iptables rules altered it, which is perfect for this use case.

We can combine a DNAT for the destination with ctorigdst to subsequently alter the source:

sudo iptables -t nat -A OUTPUT --dst 192.0.2.0 -j DNAT --to ${FINAL_DESTINATION_0}
sudo iptables -t nat -A POSTROUTING -m conntrack --ctstate DNAT --ctorigdst 192.0.2.0 -j SNAT --to ${SOURCE_IP_0}

We can wrap that in a helper function to make it easy to apply for each of the four routes:

fan_dummy_destination_source () {
    sudo iptables -t nat -A OUTPUT --dst $1 -j DNAT --to $2
    sudo iptables -t nat -A POSTROUTING -m conntrack --ctstate DNAT --ctorigdst $1 -j SNAT --to $3
}

fan_dummy_destination_source 192.0.2.0 ${FINAL_DESTINATION_0} ${SOURCE_IP_0}
fan_dummy_destination_source 192.0.2.1 ${FINAL_DESTINATION_1} ${SOURCE_IP_1}
fan_dummy_destination_source 192.0.2.2 ${FINAL_DESTINATION_0} ${SOURCE_IP_2}
fan_dummy_destination_source 192.0.2.3 ${FINAL_DESTINATION_1} ${SOURCE_IP_3}

Note that IP forwarding must be enabled for this to work:

sudo sysctl -w net.ipv4.ip_forward=1

Many thanks to StackExchange user Anton Danilov for helping with this.


Tech mentioned