Happy eyeballs makes me unhappy…
Happy eyeballs, defined in RFC 6555, is a technique that enables dual-stack hosts to automatically select between IPv6 and IPv4 based on their respective performance. When a dual-stack host tries to contact a webserver that is reachable over both IPv6 and IPv4, it :
- first tries to establish a TCP connection towards the IPv6 or IPv4 address and starts a short timeout, say 300 msec
- if the connection is established over the chosen address family, it continues
- if the timer expires before the establishment of the connection, a second connection is tried with the other address family
Happy eyeballs works well when the one of the two address families provides bad performance or is broken. In this case, a host using happy eyeballs will automatically avoid the broken address family. However, when both IPv6 and IPv4 work correctly, happy eyeballs may cause frequent switches between the two address families.
As an exemple, here is a summary of a packet trace that I collected when contacting a dual-stack web server from my laptop using the latest version of MacOS.
First connection
09:40:47.504618 IP6 client6.65148 > server6.80: Flags [S], cksum 0xe3c1 (correct), seq 2500114810, win 65535, options [mss 1440,nop,wscale 4,nop,nop,TS val 1009628701 ecr 0,sackOK,eol], length 0
09:40:47.505886 IP6 server6.80 > client6.65148: Flags [S.], cksum 0x1abd (correct), seq 193439890, ack 2500114811, win 14280, options [mss 1440,sackOK,TS val 229630052 ecr 1009628701,nop,wscale 7], length 0
The interesting information in these packets are the TCP timestamps. Defined in RFC 1323, these timestamps are extracted from a local clock. The server returns its current timestamp in the SYN+ACK segment.
Thanks to happy eyeballs, the next TCP connection is sent over IPv4 (it might be faster than IPv6, who knows). IPv4 works well and answers immediately
09:40:49.512112 IP client4.65149 > server4.80: Flags [S], cksum 0xee77 (incorrect -> 0xb4bd), seq 321947613, win 65535, options [mss 1460,nop,wscale 4,nop,nop,TS val 1009630706 ecr 0,sackOK,eol], length 0
09:40:49.513399 IP (tos 0x0, ttl 61, id 0, offset 0, flags [DF], proto TCP (6), length 60) server4.80 > client4.65149: Flags [S.], cksum 0xd86f (correct), seq 873275860, ack 321947614, win 5792, options [mss 1380,sackOK,TS val 585326122 ecr 1009630706,nop,wscale 7], length 0
Note the TS val in the returning SYN+ACK. The value over IPv4 is much larger than over IPv6. This is not because IPv6 is faster than IPv4, but indicates that there is a load-balancer that balances the TCP connections between (at least) two different servers.
Shortly after, I authenticated myself over an SSL connection that was established over IPv4
09:41:26.566362 IP client4.65152 > server4.443: Flags [S], cksum 0xee77 (incorrect -> 0x420d), seq 3856569828, win 65535, options [mss 1460,nop,wscale 4,nop,nop,TS val 1009667710 ecr 0,sackOK,eol], length 0
09:41:26.567586 IP server4.443 > client4.65152: Flags [S.], cksum 0x933e (correct), seq 3461360247, ack 3856569829, win 14480, options [mss 1380,sackOK,TS val 229212430 ecr 1009667710,nop,wscale 7], length 0
Again, a closer look at the TCP timestamps reveals that there is a third server that terminated the TCP connection. Apparently, in this case this was the load-balancer itself that forwarded the data extracted from the connection to one of the server.
Thanks to happy eyeballs, my TCP connections reach different servers behind the load-balancer. This is annoying because the web servers maintain one session and every time I switch from one session to another I might switch from one server to another. In my experience, this happens randomly with this server, possibly as a function of the IP addresses that I’m using and the server load. As a user, I experience difficulties to log on the server or random logouts, while the problem lies in unexpected interactions between happy eyeballs and a load balancer. The load balancer would like to stick all the TCP connections from one host to the same server, but due to the frequent switchovers between IPv6 and IPv4 it cannot stick each client to a server.
I’d be interested in any suggestions on how to improve this load balancing scheme without changing the web servers…