The extended version of Cheetah: “A High-Speed Programmable Load-Balancer Framework With Guaranteed Per-Connection-Consistency” has been published in ACM/IEEE ToN

In this journal version, we extended our conference paper with additional, peer-reviewed material:

  • We implemented our system on QUIC using P4 and Picoquic. This demonstrates that our approach does not depend solely on TCP timestamps. The code in ‘bmv2’ and ‘p4-tofino’ has been made publicly available.  All of our code is available at https://github.com/cheetahlb/
  • We added an experiment using the Tofino implementation and the QUIC implementation of Cheetah for an HTTP webserver.
  • We added an experiment to verify whether today’s OSes support TCP timestamp, have them enabled by default, and correctly echo the TCP timestamp set by a server.
  • We added an experiment to verify the granularity of the TCP timestamp units used by some of the largest Alexa top 100 websites. 
  • We added a proof sketch on the size of the cookies given a number of servers. 
  • We added an implementation in bmv2 of the “TCP timestamp”-based system. We have also rewritten and published the P4- tofino code of the system. The implementation of the stateful LB is non-trivial as it requires the insertions/lookups/deletions operations to be applied in constant time (and more restrictions apply). We describe our implementation of a stack-based data structure for the Tofino in Section 4.3. 
  • We added a micro-benchmark of the performance of the Cheetah LB, e.g., compared SYN insertions with cuckoo, normal packets, 
  • We broke down the benefits of SSE parsing of TCP options instructions.
  • We evaluated the packet processing latency overheads of realizing Cheetah on a Tofino for both the TCP timestamp and QUIC implementation.
  • We clarified the design challenges in the introduction.

Check out the paper in open access !

A poster of our latest work, CrossRSS, a Stateless CPU-Aware Datacenter Load-Balancer

Today we will present a poster of our latest work, at CoNEXT’20 : CrossRSS! CrossRSS is a load-balancer that spreads the load uniformly even inside the servers. It uses knowledge of the dispatching done inside the servers, RSS, to purposely select less-loaded cores without any server modification, or inter-core communications on the server. Learn more by watching the short video!

The poster session will be held on the 4th of December, 2:30 CET on the Mozilla VR Hub

Extended Abstract ; Hub ; Video ; Poster-As-Slides

Our latest paper “Cheetah”, a load balancer that guarantees per-connection-consistency

Cheetah is a new load balancer that solves the challenge of remembering which connection was sent to which server without the traditional trade off between uniform load balancing and efficiency. Cheetah is up to 5 times faster than stateful load balancers and can support advanced balancing mechanisms that reduce the flow completion time by a factor of 2 to 3x without breaking connections, even while adding and removing servers.

More information at https://www.usenix.org/conference/nsdi20/presentation/barbette.