Computer Networking : starting from the principles

In 2009, I took my first sabbatical and decided to spend a large fraction of my time to write the open-source Computer Networking : Principles, Protocols and Practice ebook. This ebook was well received by the community and it received a 20,000$ award from the Saylor foundation that published it on iTunes.

There are two approaches to teach standard computer networking classes. Most textbooks are structured on the basis of the OSI or TCP/IP layered reference models. The most popular organisation is the bottom-up approach. Students start by learning about the physical layer, then move to datalink, … This approach is used by Computer Networks among others. Almost a decade ago, Kurose and Ross took the opposite approach and started from the application layer. I liked this approach and have adopted a similar one for the first edition of Computer Networking : Principles, Protocols and Practice.

However, after a few years of experience of using the textbook with students and discussions with several colleagues who were using parts of the text, I’ve decided to revise it. This is a major revision that will include the following major modifications.

  • the second edition of the ebook will use an hybrid approach. One half of the ebook will be devoted to the key principles that any CS student must know. To explain these principles, the ebook will start from the physical layer and go up to the application. The main objective of this first part is to give the students a broad picture of the operation of computer networks without entering into any protocol detail. Several existing books discuss this briefly in their introduction or first chapter, but one chapter is not sufficient to grasp this entirely.

  • the second edition will discuss at least two different protocols in each layer to allow the students to compare different designs.

    • the application layer will continue to cover DNS, HTTP but will also include different types of remote procedure calls
    • the transport layer will continue to explain UDP and TCP, but will also cover SCTP. SCTP is cleaner than TCP and provides a different design for the students.
    • the network layer will continue to cover the data and control planes. In the control plane, RIP, OSPF and BGP remain, except that iBGP will probably not be covered due to time constraints. Concerning the data plane, given the same time constraints, we can only cover two protocols. The first edition covered IPv4 and IPv6. The second edition will cover IPv6 and MPLS. Describing MPLS (the basics, not all details about LDP and RSVP-TE, more on this in a few weeks) is important to show a different design than IP to the students. Once this choice has been made, one needs to select between IPv4 and IPv6. Covering both protocols is a waste of student’s time and the second edition will only discuss IPv6. A this point, it appears that IPv6 is more future-proof than IPv4. The description of IPv4 can still be found in the first edition of the ebook.
    • the datalink layer will continue to cover Ethernet and WiFi. Zigbee or other techniques could appear in future minor revisions
  • Practice remains an important skill that networking students need to learn. The second edition will include IPv6 labs built on top of netkit to allow the students to learn how to perform basic network configuration and management tasks on Linux.

The second edition of the book will be tested by the students who follow INGI2141 at UCL. The source code is available from https://github.com/CNP3/ebook and drafts will be posted on http://cnp3bis.info.ucl.ac.be/ every Wednesday during this semester.

Sources of networking information

Students who start their Master thesis in networking have sometimes difficulties in locating scientific information which is related to their Master thesis’ topic. Many of them start by googling with a few keywords and find random documents and wikipedia pages. To aid them, I list below some relevant sources of scientific information about networking in general. The list is far from complete and biased by my own research interests which do not cover the entire networking domain.

Digital Libraries

During the last decade, publishers of scientific journals and conference organizers have created large digital libraries that are accessible through a web portal. Many of them are protected by a paywall that provides full access only to paid subscribers, but many universities have (costly) subscriptions to (some of) these librairies. Most of these digital librairies provide access to table of contents and abstracts.

Magazines

Conferences

Journals

Standardisation bodies

Is your network ready for iOS7 and Multipath TCP ?

During the last days, millions of users have installed iOS7 on their iphones and ipad. Estimates published by The Guardian reveal that more than one third of the users have already upgraded their devices to support the new release. As I still don’t use a smartphone, I usually don’t check these new software releases. From a networking viewpoint, this iOS update is different because it is the first step towards a wide deployment of Multipath TCP [RFC 6824]. Until now, Multipath TCP has mainly been used by researchers. With iOS7, the situation changes since millions of devices are capable of using Multipath TCP.

From a networking viewpoint, the deployment of Multipath TCP is an important change that will affect many network operators. In the 20th century, networks were only composed of routers and switches. These devices are completely transparent to TCP and never change any field of the TCP header or payload. Today’s networks, mainly enterprise and cellular networks are much more complex. They include various types of middleboxes that process the IP header but also analyze the TCP headers and payload and sometimes modify them for various. Michio Honda an his colleagues presented at IMC2011 a paper that reveals the impact of these middleboxes on TCP and its extensibility. In a nutshell, this paper revealed the following behaviors :

  • some middleboxes drop TCP options that they do not understand
  • some middleboxes replace TCP options by dummy options
  • some middleboxes change fields of the TCP header (source and destination ports for NAT, but also sequence/acknowledgement numbers, window fields, …)
  • some middleboxes inspect the payload of TCP segments, reject out-of-sequence segments and sometimes modify the TCP payload (e.g. ALG for ftp on NAT)

These results had a huge influence on the design of Multipath TCP that includes various mechanisms that enable it to work around most of these middleboxes and fallback to regular TCP in case of problems (e.g. payload modifications) to preserve connectivity.

Of course, Multipath TCP will achieve the best performance when running in a network which is fully transparent and does not include middleboxes that interfere with it. Network operators might have difficulties to check the possible interference between their devices and TCP extensions like Multipath TCP. While implementing Multipath TCP in the Linux kernel, we spent a lot of time understanding the interference caused by our standard firewall that randomizes TCP sequence numbers.

To support network operators who want to check the transparency of their network, we have recently released a new open-source software called tracebox. tracebox is described in a forthcoming paper that will be presented at IMC2013.

In a nutshell, tracebox can be considered as an extension to traceroute. Like traceroute, it allows to discover devices in a network. However, while traceroute only detects IP routers, tracebox is able to detect any type of middlebox that modify some fields of the network or transport header. tracebox can be used as a command-line tool but also includes a scripting language that allows operators to develop more complex tests.

For example, tracebox can be used to verify that a path is transparent for Multipath TCP as shown below

# tracebox -n -p IP/TCP/MSS/MPCAPABLE/WSCALE bahn.de
tracebox to 81.200.198.6 (bahn.de): 64 hops max
1: 130.104.228.126 IP::CheckSum
2: 130.104.254.229 IP::TTL IP::CheckSum
3: 193.191.3.85 IP::TTL IP::CheckSum
4: 193.191.16.21 IP::TTL IP::CheckSum
5: 195.69.144.123 IP::TTL IP::CheckSum
6: 145.254.5.158 IP::TTL IP::CheckSum
7: 88.79.13.62 IP::TTL IP::CheckSum
8: 81.200.194.234 IP::TTL IP::CheckSum
9: 81.200.197.9 IP::TTL IP::CheckSum
10: 81.200.198.6 TCP::CheckSum IP::TTL IP::CheckSum TCPOptionMaxSegSize::MaxSegSize -TCPOptionMPTCPCapable -TCPOptionWindowScale

At each hop, tracebox verifies which fields of the IP/TCP headers have been modified. In the trace above, tracebox sends a SYN TCP segment on port 80 that contains MSS, MP_CAPABLE and WSCALE option. The last hop corresponds to a middlebox that changes the MSS option and removes the MP_CAPABLE and WSCALE option. Thanks to the flexibility of tracebox, it is possible to use it to detect almost any type of middlebox interference.

You can use it on Linux and MacOS to verify whether the network that you use is fully transparent to TCP. If not, tracebox will point you to the offending middlebox.

Apple seems to also believe in Multipath TCP

Multipath TCP is a TCP extension that allows a TCP connection to send/receive packets over different interfaces. Multipath TCP has various use cases, including :

Designing such a major TCP extension has been a difficult problem and took a lot of effort within several research projects. The work started within the FP7 Trilogy project funded by the European Commission. It continues within the CHANGE and Trilogy 2 projects.

After five years of effort, we are getting close to a wide adoption of Multipath TCP.

  • In January 2013, the IETF published the Multipath specification as an Experimental standard in RFC 6824
  • In July 2013, the MPTCP working group reported three independent implementations of Multipath TCP, including our implementation in the Linux kernel. To my knowledge, this is the first time that a large TCP extension is implemented so quickly.
  • On September 18th, 2013, Apple releases iOS7 which includes the first large scale commercial deployment of Multipath TCP. Given the marketing buzz around new iOS7 releases, when can expect tens of millions of users who will use a Multipath TCP enabled device.

Packet traces collected on an iPad running iOS7 reveal that it uses Multipath TCP to reach some destinations that seem to be directly controlled by Apple. You won’t see Multipath TCP for regular TCP connections from applications like Safari, but if you use SIRI, you might see that the connection with one of the apple servers uses Multipath TCP. The screenshot below shows the third ACK of a three-way handshake sent by an ipad running iOS7.

../../../_images/siri.png

At this stage, the actual usage of Multipath TCP by iOS7 is unclear to me. If you have any hint on the type of information exchanged over this SSL connection, let me know.

The next step will, of course, be the utilisation of Multipath TCP by default for all applications running over iOS7.

Quickly producing time-sequence diagrams

Networking researchers and teachers often need to draw time-sequence diagrams that represent the exchange of packets through a network. Any drawing tool can be used to write these diagrams that contains mainly lines, arrows and text. However, while writing an article or a textbook, switching from the text to the drawing tool can be cumbersome.

A better approach would to write a description of the diagrams directly in the text as a set of commands in a simple langage. Latex hackers can probably manage this easily, but I’m far from a latex guru. Thanks to Benjamin Hesmans, I recently found an interesting software called MSCGen. MSCGen was designed to write Message Sequence Chart descriptions. It produces SVG and PNG images and is integrated with sphinx thanks to mscgen extension. This integration is very useful since it allows to write both images and text directly in ascii.

The langage supported by mscgen is similar to the DOT langage used by graphviz and is very easy to use. For example, the code below

.. msc::

    a [label="", linecolour=white],
    b [label="Host A", linecolour=black],
    z [label="Physical link", linecolour=white],
    c [label="Host B", linecolour=black],
    d [label="", linecolour=white];

    a=>b [ label = "DATA.req(0)" ] ,
    b>>c [ label = "", arcskip=1];
    c=>d [ label = "DATA.ind(1)" ];

Produces the following image.

../../../_images/msc.png

The only drawback of MSCGen is that it is currently difficult to write a diagram that contains a window of packets that are exchanged and the opposite flow of the acknowledgements. Besides that, I’m planning to use it to produce all time sequence diagrams in the planned revision of Computer Networking : Principles, Protocols and Practice

Adding bibliographic information to pdf files

Researchers often distribute pdf files of their articles on their homepages or through institutional repositories like DIAL. Researchers are encouraged to distribute their scientific papers electronically and measurements have shown that distributing papers online improves the impact of the papers. Still, there is often one important information which is missing when a paper is posted on a website : the precise bibliography information which is needed to cite the paper. Without this bibliographic information, readers of a paper my print or save it without knowing where it has been published and are more likely to ignore it when preparing the bibliography of their own papers.

A better approach is to add directly the bibliographic information inside the pdf file. This is what the default ACM Latex style provides for accepted papers. For the SIGCOMM ebook on Recent Advances in Networking, we opted for a simple note on each paper.

Read more...

How quickly can we scan the entire Internet

A random host on the Internet receives a large number of unsollicited packets. Some of these packets are caused by transmission errors that modify the destination address of the packets or bugs/implementation errors. Still, most of the background Internet noise observed by network telescopes comes from worms that try to propagate or researchers, security experts or attackers trying to find characteristics of remote hosts.

When researchers try to map the Internet, they usually operate slowly. For example, CAIDA takes a few days to send traceroute probes towards all reachable class C networks. The 2012 anonymous Internet Census that exploited a large number of vulnerable routers to serve as probes took months. nmap, the default tool to probe open services on a remote host or network also uses a slow mode of operation. These slow modes of operations are mainly chosen to avoid triggering alarms on the remote sites. A few packets can be easily unnoticed on an entreprise networks, not millions of them.

A recent paper presented at the USENIX 2013 Security symposium takes a completely different approach.

Read more...

Adding hyperlinks to our Latex articles

When they write papers, scientists take a lot of time in preparing their bibliography and correctly citing all their references. However, bibliographies and the corresponding bibtex styles were designed when everyone read scientific papers on paper. This is rarely the case today and most scientific papers are read online. Still, we insist on placing volume numbers, pages numbers and other information from the paper era in each paper but rarely URLs or DOIs. This is probably a mistake…

When developing the first edition of Computer Networking : Principles, Protocols and Practice I quickly found that students read references provided that these references were easily accessible through hyperlinks. Today’s students and I guess a growing number of researchers are used to browse the web but rarely go to their library to read articles on paper. For the recently published SIGCOMM ebook on Recent Advances in Networking, we did a small experiment in adding hyperlinks directly to each chapter in pdf format. Adding these hyperlinks was surprisingly easy and I hope useful for the readers.

Read more...

TCP over UDP : a new hack to pass through (some) middleboxes

Extending TCP in the presence of middleboxes is a difficult but not impossible task as shown by Multipath TCP RFC 6824. A recent IETF draft proposed by Apple suggests to encapsulate TCP segments inside UDP to prevent modifications performed by middleboxes. Apparently, some measurements indicate that UDP passes better through some types of NAT boxes that regular TCP segments. Since TCP is more widely used than UDP, the draft proposes to encapsulate TCP inside UDP. The proposed encapsulation technique is a bit unusual. A classical encapsulation would put the entire TCP segment after the UDP header. Instead, the TCP-over-UDP draft proposes to rewrite the TCP header as follows

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Source Port          |       Destination Port        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|            Length             |           Checksum            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  Data |           | |A|P|R|S|F|                               |
| Offset| Reserved  |0|C|S|S|Y|I|            Window             |
|       |           | |K|H|T|N|N|                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                        Sequence Number                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Acknowledgment Number                      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                      (Optional) Options                       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Read more...

Should we completely deprecate IP fragmentation ?

Fragmentation and reassembly have been part of the IPv4 specification seems the beginning. One of the main motivations for including such mechanisms in the network layer is of course to allow IP packets to be exchanged over subnetworks that support different packet sizes. The IPv4 fragmentation forced routers to be able to fragment too large fragments. When routers were entirely software based, doing fragmentation on the router was a viable solution. However, with the advent of hardware assisted routers, performing fragmentation on the routers became quickly too expensive. In a seminal paper, Christopher Kent and Jeff Mogul argued that fragmentation should be considered harmful. This encourage endhosts to avoid in-network packet fragmentation and most TCP implementations now include Path MTU discovery RFC 1191.

When IPv6 was designed, in-network fragmentation was quickly left out. However, the designers of IPv6 still believed in the benefits of fragmentation. IPv6 supports a fragmentation header that can be used by endhosts to fragment packets that are too large for a given path. One of the motivation for host based fragmentation is that some packets need to be transmitted over subnets that only support small packet sizes (IPv6 mandates a minimum MTU of 1280 bytes).

Read more...