Linux

Segmentation and Checksum Offloading: Turning Off with ethtool

When introducing data communications concepts and protocols to students I think it is beneficial to demonstrate, and more importantly, allow students to play with real protocols. In the lab I teach (ITS332), as well as assignments for some lecture courses, we use Wireshark to capture traffic generated by several Internet applications (e.g. ping, Secure Shell, web browsing, iperf). This allows students to see the actual packets being sent across a network, and start to understand the protocol rules and formats used.

Unfortunately sometimes what we see in Wireshark is not what we expect. One case in which this occurs is when TCP/IP operations are offloaded by the operating system to the Network Interface Card (NIC). Common operations for offloading are segmentation and checksum calculations. That is, instead of the OS using the CPU to segment TCP packets, it allows the NIC to use its own processor to perform the segmentation. This saves on the CPU and importantly cuts down on the bus communications to/from the NIC. However offloading doesn't change what is sent over the network. In other words, offloading to the NIC can produce performance gains inside your computer, but not across the network.

How does this affect what Wireshark captures? Consider the figure below illustrating the normal flow of data through a TCP/IP stack without offloading. Lets assume the application data is 7,300 Bytes. TCP breaks this into five segments. Why five? The Maximum Transmission Unit (MTU) of Ethernet is 1500 Bytes. If we subtract the 20 Byte IP header and 20 Byte TCP header there is 1460 Bytes remaining for data in a TCP segment (this is the TCP Maximum Segment Size (MSS)). 7,300 Bytes can be conveniently segmented into five maximum sized TCP segments.

Wireshark capturing in the stack

After IP adds a header to the TCP segments the resulting IP datagrams are sent one-by-one to the "Ethernet layer". Note that TCP/IP are part of operating system, while most functionality of Ethernet is implemented on the NIC. However network drivers (lets consider them part of the OS) also perform some of the Ethernet functionality. The network driver creates/receives Ethernet frames. So in the above example, assuming segmentation offloading is not used, the 7,300 Bytes of application data is segmented into 5 TCP/IP packets containing 1460 Bytes of data each. The network driver encapsulates each IP datagram in an Ethernet frame and sends the frames to the NIC. It is these Ethernet frames that Wireshark (and other packet capture software, like tcpdump) captures. The NIC then sends the frames, one-by-one, over the network.

Now consider when segmentation offloading is used (as in the figure below). The OS does not segment the application data, but instead creates one large TCP/IP packet and sends that to the driver. The TCP and IP headers are in fact template headers. The driver creates a single Ethernet frame (which is captured by Wireshark) and sends it to the NIC. Now the NIC performs the segmentation. It uses the template headers to create 5 Ethernet frames with real TCP/IP/Ethernet headers. The 5 frames are then sent over the network

Wireshark capturing in the stack

The result: although the same 5 Ethernet frames are sent over the network, Wireshark captures different data depending on the use of segmentation offloading. When not used, the 5 Ethernet frames are captured. When offloading is used, Wireshark only captures the single, large frame (containing 7,300 bytes of data).

To further illustrate segmentation offloading, and how to control it in Linux, consider the following tests performed on two Ubuntu computers, basil and ginger, connected on an Ethernet LAN. On basil (which has IP address 10.10.1.22) netcat in server mode is used to receive data:

sgordon@basil$ nc -l 5001

On ginger netcat in client mode is used to send 10,000 Bytes of data (stored in a file) to the server.

sgordon@ginger$ nc -p 5002 10.10.1.22 5001 < 10000bytes.txt 

tcpdump is used to see the captured IP packets, and in particular the size of the TCP segments. I could have used Wireshark, but the text output of tcpdump> is easier to include in this page. ethtool is used to view and change the status of segmentation offloading (in this example, generic segmentation offload or GSO).

First note that ethtool shows us that generic segmentation offload is on.

sgordon@ginger$ sudo ethtool -k eth0
Offload parameters for eth0:
Cannot get device flags: Operation not supported
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: off
udp fragmentation offload: off
generic segmentation offload: on
large receive offload: off

Now, after running the netcat client, lets see the output from tcpdump (for clarity I have omitted the option fields from selected TCP segments):

sgordon@ginger$ sudo tcpdump -i eth0 -n 'not port 22'
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
18:30:24.899687 IP 192.168.1.2.5002 > 10.10.1.22.5001: S 679249855:679249855(0) win 5840 
18:30:24.900583 IP 10.10.1.22.5001 > 192.168.1.2.5002: S 1420594303:1420594303(0) ack 679249856 win 5792 
18:30:24.900612 IP 192.168.1.2.5002 > 10.10.1.22.5001: . ack 1 win 92
18:30:24.900713 IP 192.168.1.2.5002 > 10.10.1.22.5001: . 1:2897(2896) ack 1 win 92
18:30:24.900735 IP 192.168.1.2.5002 > 10.10.1.22.5001: . 2897:4345(1448) ack 1 win 92 
18:30:24.902575 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 1449 win 68 
18:30:24.902591 IP 192.168.1.2.5002 > 10.10.1.22.5001: P 4345:7241(2896) ack 1 win 92 
18:30:24.903597 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 2897 win 91 
18:30:24.903607 IP 192.168.1.2.5002 > 10.10.1.22.5001: . 7241:8689(1448) ack 1 win 92 
18:30:24.903613 IP 192.168.1.2.5002 > 10.10.1.22.5001: P 8689:10001(1312) ack 1 win 92 
18:30:24.903617 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 4345 win 114 
18:30:24.905573 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 5793 win 136 
18:30:24.905587 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 7241 win 159 
18:30:24.906628 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 8689 win 181 
18:30:24.906637 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 10001 win 204 

Each line is showing a captured packet. The TCP segments containing data can be identified by the sequence numbers (I've made them bold). The number in parentheses indicates the number of bytes in this TCP segment. We can see from the capture that our 10,000 Bytes of data is broken into 5 segments containing: 2896, 1448, 2896, 1448, 1312 Bytes each. But wait ... 2896 Bytes in a TCP segment when the MSS is 1460? (in fact, with TCP header options, like SACK and timestamp, the MSS in this capture is 1448). This is Generic Segmentation Offloading going to work: the OS is sending large segments, as captured above, and letting the NIC do the real segmentation.

So now lets turn Generic Segmentation Offloading off using ethtool:

sgordon@ginger$ sudo ethtool -K eth0 gso off

And run the netcat transfer again and look at the tcpdump output this time:

sgordon@ginger$ sudo tcpdump -i eth0 -n 'not port 22'
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
18:33:02.644356 IP 192.168.1.2.5002 > 10.10.1.22.5001: S 3144010294:3144010294(0) win 5840 
18:33:02.645427 IP 10.10.1.22.5001 > 192.168.1.2.5002: S 3901655238:3901655238(0) ack 3144010295 win 5792 
18:33:02.645471 IP 192.168.1.2.5002 > 10.10.1.22.5001: . ack 1 win 92 
18:33:02.645542 IP 192.168.1.2.5002 > 10.10.1.22.5001: . 1:1449(1448) ack 1 win 92 
18:33:02.645558 IP 192.168.1.2.5002 > 10.10.1.22.5001: . 1449:2897(1448) ack 1 win 92 
18:33:02.645567 IP 192.168.1.2.5002 > 10.10.1.22.5001: P 2897:4345(1448) ack 1 win 92 
18:33:02.647415 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 1449 win 68 
18:33:02.647433 IP 192.168.1.2.5002 > 10.10.1.22.5001: . 4345:5793(1448) ack 1 win 92 
18:33:02.647439 IP 192.168.1.2.5002 > 10.10.1.22.5001: . 5793:7241(1448) ack 1 win 92 
18:33:02.648437 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 2897 win 91 
18:33:02.648446 IP 192.168.1.2.5002 > 10.10.1.22.5001: . 7241:8689(1448) ack 1 win 92 
18:33:02.648451 IP 192.168.1.2.5002 > 10.10.1.22.5001: P 8689:10001(1312) ack 1 win 92 
18:33:02.648460 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 4345 win 114 
18:33:02.650414 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 5793 win 136 
18:33:02.650428 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 7241 win 159 
18:33:02.651469 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 8689 win 181 
18:33:02.651476 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 10001 win 204

Now this is what we expect to see - 7 TCP segments each no larger than 1448 Bytes.

Whats the conclusion of all this? What is taught in lectures and textbooks is not always what you see in practice. I suggest turning offloading optimisations off to demonstrate the basic concepts, and then turn them back on again to illustrate the practical performance optimizations applied at the expense of theoretical layering principles.

Mostly Unix

Its always nice to see and deal with other people using Unix-based operating systems. If a student comes to me for help with a software problem, although I'll try to help no matter what OS, I am much more interested in spending time with them if they are using a Unix-based OS (Ubuntu, MacOS, etc). This is mainly because that is what I know best. Of recent, more students have Ubuntu installed, either as dual-boot or in a virtual machine, on their personal laptops. And they are starting to learn that the Windows GUI way is not the only way. I am encouraging the further exploration of Linux in my networking lab (ITS332) where 90% of the tasks are completed on the command line in Ubuntu (the other 10% using Wireshark in Ubuntu). In other courses, although I don't require the students to use any specific operating system, I demonstrate how easy it is to complete networking tasks in Unix-based OSes, such as measure the throughput of TCP in a network with varying packet loss rates (three command line operations using iperf and tc).

One of the local Unix gurus at SIIT is Yoichi. As well as doing his research, he has been active in labs and teaching introductory computing to SIIT/Thammasat students. He has started to blog about some of the simple, yet very powerful things that can be done via the command line in Ubuntu. Yoichi and other teaching assistants and graduate students are spreading the word in SIIT that Windows is not the only operating system available. And their efforts are starting to show as more students are coming to me for help on a network or programming assignment with their laptop booted into Ubuntu.

Ubuntu, Ubuntu, Ubuntu

Since returning from a holiday in Australia in April not much has been happening other than work. However over the past semester I finally made the switch to Ubuntu on all my computers.

In fact I have been running Ubuntu on all my computers since moving to Thailand. Both my home PC and work PC were running dual boot Windows XP and Ubuntu Desktop, and my home Pentium III web server Ubuntu Server. I was using Windows for most of my work. However when my own server PC died I shifted to running Ubuntu full-time on my main home PC (acting as my web server and everyday PC), and a month ago I deleted the Windows partition (but have XP installed under VirtualBox). At work I haven't got around to deleting XP, but I only use it a few hours per week.

Ubuntu supports almost everything I need for my everyday computing activities at work and home: browsing, office applications, graphics, watching movies, occasional basic games. Of course it also perfect for hosting my web and email server. The only things I've been needing Windows for are: running some Windows-only simulation software, opening/editing old documents in MS Office (most of my old lecture slides were developed in Powerpoint - OpenOffice Presentation doesn't always handle the animations and fonts), and accessing web sites of selected financial institutions (IE only). Now all new work documents are created in OpenOffice. Slowly I am converting old documents, especially teaching material, to OpenOffice formats. It is a nice feeling knowing that eventually my 1000's of pages of teaching material will soon be in open formats, no longer dependent on closed, commercial applications and operating systems.

As Wan has been getting used to Ubuntu on my home PC, when she recently got her new Samsung NC10 netbook I immediately installed Ubuntu Desktop 9.04, overwriting the Windows XP Home install. So now its Ubuntu at multiple PCs at work, Ubuntu at home and Ubuntu on the road.

Fixing a Grub Error 15 from a Partition Resize

Wow, that was close! I decided to delete an old fat32 partition that I used to share data between linux (first Fedora, now Ubuntu) and Windows on my dual boot PC. Since Ubuntu supports NTFS partitions, I hardly ever used it. I used Partition Manager Professional in Windows to delete the partition. No worries.

However, I also decided to resize my /boot partition under Ubuntu. When I recently upgraded to Ubuntu 8.04, the process was stalled several times because I didn't have enough space on the /boot partition (initially 100Mb). This was very annoying because I had to move some files from /boot and restart the entire Ubuntu upgrade (several minutes wasted each time). So I used Partition Manager to also increase the /boot partition to 200MB. No worries.

Templates for Right Click Menu in Ubuntu Linux

When you right-click on a folder in Ubuntu Linux you are given an option to Create Document. Initially no templates are installed. To install a template simply create blank document in the ~/Templates directory and it will appear in the right-click menu.

Postfix

Postfix is a mail daemon for sending email (an alternative to sendmail).

Ubuntu

Linux distribution which is relatively easy to install and use. I have tried Ubuntu and it does provide a user friendly Linux distro, detecting hardware automatically so that things like sound and networking work straight away.

Installing Ubuntu Linux

Background

HP PhotoSmart C3180 Multifunction Colour Printer

After getting a new router yesterday I returned to Zeer Rangsit and got my next IT purchase today: a HP PhotoSmart C3180 colour printer and scanner for 4000 Baht.

Gnome

Linux desktop management system

Creating DVD Slideshows in Linux

  1. Install libquicktime 0.9.8:
    1. tar xzvf libquicktime-0.9.8.tar.gz
    2. cd libquicktime-0.9.8
    3. ./configure

Installing ns2 on Fedora Core 4

ns2 is a popular, free package for simulating computer networks and protocols, e.g. TCP, satellite links, ad hoc networks, routing protocols.

  1. Download the following packages from the ns2 website and extract them into your working directory. For example in my case /home/sgordon/ns2 is my install directory, therefore all the .tar.gz files are saved in this directory and apply tar xzvf filename.tar.gz on each:

Configuring Fedora on Dell Latitude D410 Laptop

Here are some notes from my installation and usage of Fedora Core 4 on a Dell Latitude D410 laptop.

Installation

My Dell Latitude D410 laptop already had Windows XP Professional installed on one partition, and a second partition was already created (when ordered through Dell).

Computing

Here you can find a brief description of my PC in Bangkok, my Web Server in Bangkok and the computer setup I used to have in Adelaide. You can also find some notes to some installation and configuration issues using Linux (Ubuntu, Fedora) and other software (e.g. NS2).

PC in Bangkok

A couple of days after I arrived in Bangkok and got my room I purchased the following PC which also acts as my TV and sound system. Total price: 38000 Baht (approx $AU1350).

Syndicate content