ThinkingLinux ’06

ThinkingLinux ’06 was held in Melbourne a few days ago. It was organised by Synergy Plus with sponsorship by RedHat. Novel and a few others.

I gave a talk on Open Source in the Data Centre. Luckily this talk was after lunch so I got to do some editing in the morning sessions to tweak it more towards a business rather than technical audience. ๐Ÿ™‚

The conference was pretty awesome with interesting talks, ranging from Xen to how wotif.com was started.

Copies of the slides for all the talks should eventually make it onto the conference’s website.

Open Source in the Data Centre

Next Tuesday (17th Oct) I’ll be giving a presentation at Thinking Linux ’06 in Melbourne.

The talk is entitled Open Source in the Data Centre and I’ll be covering things like

  • Load Balancing “Stuff” (IPVS, keepalived, heartbeat)
  • Monitoring using Nagios and MRTG/rrdtool
  • Authentication with OpenLDAP anf FreeRADIUS

and a whole lot of other random things I can fit into 40 minutes.

I choose to blame Pia for putting me in a position to give this talk but only because it’s Jeff’s fault and there isn’t a justblamejdub.com ๐Ÿ™‚

If anyone wants to catch up on the Monday night down in Melbourne then let me know.

I’ll put slides up after the event.

Build your own ISP

I’ve finally gotten around to putting up the slides for my Build your own ISP talk I gave at Software Freedom Day and DEBSIG. You can find them on my Presentations page or a the direct link to the PDF here.

The slides are fairly sparse, the talk was a bit of a brain dump about random things to do with ISPs. I’m sure someone is going to ask me to give it at SLUG again at some stage ๐Ÿ™‚

TCP Window Scaling and kernel 2.6.17+

So I was tearing my hair out today. I’d installed Ubuntu onto a new Sun X4200 so that I could migrate Bulletproof’s monitoring system to it. (Note you need to use edgy knot-1 for the SAS drives to be supported). Anyway as I was installing packages I was getting speeds like 10kB/s. Normally I would expect 800-1000kB/s.

I did the usual sort of debugging, where there any errors on the switch, was it affecting other servers on the same network etc etc. Everything looked fine. Our friend tcpdump showed a dump that looked something like this.


root@oldlace:~# tcpdump -ni bond0 port 80
tcpdump: listening on bond0
1.2.3.4.42501 > 203.16.234.85.80: S 0:0 win 5840 <mss 1460,sackOK,timestamp 94318 0,nop,wscale 6> (DF)
203.16.234.85.80 > 1.2.3.4.42501: S 0:0(0) ack 1 win 5840<mss 1460,nop,wscale 2> (DF)
1.2.3.4.42501 > 203.16.234.85.80: . ack 1 win 92 (DF)
1.2.3.4.42501 > 203.16.234.85.80: P 1:352(351) ack 1 win 92 (DF)
203.16.234.85.80 > 1.2.3.4.42501: . ack 352 win 1608 (DF)

You’ll notice that the server initially advertises a window size of 5840, then suddenly in the first ACK it is advertising a size of 92. This means that the other side can only send 92 bytes before waiting for an ACK!!! Not very conducive to quick WAN transfer speeds.

After a lot of Google searching I discovered these threads on LKLM

Of course what I was missing was the wscale 6, which means that the windows was actually 92*2^6 = 5888. Which is pretty close to 5840 so why bother with the scaling, because towards the end of the connection we get 16022*2^6 = 1025408 which doesn’t normally fit into a TCP header.

So why aren’t things screaming along with this massive window, well something in the middle doesn’t like a windows scaling factor of 6 and is resetting it to zero. Which means the other end thingk the windows size really is 92.

There are 2 quick fixes. First you can simply turn off windows scaling all together by doing

echo 0 > /proc/sys/net/ipv4/tcp_window_scaling

but that limits your window to 64k. Or you can limit the size of your TCP buffers back to pre 2.6.17 kernel values which means a wscale value of about 2 is used which is acceptable to most broken routers.

echo "4096 16384 131072" > /proc/sys/net/ipv4/tcp_wmem
echo "4096 87380 174760" > /proc/sys/net/ipv4/tcp_rmem

The original values would have had 4MB in the last column above which is what was allowing these massive windows.

In a thread somewhere which I can’t find anymore Dave Miller had a great quote along the lines of

“I refuse to workaround it, window scaling has been part of the protocol since 1999, deal with it.”