Puppet, Facts and Certificates

I’m currently setting up Puppet at Vquence so that, among other things, we can deploy hosts into Amazon EC2 more easily.

To ensure a minimum setup time on a new server I wanted the setup to be as simple as

  • echo ‘DAEMON_OPTS=”-w 120 –fqdn newserver.vquence.com –server puppetmaster.vquence.com” > /etc/default/puppet
  • aptitude install puppet

This means that the puppet client will use newserver.vquence.com as the common name in the SSL certificate it creates for itself. On the puppet master the SSL cert name is then used to pick a node rather than the hostname reported by facter.

This means that I don’t need to worry about setting up /etc/hostname, even better /etc/hostname can be managed by puppet.

You can control this functionality on the puppet master by using the node_name option. From the docs

    # How the puppetmaster determines the client's identity 
    # and sets the 'hostname' fact for use in the manifest, in particular 
    # for determining which 'node' statement applies to the client. 
    # Possible values are 'cert' (use the subject's CN in the client's 
    # certificate) and 'facter' (use the hostname that the client 
    # reported in its facts)
    # The default value is 'cert'.
    # node_name = cert

The problem was that the ‘hostname’ fact wasn’t being set. It looks like there was a regression in SVN#1673 when some refactoring was performed.

I’ve filed bug #1133 and you can clone my git repository.

I haven’t included any tests in the patch as I’m not sure how to. The master.rb test already tests this functionality but doesn’t test that the facts object has actually been changed. I think a test on getconfig is probably required but I’m not sure how you would access the facts after calling it.

Update: This patch is now in puppet as of 0.24.3.

Amazon EC2 ruby gem and large user_data

When you create an instance in EC2 you can send Amazon some user data that is accessible by your instance. At Vquence we use this to send a script that gets executes at boot up. This script contains some openvpn and puppet RSA keys so its approaching about 10k in size.

This works without any problems when using the java based command line tools. However I was getting the following error when using the EC2 Ruby GEM.

/usr/lib/ruby/1.8/net/protocol.rb:133:in `sysread': Connection reset by peer (Errno::ECONNRESET)
	from /usr/lib/ruby/1.8/net/protocol.rb:133:in `rbuf_fill'
	from /usr/lib/ruby/1.8/timeout.rb:56:in `timeout'
	from /usr/lib/ruby/1.8/timeout.rb:76:in `timeout'
	from /usr/lib/ruby/1.8/net/protocol.rb:132:in `rbuf_fill'
	from /usr/lib/ruby/1.8/net/protocol.rb:116:in `readuntil'
	from /usr/lib/ruby/1.8/net/protocol.rb:126:in `readline'
	from /usr/lib/ruby/1.8/net/http.rb:2020:in `read_status_line'
	from /usr/lib/ruby/1.8/net/http.rb:2009:in `read_new'
	 ... 6 levels...
	from ./lib/ec2helpers.rb:43:in `start_instance'
	from ./ec2-puppet:107
	from ./ec2-puppet:89:in `each_pair'
	from ./ec2-puppet:89

Doing some tcpdumping indicated that after receiving the request Amazon waits for a while and then sends a TCP RESET. Not very nice at all. My next step was to use ngrep to compare the output from the command line tools and the ruby gem. This got nowhere fast since the command line tools use the SOAP API while the ruby gem uses the Query API.

What I did notice however is that while the command line tools performed a POST the ruby library performed a GET. At this stage I decided to test how much data I could send. So I started trying different user data sizes. The offending amount was around 7.8k, suspiciously close to exactly 8k.

The HTTP/1.1 spec doesn’t place an actual limit on the length but leaves it up to the server.

The HTTP protocol does not place any a priori limit on the length of
a URI. Servers MUST be able to handle the URI of any resource they
serve, and SHOULD be able to handle URIs of unbounded length if they
provide GET-based forms that could generate such URIs. A server
SHOULD return 414 (Request-URI Too Long) status if a URI is longer
than the server can handle (see section 10.4.15).


Note: Servers ought to be cautious about depending on URI lengths
above 255 bytes, because some older client or proxy
implementations might not properly support these lengths.

Apache for example limits this by default to 8190 bytes including the method and the protocol. You can change this using the LimitRequestLine directive.

I created a patch to modify the EC2 Gem to use a POST instead of a GET which has no such limitations. You can find the git tree for it at http://inodes.org/~johnf/git/amazon-ec2

EC2UI extension for Firefox 3

I’ve been doing some work with Amazon EC2 the last few days. An invaluable tool is the EC2UI firefox extension that Amazon have written. This provides you with a simple GUI inside the firefox chrome which makes it really easy to manipulate your EC2 instances.

A few weeks ago Hardy moved to using firefox 3. This meant, amongst other things, that the amazon plugin stopped working. The firefox guys have a webpage up that explains how to update extensions for Firefox 3.

The main problem was with changes to the password manager. You can find my changes on my bzr branch and a packaged up version of the extension EC2UI for Firefox 3.0b4.

Update: See comments below for new versions

Vim and spell checking

I just discovered Vim has spell checking. No more having to manually spell check in mutt with ispell when writing emails, Hurray!!

In your .vimrc file simply add

setlocal spell spelllang=en_au

Note: By default vim only installs en_us spell files. If you are running debian then there is a vim-spellfiles package. There is an ubuntu bug to do something about this as well. Since I’m using ubuntu I simply grabbed the en directory from ftp://ftp.vim.org/pub/vim/runtime/spell/ and dumped it in /usr/share/vim/vim71/spell

Vim will now highlight words it thinks are misspelled. The magic incarnations you will need are:

z= – Suggest alternatives for the word
zg – Add word to dictionary
zw – Remove word from dictionary

Squid and Rails caching

At Vquence our Rails setup looks something like this.

------------     ---------     ------------ 
| Internet |---->| Squid |---->| Mongrels | 
------------     ---------     ------------ 

(Who needs Inkscape when you have ASCII art)

This infrastructure is hosted in the US and up until recently squid hadn’t been doing much of anything except really sitting there.

Now a few months ago when we signed a contract with an Australian customer we decided we needed to place a squid cache in Australia which would actually cache content. For two reasons, firstly the US is a long way away and the 300ms latency is really noticeable and secondly because some of our pages involving graphs have long statistical calculations which can take minutes to render. (OK its really because no one has had a chance to optimise them yet but lets pretend that’s not the case). So we changed the above setup for the Australian customers to look like the following.

------------     ------------     ------------     ------------
| Internet |---->| Squid AU |---->| Squid US |---->| Mongrels |
------------     ------------     ------------     ------------

We hand out urls like http://www.client.b2b.vquence.com/widget to Australian customers and the rails backend is smart enough to make sure all the URLs look similar (I’ll blog about how I did that another time).

Without much time to look into thing properly I did some really nasty things on the AU squid cache to make sure it cached the pages.

refresh_pattern /client/graph  1440    0%    1440    ignore-no-cache ignore-reload
refresh_pattern /client/static 1440    0%    1440    ignore-no-cache ignore-reload
refresh_pattern /client/video  1440    0%    1440    ignore-no-cache ignore-reload

This is evil, breaks a whole heap of RFCs but it did the trick and got us out of a bind quickly.

A few weeks ago I moved the production site to Rails 2.0, I noticed around this time that the caching had stopped working. The client was no longer using our services as their campaign had finished so it wasn’t an urgent concern.

It seems that Rails 2.0 goes one step further to ensure that caches don’t cache content and instead of just sending

Cache-Control: no-cache

it now sends

Cache-Control: private, max-age=0, must-revalidate

I tried adding ignore-private, since if you’re breaking some aspects of the RFC you may as well break a couple more, but squid still refused to cache the content. After struggling with this for a bit I decided that the universe was trying to tell me I should actually do things properly.

So with squid set back to its defaults I went exploring how to accomplish this. Google wasn’t all that helpful at first since most Rails caching articles talk about caching to static files as most sites don’t implement reverse proxying for caching. It turns out however its fairly simple. In the appropriate actions in your controllers simply do the following.

class VideoController  false
        render :template => "videos/vquence"
    end

end

This will send the following header and cache the page for 8 hours.

Cache-Control: max-age=28800

Now everything is much faster!!