John’s Tidbits

Moo - Development, Trouble-shooting and Random thoughts…


Puppet, Facts and Certificates

I’m currently setting up Puppet at Vquence so that, among other things, we can deploy hosts into Amazon EC2 more easily.

To ensure a minimum setup time on a new server I wanted the setup to be as simple as

  • echo ‘DAEMON_OPTS=”-w 120 –fqdn newserver.vquence.com –server puppetmaster.vquence.com” > /etc/default/puppet
  • aptitude install puppet

This means that the puppet client will use newserver.vquence.com as the common name in the SSL certificate it creates for itself. On the puppet master the SSL cert name is then used to pick a node rather than the hostname reported by facter.

This means that I don’t need to worry about setting up /etc/hostname, even better /etc/hostname can be managed by puppet.

You can control this functionality on the puppet master by using the node_name option. From the docs

    # How the puppetmaster determines the client's identity
    # and sets the 'hostname' fact for use in the manifest, in particular
    # for determining which 'node' statement applies to the client.
    # Possible values are 'cert' (use the subject's CN in the client's
    # certificate) and 'facter' (use the hostname that the client
    # reported in its facts)
    # The default value is 'cert'.
    # node_name = cert

The problem was that the ‘hostname’ fact wasn’t being set. It looks like there was a regression in SVN#1673 when some refactoring was performed.

I’ve filed bug #1133 and you can clone my git repository.

I haven’t included any tests in the patch as I’m not sure how to. The master.rb test already tests this functionality but doesn’t test that the facts object has actually been changed. I think a test on getconfig is probably required but I’m not sure how you would access the facts after calling it.

Update: This patch is now in puppet as of 0.24.3.

Amazon EC2 ruby gem and large user_data

When you create an instance in EC2 you can send Amazon some user data that is accessible by your instance. At Vquence we use this to send a script that gets executes at boot up. This script contains some openvpn and puppet RSA keys so its approaching about 10k in size.

This works without any problems when using the java based command line tools. However I was getting the following error when using the EC2 Ruby GEM.

/usr/lib/ruby/1.8/net/protocol.rb:133:in `sysread': Connection reset by peer (Errno::ECONNRESET)
	from /usr/lib/ruby/1.8/net/protocol.rb:133:in `rbuf_fill'
	from /usr/lib/ruby/1.8/timeout.rb:56:in `timeout'
	from /usr/lib/ruby/1.8/timeout.rb:76:in `timeout'
	from /usr/lib/ruby/1.8/net/protocol.rb:132:in `rbuf_fill'
	from /usr/lib/ruby/1.8/net/protocol.rb:116:in `readuntil'
	from /usr/lib/ruby/1.8/net/protocol.rb:126:in `readline'
	from /usr/lib/ruby/1.8/net/http.rb:2020:in `read_status_line'
	from /usr/lib/ruby/1.8/net/http.rb:2009:in `read_new'
	 ... 6 levels...
	from ./lib/ec2helpers.rb:43:in `start_instance'
	from ./ec2-puppet:107
	from ./ec2-puppet:89:in `each_pair'
	from ./ec2-puppet:89

Doing some tcpdumping indicated that after receiving the request Amazon waits for a while and then sends a TCP RESET. Not very nice at all. My next step was to use ngrep to compare the output from the command line tools and the ruby gem. This got nowhere fast since the command line tools use the SOAP API while the ruby gem uses the Query API.

What I did notice however is that while the command line tools performed a POST the ruby library performed a GET. At this stage I decided to test how much data I could send. So I started trying different user data sizes. The offending amount was around 7.8k, suspiciously close to exactly 8k.

The HTTP/1.1 spec doesn’t place an actual limit on the length but leaves it up to the server.

The HTTP protocol does not place any a priori limit on the length of
a URI. Servers MUST be able to handle the URI of any resource they
serve, and SHOULD be able to handle URIs of unbounded length if they
provide GET-based forms that could generate such URIs. A server
SHOULD return 414 (Request-URI Too Long) status if a URI is longer
than the server can handle (see section 10.4.15).


Note: Servers ought to be cautious about depending on URI lengths
above 255 bytes, because some older client or proxy
implementations might not properly support these lengths.

Apache for example limits this by default to 8190 bytes including the method and the protocol. You can change this using the LimitRequestLine directive.

I created a patch to modify the EC2 Gem to use a POST instead of a GET which has no such limitations. You can find the git tree for it at http://inodes.org/~johnf/git/amazon-ec2

Rails, ActiveRecord, MySQL, GUIDs and the rename_column bug

Since I wasted over 4 hours of my life today working my way through this problem I feel the need to share.

Since it seems to be the in thing in the Web 2.0 space, just to be cool, we use GUIDs to identify different objects in our URLs at Vquence. For example my randomly created vquence on on Rails has a GUID of

cDuIhGWb8r3lDxaby-aaea

Andy Singleton has written a rails plugin called funnily enough guid. This allows you to do the following in your model.

class Vquence < ActiveRecord::Base
  usesguid :column => ‘guid’
end

Once you do this you will automatically get GUID looking identifiers in the db and your application. The guid column in the DB gets mapped to Vquence.id so you can do things like

Vquence.find('cDuIhGWb8r3lDxaby-aaea');

We used to use Sphinx as our search index, we now use Lucene. Sphinx requires that you have an integer id for each document in your index. This is to make your SQL queries much faster. The dumb way to create your index is to use queries like the following.

SELECT * FROM videos LIMIT 0,10000
SELECT * FROM videos LIMIT 10000,10000
...
SELECT * FROM videos LIMIT 990000,10000

I know this as its what we originally used with Lucene. This works fine until you reach about 1,000,000 rows. The problem is that since there is no implicit ordering or range in the above query it means that for the final query MySQL needs to workout what the first 1,000,000 rows are and then return you the last 10,000.

A much better way to do it is the following

SELECT * FROM videos WHERE integer_id >= 1 and integer_id < = 10000
SELECT * FROM videos WHERE integer_id >= 10001 and integer_id < = 20000
...
SELECT * FROM videos WHERE integer_id >= 990000 and integer_id < = 1000000

This is fast as long as integer_id is indexed.

So to accommodate this in Rails we began using migrations like the following.

class Videos < ActiveRecord::Migration
  def self.up
    create_table :videos do |t|
      t.column :uuid, :string, :limit =>22, :null => false
      …

      t.timestamps
    end
    add_index :videos, :uuid, :unique => true
    rename_column :videos, :id, :integer_id
  end

  def self.down
    drop_table :videos
  end
end

This was all done months ago and the repercussions didn’t rear their ugly head until today. Previously everything in the videos table had been created by our external crawler and Rails never needed to insert into the table. Today I wrote some code that inserted into the videos table and everything broke horribly.

The problem is that ActiveRecord can still see the integer_id field and tries to insert a 0 value into it. It isn’t clever enough to realise that it is an auto increment field and to leave it alone. After some help from bitsweat on #RoR I implemented a dirty hack to hide the integer_id column from ActiveRecord. Thanks to Ruby overriding the ActiveRecord internals is really easy and I added the following to our guid plugin.

  # HACK (JF) - This is too evil to even blog about
  # When we use guid as a primary key we usually rename the original 'id'
  # field to 'integer_id'. We need to hide this from rails so it doesn't
  # mess with it. WARNING: This means once you use usesguid anywhere you can
  # never access a column in any table anywhere called 'integer_id'

class ActiveRecord::Base
  private
    alias :original_attributes_with_quotes :attributes_with_quotes

    def attributes_with_quotes(include_primary_key = true, include_readonly_attributes = true)
      quoted = original_attributes_with_quotes(include_primary_key = true, include_readonly_attributes = true)
      quoted.delete('integer_id')
      quoted
    end
end

So this worked like a charm and after 4 hours I thought my pain was over, but then I tried to add second row to my test database. This resulted in the following.

 Mysql::Error: Duplicate entry '0' for key 1: INSERT INTO `videos` (`updated_at`, `sort_order`, `guid`, `description`,
 `user_id`, `created_at`) VALUES('2008-01-11 16:45:05', NULL, 'bcOMPqWaGr3k5CabxfFyeK', '', 5, '2008-01-11 16:44:28');

I ran the same SQL with MySQL client and got the same error. I then looked at the table and saw the following

mysql> show columns from moo;
+------------+-------------+------+-----+---------+-------+
| Field      | Type        | Null | Key | Default | Extra |
+------------+-------------+------+-----+---------+-------+
| integer_id | int(11)     | NO   | PRI | 0       |       |
| guid       | varchar(22) | NO   | UNI |         |       |
+------------+-------------+------+-----+---------+-------+

What I expected to see was

mysql> show columns from moo;
+------------+-------------+------+-----+---------+----------------+
| Field      | Type        | Null | Key | Default | Extra          |
+------------+-------------+------+-----+---------+----------------+
| integer_id | int(11)     | NO   | PRI | NULL    | auto_increment |
| guid       | varchar(22) | NO   | UNI |         |                |
+------------+-------------+------+-----+---------+----------------+

The difference is that when the column was renamed it seems to have lost its auto increment and NOT NULL properties. Some investigation showed that the SQL being used to rename the column was

ALTER TABLE `videos` CHANGE `id` `integer_id` int(11)

when it should be

ALTER TABLE `videos` CHANGE `id` `integer_id` int(11) NOT NULL AUTO_INCREMENT

It seems that this is already filled as a bug on the rails site, including a patch.

Funnily enough that bug is owned by bitsweat. It seems he’s managed to help me out twice in one day :) It doesn’t seem that it made it into Rails 2.0 though so until then be careful about renaming columns using migrations.

CeBIT - Open Source and Business Communities

At CeBIT last week I participated in a panel discussion on Open Source and Business Communities as part of OpenCeBIT. Also on the panel were Simon Phipps and Jon Oxer.

Simon Phipps created a podcast of the event which you can find here:

You can find the slides I used for reference during my presentation here.

Ubuntu, VLANs and Bridges

Bridge and VLAN support has improved dramatically under Ubuntu and probably Debian as well since I last looked into it. once upon a time to create a bridge linked to a VLAN interface you would have to do horrible things like.

auto eth0
ifconfig eth0 inet manual
    pre-up /sbin/vconfig set_name_type VLAN_PLUS_VID_NO_PAD || true

auto vlan7
iface vlan7 inet manual
    pre-up /sbin/vconfig add eth0 7 || true
    post-down /sbin/vconfig rem vlan7 || true

auto br0
    pre-up brctl addbr br0
    pre-up brctl addif br0 vlan7
    post-down brctl delbr br0
    address 10.38.38.1
    netmask 255.255.255.0
    network 10.38.38.0
    broadcast 10.38.38.255

Now the bridge-utils and vlan packages provide hooks into the ifup and ifdown commands so you can simply do

auto br-vlan4
iface br-vlan4 inet static
    address 10.38.38.1
    netmask 255.255.255.0
    network 10.38.38.0
    broadcast 10.38.38.255
    vlan-raw-device eth1
    bridge_ports vlan4
    bridge_maxwait 0
    bridge_fd 0
    bridge_stp off

Which will automagically

  • Bring up eth1
  • Create vlan4 bound to the eth1 interface
  • Bring up vlan4
  • Create the br0 with vlan4 attached
  • Give eth1 the same HW address as br0
  • Bring up br0 with the IP address

Nifty!

Mongrel, rails and the theory of relativity

Summary (E = mc²)

When using mongrel for rails and you want to deploy an app under /other_url then use

    ActionController::AbstractRequest.relative_url_root = "/other_url"

in config/environments/production.rb instead of

    ENV['RAILS_RELATIVE_URL_ROOT'] = “/other_url”

Proof (From first principals)

At Vquence we have a pretty standard rails setup

  • Apache with mod_proxy
  • pen
  • mongrel

Silvia recently wrote an application to allow us to edit the news articles posted to our corporate website. I wanted to do something I thought would be pretty simple, have the application appear at /news on our admin web server.

Step one was the obvious change to mod_proxy

    ProxyPass /news http://localhost:8000
    ProxyPassReverse /news http://localhost:8000

Of course the problem is that the rails app still thinks it is living on / so it returns URLs like /stylesheets/moo.css instead of /news/stylesheets/moo.css.

A bit of googling found a few email threads with a common solution. In your environment.rb set

    ENV['RAILS_RELATIVE_URL_ROOT'] = “/other_url”

This is where things fell apart fairly quickly. I could not get this to work no matter what I tried. After a few hours of following a HTTP request through the whole Mongrel and rails stack I discovered the following.

Setting RAILS_RELATIVE_ROOT will work fine if you are running rails using CGI. For the simple reason, which should have been more obvious to me sooner, that CGIs use environment variables to access their parameters. This can be seen in the
ruby CGI class

/usr/lib/ruby/1.8/cgi.rb:


class CGI

def env_table
    ENV
end

However mongrel overloads env_table and does the following instead

/usr/lib/ruby/1.8/mongrel/cgi.rb:


class CGIWrapper < ::CGI

    # Used to wrap the normal env_table variable used inside CGI.
    def env_table
        @request.params
    end

This makes sense since the rails code is now running inside the web server so environment variables aren’t necessary. Upon investigation I found that the URL morphing magic is performed with rails as follows.

/usr/share/rails/actionpack/lib/action_controller/request.rb:


  class AbstractRequest
    cattr_accessor :relative_url_root

    # Returns the path minus the web server relative installation directory.
    # This can be set with the environment variable RAILS_RELATIVE_URL_ROOT.
    # It can be automatically extracted for Apache setups. If the server is not
    # Apache, this method returns an empty string.
    def relative_url_root
      @@relative_url_root ||= case
        when @env["RAILS_RELATIVE_URL_ROOT"]
          @env["RAILS_RELATIVE_URL_ROOT"]
        when server_software == ‘apache’
          @env["SCRIPT_NAME"].to_s.sub(/\/dispatch\.(fcgi|rb|cgi)$/, ”)
        else
          ”
      end
    end

What this all means is that you can solve the whole problem by placing the following in your config/environments/production.rb

    ActionController::AbstractRequest.relative_url_root = "/other_url"

Now if only Einstein had put his theories to good use and invented a time machine then maybe I could get the last 4 hours of my life back :)

Update: Make sure /other_url isn’t the same name as one of your controllers or bad things happen.

linux.conf.au brings about another change

Being Technical Guru for linux.conf.au 2007 was one of the most amazing experiences I’ve had in recent years. It was a lot of hard work but it was totally worth it. Having a room burst into applause at the penguin dinner when you say your the network guy is pretty unbelievable.

I went up to the Hunter for a week to recover from the conference and as usual after linux.conf.au I did a lot of thinking as to whether it was time to try something new. This time change won out at the end of the day and after 6 years at Bulletproof I decided it was time to move on.

At the beginning of March I started as Director of Engineering at Vquence. Since we are a video company it was decided that we each needed to have our own video on the web.

The past three weeks have been so hectic that Bulletproof already seems a lifetime ago. I’ve been involved in everything from setting up the new office and the corporate infrastructure to product development.

Joining a startup right at the beginning is always an amazing experience. With just a few people on the ground you always get pulled in a few million directions and there is always a new challenge just another five minutes away. I definitely recommend anyone else to jump at the opportunity if it ever presents itself.