Amazon EC2 ruby gem and large user_data

When you create an instance in EC2 you can send Amazon some user data that is accessible by your instance. At Vquence we use this to send a script that gets executes at boot up. This script contains some openvpn and puppet RSA keys so its approaching about 10k in size.

This works without any problems when using the java based command line tools. However I was getting the following error when using the EC2 Ruby GEM.

/usr/lib/ruby/1.8/net/protocol.rb:133:in `sysread': Connection reset by peer (Errno::ECONNRESET)
	from /usr/lib/ruby/1.8/net/protocol.rb:133:in `rbuf_fill'
	from /usr/lib/ruby/1.8/timeout.rb:56:in `timeout'
	from /usr/lib/ruby/1.8/timeout.rb:76:in `timeout'
	from /usr/lib/ruby/1.8/net/protocol.rb:132:in `rbuf_fill'
	from /usr/lib/ruby/1.8/net/protocol.rb:116:in `readuntil'
	from /usr/lib/ruby/1.8/net/protocol.rb:126:in `readline'
	from /usr/lib/ruby/1.8/net/http.rb:2020:in `read_status_line'
	from /usr/lib/ruby/1.8/net/http.rb:2009:in `read_new'
	 ... 6 levels...
	from ./lib/ec2helpers.rb:43:in `start_instance'
	from ./ec2-puppet:107
	from ./ec2-puppet:89:in `each_pair'
	from ./ec2-puppet:89

Doing some tcpdumping indicated that after receiving the request Amazon waits for a while and then sends a TCP RESET. Not very nice at all. My next step was to use ngrep to compare the output from the command line tools and the ruby gem. This got nowhere fast since the command line tools use the SOAP API while the ruby gem uses the Query API.

What I did notice however is that while the command line tools performed a POST the ruby library performed a GET. At this stage I decided to test how much data I could send. So I started trying different user data sizes. The offending amount was around 7.8k, suspiciously close to exactly 8k.

The HTTP/1.1 spec doesn’t place an actual limit on the length but leaves it up to the server.

The HTTP protocol does not place any a priori limit on the length of
a URI. Servers MUST be able to handle the URI of any resource they
serve, and SHOULD be able to handle URIs of unbounded length if they
provide GET-based forms that could generate such URIs. A server
SHOULD return 414 (Request-URI Too Long) status if a URI is longer
than the server can handle (see section 10.4.15).


Note: Servers ought to be cautious about depending on URI lengths
above 255 bytes, because some older client or proxy
implementations might not properly support these lengths.

Apache for example limits this by default to 8190 bytes including the method and the protocol. You can change this using the LimitRequestLine directive.

I created a patch to modify the EC2 Gem to use a POST instead of a GET which has no such limitations. You can find the git tree for it at http://inodes.org/~johnf/git/amazon-ec2

Leave a Reply

Your email address will not be published. Required fields are marked *