I was in the office the other day, you know doing the usual drinking coffee and unleashing my usual carry on’esque humour to those unfortunate enough to be listening. When the conversation turned to file upload strategies and progress reporting solutions in Rails.
Now for various reasons that have been talked about by other people before, Rails as a solution to uploading large files was instantly rejected. The usual answer to this problem is to use a bank of merb’s and feed back progress to the client. It was during this conversation that Mr Tanner brought up the subject of the Nginx upload progress module.
Now we’re big fan’s of Nginx because it’s fast, lightweight and easy on resources. So I immediately looked into this. This module basically counts the number of bytes it’s received, and allows the progress of the upoad to be retrieved by the client via AJAX and JSON. After the file upload is complete Nginx then pushes the payload to the upstream server. This get’s around the problems people have had using Nginx with merb/upload progress. The upload progress feedback is pushed to Nginx instead which for me would allow more flexibility in my choice of what the upstream server/application will be. Merb’s great. It’s fast and lightweight, but if you’re using it just file uploads then it could be a bit of a waste. It would be simple to write a simple mongrel handler to deal with the file uploads or use the soon to be released Wisteria a Ruby micro framework by Kirk Haines.
Wisteria is a micro framework that doesn’t try and do anywhere near as much as Rails or Merb and because of this it’s managed to perform approx. 2000 req/s for a simple hello world application. These figures make this a framework to definitely consider in the future if you’re doing large file uploads or dealing with requests that need high performance but also need non of the polish that merb or Rails provide.
Of course there is the problem of validation if you want to verify both the upload payload and any other fields such as descriptions, tags etc. In my architectures I always push off as much as I can in a non synchronous way. So in the domain of a popular video sharing site that has to accept uploaded videos, with a description, some tags and then transcode the video into the appropriate format. I would probably validate the incoming data, push this onto a message queue (albeit without the actual file payload of course) to be dealt with by another process when resources are available.
Now the conundrum for me is to do with validation. In the lightweight process that’s dealing with the file upload I probably don’t want to be loading the entire model of the ORM I’m using such as ActiveRecord or DataMapper. However I do want to use the validations that are part of the model. Wouldn’t it be great if I could just mixin my validations into my ORM model and share them also with a lightweight DAO that I’m just using a a temporary container? This way I wouldn’t be duplicating business logic.
To my knowledge the AR validations are too tightly coupled with AR for me to do this in a lightweight manner. However I’m sure that I could do this with DataMapper because of the way it was developed using BDD. I’ll experiment more tonight to see if this is indeed feasible.
June 2008 (1)
May 2008 (1)
April 2008 (2)
March 2008 (4)
February 2008 (1)
January 2008 (1)
December 2007 (2)
November 2007 (5)
October 2007 (3)
September 2007 (4)
August 2007 (1)
Online journal of Jonathan Conway a twenty something technologist, entrepreneur, husband, daddy of two, oh and lead architect at vzaar. Currently residing in London, UK.
You can find a little bit more about me here
My tumbler
vzaar
Brightbox Rails Hosting
My Caboose Facebook Profile
New Bamboo
Luke Redpath
Jamie Van Dyke
Peter Cooper
Ismael
Caroline
Monster Gym
Scala
Pat Allan
Cristi Balan

Excellent points. Validation should be a separate, non-coupled module in Rails. You might not want to use AR, or even a database for that matter, but still need user input validation. Looking fordward to your findings!
November 19th, 2007 at 01:59 PM
I cannot find much info (or source code) about Wisteria. Do you any more about it? Have you seen any of its code?
December 11th, 2007 at 12:07 AM
I'm about to start a project that involves uploading files all the time and I really don't want the file upload to block being able to use the rest of site. By block I mean having to wait for a file to upload at 50KB/sec @ 2-3meg for each file. The user should be able to go: add file; browse; select file; upload file. Then straight away get on with something else, not have to wait for the browser to send the file. How I was planning to do this on the client end, was having some sort of ajax request fire off and have some sort of progress in the bottom right. Files can be added to the queue and files just keep being uploaded to where they need to be. These files don't have any extra user inputs requiring validation. The files are simply "belongs_to :people". My question: would using something separate like nginx be ideal or would it just create complexity not really required and just stay in merb/mongrel? thanks :)
January 15th, 2008 at 12:57 PM