Scaling: Using MogileFS for Storing Uploaded Images

October 31, 2008 · 2 min read

As you might have guessed from several of my previous posts, the team I've been working in has recently been scaling an application. I've learned a bunch of things along the way, and I've got half-written articles about several of them that I'll totally finish one day. One of the most useful technologies I've started using is [MogileFS](http://www.danga.com/mogilefs/), a distributed BLOB store. In our application we use it to store user-generated assets like uploaded images and syndication feeds. Rather than go into the pros and cons here, I'd like to share some code that's been genuinely useful: a `MogileFilesystemBackend` for [AttachmentFu](http://github.com/technoweenie/attachment_fu/tree/master). Why do you need a shared filestore for uploads? Once your application cluster scales beyond a single box, uploaded images land on different disks depending on which server handled the request. Without a shared store, there's no guarantee a particular image will be available to a subsequent request that hits a different server. #### Getting stuck in I've done some admittedly ugly preparation here and monkey-patched `Kernel` to provide an `attr_accessor` called `filestore` -- just an instance of `MogileFS::MogileFS` from the excellent [MogileFS client](http://seattlerb.rubyforge.org/mogilefs-client/) by the folks at [Seattle RB](http://seattlerb.rubyforge.org/). The patch, which will probably make experienced Rubyists wince, looks like this: ```ruby module Kernel # Oh noes, I'm screwing with Kernel. # mattr_accessor :filestore end ``` During Rails initialisation, the filestore is set up using configuration values pulled from a YAML file in `RAILS_ROOT/config/`: ```ruby Kernel.filestore = MogileFS::MogileFS.new( :domain => "APPNAME-#{RAILS_ENV}", :hosts => array_of_hosts_from_yaml_file ) ``` (What I actually do is quite a bit different from this because I've done evil things to the MogileFS client library, which I'll probably share in the future. For now, believe the magic.) With the setup complete, getting AttachmentFu to work with MogileFS is straightforward: ```ruby class Image << ActiveRecord::Base has_attachment :content_type => :image, :storage => :mogile_filesystem, :max_size => 5.megabytes, :thumbnails => { :canonical => '1024x' }, :processor => "MiniMagick" validates_as_attachment end ``` #### The backend Without the actual backend code, none of the above does anything. The implementation was heavily influenced by the existing Amazon S3 backend, since the concepts behind S3 and MogileFS are quite similar: ```ruby module MogileFilesystemBackend def full_filename(thumbnail = nil) "#{class_prefix}:#{filestore_tag(thumbnail)}" end def filestore_tag(thumbnail = nil) "#{parent_id || id}:#{thumbnail || :original}" end def current_content temp_path ? File.read(temp_path) : temp_data end def public_filename(thumbnail = nil) [ editorial_object_type.demodularize.tableize, editorial_object_id, "#{class_prefix}.#{file_extension}#{thumbnail && "?size=#{thumbnail}"}" ].join("/") end def file_extension Mime::Type.lookup(content_type).to_sym end def filestore_paths(thumbnail = nil) filestore.get_paths(full_filename(thumbnail)) end def file_data(thumbnail = nil) filestore.get_file_data(full_filename(thumbnail)) end protected def current_content_location temp_path ? :temp_path : :temp_data end def destroy_file filestore.delete full_filename end def rename_file filestore.rename @old_filename, full_filename end def save_to_storage logger.info "Storing #{self.class.name}\##{id} as #{full_filename(thumbnail)} (class: #{replication_policy}) from #{current_content_location == :temp_path ? temp_path : :memory}" filestore.store_content full_filename(thumbnail), replication_policy, current_content end def class_prefix self.class.name.demodularize.underscore.downcase end alias_method :replication_policy, :class_prefix end Technoweenie::AttachmentFu::Backends::MogileFilesystemBackend = ::MogileFilesystemBackend ``` #### Serving images Getting images *into* MogileFS is only half the story. You also need to serve them to visitors. Here's a controller that reads from the `filestore` instead of the local filesystem (and if you're storing files in the database, we need to have a talk): ```ruby class ImageController < ApplicationController before_filter :load_image def show respond_to do |format| format.html format.any(:png, :jpg, :gif) do send_data @image.file_data(params[:size]), :type => @image.content_type, :disposition => 'inline' end end protected def load_image @image = Image.find(params[:id]) end end ``` And there you have it. Images go into MogileFS on upload, get replicated across your storage nodes, and are served back to visitors through a simple controller action. No more worrying about which app server has which file.

These posts are LLM-aided. Backbone, original writing, and structure by Craig. Research and editing by Craig + LLM. Proof-reading by Craig.