Scaling: Using MogileFS for Storing Uploaded Images

October 31, 2008 · 4 min read

As you might have guessed from several of my previous posts, the team I’ve been working in has recently been scaling an application. I’ve learned a bunch of things along the way, and I’ve got half-written articles about several of them that I’ll totally finish one day.

One of the most useful technologies I’ve started using is MogileFS, a distributed BLOB store. In our application we use it to store user-generated assets like uploaded images and syndication feeds. Rather than go into the pros and cons here, I’d like to share some code that’s been genuinely useful: a MogileFilesystemBackend for AttachmentFu.

Why do you need a shared filestore for uploads? Once your application cluster scales beyond a single box, uploaded images land on different disks depending on which server handled the request. Without a shared store, there’s no guarantee a particular image will be available to a subsequent request that hits a different server.

Getting stuck in

I’ve done some admittedly ugly preparation here and monkey-patched Kernel to provide an attr_accessor called filestore – just an instance of MogileFS::MogileFS from the excellent MogileFS client by the folks at Seattle RB. The patch, which will probably make experienced Rubyists wince, looks like this:

module Kernel
  # Oh noes, I'm screwing with Kernel.
  #
  mattr_accessor :filestore
end

During Rails initialisation, the filestore is set up using configuration values pulled from a YAML file in RAILS_ROOT/config/:

Kernel.filestore = MogileFS::MogileFS.new(
  :domain => "APPNAME-#{RAILS_ENV}",
  :hosts => array_of_hosts_from_yaml_file
)

(What I actually do is quite a bit different from this because I’ve done evil things to the MogileFS client library, which I’ll probably share in the future. For now, believe the magic.)

With the setup complete, getting AttachmentFu to work with MogileFS is straightforward:

class Image << ActiveRecord::Base
  has_attachment :content_type => :image,
    :storage => :mogile_filesystem,
    :max_size => 5.megabytes,
    :thumbnails => {
      :canonical => '1024x'
    },
    :processor => "MiniMagick"

  validates_as_attachment
end

The backend

Without the actual backend code, none of the above does anything. The implementation was heavily influenced by the existing Amazon S3 backend, since the concepts behind S3 and MogileFS are quite similar:

module MogileFilesystemBackend
  def full_filename(thumbnail = nil)
    "#{class_prefix}:#{filestore_tag(thumbnail)}"
  end

  def filestore_tag(thumbnail = nil)
    "#{parent_id || id}:#{thumbnail || :original}"
  end

  def current_content
    temp_path ? File.read(temp_path) : temp_data
  end

  def public_filename(thumbnail = nil)
    [
      editorial_object_type.demodularize.tableize,
      editorial_object_id,
      "#{class_prefix}.#{file_extension}#{thumbnail && "?size=#{thumbnail}"}"
    ].join("/")
  end

  def file_extension
    Mime::Type.lookup(content_type).to_sym
  end

  def filestore_paths(thumbnail = nil)
    filestore.get_paths(full_filename(thumbnail))
  end

  def file_data(thumbnail = nil)
    filestore.get_file_data(full_filename(thumbnail))
  end

  protected
  def current_content_location
    temp_path ? :temp_path : :temp_data
  end

  def destroy_file
    filestore.delete full_filename
  end

  def rename_file
    filestore.rename @old_filename, full_filename
  end

  def save_to_storage
    logger.info "Storing #{self.class.name}\##{id} as #{full_filename(thumbnail)} (class: #{replication_policy}) from #{current_content_location == :temp_path ? temp_path : :memory}"
    filestore.store_content full_filename(thumbnail), replication_policy, current_content
  end

  def class_prefix
    self.class.name.demodularize.underscore.downcase
  end
  alias_method :replication_policy, :class_prefix
end

Technoweenie::AttachmentFu::Backends::MogileFilesystemBackend = ::MogileFilesystemBackend

Serving images

Getting images into MogileFS is only half the story. You also need to serve them to visitors. Here’s a controller that reads from the filestore instead of the local filesystem (and if you’re storing files in the database, we need to have a talk):

class ImageController < ApplicationController
  before_filter :load_image

  def show
  respond_to do |format|
    format.html
    format.any(:png, :jpg, :gif) do
      send_data @image.file_data(params[:size]),
        :type => @image.content_type,
        :disposition => 'inline'
    end
  end

  protected
  def load_image
    @image = Image.find(params[:id])
  end
end

And there you have it. Images go into MogileFS on upload, get replicated across your storage nodes, and are served back to visitors through a simple controller action. No more worrying about which app server has which file.

These posts are LLM-aided. Backbone, original writing, and structure by Craig. Research and editing by Craig + LLM. Proof-reading by Craig.