1

A story about building an application

I was recently working on a Ruby on Rails application that had a section for sending messages. This sounds pretty easy, right? I started with a User model and a Message model and some basic associations:

 class User < ActiveRecord::Base
    has_many :messages
 end

 class Message < ActiveRecord::Base
    belongs_to :user
 end  

Fig.1 - initial User and Message models

But when it came time to actually start building the application, I found this simple model code was not enough. The devil is in the details, as they say. There was a lot of functionality I needed to add beyond just a list of messages connected to a User.

1) Filtered views

I needed different views of the messages such as sent messages, drafts, and deleted messages.

How do I determine 'draft' status? Well one way is to fill in a delivered_at date whenever a message is sent. Then a draft is just a Message with no delivered_at date.

So after adding that field (and a sender_id and receiver_id) to the database I went to my messages_controller.rb file and added a few methods that looked sort of like this:

 def index
   @messages = user.messages.find(:all, 
     :conditions => ['delivered_at is not NULL and recipient_id = ?', 
       user.id])
 end
 
 def sent_mail
   @messages = user.messages.find(:all, 
     :conditions => ['delivered_at is not NULL and sender_id = ?', 
       user.id])
 end
 
 def drafts 
   @messages = user.messages.find(:all, 
     :conditions => ['delivered_at is NULL and sender_id = ?', 
       user.id])
 end

Fig.2 - initial fragment from messages_controller.rb

2) Pagination

Nobody wants to load a page of 1000 messages at a time, so I needed to be able to break up that list into limited sized chunks. I used the excellent plugin will_paginate for that purpose. Then my controllers methods got a little more verbose:

 
 def index
   @messages = user.messages.find(:all, 
     :conditions => ['delivered_at is not NULL and recipient_id = ?', 
       user.id]
    ).paginate(:page => (params[:page] == "" ? 1 : params[:page]))
 end

 def sent_mail
   @messages = user.messages.find(:all, 
     :conditions => ['delivered_at is not NULL and sender_id = ?',
       user.id]
    ).paginate(:page => (params[:page] == "" ? 1 : params[:page]))
 end
 
 def drafts 
   @messages = user.messages.find(:all, 
     :conditions => ['delivered_at is NULL and sender_id = ?', 
       user.id]
    ).paginate(:page => (params[:page] == "" ? 1 : params[:page]))
 end

Fig.3 - fragment from messages_controller.rb with pagination

3) The ability to flag content (i.e. spam, objectionable content etc...)

What if someone gets spam in the message system - or something objectionable in some other way. Well I need to filter that stuff out. I added a Flag model and connected that to messages like so:

  class Message < ActiveRecord::Base
    has_many :flags
    belongs_to :user
  end

Fig.4 - Message model with flags added

However, at this point my controller methods are starting to look like this:

 def sent_mail
   @messages = user.messages.find(:all, 
     :conditions => ['delivered_at is not NULL and 
         flags.flagged_item_id is NULL and 
         recipient_id = ?', user.id], 
       :include => :flags
    ).paginate(:page => (params[:page] == "" ? 1 : params[:page]))
 end

 def sent_mail
   @messages = user.messages.find(:all, 
     :conditions => ['delivered_at is not NULL and 
         flags.flagged_item_id is NULL and 
         sender_id = ?', user.id], 
       :include => :flags
    ).paginate(:page => (params[:page] == "" ? 1 : params[:page]))
 end
 
 def drafts 
   @messages = user.messages.find(:all, 
     :conditions => ['delivered_at is NULL and
         flags.flagged_item_id is NULL and 
         sender_id = ?', user.id], 
       :include => :flags
    ).paginate(:page => (params[:page] == "" ? 1 : params[:page]))
 end

Fig.5 - fragment from messages_controller.rb with flags

I'm looking at some ugly code - with a lot of repetition. How do I pare this down?

Begin Pruning

My first thought is that if anything in my application can be flagged, I should be able to do a little meta-programming to create a find method that will give me only un-flagged items. Ideally I could even send in all the rest of the find arguments exactly the same.

There is the named_scope addition to Rails 2.x that does just that - but I also want something I can add to any class as a Mixin. That way I can write code like this:

  Message.unflagged_items.find(:all, 
    :conditions => ['delivered_at is not NULL'])
  SomeOtherThing.unflagged_items.find(:all, :conditions => ...)

Fig.6 - call to imagined method unflagged_items

The method with_scope is a good candidate for sending in some pre-determined find conditions - but leaving it open to add more later. I'm wanting to add the following method to all my classes that need to be flagged:

  def unflagged_items(*args)  
    self.with_scope(:find => {
        :conditions => 'flags.flagged_item_id is NULL', 
        :include => :flags}) do  
      self.find(*args)
    end  
  end

Fig.7 - code for imaginary unflagged_items method

How do I do that? Well, I can turn that code into a Module and add it to any class automatically using a little metaprogramming:

  module Flaggable

    def self.included(base)
      base.class_eval do
        has_many :flags, :as => :flagged_item, :dependent => :destroy
      end
      base.extend(ClassMethods)
    end

    module ClassMethods

      def unflagged_items(*args)  
        self.with_scope(:find => {
            :conditions => 'flags.flagged_item_id is NULL', 
            :include => :flags}) do  
          self.find(*args)
        end  
      end  
    end

  end

Fig.8 - Flaggable module

Any model I put the line include Flaggable in will have that method available. So if I include it in the User class I've added a method user.messages.unflagged_items which returns a sort of incomplete version of the find function - with all the necessary logic to limit the list to unflagged items already filled in. I still have to fill in the :all or :first or any other :conditions I want. But the function is sort of half-called. This is a useful thing - getting half-called functions. In functional programming it's called currying. I'll come back to that in a moment.

Anyway, So now my controller methods now look like this:

 def index
    @messages = user.messages.unflagged_items(:all,
      :conditions => ['delivered_at is not NULL and recipient_id = ?', 
        user.id]
    ).paginate(:page => (params[:page] == "" ? 1 : params[:page]))
 end

 def sent_mail
    @messages = user.messages.unflagged_items(:all, 
      :conditions => ['delivered_at is not NULL and sender_id = ?', 
        user.id]
    ).paginate(:page => (params[:page] == "" ? 1 : params[:page]))
 end
 
 def drafts 
   @messages = user.messages.unflagged_items(:all, 
     :conditions => ['delivered_at is NULL and sender_id = ?', 
       user.id]
    ).paginate(:page => (params[:page] == "" ? 1 : params[:page]))
 end

Fig.9 - fragment from new messages_controller.rb

Continue Pruning

It's getting better, but isn't there some way I can pare it down even more? Now I'll go to the User model. Instead of simply using has_many :messages - since has_many supports blocks - I can add some more convenience methods to the User class:

  class User < ActiveRecord::Base

    has_many :received_messages, :foreign_key => 'recipient_id', 
          :class_name => 'Message' do
      def delivered_and_unflagged(page=1)
          unflagged_items(:all, :conditions => 'delivered_at IS NOT NULL'
        ).paginate(:page => page, :per_page => @messages_per_page)
      end
    end

    has_many :sent_messages, :foreign_key => 'sender_id', 
          :class_name => 'Message' do
      def delivered_and_unflagged(page=1)
          unflagged_items(:all, :conditions => 'delivered_at IS NOT NULL'
        ).paginate(:page => page, :per_page => @messages_per_page) 
      end
    end

    has_many :draft_messages, :foreign_key => 'sender_id', 
          :class_name => 'Message', 
          :conditions => 'delivered_at IS NULL' do
      def paginated(page=1)
        paginate(:page => page, :per_page => @messages_per_page)
      end
    end

    def inbox(page=1)
      self.received_messages.delivered_and_unflagged(page)
    end

    def sent_mail(page=1)
      self.sent_messages.delivered_and_unflagged(page)
    end

    def drafts(page=1)
      self.draft_messages.paginated(page)
    end
  end

Fig.10 - more developed User model

I'm doing pretty well with reduction of code in my controller now. The only ugly bit of code leftover is the params[:page]... bit - but I can make that slightly better too by factoring it out. I would like to use params[:page] || 1 but params[:page] returns an empty string if there is no matching parameter and will_paginate interprets an empty string as a request for page 0 and returns an error. So I have to use the longer statement with the ternary operator. Now my controller code looks like this:

 def index
   @messages = user.inbox(figure_page)
 end
 
 def sent_mail
   @messages = user.sent_messages(figure_page)
 end
 
 def drafts 
   @messages = user.drafts(figure_page)
 end

 def figure_page
   params[:page] == "" ? 1 : params[:page]
 end

Fig.11 - pruned fragment from messages_controller.rb

I'm happy enough with that. I've made different lists of messages for the currently logged on User that automatically paginate and filter out flagged items with just one line of code per method.

3) Next and Previous Message

I'm not done yet though - because the view page of a message needs a next and previous link. So if the user is looking at a draft - next should be the next draft - not the next sent message - and previous should be the previous draft - not the previous sent message. Make sense?

One way I could do this is to have a show_draft method, a show_sent_item method etc... and just call the correct link from the correct listing page (i.e. the list of all drafts page has links to show_draft, the sent items page has links to show_sent_item etc...).

There are 2 problems with this though. 1) That is creating several methods for basically one 'show' action. So they will all be virtually the same code over and over again. 2) I'm using a partial to render the list of messages - so I'd have to send in some way to create a different link based on the type of filter ('drafts', 'sent mail' etc...) but I'd rather just call render :partial => "message", :collection => @messages. I don't want the partial to have to worry about what particular filtered list of messages it happens to be rendering.

I'm sure there are a lot of ways to solve this. What I came up with was to add a 'from' value as a parameter for each link_to :action => 'show' in the partial. That way I could just append params[:action] to every url and by the time the controller gets the request, it knows where the request is coming from. This gives me the information I need to respond differently to the show action depending on that parameter. And leaves that logic out of the view.

In order to get the next and previous messages though, I needed to be able to identify and generate a list of messages based on the value of a string (i.e. value of params[:from]).

The code I wrote at first looked something like this and was in the controller:

  def show
    @message = Message.find(params[:id])
    # need @messages for previous, next
    case params[:from]
    when 'sent_mail'
      @messages = user.sent_messages(figure_page)
    when 'drafts'
      @messages = user.drafts(figure_page)
    #...
  end

  def bulk_action
    # ... do bulk action

    # need @messages for previous, next
    case params[:from]
    when 'sent_mail'
      @messages = user.sent_messages(figure_page)
    when 'drafts'
      @messages = user.drafts(figure_page)
    #...
  end

Fig.12 - fragment of messages_controller.rb with new code

So I've lost some of my simplicity, I'm repeating myself again and my code is in need of pruning.

What I need is a function that returns a function waiting to receive arguments. This is similar to the with_scope method I mentioned earlier, and the idea of function currying. I need a function that's partially filled out - but not called yet - waiting for some parameters. This is a good place to use the the fact that a Method is just another object in Ruby - and create a method to return whichever User method I want.

A method that returns a method

  def get_messages_function(param)
    # special case of 'index' action
    if param == 'index'
       self.method('inbox')
    else
       self.method(param)
    end
  end

Fig.13 - fragment from User model

returns a method as an object waiting for arguments. So I can put that code in my User class and I can call it like this in my controller:

 def index
  @messages = user.get_messages_function(params[:action]).call(figure_page)
 end

 def sent_mail
  @messages = user.get_messages_function(params[:action]).call(figure_page)
 end
 
 def drafts 
  @messages = user.get_messages_function(params[:action]).call(figure_page)
 end

 def show
  @message = Message.find(params[:id])
  @messages = user.get_messages_function(params[:from]).call(figure_page)
  # ...
 end

Fig.14 - fragment from new messages_controller.rb

One last trick

I'm almost done. But I can go one step further in minimization of code. Taking advantages of the fact that a method can be converted to a block by putting an & in front of it. In the controller, since all the returned methods are taking that same figure_page parameter - I can factor that out as a method accepting a block and do something like this:

  def index
    @messages = find_messages(&user.get_messages_function(:inbox))
  end

  def sent_mail
    @messages = find_messages(&user.get_messages_function(:sent_mail))
  end

  def show
    @message = Message.find(params[:id])
    @messages = find_messages(&user.get_messages_function(params[:from])) 
    # ...
  end

  def figure_page
    params[:page] == "" ? 1 : params[:page]
  end

  private
  def find_messages(&func)
    yield(figure_page)
  end

Fig.15 - fragment from another revision to messages_controller.rb

It's odd looking, I admit. I've lost a little readability for the sake of density. But I've left myself very little code in the controller and nothing specific about controllers in the model. That much I like.

Conclusion

So if you ever writing a Ruby on Rails application that has messages that need to be filtered, paginated and include a detail view with a previous, next link - you might be able to glean some code from the article to help get started. Also, today's lesson is that it's sometimes handy to pass around functions as objects.

NOTE: I've included a zip file of various items related to this article. It includes some Ruby code as a demonstration which requires a sqlite3 installation. Also, I used Python to generate this document with all the color-coded sections. I've included that in case it is of interest to anyone. It requires the Mako and Pygments packages.



Unknown

A while ago, I was given the task of converting a working Java web application to Ruby on Rails. The word on the street was that Ruby on Rails was great for a 'green' project, lays i.e. starting from scratch - but not so good for a legacy application.

However, since the database I was using already had an auto-increment primary-key column 'id' for every table, I didn't run into that many problems. There was, however, one bit of functionality I puzzled over for a while.

The application was a project tracking application: projects with attached assignments, and each assigment was associated with a document of some sort. The document types weren't really related to each other all that much, except for a few common attributes and the fact that they could be assigned. To protect the names of the innocent, I will pretend the types of items were articles, books and blog posts. If you consider those, they don't have a lot in common. No article or blog post needs an isbn, and no book needs a date posted field. They are all pretty distinct conceptually; the commonality can be found in that each are something an editor might need to work on - i.e. they are all coherent bodies of text.

I looked at the existing Java code. The application had classes Article, Book and Post all derived from Document. But there was no 'documents' table - only articles, books and posts.

Technically this is Multi-table inheritance since each type of object gets its own table. This is not something supported by Rails. Rails supports the idea of Single Table Inheritance (STI): put every record in one table. This often works out very well because it is efficient and simple.

However, I didn't want to force all these records into one table for a variety of reasons. One such reason is I wanted to make sure no Book was created without an isbn - but if I put them all in one table, the isbn field would have to allow for NULL (for articles for instance). Some would argue this an application level concern, but I think if you can guarantee data validity at the database level - and bolster that reliability on the application side - you are much better off than just merely relying on the application.

Another reason was because I've found, when rewriting an application in Rails, the longer I can push forward keeping the original data structure intact the better. That way I can run both applications at the same time as I figure out the functionality of the first. In the ideal world, the functionality of the original app would be fully detailed in a spec somewhere. This project, unfortunately, was deeply rooted in the actual world.

The currently working Java application was using an ORM, just like Rails does. The ORM for this project was something from the Apache group called the ObjectRelationalBridge or OJB for short. I looked into the code to see how it was achieving this multi-table inheritance. Basically it was using a simple table to keep track of what the next unique id should be for a given 'type'. Some databases support sequences that do this, but not MySQL 3.

The name of the table was sequences and it looked roughly like this:

sequences
id type_name value
1 document 5825

Whenever it was time to create a new Article, Book or Post, the database went to this table looking for the 'value' in the row with the 'type_name' = 'document', used that value to assign an id in a table corresponding to the object, and incremented the value. So the next Article would get an id of 5826 which would be stored in the id column of articles, the next Book would get an id of 5827 in the id column of books and so on. Pretty simple.

The other tables were structured something along these lines (this is a simplification):

projects

  • id
  • subject
  • start_date
  • end_date
  • status

assignments

  • id
  • document_id
  • document_type
  • start_date
  • end_date
  • status
  • project_id

articles

  • id
  • title
  • author_id
  • published

books

  • id
  • title
  • author_id
  • isbn

posts

  • id
  • title
  • author_id
  • slug

So how do I go about duplicating the functionality in Rails? Ideally, I'd like not to have to normalize all data into one big table, and ideally, I'd like to do something that doesn't mess with the internals of Rails, simply using the conventions that are already there. Over the years, I've found meddling with the internals of a framework can come back to bite you. Call me paranoid.

First Try

So, working in the most basic Rails idioms, I start off with something like this:

  class Project  < ActiveRecord::Base
  has_many :assignments
end

class Sequence  < ActiveRecord::Base
end

class Article  < ActiveRecord::Base
  before_create :generate_key

  def generate_key
    type_name = "document"
    key = Sequence.find(:first, :conditions => [ "type_name = ?", type_name])
    new_id = key.value + 1
    key.value = new_id
    key.save
    self.id = new_id
  end
end

class Book  < ActiveRecord::Base
  before_create :generate_key

  def generate_key
    type_name = "document"
    key = Sequence.find(:first, :conditions => [ "type_name = ?", type_name])
    new_id = key.value + 1
    key.value = new_id
    key.save
    self.id = new_id
  end
end

class Post  < ActiveRecord::Base
  before_create :generate_key

  def generate_key
    type_name = "document"
    key = Sequence.find(:first, :conditions => [ "type_name = ?", type_name])
    new_id = key.value + 1
    key.value = new_id
    key.save
    self.id = new_id
  end
end

That will take care of those Document objects getting their correct id, then I just have to make the Assignment reference it. This will work fine as a belongs_to since there can only be one match per row.

  class Assignment  < ActiveRecord::Base
  belongs_to :project

  belongs_to :article, :foreign_key => 'document_id'
  belongs_to :book, :foreign_key => 'document_id'
  belongs_to :post, :foreign_key => 'document_id'
end

That's all fine and good, but I wanted to clean up my code for 3 reasons:

  1. I've had to copy and paste the generate_key and before_create :generate_key code 3 times already. Not very DRY.
  2. There is nothing in the code itself to indicate that there is a relationship between a Book, an Article and a Post.
  3. I have to type in belongs_to for each type of object. I figure there must be a way to make this more succinct.

Second Try

First order of business: fix all those redundant generate_key methods. Well, how do you go about adding both class method (before_create) and an instance method to a class? Answer: a Module.

The Rails library likes to use the base.extend(SomeModule) convention by defining ClassMethods and InstanceMethods submodules. This works great and is a good convention to follow. If there are not many instance methods, though, I've found calling module_eval within ClassMethods will work just as well (see below).

  module GeneratedKey

  def self.included(base)
    base.extend(ClassMethods)
  end

  module ClassMethods

    def has_generated_key(object_name)
      object_name = object_name

      # add the before_create hook and make sure
      # we can access the object_name in instance
      # methods
      class_eval <<-CODE
           before_create :generate_key

           cattr_accessor :object_name
           @@object_name = object_name
      CODE

      # add the generate_key instance method
      # NOTE: i've skimmed over worrying about Symbol
      # vs. String for clarity
      module_eval <<-CODE
        def generate_key
          type_name = object_name.to_s
          key = Sequence.find(:first, :conditions => [ "type_name = ?", type_name])
          new_id = key.value + 1
          key.value = new_id
          key.save
          self.id = new_id
        end
      CODE

    end
  end

end

This means when I put has_generated_key :some_value in the preface of a class definition, the class will call the generate_key function before it runs create to generate the id value from the sequences table. Now I can set up my classes using inheritance. The only odd thing about that is that I have to set the table name on subclasses (otherwise it will always look for a 'documents' table).

  class Document < ActiveRecord::Base
  include GeneratedKey
  has_generated_key :document
end

class Article < Document
  set_table_name "articles"
end

class Book < Document
  set_table_name "books"
end

class Post < Document
  set_table_name "posts"
end

The only thing left is to clean up that belongs_to :book, belongs_to :article etc. stuff in the Assignment class. Once again, I can just factor it out into a module.

    module GenericAssociations

  def self.included(base)
    base.extend(ClassMethods)
  end

  module ClassMethods

    def belongs_to_type(object_name, implementations=[])

      object_name = object_name

      # take the array of implementations and do belongs_to for
      # each one
      implementations.each do |implementation|
        code = <<-CODE belongs_to  :#{implementation},
                                   :foreign_key => '#{object_name.to_s}_id'
        CODE

        class_eval code
      end

      # build up an if for each type of object i.e.
      # if article
      #   return self.article
      # end
      # if book
      #   return book
      # end
      # etc...

      if_loop = ""
      implementations.each do |implementation|
        if_loop += <<-CODE
            if self.#{object_name.to_s}_type == "#{implementation}"
               return self.#{implementation.to_s}
            end
        CODE
      end

      # add the 'document' method
      code = <<-CODE
         def #{object_name.to_s}
            #{if_loop}
         end
      CODE


      module_eval(code)

    end

  end

end

This creates a method belongs_to_type that is called during class definition, taking the type (as a Symbol) and an Array of the allowable subclasses as parameters. It also adds a method whose name is derived from the type (in this case document) that will loop through all the values and pick out which kind it is based on a *_type field value. Now I can just write this code for my Assigment class:

  class Assignment < ActiveRecord::Base
  include GenericAssociations

  belongs_to :project, :foreign_key => 'project_id'
  belongs_to_type :document, [:book, :article, :post]
end

So the only major surgery I might have had to do on the original database design was to add a document_type field to the assigments table. But that's it and, in my case, there was already a field by that name anyway because it was still necessary for the UI. Now I can write code like this to get the Document of an Assignment (and because of duck-typing - any field I need I can just reference when I need it):

   project = Project.find(1)

project.assignments.each do |assignment|
  print assignment.document
end

I can still retrieve only books, or only articles:

     articles = Article.find(:all)
articles.each do |article|
  print article
end

books = Book.find(:all)
books.each do |book|
  print book
end

And I can still create an Article, Book or Post the same way I would any other object:

    article = Article.new
article.save

Download files here: multi_table_article.zip

Note: The download includes a Rakefile in which you need to set up the db, username and password that matches your own. Then run rake. The default task runs some tests using a Ruby version of doctest, something from the Python world I'd like to see used in the Ruby world. The basic idea of doctest is that you can copy the results of an interactive interpreter session (in Ruby's case IRB) into comments in your Ruby code and rerun those transcripts as a form of regression testing. For example:

#doctest Check that 1 + 1 = 2
>> 1 + 1
=> 2
>> 2 + 3
=> 5
=end

I included a slightly modified version so that I could run the test from Rake. For a more official version see http://code.google.com/p/ruby-roger-useful-functions/wiki/DocTest




RSS Feed


CATEGORIES


ARCHIVES


BOOKMARKED


Add to Technorati Favorites