Perspectives

Controls

In my previous article we took a Behavior Driven Development approach to testing our data layer, in which our models were tested using RSpec. In this article I will showcase how RSpec can be used for controller testing. If you are new with RSpec, I will not go into detail with basic RSpec syntax such as should and it, please read my previous article, TDD, BDD and Using RSpec which goes over the basics to get you started.

Before we dive into controller testing, let's quickly create our app that will help birdkeepers find information on birds. Run the following commands:

rails birdkeeper -d mysql
cd birdkeeper
script/generate scaffold Bird title:string species_id:integer notes:text

Create birdkeeper_development and birdkeeper_test databases, add your database credentials to config/database.yml and migrate:

rake db:migrate
rake db:migrate RAILS_ENV=test

Then add the RSpec plugins (through git), gem (if not installed) and generate the spec directories:

sudo gem install rspec
script/plugin install git://github.com/dchelimsky/rspec.git
script/plugin install git://github.com/dchelimsky/rspec-rails.git
script/generate rspec

Open up app/controllers/birds_controller.rb, it should contain the 7 CRUD actions created from the Rails scaffold generator. If there are no actions, your version of Rails may need to be updated. Create birds_controller_spec.rb inside of spec/controllers. Add the following code:

require File.expand_path(File.dirname(__FILE__) + '/../spec_helper')

describe BirdsController do

end

The first line loads in the spec_helper file which will contain common code that can be shared between specs. Next we have a describe block which helps keep our tests organized. This one will contain tests relating to the BirdsController, which will hold all our tests. It will also contain inner describe blocks for further organization. Let's test the update method, since in addition to performing a find call like most CRUD actions, it also updates the object. The update method we will be using is as follows:

# PUT /birds/1
# PUT /birds/1.xml
def update
  @bird = Bird.find(params[:id])

  respond_to do |format|
    if @bird.update_attributes(params[:bird])
      flash[:notice] = 'Bird was successfully updated.'
      format.html { redirect_to(@bird) }
      format.xml  { head :ok }
    else
      format.html { render :action => "edit" }
      format.xml  { render :xml => @bird.errors, :status => :unprocessable_entity }
    end
  end
end

Go back to the BirdsController spec and add the following inside of the BirdsController describe block:

# UPDATE
describe "PUT birds/:id" do
    describe "with valid params" do
    
    end
    
    describe "with invalid params" do
    
    end
end

Above we have added 3 describe blocks. One wrapper describe block will contain all tests relating to the update method. Inside there are two describe blocks, one with tests if valid params are given and the other if invalid params are given. Let's start with valid parameters. Before writing the actual tests, we need to set expectations. This is done through mocking and stubbing within a before block. Add the following inside the "with valid params" describe block:

before(:each) do
    @bird = mock_model(Bird)
    Bird.stub!(:find).with("1").and_return(@bird)
end

The before block will run before each test. This will DRY up our tests since we won't have to rewrite the same mocks and stubs for each test. Mocks and stubs allow us to test the controller functionality without relying on ActiveRecord. With mock_model we are imitating a Bird object. Stubs are used to fake method calls, we don't need to know the details of the actual method. We just know that the Bird class will receive a find call with a argument of "1" and it should successfully return a @bird object, which will be our mock. Then later in the same method, the @bird mock will receive an update_attributes call, so we also need to stub this call out. We can stub it out as follows:

@bird.stub!(:update_attributes).and_return(true)

But there is another way to accomplish this much more DRYly. Our mock model accepts a optional hash of method calls and their return value. We can modify our @bird mock object into:

@bird = mock_model(Bird, :update_attributes => true)

With the before block set up we can start writing tests. Tests are contained in it blocks and takes a string argument explaining its contents. Let's first test the find call, which is the first action to happen after update is called.

it "should find bird and return object" do
    Bird.should_receive(:find).with("1").and_return(@bird)
    put :update, :id => "1", :bird => {}
end

Here we are testing if the Bird class received a find call, with the should_receive syntax. The rest is very similar to the stub method since we are checking if it received "1" as an argument and returned a @bird object.

After a Bird object is found, its attributes are updated. Testing this call is very similar to the find call:

it "should update the bird object's attributes" do
    @bird.should_receive(:update_attributes).and_return(true)
    put :update, :id => "1", :bird => {}
end

Next we make sure a flash notice is set:

it "should have a flash notice" do
    put :update, :id => "1", :bird => {}
    flash[:notice].should_not be_blank
end

If the controller can have one of many flash notices, we can also be more specific:

it "should have a successful flash notice" do
    put :update, :id => "1", :bird => {}
    flash[:notice].should eql 'Bird was successfully updated.'
end

After the flash notice is set, the user should get redirected to the bird's show page. We can test the redirect by accessing the response object as so:

it "should redirect to the bird's show page" do
    put :update, :id => "1", :bird => {}
    response.should redirect_to(bird_url(@bird))
end

As for testing if there was invalid data, we would do this:

before(:each) do
    @bird = mock_model(Bird, :update_attributes => false)
    Bird.stub!(:find).with("1").and_return(@bird)
end

it "should find bird and return object" do
    Bird.should_receive(:find).with("1").and_return(@bird)
    put :update, :id => "1", :bird => {}
end

it "should update the bird object's attributes" do
    @bird.should_receive(:update_attributes).and_return(false)
    put :update, :id => "1", :bird => {}
end

it "should render the edit form" do
    put :update, :id => "1", :bird => {}
    response.should render_template('edit')
end

Most is similar to the valid data version, but there are a few differences. We are stubbing the object's update_attributes call to return false, this will cause the conditional to take the else route. Since there are errors, it needs to render the edit page. We test this by doing a response.should render_template('edit').

This is how the complete Bird Controller spec looks:

require File.expand_path(File.dirname(__FILE__) + '/../spec_helper')

describe BirdsController do

  # UPDATE
  describe "PUT birds/:id" do

    describe "with valid params" do
      before(:each) do
        @bird = mock_model(Bird, :update_attributes => true)
        Bird.stub!(:find).with("1").and_return(@bird)
      end
      
      it "should find bird and return object" do
        Bird.should_receive(:find).with("1").return(@bird)
      end
      
      it "should update the bird object's attributes" do
        @bird.should_receive(:update_attributes).and_return(true)
      end
      
      it "should redirect to the bird's show page" do
        response.should redirect_to(bird_url(@bird))
      end
    end

    describe "with invalid params" do
      before(:each) do
        @bird = mock_model(Bird, :update_attributes => false)
        Bird.stub!(:find).with("1").and_return(@bird)
      end

      it "should find bird and return object" do
        Bird.should_receive(:find).with("1").return(@bird)
      end

      it "should update the bird object's attributes" do
        @bird.should_receive(:update_attributes).and_return(false)
      end

      it "should render the edit form" do
        response.should render_template('edit')
      end
      
      it "should have a flash notice" do
        flash[:notice].should_not be_blank
      end
    end
    
  end
end
Run this spec from the root of your application with the following command, all specs will pass:
ruby spec/controllers/birds_controller_spec.rb

Using the information learned here can be applied to writing tests for the other 6 CRUD actions. Since these actions were generated through the Rails scaffold generator, they should work and tests may not be needed. On the other hand, changes in functionality may cause new bugs to pop up. It is recommended to write tests and cover as much possible. It may become tedious to write tests for basic CRUD actions for each generated controller, luckily RSpec has a scaffold generator that will generate the same files complete with RSpec tests. The RSpec version of the scaffold can be used as follows:

script/generate rspec_scaffold Bird title:string species_id:integer notes:text

As a bonus, let's say we created a Species scaffold. A bird will belong to Species and we have the Birds controller nested within the Species controller. On the Species show page, we would run the following find methods for a list of birds that belong to that Species. Just how should the following be mock and stubbed?

@species = Species.find("1")
@birds_in_species = @species.birds.find(:all)

Take a few minutes to think about it. It is a bit more complicated, but like any complicated matter, can be simplified by breaking it down. I start off by mocking all objects involved, @species and @birds_in_species are a given. But we also can't forget the birds that are going to be returned from @species.birds, that needs to be mocked as well.

@birds_in_species = mock_model(Bird)
@species = mock_model(Species)
@birds = mock_model(Bird)

As for stubbing out the method calls. There are three in total, @species.birds.find counts as two.

Species.stub!(:find).with("1").and_return(@species)
@species.stub!(:birds).and_return(@birds)
@birds.stub!(:find).and_return(@birds_in_species)

Then in our tests, we would do the following:

Species.should_receive(:find).with("1").and_return(@species)
@species.birds.should_receive(:find).and_return(@birds_in_species)

I hope you have found this article helpful on testing controllers with RSpec. If you have checked out the generated specs through the rspec_scaffold generator, there are less tests for the update method. I prefer to have many smaller tests with each testing a small portion of the controller, that way when a single line is changed, the error from the test will be more helpful since it is more specific. I have found out about this approach from Mike Mangino, though ultimately how your specs are organized is a matter of personal preference.



Birds

Migrating an existing database into a new application can be a daunting task, especially if it's coming from another development team or platform. The database may contain multitudes of data, ensuring a lengthy migration - one that can't be repeated over and over if deadlines loom. It may have poor data integrity, or none at all. Your new, shiny Rails app may have a dozen validations on each table, but those same validations might exclude half the contents of the legacy database. And let's not even get into broken associations.

Fortunately, ActiveRecord makes it incredibly easy to map a second set of database models, and migrating data from one ActiveRecord source to another is faster than you might think - as long as you follow some best practices, of course. Even if you have an extremely large data set, don't assume that raw SQL is the only way to go. As with Rails usage in general, the saved development time (not to mention accurate validations and data conversions) will more than make up for any speed loss.

Migration Schedule

Throughout your migration design, keep in mind that you'll more than likely be running multiple migrations. You will need to test individual model migrations many times (especially if you have data integrity issues), and the final migration may span at least two executions if you're attempting a seamless transition from old to new. If people are still adding new data while you're running the final migration, it probably won't be "final" after all - you'll need to capture those last additions once you've switched over, and that may be a lot of data if your migrations take a long time to process.

Map Legacy Tables With ActiveRecord

In order to set up your legacy tables in ActiveRecord, you'll want to create a legacy base class in order to DRY up your database connection and any other common migration logic. It's a good idea to throw all of this into a directory outside of app/models, too, since it really doesn't have to do with your application logic (it can go anywhere, since you'll be requiring it directly from your migration task).

In your legacy base class, add an establish_connection call with your legacy database specifics, and you'll be ready to roll:

class LegacyBase < ActiveRecord::Base
  establish_connection(
    :adapter => "mysql",
      :host => "localhost",
      :username => "username",
      :password => "password",
      :database => "app_legacy"
    )
end

Once you have your base class, just create a new class for each legacy table that you need to migrate, preferably with "Legacy" or something similar prepended to the class name (it will help avoid class name collisions and help keep things organized). You'll need to use the set_table_name method if your class names are different than your table names:

class LegacyUser < LegacyBase
  set_table_name "users"
end

Migrate Self

The most important part of the migration, of course, is the actual code that turns old into new. With Ruby at your disposal, it's easy to contain integrity issues and transform values as you need. Take some time to examine your validations and even pore through a bit of the old data to see what kind of issues you'll run into (and prepare for the worst!). In all likelihood, you'll have to do a lot of string processing and nil checking in order to get things transferred correctly. For example, if you have uniqueness validations, you might need to append random numbers to any values marked as non-unique.

It's best to put the actual migration code for each model in its corresponding legacy model class, if possible:

class LegacyUser < LegacyBase
  set_table_name "users"

  def migrate_me!
    user = User.create(
      :name => self.name,   # Some fields can be directly ported
      :state => self.state.downcase,   # Some fields may need string processing, but watch out for nils! (self.state || '').downcase is safer
      :permalink => self.permalink || "user#{self.id}"   # Your app probably validates_presence_of, but don't assume that the old app did!
    )
    # Compensate for uniqueness issues - it's better to have a generic row than no row at all (your associations think so, at least)
    if user.errors.on(:permalink)
      user.permalink = "user#{self.id}"
      user.save
    end
  end
end

Also, check out the validates_existence_of plugin. It will ensure you know about broken associations, which you can expect a lot of if your predecessors weren't careful with dependency deletions - or if they saved non-validated rows to the database (it happens way more than you want to believe).

Keep Your IDs (And Your Sanity)

If it's possible to maintain IDs from the old database to the new, you should consider doing so. Some old databases will have funky primary keys spanning multiple columns and such, but hopefully yours uses more Rails-friendly numeric IDs. There are a lot of benefits to maintaining them: you won't have to create a lookup hash of old to new every time you create an associated model, you'll be able to easily cross-check old rows to new rows while you're debugging, and so forth. Certainly your migration will blaze along quicker if you don't have to do several association finds for every row.

If you do persist your IDs, be sure to initialize said IDs after creating each model (but before saving) - Rails will not accept an id parameter in an ActiveRecord new/create hash:

# Bad
User.create :id => 42, :name => "Zaphod"

# Still bad
user = User.new :id => 42, :name => "Zaphod"
user.save

# Good
user = User.new :name => "Zaphod"
user.id = 42
user.save

Also, you may want to manually alter the auto-increment of your new tables if you're maintaining IDs. It's not a bad idea to give the increment value some extra headroom, especially if you'll need to do a follow-up migration once your new application is live. That way, you won't have conflicting IDs if data has been added to the new application before you've imported the stragglers from the old database.

It's simple enough to run an alter command in your Rails script:

new_increment_value = LegacyUser.find(:first, :order => 'id DESC').id + 10000
ActiveRecord::Base.connection.execute("ALTER TABLE users AUTO_INCREMENT = #{new_increment_value}")

Avoid Validation Slowness

Weeding out and/or correcting bad legacy data with Rails validations is one of the benefits of using ActiveRecord to migrate, but it won't feel like much of a benefit if it takes five seconds to check each row. Before you hit "go" on your million-row migration, look over your validations and ensure that you have properly set up corresponding indexes in your destination database. Methods like validates_uniqueness_of can be especially time-consuming if MySQL has to filesort a million rows by permalink, email address, etc. for each single migrated row.

Iterate Wisely

Collection processing is usually pretty simple in Rails: select your models with ActiveRecord and then use #each or a for loop to step through. If your legacy tables are small enough, you can most likely get away with this in your migration. But if your rows number in the hundreds of thousands or greater, you might find that keeping all of those model objects in memory makes Ruby rather unhappy.

Fortunately, there are ways to iterate more efficiently, no matter how large the table. For starters, we can dramatically reduce our memory cost by selecting only one row at a time. The only real issue becomes how to select each row in turn.

ActiveRecord's offset parameter provides a simple way to get the nth row of a table, which lends itself easily to a loop:

Model.count.times do |row|
  Model.find(:first, :offset => row).migrate_me!
end

The offset parameter translates to a SQL LIMIT call:

SELECT * FROM models LIMIT 3, 1

Unfortunately, using LIMIT in MySQL slows dramatically as row counts increase. You might not notice it at first - the delay is due to parsing up to the limit point, so LIMIT 100, 1 is much quicker than LIMIT 100000, 1 - but eventually it will happen (probably a short while after you've gone to sleep dreaming of a finished migration the next morning).

To speed things up, we can select by ID instead of offset - all we need is a way to iterate through IDs. The most obvious solution is to select all of the IDs first, which is actually not a bad idea if your data is merely very large (as opposed to tremendous, gargantuan, or ludicrous). An array of a million integers may occupy a hundred megabytes of memory or so, but that's small potatoes on a production server.

We can use ActiveRecord::Base's connection object to get an array of integers without dealing with any modeling:

ActiveRecord::Base.connection.select_values('SELECT id FROM models').each do |id|
  Model.find(id).migrate_me!
end

For performance geeks and folks with truly incomprehensibly large data sets, there is a better solution yet: select your first ID and use simple SQL to get successive results (if your legacy data doesn't have unique integer IDs, you still might be able to use this method on a timestamp column or something similar, but otherwise you'll have to get more creative).

last_row_id = -1
current_record = Model.first   # Or whichever model you want to start with
end_id = Model.last.id   # Or whichever model you want to end with

while last_row_id < end_id do
  current_record = Model.find(:first, :conditions => ['id > ?', last_row_id])
  current_record.migrate_me!
  last_row_id = current_record.id
end

Rake Is Your Friend

It's best to wrap your migration script in a rake task, which can be called for any environment. If you haven't done so before, simply add a .rake file to lib/tasks, and add something like this:

namespace :legacy do
  desc "Migrate legacy data"
  task :migrate => :environment do   # The => :environment does the work of loading your Rails environment so you can use ActiveRecord, etc.
    require 'db/legacy/migrater'   # Require your migration class

    # You can pass values from the command line rake call (e.g. rake legacy:migrate MODEL=LegacyUser START_ROW=500) 
    # as a hash to your migration script using the ENV variable
    migrater = Migrater.new(ENV)
    migrater.go!
  end
end

Thereafter, you can merely call RAILS_ENV=test rake legacy:migrate and so forth from the command line in order to process your migrations. Add some conditions for selecting start/end IDs and limiting by model and you'll be all set!

Some Status, Please

Nothing is worse than running a migration and not having any idea when it will finish - or if it will finish. Do yourself a favor and add some informative output to your script, ideally with error-per-model output as well (so you can track down conversion issues). A nice trick to use is print '.' for each model migrated, with the occasional numeric status update (#500 of 100000, etc.) That way, your screen won't scroll to infinity and beyond.

Hey, That Was Easy!

I hope these tips save you some time the next time you migrate a legacy database. Migrations aren't as fun and flashy as the latest AJAX trick, but it's definitely cathartic to watch those status updates march towards completion once you're done!



Honeycomb

This is the third article in a series about best practices for creating Rails forms. Be sure to check out Pretty Data, Pretty Code and Modeling All Form Data for more related techniques.

Revisiting Our Form

Continuing with the example from my last article, I have this simple form:

<% form_for :product, :url => products_path do |f| %>
  <div class="form_item text_field">
    <label for="product[name]">Name:</label>
    <%= f.text_field :name %>
  </div>
  <div class="form_item text_field">
    <label for="product[price]">Price:</label>
    <%= f.text_field :price %>
  </div>
  <div class="form_item text_field">
    <label for="product[features]">Features (1 per line):</label>
    <%= f.text_area :features %>
  </div>
  <div class="form_item submit_button">
    <%= f.submit "Create Product" %>
  </div>
<% end %>

Note: I changed the original form's wrapper tags from <p> to <div>. The reason is that, if you're using the built-in Rails error message helpers, by default it will wrap erroneous fields in <div> tags. Since <div> tags can't nest within <p> tags, this can cause serious layout-breaking problems.

There's a lot of duplication here in terms of the wrapper code. Each field is wrapped in a <div> with a class of form_item as well as a descriptor of what kind of field it contains. I use this information for CSS styling of specific kinds of inputs. The labels are also fairly repetitive as well, and these repetitions would become more obnoxious were this a longer form.

Removing Repetition

What if we could generate this same form code while removing the repetition? It's possible, and Rails gives us a great hook for just this scenario by allowing us to build custom Form Builders. A custom Form Builder is simply a subclass of ActionView::Helpers::FormBuilder that can alter and extend the abilities of the regular Form Builder. In our current form, the f block variable is an instance of FormBuilder, so methods like text_field and submit are all instance methods of the FormBuilder class. Let's override these methods to not only output the field markup, but to also output the wrapper div:

# in /helpers/application_helper.rb
class WrapperFormBuilder < ActionView::Helpers::FormBuilder
  METHODS_TO_OVERRIDE = %w{text_field text_area password_field file_field date_select datetime_select submit}

  METHODS_TO_OVERRIDE.each do |method_name|
    src =<<END_SRC
      def #{method_name}_with_wrapper(field, options={})
        # allow explicit setting of label text with options[:label]
        field_label = if '#{method_name}' == 'submit'
          '' # no label for submit inputs
        elsif options[:label]
          label(field, options.delete(:label))
        else
          label(field) + ":" # Adds colon as default
        end

        # get unwrapped field
        field_markup = #{method_name}_without_wrapper(field, options)

        # return wrapped field (@template gives us access to helper methods in this class)
        @template.content_tag(:div, field_label + field_markup, :class => "form_item #{method_name}")
      end
    END_SRC
    class_eval src, __FILE__, __LINE__
    alias_method_chain method_name.to_sym, :wrapper
  end

end

For our form, we can now implement our new class like this:

<% form_for :product, :url => products_path, :builder => WrapperFormBuilder do |f| %>

Another possibility is to create a new method to replace form_for:

# in helpers/application_helper.rb
def wrapper_form_for(name, object=nil, options={}, &proc)
  form_for(name, object, options.merge(:builder => WrapperFormBuilder), &proc)
end

Using this new helper method, our form now looks like this:

<% wrapper_form_for :product, :url => products_path do |f| %>
  <%= f.text_field :name %>
  <%= f.text_field :price %>
  <%= f.text_area :features, :label => "Features (1 per line):" %>
  <%= f.submit "Create Product" %>
<% end %>

If you need to create a form field without a wrapper (perhaps to use a non-conforming wrapper), you can still access the original methods like f.text_field_without_wrapper or f.submit_without_wrapper.

I have found custom Form Builders to be powerful tools for speeding up form development, DRYing up code and keeping consistency between forms and developers. This is a fairly basic example, but the sky is the limit for what you can implement using these techniques.




RSS Feed


CATEGORIES


ARCHIVES


BOOKMARKED


Add to Technorati Favorites