DataMapper is an object relational mapper for ruby with an interface somewhat similar to ActiveRecord's. More than sprinkles on top of a generic SQL adapter, DataMapper is a design pattern for defining repositories and the models that love them. DataMapper differs from ActiveRecord and ActiveResource in that models are encapsulated from repositories while queries and collections communicate between them. One of the most significant advantages to this approach lies in the ability to develop models separately from their repository.
Over the next several articles we will use the Twitter API to explore how DataMapper expects custom adapters to work, executes CRUD requests, handles associations and works with multiple repositories. For now we will concentrate on the basics of DataMapper and querying users from the Twitter API service. Those comfortable with ruby or ActiveRecord should be able to follow along, however I strongly recommend spending time with DataMapper's fantastic documentation if you have not already.
Modeling Our Models
First things first, we should install DataMapper and define a model to represent a user account from the Twitter API. To do that we need to define a module using the DataMapper::Resource module and describe the properties Twitter provides.
gem install datamapper
require 'dm-core' class User include DataMapper::Resource property :id, Integer, :field => 'user_id' property :name, String property :screen_name, String property :email, String property :location, String property :description, Text, :lazy => false property :profile_image_url, String property :url, String property :protected, Boolean property :followers_count, Integer end
DataMapper differs from the ActiveRecord family in that fields are defined in your model rather than being created from the repository's schema. This allows models to be built without an adapter making a connection and avoids the headaches of ActiveRecord-style migrations. Additionally you may specify the :field name to be used for each property, allowing antiquated or confusing field names to be user friendly. Defining a model's properties also allows DataMapper to intelligently calculate by type which fields should be lazily loaded, which may also be customized by passing the :lazy option a true or false value. In our User model we have set the lazy option to false as Twitter provides a user's description by default, and there is no use in wasting an API hit if we do not need to (Twitter limits API calls by the hour). A more in-depth description of DataMapper's lazy fields can be found in the documentation.
The Base Adapter
For the foundation of our Twitter adapter we need to catch any authentication options passed to the initialization as well as write a method for communicating with Twitter.
require 'cgi' require 'open-uri' require 'rubygems' require 'dm-core' require 'xmlsimple' module DataMapper module Adapters class TwitterAdapter # Clients can provide DataMapper with a URI string or hash of options when # initializing an adapter. We can store these values and use them for each # request to the Twitter service if the client provides them. Depending on # your repository you may wish to verify authentication here rather than # waiting for the initial request. # # name:: Name of the adapter # uri_or_options:: A uri string, or hash of options used to initialize the adapter # def initialize(name, uri_or_options) # don't forget to phone home! super(name, uri_or_options) case uri_or_options when Hash user = uri_or_options[:user] || '' pass = uri_or_options[:pass] || '' @auth = user.blank? || pass.blank? ? nil : [user, path] end end private # Requests a resource from the Twitter API. If the adapter was initialized # with the :user and :pass options, they will be used to authenticate the request. # # method:: Path to follow the base URI 'http://twitter.com' # params:: Hash of key/value pairs to be used as the query string # returns:: XmlSimple representation of the response from Twitter # def request(method, params = {}) uri = "http://twitter.com/#{method}" options = {:http_basic_authentication => @auth} unless params.blank? query = params.map { |k,v| "%s=%s" % [CGI.escape(k), CGI.escape(v)] } uri << "?#{query.join('&')}" end result = open(uri, options) return XmlSimple.xml_in(result.read, {'ForceArray' => false}) end end end end
Now we can go ahead and see if we are on the right path, we should get a NotImplementedError when we try to perform any action with our repository.
DataMapper.setup(:default => { :adapter => 'twitter', :user => 'kscollective', :pass => 'snark snark' }) User.first(:screen_name => 'kscollective') # => NotImplementedError
Fetching Heffalumps And Woozles
In order to continue building our adapter we have to be able to understand the queries and collections DataMapper uses to mediate between the models and adapters. DataMapper::AbstractAdapter defines #read_one and #read_many, both of which accept the query as the single parameter. The query object allows us to determine which model and fields to query along with any possible conditions to limit our results by. Queries also tell our adapter about any possible offsets, limits or ordering, but we will come back to that another day.
Query#model
Each query belongs to a model which we can use to load and return instances fetched from our repository, and may also be used to customize repository requests by type. We can even use the model to DRY up our adapter and work with a single #read method.
class << TwitterAdapter def read_one(query) read(query, query.model, false) end def read_many(query) Collection.new(query) do |set| read(query, set, true) end end private # Each read has a query and returns a set, #read_one and #read_many should provide # the set to load the results into. When called by #read_many nothing needs to be returned # as the collection is filling itself, however we must return what ever object #read_one # should return back to the client code. # def read(query, set, many = true) raise NotImplementedError # to be filled in later end end
Query#fields
Each query contains a subset of the model's properties specifying which fields to query as well as the order of our values when instantiating each result. Twitter does not provide a means to filter out fields so we will use #fields only to order the values we pass to the model builders.
class << TwitterAdapter # Map the values from item into an array of the same size and order as Query#fields # [id, name, something_else, title] => [1, 'Hello World', nil, 'Cats!'] # def parse_user_values(query, item) return query.fields.map { |f| item[f.field.to_s] } end end
Query#conditions
The meat of most queries, conditions is an array of tuples containing the operator, property and value to be considered when executing the request. Each operator may be any of the standard 'SQL' operators (:eql, :in, :gt[e], :lt[e]) and should be used to match the property with one or more values. Because Twitter only provides a method to query individual users by id, email or screen name we can write a method to create an array of key/value pair queries.
class << TwitterAdapter def generate_users_query(query) result = Array.new fields = ['user_id', 'email', 'screen_name'] conditions = query.conditions.select do |condition| condition[0] == :eql and fields.include?(condition[1].field) end # each item in conditions is a [operator, property, value] tuple for operator, property, value in conditions # if an array, each value must be queried individually [value].flatten.each { |v| result << [property.field, v] } end return result end end
Mind The Gap!
What used to be the hardest part in using third party resources has now become a matter of building the request, parsing the request and loading the model.
def read(query, set, many = true) queries = generate_user_queries(query) for key, value in queries twitter_user = request("users/show.xml", {key => value}) next if twitter_user.blank? or twitter_user['screen_name'].blank? user_values = parse_user_values(query, item) many ? set.load(user_values) : (break set.load(user_values, query)) end return end
Recess!
At this point you should be able to query users from Twitter without parsing a single query string or XML response.
DataMapper.setup(:default => {:adapter => 'Twitter'}) user = User.first(:screen_name => 'KSCollective') puts user.url # => http://www.killswitchcollective.com
As you can see DataMapper can make working with third party repositories as automagical as ActiveRecord. DataMapper itself comes with adapters for the major databases with additional adapters available for CouchDB, Google Video and many others. In a few weeks we will continue building our Twitter adapter, leveraging the power of associations. Until then I highly recommend reviewing DataMappers documentation, the adapters included with the dm-core gem and the base adapter we have just built.
Testing with Selenium
Anyone who has worked on Javascript eccentric web applications knows how much of a hassle it can be. Either you're stuck manually testing endless possible combinations of actions, or you're writing them for your Selenium plugin. Things get even worse when your client wants to support browsers like IE6. RSpec has encapsulated the behavior driven development of models, controllers and views, and user stories have integrated testing between the layers, but unfortunately neither has done much in the development of interactive web applications.
Working on Javascript applications after learning RSpec can be a painful experience. Your fingers want to write simple tests, whereas Javascript wants to you painfully point and click until everything works together. For years I have been nagging myself to find a headless browser that could be integrated into my development environment, but I simply never had the time, energy or justification to take action. Tired of pointing and clicking repeatedly after every change, I recently became filled with angst and decided it was time to do something about it.
I was first exposed to Selenium a few semesters back, and while I was unimpressed with the end product, it was the first and only project of its kind I was aware of. My first impression of Selenium was based on personal preferences and not scientific merits, so I thought a second impression was due. To my dismay, Selenium still has an interface that appears to have been designed only for Windows, and still exists as a glorified macro editor. While Selenium continues to be a bust, I did come across SeleniumRC which is built to test multiple browsers on multiple platforms.
A Better Selenium?
SeleniumRC exists as a server that acts as a proxy between a HTTP client and a browser. Clients send commands to the SeleniumRC server, which passes those actions on to the SeleniumCore inside of the browser window with the matching session id. SeleniumRC returns each request with the result of the command, allowing the client to control and test pages just as a user would. Therefore SeleniumRC can be scripted using any language that can send an HTTP request, like Ruby, JRuby, or Intel Assembly.
Immediately seeing the possibilities, I set out to plug the SeleniumRC Ruby client into RSpec, which would allow me to write user interaction specs in Ruby. While plugging the SeleniumRC client into RSpec proved to be almost as easy as drag and drop, the honeymoon quickly faded. Getting the tests to run and pass turned into a seemingly endless adventure, where sometimes XPaths wouldn't work, sometimes browsers wouldn't work... a big mess. A good portion of my time was spent testing if my behavior tests would work rather then writing the tests and functional code.
I tried two different approaches in overcoming the issues with XPath that I experienced. The first involved passing Prototype strings to be evaluated to avoid SeleniumCore all together. Unfortunately, for some reason I was unable to discover, none of the Prototype strings were returning values. The second approach used Hpricot to assert the presence of elements or values, as well as generating the XPaths for those elements, which could then be passed to SeleniumCore. Alas, XPath selectors were still not working when they were generated by Hpricot.
In addition to having difficulties in getting the tests to work, the syntax provided by the Ruby client is not very pleasing to the Rubyist's eye. I never expect a 'bonus' piece of code, packaged with a free product still in beta, to be the cat's meow. Still, it is always nice when the syntax is clean and all of the pieces work correctly. Does Ruby written to test Javascript have to look like Javascript?
Another issue I've found with SeleniumRC is that it is as slow as my grandma driving a Cadillac down the expressway on Sunday. Even when running the server and client from your local machine, the tests take an extraordinary amount of time to run. I believe this is due to the manner in which SeleniumRC makes it all possible, and while it may be tolerable when doing full scale testing, it simply is not when practicing BDD.
SeleniumRC is still a beta product, listed as version 1.0 beta. While I agree with the term beta, I feel that in today's day and age software should be roughly usable at version 0.1. Typically I would have no problem working around these issues for something I really want; however active development turns into more of a requirement then a wish. There have only been two releases of SeleniumRC since 2006, with the most work being done in the first half of 2006.
A New Hope
Although SeleniumRC hacked my enthusiasm into pieces, it did manage to further motivate my quest for a headless browser. Taking the ideas I got from my time spent with SeleniumRC and RSpec, I set out to create a class that would allow me to control a virtual browser instance from a Ruby object. I decided to look back at my ObjectiveC/Cocoa experience and poke at the WebKit API for a bit and see what sort of trouble I could get myself into.
With a little bit of elbow grease and Google, I have been able to get a working instance of a WebKit browser neatly bundled as a Ruby object. Currently Javascript strings can be passed to the browser to be evaluated, with their string result being returned.
While there are a few technical details to be worked out, with any luck the power of WebKit in Ruby combined with the magic of RSpec should free the masses from the infinite loop of edit-reload-edit. Of course visual aspects will still need manual testing, as well as user interaction on other browsers. With a little finesse, the same tests written to be tested locally with WebKit could also be used to test remote browsers using SeleniumRC.
Next time I hope to have a working demonstration and sample code, in the meantime here is some eye candy you can feast on!
Recently I was working with a unique relationship in which each child object was a specific instance of its parent while also inheriting multiple attributes from that parent. In this particular situation the traditional method of accessing the parents attribute though the child is both tedious and verbose. Rather then calling the attribute directly, I wanted to clean things up and call the child directly, leaving the child to fetch the attribute from the parent. Not only would this reduce the amount of code needed, but it would also help separate the concerns of the model from the views. I decided to concentrate on imitating the inheritance rather then the relationship since the inheritance evolved from normalizing the data.
At first I thought it would be easiest to override Child#method_missing, an obvious choice for dynamically monkey patching any object. Unfortunately I was unable to find a working pattern that did not rely on exceptions, the time consuming respond_to? method or a white list of method names to forward. The voices in my head kept insisting there had to be a better, more ruby-esque way to achieve the same result while automating the grunt work.
With a bit of help from Google and caffeine I stumbled over Rails Ticket 4133 incorporating the delegate method, which creates a method that delegates attribute calls onto related objects.
class Parent << ActiveRecord::Base has_many :children end class Child << ActiveRecord::Base belongs_to :parent delegate :name, :to => :parent end
However there are a few drawbacks to using the delegate method, such as still relying on a white-list of attributes, and the helper costing more keystrokes then the original code. Even more disturbing is that the delegate helper does not generate the attribute query method. I adore uniformity in software, and if a model method Child#name exists, so should the query accessor Child#name? (at least in Rails).
Fortunately we can inject a bit of voodoo into ActiveRecord itself to extend and customize how the delegate method operates. Although I have taken the low road and created a monkey patch, with a few slight modifications and a well rounded test suite a Rails plugin could be born.
module ActsAsDelegatableTo def acts_as_delegatable_to(table, *exceptions) local_columns = self.columns.map { |attr| attr.name } table_columns = eval(%{#{table.to_s.capitalize}.columns}).map { |attr| attr.name } table_columns.reject! { |attr| attr == 'id' or local_columns.include?(attr) or exceptions.include?(attr) } table_columns.each { |attr| class_eval(%{delegate :#{attr}, :to => :#{table}}) class_eval(%{delegate :#{attr}?, :to => :#{table}}) } end end ActiveRecord::Base.send :include, ActsAsDelegatableTo
Rather then explicitly programming or including the delegate methods using the white-list approach, acts_as_delegatable_to creates delegate methods to each of the fields defined in the parent's data model, excluding any the child may have redefined. But wait, there's more! For only a few key strokes more we also gave ourselves the standard Rails attribute query methods, providing access to not only Parent#attribute but Parent#attribute? as well.
class Parent << ActiveRecord::Base has_many :children end class Child << ActiveRecord::Base belongs_to :parent acts_as_delegatable_to :parent end
By calling acts_as_delegatable_to :parent, each Child object instance will have attribute accessors and query methods to each field belonging to its parent. Just remember that this sort of relationship is not common before blindly inheriting all of the attributes of related objects. This method could cause nasty headaches and sleepless nights if it is not used with caution. There are no protections for naming conflicts or duplicated class evaluations, and the inheritance is generated based on the data model, not the business logic. If you decide to take this approach, just proceed with caution!




