Perspectives

Map

DataMapper is an object relational mapper for ruby with an interface somewhat similar to ActiveRecord's. More than sprinkles on top of a generic SQL adapter, DataMapper is a design pattern for defining repositories and the models that love them. DataMapper differs from ActiveRecord and ActiveResource in that models are encapsulated from repositories while queries and collections communicate between them. One of the most significant advantages to this approach lies in the ability to develop models separately from their repository.

Over the next several articles we will use the Twitter API to explore how DataMapper expects custom adapters to work, executes CRUD requests, handles associations and works with multiple repositories. For now we will concentrate on the basics of DataMapper and querying users from the Twitter API service. Those comfortable with ruby or ActiveRecord should be able to follow along, however I strongly recommend spending time with DataMapper's fantastic documentation if you have not already.

Modeling Our Models

First things first, we should install DataMapper and define a model to represent a user account from the Twitter API. To do that we need to define a module using the DataMapper::Resource module and describe the properties Twitter provides.

gem install datamapper
require 'dm-core'

class User
  include DataMapper::Resource

  property :id, Integer, :field => 'user_id'
  property :name, String
  property :screen_name, String
  property :email, String
  property :location, String
  property :description, Text, :lazy => false
  property :profile_image_url, String
  property :url, String
  property :protected, Boolean
  property :followers_count, Integer
end

DataMapper differs from the ActiveRecord family in that fields are defined in your model rather than being created from the repository's schema. This allows models to be built without an adapter making a connection and avoids the headaches of ActiveRecord-style migrations. Additionally you may specify the :field name to be used for each property, allowing antiquated or confusing field names to be user friendly. Defining a model's properties also allows DataMapper to intelligently calculate by type which fields should be lazily loaded, which may also be customized by passing the :lazy option a true or false value. In our User model we have set the lazy option to false as Twitter provides a user's description by default, and there is no use in wasting an API hit if we do not need to (Twitter limits API calls by the hour). A more in-depth description of DataMapper's lazy fields can be found in the documentation.

The Base Adapter

For the foundation of our Twitter adapter we need to catch any authentication options passed to the initialization as well as write a method for communicating with Twitter.

require 'cgi'
require 'open-uri'
require 'rubygems'
require 'dm-core'
require 'xmlsimple'

module DataMapper
  module Adapters
    class TwitterAdapter

      # Clients can provide DataMapper with a URI string or hash of options when
      # initializing an adapter. We can store these values and use them for each
      # request to the Twitter service if the client provides them. Depending on
      # your repository you may wish to verify authentication here rather than 
      # waiting for the initial request.
      #
      # name:: Name of the adapter
      # uri_or_options:: A uri string, or hash of options used to initialize the adapter
      #
      def initialize(name, uri_or_options)
        # don't forget to phone home!
        super(name, uri_or_options)

        case uri_or_options
        when Hash
          user = uri_or_options[:user] || ''
          pass = uri_or_options[:pass] || ''
          @auth = user.blank? || pass.blank? ? nil : [user, path]
        end
      end

      private

      # Requests a resource from the Twitter API. If the adapter was initialized
      # with the :user and :pass options, they will be used to authenticate the request.
      #
      # method:: Path to follow the base URI 'http://twitter.com'
      # params:: Hash of key/value pairs to be used as the query string
      # returns:: XmlSimple representation of the response from Twitter
      #
      def request(method, params = {})
        uri = "http://twitter.com/#{method}"
        options = {:http_basic_authentication => @auth}

        unless params.blank?
          query = params.map { |k,v| "%s=%s" % [CGI.escape(k), CGI.escape(v)] }
          uri << "?#{query.join('&')}"
        end

        result = open(uri, options)
        return XmlSimple.xml_in(result.read, {'ForceArray' => false})
      end

    end
  end
end

Now we can go ahead and see if we are on the right path, we should get a NotImplementedError when we try to perform any action with our repository.

DataMapper.setup(:default => {
  :adapter => 'twitter',
  :user => 'kscollective',
  :pass => 'snark snark'
})

User.first(:screen_name => 'kscollective') # => NotImplementedError

Fetching Heffalumps And Woozles

In order to continue building our adapter we have to be able to understand the queries and collections DataMapper uses to mediate between the models and adapters. DataMapper::AbstractAdapter defines #read_one and #read_many, both of which accept the query as the single parameter. The query object allows us to determine which model and fields to query along with any possible conditions to limit our results by. Queries also tell our adapter about any possible offsets, limits or ordering, but we will come back to that another day.

Query#model

Each query belongs to a model which we can use to load and return instances fetched from our repository, and may also be used to customize repository requests by type. We can even use the model to DRY up our adapter and work with a single #read method.

class << TwitterAdapter

  def read_one(query)
    read(query, query.model, false)
  end

  def read_many(query)
    Collection.new(query) do |set|
                  read(query, set, true)
    end
  end

  private

  # Each read has a query and returns a set, #read_one and #read_many should provide
  # the set to load the results into. When called by #read_many nothing needs to be returned
  # as the collection is filling itself, however we must return what ever object #read_one 
  # should return back to the client code.
  #
  def read(query, set, many = true)
    raise NotImplementedError # to be filled in later
  end

end

Query#fields

Each query contains a subset of the model's properties specifying which fields to query as well as the order of our values when instantiating each result. Twitter does not provide a means to filter out fields so we will use #fields only to order the values we pass to the model builders.

class << TwitterAdapter
  
  # Map the values from item into an array of the same size and order as Query#fields
  # [id, name, something_else, title] => [1, 'Hello World', nil, 'Cats!']
  #
  def parse_user_values(query, item)
    return query.fields.map { |f| item[f.field.to_s] }
  end
end

Query#conditions

The meat of most queries, conditions is an array of tuples containing the operator, property and value to be considered when executing the request. Each operator may be any of the standard 'SQL' operators (:eql, :in, :gt[e], :lt[e]) and should be used to match the property with one or more values. Because Twitter only provides a method to query individual users by id, email or screen name we can write a method to create an array of key/value pair queries.

class << TwitterAdapter

  def generate_users_query(query)
    result = Array.new
    fields = ['user_id', 'email', 'screen_name']
    
    conditions = query.conditions.select do |condition|
      condition[0] == :eql and fields.include?(condition[1].field)
    end
    
    # each item in conditions is a [operator, property, value] tuple
    for operator, property, value in conditions
      # if an array, each value must be queried individually
      [value].flatten.each { |v| result << [property.field, v] }
    end
    
    return result
  end

end

Mind The Gap!

What used to be the hardest part in using third party resources has now become a matter of building the request, parsing the request and loading the model.

def read(query, set, many = true)
  queries = generate_user_queries(query)
  
  for key, value in queries
    twitter_user = request("users/show.xml", {key => value})
    next if twitter_user.blank? or twitter_user['screen_name'].blank?
    user_values = parse_user_values(query, item)
    many ? set.load(user_values) : (break set.load(user_values, query))
  end
  
  return
end

Recess!

At this point you should be able to query users from Twitter without parsing a single query string or XML response.

DataMapper.setup(:default => {:adapter => 'Twitter'})

user = User.first(:screen_name => 'KSCollective')
puts user.url # => http://www.killswitchcollective.com

As you can see DataMapper can make working with third party repositories as automagical as ActiveRecord. DataMapper itself comes with adapters for the major databases with additional adapters available for CouchDB, Google Video and many others. In a few weeks we will continue building our Twitter adapter, leveraging the power of associations. Until then I highly recommend reviewing DataMappers documentation, the adapters included with the dm-core gem and the base adapter we have just built.




RSS Feed


CATEGORIES


ARCHIVES


BOOKMARKED


Add to Technorati Favorites