Perspectives

Spring

Get FlexibleCSV from GitHub

A Challenge in Flexibility

As part of a contact management system we are building for a client, I encountered a unique challenge with allowing users to upload and import their contacts from CSV files. Usually this would not be a problem, except that in this case there was no standardization to what the header names would be or what order the columns were in. Because the FasterCSV gem relies on using the header names as access keys, this process was suddenly quite complicated.

One solution would be to create a user interface that would display our database fields, their CSV columns and allow them to pair them up. For example, my database column is 'email' but their CSV column is 'Email Address', so they could mark those as equivalent. What would I do, however, for the users who have a "Full Name" column when I use 'first_name' and 'last_name' database columns? Suddenly the user interface could get very complicated and confusing.

Introducing FlexibleCSV

Instead, I developed FlexibleCSV, a gem that allows you to parse through a CSV file without knowing exactly what the headers are named. By providing a list of possible header names, you can access all the CSV columns with a uniform interface.

require 'flexible_csv'

# Arbitrary CSV data
csv_data1 = %Q{Full Name, Email Address\nJohn Doe, john@doe.com}
csv_data2 = %Q{Email, Name\njohn@doe.com, John Doe}

parser = FlexibleCsv.new do |csv|
  csv.column :full_name, "Name", "Full Name", "Client Name"
  csv.column :email, "Email", "Email Address"
end

parser.parse(csv_data1).each do |row|
  puts row.full_name #=> 'John Doe'
  puts row.email     #=> 'john@doe.com'
end

parser.parse(csv_data2).each do |row|
  puts row.full_name #=> 'John Doe'
  puts row.email     #=> 'john@doe.com'
end

Both data sets can now be accessed using the uniform #full_name and #email accessors.

Handling Complexity with Adapters

Going back to my original example, how would we handle CSV files that separated first and last names when my database uses the full name? Or vis versa? Though I considered adding this kind of functionality to the FlexibleCSV gem, ultimately I thought it best to keep that kind of logic in a separate adapter class. For example:

require 'flexible_csv'

# Arbitrary CSV data
csv_data1 = %Q{Full Name\nJohn Doe}
csv_data2 = %Q{First Name, Last Name\nJohn,Doe}

parser = FlexibleCsv.new do |csv|
  csv.column :full_name, "Name", "Full Name", "Client Name"
  csv.column :first_name, "First Name", "First"
  csv.column :last_name, "Last Name", "Last", "Surname"
end

class CsvAdapter
  def initialize(row)
    @row = row
  end

  def full_name
    row.full_name || "#{row.first_name} #{row.last_name}"
  end

  def last_name
    row.last_name || row.full_name.split(' ').last
  end

  def first_name
    row.first_name || row.full_name.split(' ').first
  end

  def method_missing(method_name, *args)
    row.send(method_name, *args)
  end
end

parser.parse(csv_data1).each do |row|
  ad_row = CsvAdapter.new(row)
  puts ad_row.full_name  #=> 'John Doe'
  puts ad_row.first_name #=> 'John'
  puts ad_row.last_name  #=> 'Doe'
end

parser.parse(csv_data2).each do |row|
  ad_row = CsvAdapter.new(row)
  puts ad_row.full_name  #=> 'John Doe'
  puts ad_row.first_name #=> 'John'
  puts ad_row.last_name  #=> 'Doe'
end

Using the adapter class, we can once again access each row of data from any CSV file with a uniform interface.

Go Get It!

To use the FlexibleCSV gem, you can follow or fork the project on GitHub or just install the gem:

sudo gem install chrisjpowers-flexible_csv




RSS Feed


CATEGORIES


ARCHIVES


BOOKMARKED


Add to Technorati Favorites