I was working on a large community-driven website recently, which had always had the requirement to synchronize its user database with another (3rd party) site that we didn't control. The most we could get out of the other site was for them to send us regular XML dumps of the changes (additions, removals, deletions). Predictably, we wrote that as a rake task and added it to cron.
@hourly cd /apps/product/current && export RAILS_ENV=production && rake product:synchronize_database
It worked great, until the client threw in ONE additional little snag; not only should it check for updates every hour, but an admin should be able to log in to the site and request an immediate synchronization. Unfortunately, this task can take anywhere from 10 minutes to 10 hours, depending on the size of the XML file and the number of employees involved. Clearly not a job for a simple backtick or %x{}.
There are a number of different options for running background processes in Rails, but since I had pretty simple requirements (and needed quick results) the best choice for me was clearly Spawn.
You can install the plugin from rubyforge:
script/plugin install http://spawn.rubyforge.org/svn/spawn/
Then implementing the client's request was as simple as adding a button with a remote_function and a controller action with:
spawn(:nice => 7) do exec("cd #{RAILS_ROOT} && export RAILS_ENV=#{RAILS_ENV} && rake product:synchronize_database") end
The :nice option (vital, in my case!) will make sure that your process doesn't monopolize the CPU (just like its shell counterpart). There are only a couple of other options; you can choose to fork or thread your process (fork is the default), and you can wait for it to finish. Neither was necessary in my case. Problem solved and, in typical Rails fashion, it only took a few lines of code!
Just give me the globalize_with_google plugin now!
Anybody who's looked into localizing or internationalizing a Rails app has probably come across the "Globalize" plugin. It's a bit of an 800 lb. gorilla in the sense that it supports potentially hundreds of languages, automatic generation of validation messages, and even multiple pluralization cases based on the exact number of objects being counted. (There's a story about a people whose language only had three numbers- 1, 2, and 'many'. Globalize can handle that!) But as long as installation is as easy as "script/plugin install ...", who cares how much the gorilla weighs?
On a related note, Google recently released a series of AJAX APIs that are dead-simple to plug in to any web app, including one that does automatic translation. Can you guess where I'm going with this?
As soon as I saw Google's announcement that they were offering a free translation API, I started thinking about how to write a plugin that used it to initialize a Globalize database.
My solution, as sketched on the back of a napkin, had two pieces: The first would override Globalize's "String.translate" method. The other one would cache the translations so we still had a checklist of phrases for professional translators to go over, if necessary, and so we weren't dependent on the uptime of Google's servers for the functionality of our application. (Not that Google has lousy uptime; but if by chance they ever take down the service or start charging for translations, we can't have our translations just turn off).
The Actual Translation
This part was the easiest. We just modify Globalize's ".t" method to use Google's translation service:
module String def self.included(base) base.send :alias_method_chain, :translate, :google base.send :alias_method, :t, :translate end def translate_with_google(default = nil, arg = nil) local_base_language = defined?(BASE_LANGUAGE) ? BASE_LANGUAGE : 'en' #don't translate this if it's already written in the target language return self if Locale.language.iso_639_1 == local_base_language result = Locale.translate(self, '__translate__', arg) return result unless result == '__translate__' return %Q{<span id="translation_#{self.object_id}">#{self}</span> <script type="text/javascript"> ......} end end
The only flaw is that you can't use this on the labels of buttons or in javascript alert()s. Instead of showing a translated string, it would display a huge mess of javascript. I don't think there's a simple workaround for this, though, since the ".t" method can't know what context it is being called in. So in your views, make sure all of your translated buttons use something like
<input type="submit" value="<%= "Submit".translate_without_google %>" />
The Caching
This part nearly killed me. How do you cache the result of a google translation? It never goes through our server! The solution was a little convoluted, but very educational to a guy who had never written a plugin before.
First, we need to make the Javascript report the result of each translation back to our server. Fortunately, Google's "translate" function offers a callback once the translation is complete. So I just told it to execute the following:
new Ajax.Request('/cache_google_translation',{method: 'post', parameters: "phrase=#{self}&translation="+result.translation});
Next, we need a way for our Rails app to recognize the request for caching. But how can a plugin respond to a request like a controller does? It takes two steps. First you need to make a pseudo-controller that will do the caching:
class TricksController < ActionController::Base def cache_google_translation bound_vars = [params[:translation], params[:phrase]] ActiveRecord::Base.connection.execute("UPDATE globalize_translations SET built_in = 2, text = ? WHERE tr_key = ? AND language_id = #{Locale.language.id}".gsub('?'){ActiveRecord::Base.connection.quote(bound_vars.shift)}) Locale.translator.put_in_cache(params[:phrase],Locale.language.iso_639_1,params[:translation]) render :text => '' end end
And then you need to extend Rails' route parser to attach a URL to your controller. (alias_method_chain to the rescue!)
module MapperExtensions def self.included(base) base.send :alias_method_chain, :initialize, :google_caching end def initialize_with_google_caching(set) #we have to add ours FIRST, otherwise the final line of the regular routes.rb is usually a catchall that would intercept OUR route set.add_route('/cache_google_translation',{:controller => 'google/tricks', :action => 'cache_google_translation'}) initialize_without_google_caching(set) end end
Finally, in your plugin's init file you just attach these classes into Rails:
ActionController::Routing::RouteSet::Mapper.send :include, Google::MapperExtensions ActionView::Helpers::AssetTagHelper.send :include, Google::Javascript
And that's it! Well, not quite. Did you notice the reference to Locale.translator.put_in_cache? If you want to make sure that the auto-translations in your database are easily distinguishable so that you can have them manually translated later (machine translation isn't quite there yet!) then you have to add an extra step. It was easy enough to use a manual update statement instead of Locale.set_translation, which allowed me to set "built_in = 2" (that's how you recognize the auto-translations). But then the 800 lb. gorilla gets in the way. Globalize maintains a separate cache of translations in memory to avoid wear and tear on the database, but if you don't update the copy in memory as well, Globalize will never actually USE your cached version! It's a protected variable, so one more module extension:
module LocalizeCacheAccess def put_in_cache(key,language,translation) @cache["#{key}:#{language}:1"] = translation end end
and then include it in your app with
Globalize::DbViewTranslator.send :include, Google::LocalizeCacheAccess
And that's it! Now you're REALLY done! To get all of this code in a simple Rails plugin, download globalize_with_google.zip and unpack it in #{RAILS_ROOT}/vendor/plugins/.
Installing and configuring the Globalize plugin can be difficult, but it usually goes easily if you follow the instructions. There's nothing I can say about that that's not thoroughly covered elsewhere. Adapting your coding style to use it is almost as easy as advertised.
But once you have it up and running there are a lot of subtle things that can still give you headaches down the line. For many of our international sites, the workflow is this:
- The client goes back and forth on designs, and we make a prototype like usual.
- Before getting started, we installed the Globalize plugin. Once we have the site roughed out and the design is approved, Globalize has automatically generated a list of the unique phrases in our app. (You might want to clear the translations table and re-test every page so your list doesn't include obsolete or phased out phrases).
- Then we hand the list to our client, who sends it to the team of translators, who send it back to us, and we slurp it into the database with a simple script (just loop through the languages and Locale.set, then Locale.set_translation)
That's technically all you need to do to globalize your app. But, if you want your application to be as scalable as possible—or you'd like to keep your developers from pulling their hair out—here are some tips.
- If at all possible, never translate phrases over 256 characters. If you do, you'll have to change globalize_translations.tr_key and and globalize_translations.text to a "text" field instead of a "string" (or varchar). That will work just fine, but, if you can avoid having to do that, it will really pay off to be able to add an index to tr_key and language_id.
- Make sure your validation messages are each translated individually, not en-masse in your view, or your translators will have extra work. That is, you want to avoid adding this to your translations table: "Errors: Name cannot be blank. Email address is not valid.", and instead have several seperate rows like this: "Errors:", "Name cannot be blank", "Email address is not valid" and so forth.
- You may have to turn off view caching in your environments/production.rb if you're running a mongrel cluster, or you might get spontaneous, random language changes as your users navigate the site! (change config.action_controller.perform_caching and config.cache_classes to "false")
- Don't stress too much about translating the word "Browse..." or "Choose File" on your file upload buttons- you can't change that! That's controlled by the user's browser, and it probably looks fine to them already.
- Lastly, here's a handy query for generating statistics on how many translations you've added to the system for each language:
SELECT english_name, COUNT(*) num_phrases FROM globalize_languages gl LEFT JOIN globalize_translations gt ON gl.id = gt.language_id GROUP BY english_name ORDER BY english_name
- And this rake task will generate a spreadsheet-checklist of phrases that still require translation.
translation_checklist.rake
Copy it to your lib/tasks folder, and run it to, in this case, make a pipe-delimited checklist, without a column for "English".
rake skip_languages=English delimiter="|" db:translation_checklist > whatever_file.csv

