Whether you run an online store, community or news portal, one day you might want to add a search option for your users. And they will expect it to work fast and to provide relevant results.
Problems & Solutions
When it comes to a Rails application you might face certain problems. If your application uses a MySQL database then the default engine used for tables is InnoDB. When you need to search through certain tables by several fields it becomes impossible since InnoDB doesn’t support full-text search.
What can be a possible solution for this situation? There are a few:
- You can covert some tables to MyISAM which supports full-text search, but it’s not a good idea to keep InnoDB and MyISAM tables on a single server due to possible memory issues.
- Use MyISAM slaves but it will increase the architectural complexity.
- You can use an external full-text search engine such as Apache Lucine/Solr, Sphinx or Xapian.
At Sphere we tend to use Sphinx. It’s an open source, super fast and reliable solution.
Sphinx’s overview
Sphinx is an open-source full-text search server, designed from the ground up with performance, relevance and integration simplicity in mind.
Sphinx lets you either batch index and search data stored in an SQL database, NoSQL storage, or just files quickly and easily — or index and search data on the fly, working with Sphinx pretty much as a database server.
Let’s look at Sphinx’s features:
- High indexing speed (upto 10 MB/sec on modern CPUs)
- High search speed (average query is under 0.1 sec on 2-4 GB text collections)
- High scalability (up to 100 GB of text, up to 100 M documents on a single CPU)
- Supports distributed searching (since v.0.9.6)
- Supports MySQL natively (MyISAM and InnoDB tables are both supported)
- Supports phrase searching
- Supports phrase proximity ranking, providing good relevance
- Supports English and Russian stemming
- Supports any number of document fields (weights can be changed on the fly)
- Supports document groups
- Supports stop words
- Supports different search modes (“match all”, “match phrase” and “match any” as of v.0.9.5)
Sphinx Installation
Let’s get the latest version from the official website and untar it:
wget http://www.sphinxsearch.com/downloads/sphinx-0.9.9.tar.gz tar -xzf sphinx-0.9.9-rc2.tar.gz
After that, we should compile Sphinx from the source:
cd sphinx-0.9.9-rc2/ ./configure make sudo make install
That’s it.
Thinking Sphinx Installation
Now we need to install Thinking Sphinx gem written by Pat Allen to work with Sphinx from our Rails applications. There are some other gems such as acts_as_sphinx and Ultrsphinx but they seem to be abandoned.
If you use Rails 2.x, run from the application’s root directory:
script/plugin install git://github.com/freelancing-god/thinking-sphinx.git
If you are already on Rails 3, open Gemfile in the root directory and add the line below:
gem 'thinking-sphinx', :git => 'http://github.com/freelancing-god/thinking-sphinx.git', :require => 'thinking_sphinx', :branch => 'rails3'
And run the following command:
bundle install
Thinking Sphinx gem adds a few rake tasks to your application. The most important ones:
rake thinking_sphinx:index – Create the index rake thinking_sphinx:reindex – Reindex Sphinx without regenerating the configuration file rake thinking_sphinx:start – Start up Sphinx's daemon rake thinking_sphinx:stop – Shut down the daemon
Usage
Let’s imagine we have a web site with database of potential ready-to-work candidates.
Every candidate has several documents attached such as resume, cover letter and certificates. We want to perform a search by candidate’s name, location and information inside documents.
We need to set up indexes in our Candidate Mode:
class Candidate < ActiveRecord::Base
has_many :documents, :dependent => :destroy
define_index do
indexes location
indexes [first_name, last_name], :as => :name, :sortable => true
indexes documents.content, :as => :document_content
end
end
Now we can perform a search by calling the search method:
Candidate.search “chicago ruby”
or
Candidate.search “smith boston ruby rails”
As you might see, we added :srtable parameter to the name, which allows us to add a search order:
Candidate.search(“chicago rails”, : order => :name )
Conclusion
MySQL can become a blocker when it comes to searching on large text fields and usage of external full-text search engines might be a good solution. As you could see above, it’s really easy to install and start using Sphinx along with Thinking Sphinx gem.
Resources
Sphinx’s official site (http://www.sphinxsearch.com/)
Thinking Sphinx gem (http://freelancing-god.github.com/ts/en/)
Thinking Sphinx PDF (http://peepcode.com/products/thinking-sphinx-pdf)
Apache Lucene (http://lucene.apache.org/)
Solr (http://lucene.apache.org/solr/)
Xapian (http://xapian.org/)
