Searching

Basic Searching

Once you’ve got an index set up on your model, and have the Sphinx daemon running, then you can start to search, using a method on your model named just that.

Article.search 'pancakes'

Sphinx does have some reserved characters (including the @ character), so you may need to escape your query terms. Riddle (a dependency of Thinking Sphinx) has escaping methods built-in:

# For Thinking Sphinx v3 or newer:
Article.search Riddle::Query.escape(params[:query])
# For Thinking Sphinx before v3:
Article.search Riddle.escape(params[:query])

Please note that Sphinx paginates search results, and the default page size is 20. You can find more information further down in the pagination section.

Field Conditions

To focus a query on a specific field, you can use the :conditions option - much like in ActiveRecord (back before Rails 3, anyway):

Article.search :conditions => {:subject => 'pancakes'}

You can combine both field-specific queries and generic queries too:

Article.search 'pancakes', :conditions => {:subject => 'tasty'}

Please keep in mind that Sphinx does not support SQL comparison operators - it has its own query language. The :conditions option must be a hash, with each key a field and each value a string.

Attribute Filters

Filters on attributes can be defined using a similar syntax, but using the :with option.

Article.search 'pancakes', :with => {:author_id => @pat.id}

Filters have the advantage over focusing on specific fields in that they accept arrays and ranges:

Article.search 'pancakes', :with => {
  :created_at => 1.week.ago..Time.now,
  :author_id  => @fab_four.collect { |author| author.id }
}

And of course, you can mix and match global terms, field-specific terms, and filters:

Article.search 'pancakes',
  :conditions => {:subject => 'tasty'},
  :with       => {:created_at => 1.week.ago..Time.now}

If you wish to exclude specific attribute values, then you can specify them using :without:

Article.search 'pancakes',
  :without => {:user_id => current_user.id}

For matching multiple values in a multi-value attribute, :with doesn’t quite do what you want. Give :with_all a try instead:

Article.search 'pancakes',
  :with_all => {:tag_ids => @tags.collect(&:id)}

You can also perform combination AND and OR matches with :with_all using nested arrays:

# All pancackes belonging to tag 3 and belonging to one of tag 1 or tag 2
Article.search 'pancakes',
  :with_all => {:tag_ids => [[1,2], 3]}

Application-Wide Search

You can use all the same syntax to search across all indexed models in your application:

ThinkingSphinx.search 'pancakes'

If you’re using a version of Thinking Sphinx prior to 1.2, you will need to use a slightly deeper namespaced method: ThinkingSphinx::Search.search.

This search will return all objects that match, no matter what model they are from, ordered by relevance (unless you specify a custom order clause, of course). Don’t expect references to attributes and fields to work perfectly if they don’t exist in all the models.

If you want to limit global searches to a few specific models, you can do so with the :classes option:

ThinkingSphinx.search 'pancakes', :classes => [Article, Comment]

Pagination

Sphinx paginates search results by default. Indeed, there’s no way to turn it off (but you can request really big pages should you wish). The parameters for pagination in Thinking Sphinx are exactly the same as Will Paginate: :page and :per_page.

Article.search 'pancakes', :page => params[:page], :per_page => 42

The output of search results can be used with Will Paginate’s view helper as well, just to keep things nice and easy.

# in the controller:
@articles = Article.search 'pancakes'

# in the view:
will_paginate @articles

Match Modes

Thinking Sphinx v3 and newer use Sphinx’s SphinxQL for querying, and that always uses the extended match mode, which is covered in detail in the Sphinx documentation.

Thinking Sphinx v1/v2

Note: If you are using an older version of Thinking Sphinx, then you have simpler match modes available, which are covered both in the Sphinx documentation and here.

Ranking Modes

Sphinx also has a few different ranking modes (again, the Sphinx documentation is the best source of information on these). They can be set using the :ranker option (or :rank_mode if you’re using Thinking Sphinx v2 or older):

Article.search "pancakes", :ranker => :bm25

Ranking modes include the following (though the definitive list is in the Sphinx documentation):

:proximity_bm25

The default ranking mode, which combines both phrase proximity and BM25 ranking (see below).

:bm25

A statistical ranking mode, similar to most other full-text search engines.

:none

No ranking - every result has a weight of 1.

:wordcount (since 0.9.9rc1)

Ranks results purely on the number of times the keywords are found in a document. Field weights are taken into factor.

:proximity (since 0.9.9rc1)

Ranks documents by raw proximity value.

:matchany (since 0.9.9rc1)

Returns rankings calculated in the same way as a match mode of :any.

:fieldmask (since 0.9.9rc2)

Returns rankings as a 32-bit mask with the N-th bit corresponding to the N-th field, numbering from 0. The bit will only be set when any of the keywords match the respective field. If you want to know which fields match your search for each document, this is the only way.

Sorting

By default, Sphinx sorts by how relevant it believes the documents to be to the given search keywords. However, you can also sort by attributes (and fields flagged as sortable) or custom mathematical expressions.

Sorting expressions are much like SQL’s ORDER BY clause - an attribute followed by a direction:

Article.search 'pancakes', :order => 'created_at DESC'

If you supply an attribute as a symbol, it’s presumed you want them in ascending order:

Article.search "pancakes", :order => :created_at
# is equivalent to
Article.search "pancakes", :order => 'created_at ASC'

If you want to use a custom expression to define your sorting order, you need to declare that as a dynamic attribute:

ThinkingSphinx.search(
  :select => '*, weight() * 10 + document_boost as custom_weight',
  :order  => 'custom_weight DESC'
)

And as shown in the above example, Sphinx’s calculated ranking is available via the weight() function. If all you want to refer to that directly when sorting, you need to give it an alias:

ThinkingSphinx.search(
  :select => '*, weight() as w', :order  => 'w DESC'
)

Sphinx 2.0.x

Note: If you are using a version of Sphinx prior to 2.1.1, then the ranking is available instead by the internal attribute `@weight`.

Thinking Sphinx v1/v2

Note: If you are using an older version of Thinking Sphinx, then you have further sorting options available, which are covered both in the Sphinx documentation and here.

Field Weights

Sphinx has the ability to weight fields with differing levels of importance. You can set this using the :field_weights option in your searches:

Article.search "pancakes", :field_weights => {
  :subject => 10,
  :tags    => 6,
  :content => 3
}

You don’t need to specify all fields - any not given values are kept at the default weighting of 1.

If you’d like the same custom weightings to apply to all searches, it’s best to set these through a default Sphinx scope. If you’re using a version prior to 3.0, you can specify these defaults in your index definition (see below), but given this is something related to searching rather than indexing, a default scope is a more appropriate option.

set_property :field_weights => {
  :subject => 10,
  :tags    => 6,
  :content => 3
}

Search Results Information

If you’re building your own pagination output, then you can find out the statistics of your search using the following accessors:

@articles = Article.search 'pancakes'
# Number of matches in Sphinx
@articles.total_entries
# Number of pages available
@articles.total_pages
# Current page index
@articles.current_page
# Number of results per page
@articles.per_page

Grouping / Clustering

Sphinx allows you group search records that share a common attribute, which can be useful when you want to show aggregated collections. For example, if you have a set of posts and they are all part of a category and have a category_id, you could group your results by category id and show a set of all the categories matched by your search, as well as all the posts. You can read more about it in the official Sphinx documentation.

For grouping to work, you need to pass in the :group_by parameter.

Searching posts, for example:

Post.search 'syrup', :group_by => :category_id

By default, this will return your Post objects, but one per category_id. If you want to sort by how many posts each category contains, you can pass in :order_group_by:

Post.search 'syrup',
  :group_by       => :category_id,
  :order_group_by => 'count(*) desc'

Sphinx 2.0.x

Note: If you are using a version of Sphinx prior to 2.1.1, then the group count is available instead by the internal attribute `@count`.

Once you have the grouped results, you can enumerate by each result along with the group value, the number of objects that matched that group value, or both, using the following methods respectively:

posts.each_with_group           { |post, group| }
posts.each_with_count           { |post, count| }
posts.each_with_group_and_count { |post, group, count| }

Sphinx’s SphinxQL syntax only allows for grouping on a single attribute - but that attribute can be generated in the SELECT part of the query itself:

ThinkingSphinx.search(
  :select   => '*, MAX(foo, bar) AS grouping',
  :group_by => 'grouping'
)

Thinking Sphinx v1/v2

Note: If you are using an older version of Thinking Sphinx, then you have further grouping options available, which are covered both in the Sphinx documentation and here.

Searching for Object Ids

If you would like just the primary key values returned, instead of instances of ActiveRecord objects, you can use all the same search options in a call to search_for_ids instead.

Article.search_for_ids 'pancakes'
ThinkingSphinx.search_for_ids 'pancakes'

Search Counts

If you just want the number of matches, instead of the matched objects themselves, then you can use the search_count method (which accepts all the same arguments as a normal search call). If you’re searching globally, then use the ThinkingSphinx.count method.

Article.search_count 'pancakes'
ThinkingSphinx.count 'pancakes'

Avoiding Nil Results

Thinking Sphinx tries its hardest to make sure Sphinx knows when records are deleted, but sometimes stale objects slip through the gaps. To get around this, Thinking Sphinx has the option of retrying searches.

To enable this, you can set :retry_stale to true, and Thinking Sphinx will make up to three tries at retrieving a full result set that has no nil values. If you want to change the number of tries, set :retry_stale to an integer.

And obviously, this can be quite an expensive call (as it instantiates objects each time), but it provides a better end result in some situations.

Article.search 'pancakes', :retry_stale => true
Article.search 'pancakes', :retry_stale => 1

Automatic Wildcards

If you’d like your search keywords to be wildcards for every search, you can use the :star option, which automatically prepends and appends wildcard stars to each word.

Article.search 'pancakes waffles', :star => true
# => becomes '*pancakes* *waffles*'

Errors

If you construct a query that Sphinx cannot understand, or if the connection fails, an instance of ThinkingSphinx::SphinxError will be raised.

Some specific types of errors are given specific subclass - ThinkingSphinx::QueryError, ThinkingSphinx::SyntaxError and ThinkingSphinx::ParseError. The message in any of these errors will give you more detail on what’s gone wrong.

Thinking Sphinx v1/v2

Note: If you are using an older version of Thinking Sphinx, then errors are handled differently.

Advanced Options

Thinking Sphinx accepts the following advanced Sphinx arguments:

  • :cutoff
  • :retry_count and :retry_delay
  • :max_query_time
  • :comment

Thinking Sphinx v1/v2 also allows for the :id_range option.

If you want to set additional arguments for the underlying SQL call when translating Sphinx results into ActiveRecord objects (:include, :joins, :select, :order), you can put these within the :sql option:

Article.search :sql => {:include => :user}

And finally - to avoid lazily loading search results and make sure Thinking Sphinx processes the search query immediately, use the :populate option:

Article.search 'pancakes', :populate => true
# is equivalent to
Article.search('pancakes').populate

This is particularly useful to ensure exceptions are raised where you expect them to.