Mostly Harmless: Lucene Optimization Tip: Reuse Searcher

Wednesday, April 11, 2007

Lucene Optimization Tip: Reuse Searcher

Lucene is a great text search engine, but some aspects of it are not clearly documented. For example, if you have multiple simultaneous requests (e.g., in a web application) searching for different queries, should there be a separate IndexSearcher for each thread or one shared IndexSearcher will do.

Initially we were using different IndexSearcher for different requests, creating a new instance at the start of the request and destroying soon after searching. After a lot of experimentation and exchange of a few mails over the lucene mailing list, I discovered that the efficient way is to use a single shared IndexSearcher across all requests. Multiple concurrent threads can easily invoke the search method on a single IndexSearcher object. Reusing cached data makes the single searcher approach very efficient both in terms of response time and memory usages. In our case, response times decreased by a factor of 2-3 after this change.

Remember that IndexSearcher object however needs to be destroyed and recreated after the index is modified or updated. Unless the searcher is recreated, it will not reflect the changes made to the index. A good strategy is to periodically destroy and recreate the shared index using a separate Timer thread.

Labels: lucene, optimization, programming

# posted by nileshbansal @ 3:42 pm - - -bookmark at -del.icio.us

Comments:

Thanks Nilesh for good post.
I am java developer and also worked on lucene search api for
my site.
Can u give me small tips which on lucene which might be small gain.

Will optimizing lucene index help
in gaining performance rise

# posted by

Nixon Rodrigues : 5:45 am, September 18, 2007

Does you know some placing explaning how to do the same with Zend_Search_Lucene ???

Thnx!

# posted by

Thadeu : 1:30 pm, December 19, 2007

Good Article on optimization techniques in Lucene...
Lucene Optimization

# posted by

Sachin Joshi : 4:09 pm, May 14, 2008

This comment has been removed by a blog administrator.

# posted by

Adi : 7:31 am, November 24, 2009

Hi Nilesh,
rightly said, that an index searcher instance can be reused for better optimization.
I am facing same problem. I am using Apache threads to serve request via cgi. these cgi create an instance of searcher and gets the result back . Now, m not sure here how can I use the same instance of searcher by every apache thread handling cgi.
How u used the same instance of searcher in diff web request.

# posted by

Unknown : 5:23 am, March 04, 2010

@Suman - I would suggest using Tomcat (if you use Java) or some app container that allows you to share objects across requests.

# posted by

nileshbansal : 1:47 pm, March 04, 2010

Mostly Harmless

About Me

Links

archives

Previous

Wednesday, April 11, 2007

Lucene Optimization Tip: Reuse Searcher