Wednesday, April 11, 2007
Lucene Optimization Tip: Reuse Searcher
Lucene is a great text search engine, but some aspects of it are not clearly documented. For example, if you have multiple simultaneous requests (e.g., in a web application) searching for different queries, should there be a separate IndexSearcher for each thread or one shared IndexSearcher will do.
Initially we were using different IndexSearcher for different requests, creating a new instance at the start of the request and destroying soon after searching. After a lot of experimentation and exchange of a few mails over the lucene mailing list, I discovered that the efficient way is to use a single shared IndexSearcher across all requests. Multiple concurrent threads can easily invoke the search method on a single IndexSearcher object. Reusing cached data makes the single searcher approach very efficient both in terms of response time and memory usages. In our case, response times decreased by a factor of 2-3 after this change.
Remember that IndexSearcher object however needs to be destroyed and recreated after the index is modified or updated. Unless the searcher is recreated, it will not reflect the changes made to the index. A good strategy is to periodically destroy and recreate the shared index using a separate Timer thread.
Labels: lucene, optimization, programming
I am java developer and also worked on lucene search api for
my site.
Can u give me small tips which on lucene which might be small gain.
Will optimizing lucene index help
in gaining performance rise
rightly said, that an index searcher instance can be reused for better optimization.
I am facing same problem. I am using Apache threads to serve request via cgi. these cgi create an instance of searcher and gets the result back . Now, m not sure here how can I use the same instance of searcher by every apache thread handling cgi.
How u used the same instance of searcher in diff web request.
<< Home