I have a fairly large lucene index, and queries that can hit about 5000 documents or so. I am storing my application metadata in a field in lucene (apart from text contents), and need to quickly get to this small metadata field for all the 5000 hits. Currently, my code looks something like this:
MapFieldSelector field = new MapFieldSelector("metaData");
ScoreDoc[] hits = searcher.search(query, null, 10000).scoreDocs;
for (int i = 0; i < hits.length; i++) {
int index_doc_id = hits[i].doc;
Document hitDoc = searcher.doc(index_doc_id, field); // expensive esp with disk-based lucene index
metadata = hitDoc.getFieldable("metaData").stringValue();
}
However, this is terribly slow because each call to searcher.doc() is pretty expensive. Is there a way to do a "batch" fetch of the field for all the hits that may be more responsive? Or any other way to make this work faster? (the only thing inside the ScoreDoc appears to be the Lucene doc id, which I understand should not be relied upon. Otherwise I would have maintained a Lucene doc id -> metadata map on my own.) Thanks!
Update: I am now trying to use FieldCache's like this:
String metadatas[] = org.apache.lucene.search.FieldCache.DEFAULT.getStrings(searcher.getIndexReader(), "metaData");
when I open the index, and upon a query:
int ldocId = hits[i].doc;
String metadata = metadatas[ldocId];
This is working well for me.