Create a direct index from an InvertedIndex. The algorithm is similar to that followed by InvertedIndexBuilder. To summarise, InvertedIndexBuilder builds an InvertedIndex from a DirectIndex. This class does the opposite, building a DirectIndex from an InvertedIndex.
Algorithm:
For a selection of document ids (Scan the inverted index looking for postings with these document ids) For each term in the inverted index Select required postings from all the postings of that term Add these to posting objects that represents each document &nsbp;For each posting object Write out the postings for that document
Notes:
This algorithm assumes that termids start at 0 and are strictly increasing. This assumption holds true only for inverted indices generated by the single pass indexing method.
Properties:
- inverted2direct.processtokens - total number of tokens to attempt each iteration. Defaults to 100000000. Memory usage would more likely be linked to the number of pointers, however as the document index does not contain the number of unique terms in each document, the pointers calculation is impossible to make.
@author Craig Macdonald
@since 2.0