While aggregating the total term frequency is trivial since it can be summed up not every {@link org.apache.lucene.search.similarities.Similarity}makes use of this statistic. The document frequency which is used in the {@link org.apache.lucene.search.similarities.DefaultSimilarity}can only be estimated as an lower-bound since it is a document based statistic. For the document frequency the maximum frequency across all fields per term is used which is the minimum number of documents the terms occurs in.
|
|