This adapter differs from {@link FSTCompletion} in that it attemptsto discretize any "weights" as passed from in {@link InputIterator#weight()}to match the number of buckets. For the rationale for bucketing, see {@link FSTCompletion}.
Note:Discretization requires an additional sorting pass.
The range of weights for bucketing/ discretization is determined by sorting the input by weight and then dividing into equal ranges. Then, scores within each range are assigned to that bucket.
Note that this means that even large differences in weights may be lost during automaton construction, but the overall distinction between "classes" of weights will be preserved regardless of the distribution of weights.
For fine-grained control over which weights are assigned to which buckets, use {@link FSTCompletion} directly or {@link TSTLookup}, for example. @see FSTCompletion @lucene.experimental
|
|
|
|