This class embodies most of the work that must be done when answering a query. Basically, {@link #process(String,int,int,ObjectArrayList) process(query,offset,length,results)} takes query
,parses it, turns it into a document iterator, scans the results, and deposits length
results starting at offset
into the list results
.
There however several additional features available. First of all, either by separating several queries with commas, or using directly {@link #process(Query[],int,int,ObjectArrayList)}it is possible to resolve a series of queries with an “and-then” semantics: results are added from each query, provided they did not appear before.
It is possible to {@linkplain #score(Scorer[],double[]) score queries} using one ormore scorer with different weights (see {@link it.unimi.dsi.mg4j.search.score}), and also set {@linkplain #setWeights(Reference2DoubleMap) different weights for different indices} (they will be passed to the scorers). The scorers influence the order when processing each query, but results from different “and-then” queries are simply concatenated.
When using multiple scorers, {@linkplain #equalize(int) equalisation} can be used to avoid the problem associated with the potentially different value ranges of each scorer. Equalisation evaluates a settable number of sample documents and normalize the scorers using the maximum value in the sample. See {@link it.unimi.dsi.mg4j.search.score.AbstractAggregator} for some elaboration.
{@linkplain #multiplex Multiplexing} transforms a query q into index0:q | index1:q …. In other words, the query is multiplexed on all available indices. Note that if inside q there are selection operators that specify an index, the inner specification will overwrite the external one, so that the semantics of the query is only amplified, but never contradicted.
The results returned are instances of {@link it.unimi.dsi.mg4j.search.score.DocumentScoreInfo}. If an {@linkplain #intervalSelector interval selector} has been set, the info
field will contain a map from indices to arrays of {@linkplain it.unimi.dsi.mg4j.query.SelectedInterval selected intervals}satisfying the query (see {@link it.unimi.dsi.mg4j.search} for some elaboration on minimal-interval semantics support in MG4J).
For examples of usage of this class, please look at {@link it.unimi.dsi.mg4j.query.Query}and {@link it.unimi.dsi.mg4j.query.QueryServlet}.
Warning: This class is highly experimental. It has become definitely more decent in MG4J, but still needs some refactoring.
Warning: This class is not thread safe, but it provides {@linkplain it.unimi.dsi.lang.FlyweightPrototype flyweight copies}. The {@link #copy()} method is strengthened so to return an object implementing this interface. @author Sebastiano Vigna @author Paolo Boldi @since 1.0
|
|