This class provides a {@link Field} that enables indexingof numeric values for efficient range filtering and sorting. Here's an example usage, adding an int value:
document.add(new NumericField(name).setIntValue(value));For optimal performance, re-use the
NumericField
and {@link Document} instance for more thanone document: NumericField field = new NumericField(name); Document document = new Document(); document.add(field); for(all documents) { ... field.setIntValue(value) writer.addDocument(document); ... }
The java native types int
, long
, float
and double
are directly supported. However, any value that can be converted into these native types can also be indexed. For example, date/time values represented by a {@link java.util.Date} can be translated into a longvalue using the {@link java.util.Date#getTime} method. If youdon't need millisecond precision, you can quantize the value, either by dividing the result of {@link java.util.Date#getTime} or using the separate getters(for year, month, etc.) to construct an int
or long
value.
To perform range querying or filtering against a NumericField
, use {@link NumericRangeQuery} or {@link NumericRangeFilter}. To sort according to a NumericField
, use the normal numeric sort types, eg {@link SortField#INT}. NumericField
values can also be loaded directly from {@link FieldCache}.
By default, a NumericField
's value is not stored but is indexed for range filtering and sorting. You can use the {@link #NumericField(String,Field.Store,boolean)}constructor if you need to change these defaults.
You may add the same field name as a NumericField
to the same document more than once. Range querying and filtering will be the logical OR of all values; so a range query will hit all documents that have at least one value in the range. However sort behavior is not defined. If you need to sort, you should separately index a single-valued NumericField
.
A NumericField
will consume somewhat more disk space in the index than an ordinary single-valued field. However, for a typical index that includes substantial textual content per document, this increase will likely be in the noise.
Within Lucene, each numeric value is indexed as a trie structure, where each term is logically assigned to larger and larger pre-defined brackets (which are simply lower-precision representations of the value). The step size between each successive bracket is called the precisionStep
, measured in bits. Smaller precisionStep
values result in larger number of brackets, which consumes more disk space in the index but may result in faster range search performance. The default value, 4, was selected for a reasonable tradeoff of disk space consumption versus performance. You can use the expert constructor {@link #NumericField(String,int,Field.Store,boolean)} if you'dlike to change the value. Note that you must also specify a congruent value when creating {@link NumericRangeQuery} or {@link NumericRangeFilter}. For low cardinality fields larger precision steps are good. If the cardinality is < 100, it is fair to use {@link Integer#MAX_VALUE}, which produces one term per value.
For more information on the internals of numeric trie indexing, including the precisionStep
configuration, see {@link NumericRangeQuery}. The format of indexed values is described in {@link NumericUtils}.
If you only need to sort by numeric value, and never run range querying/filtering, you can index using a precisionStep
of {@link Integer#MAX_VALUE}. This will minimize disk space consumed.
More advanced users can instead use {@link NumericTokenStream} directly, when indexing numbers. Thisclass is a wrapper around this token stream type for easier, more intuitive usage.
NOTE: This class is only used during indexing. When retrieving the stored field value from a {@link Document} instance after search, you will get aconventional {@link Fieldable} instance where the numericvalues are returned as {@link String}s (according to toString(value)
of the used data type).
NOTE: This API is experimental and might change in incompatible ways in the next release. @since 2.9
|
|
|
|
|
|