Encodes the four per-document value types (Numeric,Binary,Sorted,SortedSet) with seven basic strategies.
acceptableOverheadRatio
would pack values into 8 bits per value anyway, they are written as absolute values (with no indirection or packing) for performance. Files:
The DocValues metadata or .dvm file.
For DocValues field, this stores metadata, such as the offset into the DocValues data (.dvd)
DocValues metadata (.dvm) --> Header,<FieldNumber,EntryType,Entry>NumFields
Sorted fields have two entries: a SortedEntry with the FST metadata, and an ordinary NumericEntry for the document-to-ord metadata.
SortedSet fields have two entries: a SortedEntry with the FST metadata, and an ordinary BinaryEntry for the document-to-ord-list metadata.
FieldNumber of -1 indicates the end of metadata.
EntryType is a 0 (NumericEntry), 1 (BinaryEntry, or 2 (SortedEntry)
DataOffset is the pointer to the start of the data in the DocValues data (.dvd)
CompressionType indicates how Numeric values will be compressed:
acceptableOverheadRatio
parameter would upgrade the number of bits required to 8, and all values fit in a byte, these are written as absolute binary values for performance. MinLength and MaxLength represent the min and max byte[] value lengths for Binary values. If they are equal, then all values are of a fixed size, and can be addressed as DataOffset + (docID * length). Otherwise, the binary values are of variable size, and packed integer metadata (PackedVersion,BlockSize) is written for the addresses.
The DocValues data or .dvd file.
For DocValues field, this stores the actual per-document data (the heavy-lifting)
DocValues data (.dvd) --> Header,<NumericData | BinaryData | SortedData>NumFields
SortedSet entries store the list of ordinals in their BinaryData as a sequences of increasing {@link DataOutput#writeVLong vLong}s, delta-encoded.
Limitations:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|