Lucene 4.0 Field Infos format.
Field names are stored in the field info file, with suffix .fnm.
FieldInfos (.fnm) --> Header,FieldsCount, <FieldName,FieldNumber, FieldBits,DocValuesBits,Attributes> FieldsCount
Data types:
- Header --> {@link CodecUtil#checkHeader CodecHeader}
- FieldsCount --> {@link DataOutput#writeVInt VInt}
- FieldName --> {@link DataOutput#writeString String}
- FieldBits, DocValuesBits --> {@link DataOutput#writeByte Byte}
- FieldNumber --> {@link DataOutput#writeInt VInt}
- Attributes --> {@link DataOutput#writeStringStringMap Map<String,String>}
Field Descriptions:
- FieldsCount: the number of fields in this file.
- FieldName: name of the field as a UTF-8 String.
- FieldNumber: the field's number. Note that unlike previous versions of Lucene, the fields are not numbered implicitly by their order in the file, instead explicitly.
- FieldBits: a byte containing field options.
- The low-order bit is one for indexed fields, and zero for non-indexed fields.
- The second lowest-order bit is one for fields that have term vectors stored, and zero for fields without term vectors.
- If the third lowest order-bit is set (0x4), offsets are stored into the postings list in addition to positions.
- Fourth bit is unused.
- If the fifth lowest-order bit is set (0x10), norms are omitted for the indexed field.
- If the sixth lowest-order bit is set (0x20), payloads are stored for the indexed field.
- If the seventh lowest-order bit is set (0x40), term frequencies and positions omitted for the indexed field.
- If the eighth lowest-order bit is set (0x80), positions are omitted for the indexed field.
- DocValuesBits: a byte containing per-document value types. The type recorded as two four-bit integers, with the high-order bits representing
norms
options, and the low-order bits representing {@code DocValues} options. Each four-bit integer can be decoded as such: - 0: no DocValues for this field.
- 1: variable-width signed integers. ( {@code Type#VAR_INTS VAR_INTS})
- 2: 32-bit floating point values. ( {@code Type#FLOAT_32 FLOAT_32})
- 3: 64-bit floating point values. ( {@code Type#FLOAT_64 FLOAT_64})
- 4: fixed-length byte array values. ( {@code Type#BYTES_FIXED_STRAIGHT BYTES_FIXED_STRAIGHT})
- 5: fixed-length dereferenced byte array values. ( {@code Type#BYTES_FIXED_DEREF BYTES_FIXED_DEREF})
- 6: variable-length byte array values. ( {@code Type#BYTES_VAR_STRAIGHT BYTES_VAR_STRAIGHT})
- 7: variable-length dereferenced byte array values. ( {@code Type#BYTES_VAR_DEREF BYTES_VAR_DEREF})
- 8: 16-bit signed integers. ( {@code Type#FIXED_INTS_16 FIXED_INTS_16})
- 9: 32-bit signed integers. ( {@code Type#FIXED_INTS_32 FIXED_INTS_32})
- 10: 64-bit signed integers. ( {@code Type#FIXED_INTS_64 FIXED_INTS_64})
- 11: 8-bit signed integers. ( {@code Type#FIXED_INTS_8 FIXED_INTS_8})
- 12: fixed-length sorted byte array values. ( {@code Type#BYTES_FIXED_SORTED BYTES_FIXED_SORTED})
- 13: variable-length sorted byte array values. ( {@code Type#BYTES_VAR_SORTED BYTES_VAR_SORTED})
- Attributes: a key-value map of codec-private attributes.
@lucene.experimental
@deprecated Only for reading old 4.0 and 4.1 segments