Term vectors are stored using two files
File formats
A vector data file (extension .tvd). This file stores terms, frequencies, positions, offsets and payloads for every document. Upon writing a new segment, it accumulates data into memory until the buffer used to store terms and payloads grows beyond 4KB. Then it flushes all metadata, terms and positions to disk using LZ4 compression for terms and payloads and {@link BlockPackedWriter blocks of packed ints} for positions.
Here is a more detailed description of the field data file format:
An index file (extension .tvx).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|