This class is a semi-external {@link it.unimi.dsi.fastutil.longs.LongList} thatMG4J uses as default for accessing term offsets.
When the number of terms in the index grows, storing each offset as a long in an array can consume hundred of megabytes of memory, and most of this memory is wasted, as it is occupied by offsets of hapax legomena (terms occurring just once in the collection). Instead, this class accesses offsets in their compressed forms, and provides entry points for random access to each offset. At construction time, entry points are computed with a certain step, which is the number of offsets accessible from each entry point, or, equivalently, the maximum number of offsets that will be necessary to read to access a given offset.
Warning: This class is not thread safe, and needs to be synchronised to be used in a multithreaded environment. @author Fabien Campagne @author Sebastiano Vigna
|
|
|
|
|
|