Instance of this class represent a single document. Documents provide access to possibly several fields, which represent units of information that should be indexed separately.
Each field is accessible by a call to {@link #content(int)}. Note, however, that unless specified otherwise field content must be accessed in increasing order. You can skip some field, but the contract of this class does not require that you can access fields in random order (although implementations may provide this feature). Moreover, the data provided by a call to {@link #content(int)} (e.g., a {@link java.io.Reader} for {@link it.unimi.dsi.mg4j.document.DocumentFactory.FieldType#TEXT TEXT} fields) may become invalidat the next call (similarly to the behaviour of {@link it.unimi.dsi.mg4j.document.DocumentCollection#document(int)}). The same holds for {@link #wordReader(int)}.
After obtaining a document, it is your responsibility to {@linkplain java.io.Closeable#close() close} it.
It is advisable, although not strictly required, that documents have a toString()
equal to their title.
|
|
|
|