org.apache.uima.cas.impl.BinaryCasSerDes6.ReuseInfo
User callable serialization and deserialization of the CAS in a compressed Binary Format This serializes/deserializes the state of the CAS. It has the capability to map type systems, so the sending and receiving type systems do not have to be the same. - types and features are matched by name, and features must have the same range (slot kind) - types and/or features in one type system not in the other are skipped over Header specifies to reader the format, and the compression level. How to Serialize: 1) create an instance of this class a) if doing a delta serialization, pass in the mark and a ReuseInfo object that was created after deserializing this CAS initially. b) if serializaing to a target with a different type system, pass the target's type system impl object so the serialization can filter the types for the target. 2) call serialize() to serialize the CAS 3) If doing serialization to a target from which you expect to receive back a delta CAS, create a ReuseInfo object from this object and reuse it for deserializing the delta CAS. TypeSystemImpl objects are lazily augmented by customized TypeInfo instances for each type encountered in serializing or deserializing. These are preserved for future calls, so their setup / initialization is only needed the first time. TypeSystemImpl objects are also lazily augmented by typeMappers for individual different target typesystems; these too are preserved and reused on future calls. Compressed Binary CASes are designed to be "self-describing" - The format of the compressed binary CAS, including version info, is inserted at the beginning so that a proper deserialization method can be automatically chosen. Compressed Binary format implemented by this class supports type system mapping. Types in the source which are not in the target (or vice versa) are omitted. Types with "extra" features have their extra features omitted (or on deserialization, they are set to their default value - null, or 0, etc.). Feature slots which hold references to types not in the target type system are replaced with 0 (null). How to Deserialize: 1) get an appropriate CAS to deserialize into. For delta CAS, it does not have to be empty, but it must be the originating CAS from which the delta was produced. 2) If the case is one where the target type system== the CAS's, and the serialized for is not Delta, then, call aCAS.reinit(source). Otherwise, create an instance of this class -> xxx a) Assuming the object being deserialized has a different type system, set the "target" type system to the TypeSystemImpl instance of the object being deserialized. a) if delta deserializing, pass in the ReuseInfo object created when the CAS was serialized 3) call xxx.deserialize(inputStream) Compression/Decompression Works in two stages: application of Zip/Unzip to particular sub-collections of CAS data, grouped according to similar data distribution collection of like kinds of data (to make the zipping more effective) There can be up to ~20 of these collections, such as control info, float-exponents, string chars Deserialization: Read all bytes, create separate ByteArrayInputStreams for each segment create appropriate unzip data input streams for these Slow but expensive data: extra type system info - lazily created and added to shared TypeSystemImpl object set up per type actually referenced mapper for type system - lazily created and added to shared TypeSystemImpl object in identity-map cache (size limit = 10 per source type system?) - key is target typesystemimpl. Defaulting: flags: doMeasurements, compressLevel, CompressStrategy Per serialize call: cas, output, [target ts], [mark for delta] Per deserialize call: cas, input, [target ts], whether-to-save-info-for-delta-serialization CASImpl has instance method with defaulting args for serialization. CASImpl has reinit which works with compressed binary serialization objects if no type mapping If type mapping, (new BinaryCasSerDes6(cas, marker-or-null, targetTypeSystem (for stream being deserialized), reuseInfo-or-null) .deserialize(in-stream) Use Cases, filtering and delta * * (de)serialize * filter? * delta? * Use case * * serialize * N * N * Saving a Cas, * * * * sending Cas to service with identical ts * * serialize * Y * N * sending Cas to service with * * * * different ts (a guaranteed subset) * * serialize * N * Y * returning Cas to client * * * * uses info saved when deserializing * * * * (?? saving just a delta to disk??) * * serialize * Y * Y * NOT SUPPORTED (not needed) * * deserialize * N * N * reading/(receiving) CAS, identical TS * * deserialize * Y * N * reading/receiving CAS, different TS * * * * ts not guaranteed to be superset * * * * for "reading" case. * * deserialize * N * Y * receiving CAS, identical TS * * * * uses info saved when serializing * * deserialize * Y * Y * receiving CAS, different TS (tgt a feature subset) * * * * uses info saved when serializing *