ResultSpecification
is a set of desired outputs from a Analysis Engine or Annotator. Each output is a reference to either a {@link org.apache.uima.cas.Type} or a{@link org.apache.uima.cas.Feature}. Annotator implementations may, but are not required to, check the ResultSpecification
passed as a parameter to their {@link AnalysisComponent#setResultSpecification(ResultSpecification)}}, and produce only those Type
s and Feature
s that are part of the ResultSpecification. Annotators can call the {@link #containsType(String)} and{@link #containsFeature(String)} to determine which types and features belong to thisResultSpecification and should be produced.
The containsType(...) method also returns true for subtypes of types specified in the result specification. The containsFeature(...) method also returns true for all features of a type if the type's allAnnotatorFeatures flag is set. As a corner case, the containsFeature may return true in this case even if the feature is not actually part of the type system.
ResultSpecifications are language enabled to allow different values to be set and returned, based on a ISO language identifier. There are two styles of the get and add methods that add these specifications to an instance of a ResultSpecification: one takes an additional parameter specifying the language(s), the other doesn't have this parameter. Using the one without the language parameter is equivalent to using the "x-unspecified" language. The functions that add ResultSpecifications can do this for multiple languages at once because the language parameter is an array of strings. The functions to retrieve a ResultSpecification specify one particular language.
Result Specifications for particular types or features have an associated set of languages which they are set for. That associated set of languages can include the "x-unspecified" language; if it does, then a query for that feature for any language will match.
If a type or feature's set of languages does not include "x-unspecified", then a query using "x-unspecified" (either as the language passed, or by default, if no language is passed) returns false.
A result specification entry having a language set may contain languages with country codes, such as zh-cn for example. A query for zh-cn would match, but a query for zh would not match, if this entry only had the zh-cn language specified.
But, a result specification entry having a language set containing just zh will match queries that specify zh-cn as the language, because a result-specification of zh is implied to mean the language zh regardless of the country.
Sometimes the methods to change the result specification replace the language(s), other times, the language(s) are merged with any existing specification already present in the result specification for the particular type or feature.
Prior to any querying using the containsType or containsFeature methods, the type system in use for this result specification may be specified by calling setTypeSystem(typeSystem). If it is available, the results of the containsType and containsFeature methods will be true for subtypes of the original types. The language specification in the ResultSpecification for a derived sub type is computed as the union of all of the langauges specified for its supertypes. Likewise, the The allAnnotatorFeatures flag of a subtype is the logical union of this property for all of its supertypes.
The computation to enable this behavior for subtypes is called "compiling" the result specification, and is done automatically, but only when needed, and when there is an available type system specified for this result specification object, using the setTypeSystem or compile methods. The result of this computation is "cached". If the result specification is subsequently updated in such a way as to make this computation invalid, the cache is invalidated, and another compile operation will be transparently done, when and if needed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|