Examples of org.apache.uima.analysis_engine.AnalysisEngine

org.apache.uima.analysis_engine.AnalysisEngine
An Analysis Engine is a component responsible for analyzing unstructured information, discovering and representing semantic content. Unstructured information includes, but is not restricted to, text documents.
An AnalysisEngine operates on an "analysis structure" (implemented by {@link org.apache.uima.cas.CAS}). The CAS contains the artifact to be processed as well as semantic information already inferred from that artifact. The AnalysisEngine analyzes this information and adds new information to the CAS.
To create an instance of an Analysis Engine, an application should call {@link org.apache.uima.UIMAFramework#produceAnalysisEngine(ResourceSpecifier)}.
A typical application interacts with the Analysis Engine interface as follows:
1. Call {@link #newCAS()} to create a new Common Analysis System appropriate for thisAnalysisEngine.
2. Use the {@link CAS} interface to populate the CAS with the artifact to beanalyzed any information known about this document (e.g. the language of a text document).
3. Optionally, create a {@link org.apache.uima.analysis_engine.ResultSpecification} thatidentifies the results you would like this AnalysisEngine to generate (e.g. people, places, and dates), and call the {#link {@link #setResultSpecification(ResultSpecification)} method.
4. Call {@link #process(CAS)} - the AnalysisEngine will perform its analysis.
5. Retrieve the results from the {@link CAS}.
6. Call {@link CAS#reset()} to clear out the CAS and prepare for processing anew artifact.
7. Repeat steps 2 through 6 for each artifact to be processed.
Important: It is highly recommended that you reuse CAS objects rather than calling newCAS() prior to each analysis. This is because CAS objects may be expensive to create and may consume a significant amount of memory.
Instead of using the {@link CAS} interface, applications may wish to use the Java-object-based{@link JCas} interface. In that case, the call to newCAS from step 1 above wouldbe replaced by {@link #newJCas()}, and the {@link #process(JCas)} method would be used.
Analysis Engine implementations may or may not be capable of simultaneously processing multiple documents in a multithreaded environment. See the documentation associated with the implementation or factory method (e.g. ( {@link org.apache.uima.UIMAFramework#produceAnalysisEngine(ResourceSpecifier)}) that you are using.

      outFile.delete();
      
      //test aggregate that uses ParallelStep
      AnalysisEngineDescription desc = UIMAFramework.getXMLParser().parseAnalysisEngineDescription(
        new XMLInputSource(JUnitExtension.getFile("TextAnalysisEngineImplTest/AggregateForParallelStepTest.xml")));
      AnalysisEngine ae = UIMAFramework.produceAnalysisEngine(desc);
      CAS cas = ae.newCAS();
      cas.setDocumentText("new test");
      ae.process(cas);
      assertEquals("new test", TestAnnotator.lastDocument);
      assertEquals("new test", TestAnnotator2.lastDocument);
      cas.reset();
      
    } catch (Exception e) {

View Full Code Here

   * 
   * @param aTaeDesc
   *          description of TextAnalysisEngine to test
   */
  protected void _testProcess(AnalysisEngineDescription aTaeDesc) throws UIMAException {
    AnalysisEngine ae = UIMAFramework.produceAnalysisEngine(aTaeDesc);
    CAS tcas = ae.newCAS();


    // process(CAS,ResultSpecification)
    ResultSpecification resultSpec = new ResultSpecification_impl(tcas.getTypeSystem());
    resultSpec.addResultType("NamedEntity", true);

View Full Code Here


    _testProcessInner(ae, tcas, resultSpec, resultSpec);
  }
  
  protected void _testProcess(AnalysisEngineDescription aTaeDesc, String[] languages) throws UIMAException {
    AnalysisEngine ae = UIMAFramework.produceAnalysisEngine(aTaeDesc);
    CAS tcas = ae.newCAS();


    // process(CAS,ResultSpecification)
    ResultSpecification resultSpec = new ResultSpecification_impl(tcas.getTypeSystem());
    resultSpec.addResultType("NamedEntity", true);   // includes subtypes Person, Sentence, Place, Paragraph
                                                     // sets for lang = x-unspecified

View Full Code Here

      aggAe.collectionProcessComplete(new ProcessTrace_impl());
      
      //test that fixedFlow order is used
      File descFile = JUnitExtension.getFile("TextAnalysisEngineImplTest/AggregateForCollectionProcessCompleteTest.xml");
      AnalysisEngineDescription cpcTestDesc = UIMAFramework.getXMLParser().parseAnalysisEngineDescription(new XMLInputSource(descFile));
      AnalysisEngine cpcTestAe = UIMAFramework.produceAnalysisEngine(cpcTestDesc);
      cpcTestAe.collectionProcessComplete();
      assertEquals("One", AnnotatorForCollectionProcessCompleteTest.lastValue);
    } catch (Exception e) {
      JUnitExtension.handleException(e);
    }
  }

View Full Code Here

      // primitive
      AnalysisEngineDescription segmenterDesc = UIMAFramework.getXMLParser()
              .parseAnalysisEngineDescription(
                      new XMLInputSource(JUnitExtension
                              .getFile("TextAnalysisEngineImplTest/NewlineSegmenter.xml")));
      AnalysisEngine ae = UIMAFramework.produceAnalysisEngine(segmenterDesc);
      CAS cas = ae.newCAS();
      cas.setDocumentText("Line one\nLine two\nLine three");
      CasIterator iter = ae.processAndOutputNewCASes(cas);
      assertTrue(iter.hasNext());
      CAS outCas = iter.next();
      assertEquals("Line one", outCas.getDocumentText());
      outCas.release();
      assertTrue(iter.hasNext());
      outCas = iter.next();
      assertEquals("Line two", outCas.getDocumentText());
      outCas.release();
      assertTrue(iter.hasNext());
      outCas = iter.next();
      assertEquals("Line three", outCas.getDocumentText());
      outCas.release();
      assertFalse(iter.hasNext());


      // aggregate
      AnalysisEngineDescription aggSegDesc = UIMAFramework.getXMLParser()
              .parseAnalysisEngineDescription(
                      new XMLInputSource(JUnitExtension
                              .getFile("TextAnalysisEngineImplTest/AggregateWithSegmenter.xml")));
      ae = UIMAFramework.produceAnalysisEngine(aggSegDesc);
      cas = ae.newCAS();
      cas.setDocumentText("Line one\nLine two\nLine three");
      iter = ae.processAndOutputNewCASes(cas);
      assertTrue(iter.hasNext());
      outCas = iter.next();
      assertEquals("Line one", outCas.getDocumentText());
      assertEquals("Line one", TestAnnotator.lastDocument);
      outCas.release();
      assertTrue(iter.hasNext());
      outCas = iter.next();
      assertEquals("Line two", outCas.getDocumentText());
      assertEquals("Line two", TestAnnotator.lastDocument);
      outCas.release();
      assertTrue(iter.hasNext());
      outCas = iter.next();
      assertEquals("Line three", outCas.getDocumentText());
      assertEquals("Line three", TestAnnotator.lastDocument);
      outCas.release();
      assertFalse(iter.hasNext());
      // Annotator should NOT get the original CAS according to the default flow
      assertEquals("Line three", TestAnnotator.lastDocument);


      // nested aggregate
      AnalysisEngineDescription nestedAggSegDesc = UIMAFramework
              .getXMLParser()
              .parseAnalysisEngineDescription(
                      new XMLInputSource(
                              JUnitExtension
                                      .getFile("TextAnalysisEngineImplTest/AggregateContainingAggregateSegmenter.xml")));
      ae = UIMAFramework.produceAnalysisEngine(nestedAggSegDesc);
      cas = ae.newCAS();
      cas.setDocumentText("Line one\nLine two\nLine three");
      iter = ae.processAndOutputNewCASes(cas);
      assertTrue(iter.hasNext());
      outCas = iter.next();
      assertEquals("Line one", outCas.getDocumentText());
      assertEquals("Line one", TestAnnotator.lastDocument);
      outCas.release();
      assertTrue(iter.hasNext());
      outCas = iter.next();
      assertEquals("Line two", outCas.getDocumentText());
      assertEquals("Line two", TestAnnotator.lastDocument);
      outCas.release();
      assertTrue(iter.hasNext());
      outCas = iter.next();
      assertEquals("Line three", outCas.getDocumentText());
      assertEquals("Line three", TestAnnotator.lastDocument);
      outCas.release();
      assertFalse(iter.hasNext());
      // Annotator should NOT get the original CAS according to the default flow
      assertEquals("Line three", TestAnnotator.lastDocument);


      // two segmenters
      AnalysisEngineDescription twoSegDesc = UIMAFramework.getXMLParser()
              .parseAnalysisEngineDescription(
                      new XMLInputSource(JUnitExtension
                              .getFile("TextAnalysisEngineImplTest/AggregateWith2Segmenters.xml")));
      ae = UIMAFramework.produceAnalysisEngine(twoSegDesc);
      cas = ae.newCAS();
      cas.setDocumentText("One\tTwo\nThree\tFour");
      iter = ae.processAndOutputNewCASes(cas);
      assertTrue(iter.hasNext());
      outCas = iter.next();
      assertEquals("One", outCas.getDocumentText());
      assertEquals("One", TestAnnotator.lastDocument);
      outCas.release();
      assertTrue(iter.hasNext());
      outCas = iter.next();
      assertEquals("Two", outCas.getDocumentText());
      assertEquals("Two", TestAnnotator.lastDocument);
      outCas.release();
      assertTrue(iter.hasNext());
      outCas = iter.next();
      assertEquals("Three", outCas.getDocumentText());
      assertEquals("Three", TestAnnotator.lastDocument);
      outCas.release();
      assertTrue(iter.hasNext());
      outCas = iter.next();
      assertEquals("Four", outCas.getDocumentText());
      assertEquals("Four", TestAnnotator.lastDocument);
      outCas.release();
      assertFalse(iter.hasNext());
      // Annotator should NOT get the original CAS according to the default flow
      assertEquals("Four", TestAnnotator.lastDocument);


      // dropping segments
      aggSegDesc = UIMAFramework.getXMLParser().parseAnalysisEngineDescription(
              new XMLInputSource(JUnitExtension
                      .getFile("TextAnalysisEngineImplTest/AggregateSegmenterForDropTest.xml")));
      ae = UIMAFramework.produceAnalysisEngine(aggSegDesc);
      cas = ae.newCAS();
      cas.setDocumentText("Line one\nDROP\nLine two\nDROP\nLine three");
      // results should be the same as the first aggregate segmenter test.
      // segmetns whose text is DROP should not be output.
      iter = ae.processAndOutputNewCASes(cas);
      assertTrue(iter.hasNext());
      outCas = iter.next();
      assertEquals("Line one", outCas.getDocumentText());
      assertEquals("Line one", TestAnnotator.lastDocument);
      outCas.release();
      assertTrue(iter.hasNext());
      outCas = iter.next();
      assertEquals("Line two", outCas.getDocumentText());
      assertEquals("Line two", TestAnnotator.lastDocument);
      outCas.release();
      assertTrue(iter.hasNext());
      outCas = iter.next();
      assertEquals("Line three", outCas.getDocumentText());
      assertEquals("Line three", TestAnnotator.lastDocument);
      outCas.release();
      assertFalse(iter.hasNext());
      // Annotator should NOT get the original CAS according to the default flow
      assertEquals("Line three", TestAnnotator.lastDocument);
      
      //with ParallelStep
      AnalysisEngineDescription desc = UIMAFramework.getXMLParser().parseAnalysisEngineDescription(
        new XMLInputSource(JUnitExtension.getFile("TextAnalysisEngineImplTest/AggregateForParallelStepCasMultiplierTest.xml")));
      ae = UIMAFramework.produceAnalysisEngine(desc);
      cas.reset();
      cas.setDocumentText("One\tTwo\nThree\tFour");
      iter = ae.processAndOutputNewCASes(cas);
      Set<String> expectedOutputs = new HashSet<String>();
      expectedOutputs.add("One");
      expectedOutputs.add("Two\nThree");
      expectedOutputs.add("Four");
      expectedOutputs.add("One\tTwo");
      expectedOutputs.add("Three\tFour");
      while (iter.hasNext()) {
        outCas = iter.next();
        assertTrue(expectedOutputs.remove(outCas.getDocumentText()));        
        outCas.release();
      }
      assertTrue(expectedOutputs.isEmpty());


      
      // test aggregate with 2 AEs sharing resource manager
      AnalysisEngineDescription aggregateSegDesc = UIMAFramework.getXMLParser()
              .parseAnalysisEngineDescription(
                      new XMLInputSource(JUnitExtension
                              .getFile("TextAnalysisEngineImplTest/AggregateWithSegmenter.xml")));
      
      ResourceManager rsrcMgr = UIMAFramework.newDefaultResourceManager();
      Map<String, Object> params = new HashMap<String, Object>();
      AnalysisEngine ae1 = UIMAFramework.produceAnalysisEngine(aggregateSegDesc, rsrcMgr, params);
      AnalysisEngine ae2 = UIMAFramework.produceAnalysisEngine(aggregateSegDesc, rsrcMgr, params);
      
      // start with testing first ae
      CAS cas1 = ae1.newCAS();
      cas1.setDocumentText("Line one\nLine two\nLine three");
      CasIterator iter1 = ae1.processAndOutputNewCASes(cas1);
      assertTrue(iter1.hasNext());
      CAS outCas1 = iter1.next();
      assertEquals("Line one", outCas1.getDocumentText());
     
      // now test second ae
      CAS cas2 = ae2.newCAS();
      cas2.setDocumentText("Line one\nLine two\nLine three");
      CasIterator iter2 = ae2.processAndOutputNewCASes(cas2);
      assertTrue(iter2.hasNext());
      CAS outCas2 = iter2.next();
      assertEquals("Line one", outCas2.getDocumentText());
      outCas2.release();
      assertTrue(iter2.hasNext());

View Full Code Here

    if (o instanceof ResourceCreationSpecifier) {
      return ((ResourceCreationSpecifier) o).getMetaData();
    }
    if (o instanceof URISpecifier) {
      URISpecifier uriSpec = ((URISpecifier) o);
      AnalysisEngine ae = null;
      try {
        setVnsHostAndPort(o);
        ae = UIMAFramework.produceAnalysisEngine(uriSpec);
      } catch (ResourceInitializationException e) {
        return null;
      }
      AnalysisEngineMetaData aemd = ae.getAnalysisEngineMetaData();
      ae.destroy();
      return aemd;
    }
    
    throw new InternalErrorCDE("invalid call");
  }

View Full Code Here

      if (mInitParams == null)
        mInitParams = new HashMap<String, Object>();
      UimaContextAdmin childContext = aParentContext.createChild(key, sofamap);
      mInitParams.put(Resource.PARAM_UIMA_CONTEXT, childContext);


      AnalysisEngine ae;


      // if running in "validation mode", don't try to connect to any services
      if (mInitParams.containsKey(AnalysisEngineImplBase.PARAM_VERIFICATION_MODE)
              && !(spec instanceof ResourceCreationSpecifier)) {
        // but we need placeholder entries in maps to satisfy later checking
        ae = new DummyAnalysisEngine();
      } else {
        // construct an AnalysisEngine - initializing it with the parameters
        // passed to this ASB's initialize method
        ae = UIMAFramework.produceAnalysisEngine(spec, mInitParams);
      }


      // add the Analysis Engine and its metadata to the appropriate lists


      // add AnlaysisEngine to maps based on key
      mComponentAnalysisEngineMap.put(key, ae);
      mComponentAnalysisEngineMetaDataMap.put(key, ae.getAnalysisEngineMetaData());
    }


    // make Maps unmodifiable
    mComponentAnalysisEngineMap = Collections.unmodifiableMap(mComponentAnalysisEngineMap);
    mComponentAnalysisEngineMetaDataMap = Collections

View Full Code Here

          // repeat until we reach a FinalStep
          while (!(nextStep instanceof FinalStep)) {
            //Simple Step
            if (nextStep instanceof SimpleStep) {
              String nextAeKey = ((SimpleStep) nextStep).getAnalysisEngineKey();
              AnalysisEngine nextAe = (AnalysisEngine) mComponentAnalysisEngineMap.get(nextAeKey);
              if (nextAe != null) {
                //check if we have to set result spec, to support capability language flow
                if (nextStep instanceof SimpleStepWithResultSpec) {
                  ResultSpecification rs = ((SimpleStepWithResultSpec)nextStep).getResultSpecification();
                  if (rs != null) {
                    nextAe.setResultSpecification(rs);
                  }
                }
                // invoke next AE in flow
                CasIterator casIter = null;
                CAS outputCas = null; //used if the AE we call outputs a new CAS
                try {
                  casIter = nextAe.processAndOutputNewCASes(cas);
                  if (casIter.hasNext()) {
                    outputCas = casIter.next();
                  }
                }
                catch(Exception e) {
                  //ask the FlowController if we should continue
                  //TODO: should this be configurable?
                  if (!flow.continueOnFailure(nextAeKey, e)) {
                    throw e;
                  }
                  else {
                    UIMAFramework.getLogger(CLASS_NAME).logrb(Level.FINE, CLASS_NAME.getName(), "processUntilNextOutputCas",
                            LOG_RESOURCE_BUNDLE, "UIMA_continuing_after_exception__FINE", e);
                  }
                }
                if (outputCas != null) // new CASes are output
                {
                  // push the CasIterator, original CAS, and Flow onto a stack so we
                  // can get the other output CASes and the original CAS later
                  casIteratorStack.push(new StackFrame(casIter, cas, flow, nextAeKey));
                  // compute Flow for the output CAS
                  flow = flow.newCasProduced(outputCas, nextAeKey);
                  // now route the output CAS through the flow
                  cas = outputCas;
                  activeCASes.add(cas);
                } else {
                  // no new CASes are output; this cas is done being processed
                  // by that AnalysisEngine so clear the componentInfo
                  cas.setCurrentComponentInfo(null);
                }
              } else {
                throw new AnalysisEngineProcessException(
                        AnalysisEngineProcessException.UNKNOWN_ID_IN_SEQUENCE,
                        new Object[] { nextAeKey });
              }
            } 
            //ParallelStep (TODO: refactor out common parts with SimpleStep?)
            else if (nextStep instanceof ParallelStep) {
              //create modifiable list of destinations 
              List<String> destinations = new LinkedList<String>(((ParallelStep)nextStep).getAnalysisEngineKeys());
              //iterate over all destinations, removing them from the list as we go
              while (!destinations.isEmpty()) {
                String nextAeKey = destinations.get(0);
                destinations.remove(0); 
                //execute this step as we would a single step
                AnalysisEngine nextAe = (AnalysisEngine) mComponentAnalysisEngineMap.get(nextAeKey);
                if (nextAe != null) {
                  // invoke next AE in flow
                  CasIterator casIter = null;
                  CAS outputCas = null; //used if the AE we call outputs a new CAS
                  try {
                    casIter = nextAe.processAndOutputNewCASes(cas);
                    if (casIter.hasNext()) {
                      outputCas = casIter.next();
                    }
                  }
                  catch(Exception e) {

View Full Code Here

    ResourceSpecifier spec = UIMAFramework.getXMLParser().parseResourceSpecifier(
            new XMLInputSource(aDescriptorFile));
    if (spec instanceof AnalysisEngineDescription) {
      return CasCreationUtils.createCas((AnalysisEngineDescription) spec);
    } else {
      AnalysisEngine currentAe = UIMAFramework.produceAnalysisEngine(spec);
      return CasCreationUtils.createCas(currentAe.getAnalysisEngineMetaData());
    }
  }

View Full Code Here

      progressMonitor.setProgress(1);


      // instantiate AE
      // keep this a local variable - so it doesn't hang on to the ae object
      //   preventing gc (some ae objects are huge)
      AnalysisEngine ae = UIMAFramework.produceAnalysisEngine(aggDesc);
      mCPM.setAnalysisEngine(ae);


      progressMonitor.setProgress(2);


      // register callback listener
      mCPM.addStatusCallbackListener(DocumentAnalyzer.this);


      // create a CAS, including all types from all components, that
      // we'll later use for deserializing XCASes
      List descriptorList = new ArrayList();
      descriptorList.add(collectionReaderDesc);
      descriptorList.add(ae.getMetaData());
      descriptorList.add(casConsumerDesc);
      cas = CasCreationUtils.createCas(descriptorList);
      currentTypeSystem = cas.getTypeSystem();


      // save AE output types for later use in configuring viewer

View Full Code Here

0 1 2 3 4 5 6 7 8 9

TOP

Related Classes of org.apache.uima.analysis_engine.AnalysisEngine

dkpro.similarity.example.WithDKPro

net.sf.nlpshell.MainUima

org.apache.clerezza.uima.utils.AEProvider

org.apache.clerezza.uima.utils.AEProviderTest

org.apache.ctakes.assertion.eval.AssertionEvalBasedOnModifier

org.apache.ctakes.assertion.eval.AssertionEvaluation

org.apache.ctakes.constituency.parser.ae.ParserEvaluationAnnotator

org.apache.ctakes.constituency.parser.util.CommandLineParserUtil

org.apache.ctakes.core.ae.SHARPKnowtatorXMLReader

org.apache.ctakes.core.ae.SimpleSegmentAnnotatorTests

All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.