Examples of Tokenizer

  • org.apache.jena.riot.tokens.Tokenizer
  • org.apache.lucene.analysis.Tokenizer
    A Tokenizer is a TokenStream whose input is a Reader.

    This is an abstract class.

    NOTE: subclasses must override {@link #incrementToken()} if the new TokenStream API is usedand {@link #next(Token)} or {@link #next()} if the oldTokenStream API is used.

    NOTE: Subclasses overriding {@link #incrementToken()} mustcall {@link AttributeSource#clearAttributes()} beforesetting attributes. Subclasses overriding {@link #next(Token)} must call{@link Token#clear()} before setting Token attributes.

  • org.apache.myfaces.trinidadinternal.el.Tokenizer
    converts a EL expression into tokens. @author The Oracle ADF Faces Team
  • org.apache.uima.lucas.indexer.Tokenizer
  • org.crsh.cli.impl.tokenizer.Tokenizer
  • org.eclipse.orion.server.cf.manifest.v2.Tokenizer
  • org.eclipse.osgi.framework.internal.core.Tokenizer
    Simple tokenizer class. Used to parse data.
  • org.exist.storage.analysis.Tokenizer
  • org.geoserver.ows.util.KvpUtils.Tokenizer
  • org.hsqldb.Tokenizer
    Provides the ability to tokenize SQL character sequences. Extensively rewritten and extended in successive versions of HSQLDB. @author Thomas Mueller (Hypersonic SQL Group) @version 1.8.0 @since Hypersonic SQL
  • org.jboss.dna.common.text.TokenStream.Tokenizer
  • org.jboss.forge.shell.command.parser.Tokenizer
    @author Lincoln Baxter, III
  • org.jstripe.tokenizer.Tokenizer
  • org.languagetool.tokenizers.Tokenizer
    Interface for classes that tokenize text into smaller units. @author Daniel Naber
  • org.modeshape.common.text.TokenStream.Tokenizer
  • org.openjena.riot.tokens.Tokenizer
  • org.radargun.utils.Tokenizer
    Tokenizer that allows string delims instead of char delims @author Radim Vansa <rvansa@redhat.com>
  • org.sonatype.maven.polyglot.atom.parsing.Tokenizer
    Taken from the Loop programming language compiler pipeline. @author dhanji@gmail.com (Dhanji R. Prasanna)
  • org.spoofax.jsglr.client.imploder.Tokenizer
  • org.supercsv_voltpatches.tokenizer.Tokenizer
    Reads the CSV file, line by line. If you want the line-reading functionality of this class, but want to define your own implementation of {@link #readColumns(List)}, then consider writing your own Tokenizer by extending AbstractTokenizer. @author Kasper B. Graversen @author James Bassett
  • org.zkoss.selector.lang.Tokenizer
    @author simonpai
  • weka.core.tokenizers.Tokenizer
    A superclass for all tokenizer algorithms. @author FracPete (fracpete at waikato dot ac dot nz) @version $Revision: 1.3 $

  • Examples of org.apache.lucene.analysis.Tokenizer

        add("x c", "xc", keepOrig);
        final SynonymMap map = b.build();
        Analyzer a = new Analyzer() {
          @Override
          protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
            Tokenizer tokenizer = new MockTokenizer(reader, MockTokenizer.WHITESPACE, false);
            return new TokenStreamComponents(tokenizer, new SynonymFilter(tokenizer, map, true));
          }
        };
       
        assertAnalyzesTo(a, "$",
    View Full Code Here

    Examples of org.apache.lucene.analysis.Tokenizer

          if((i % 10) == 0)
            builder.append(" ");
        }
        // internal buffer size is 1024 make sure we have a surrogate pair right at the border
        builder.insert(1023, "\ud801\udc1c");
        Tokenizer tokenizer = new LowerCaseTokenizer(TEST_VERSION_CURRENT, new StringReader(builder.toString()));
        assertTokenStreamContents(tokenizer, builder.toString().toLowerCase(Locale.ROOT).split(" "));
      }
    View Full Code Here

    Examples of org.apache.lucene.analysis.Tokenizer

        add("zoo zoo", "zoo", keepOrig);
        final SynonymMap map = b.build();
        Analyzer a = new Analyzer() {
          @Override
          protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
            Tokenizer tokenizer = new MockTokenizer(reader, MockTokenizer.WHITESPACE, false);
            return new TokenStreamComponents(tokenizer, new SynonymFilter(tokenizer, map, true));
          }
        };
       
        assertAnalyzesTo(a, "zoo zoo $ zoo",
    View Full Code Here

    Examples of org.apache.lucene.analysis.Tokenizer

          StringBuilder builder = new StringBuilder();
          for (int j = 0; j < 1+i; j++) {
            builder.append("a");
          }
          builder.append("\ud801\udc1cabc");
          Tokenizer tokenizer = new LowerCaseTokenizer(TEST_VERSION_CURRENT, new StringReader(builder.toString()));
          assertTokenStreamContents(tokenizer, new String[] {builder.toString().toLowerCase(Locale.ROOT)});
        }
      }
    View Full Code Here

    Examples of org.apache.myfaces.trinidadinternal.el.Tokenizer

        String expression,
        String var,
        String subst)
      {
        String varDot = var + ".";
        Tokenizer tokens = new Tokenizer(expression);
        StringBuffer buf = new StringBuffer(expression.length());
        while(tokens.hasNext())
        {
          Token tok = tokens.next();
          String exp = tok.getText();
          if (tok.type == Tokenizer.VAR_TYPE)
          {
            if (var.equals(exp) || exp.startsWith(varDot))
            {
    View Full Code Here

    Examples of org.apache.uima.lucas.indexer.Tokenizer

      private AnnotationDescription annotationDescription;
      private TokenStream tokenStream;
     
      @Before
      public void setUp(){
        tokenizer = new Tokenizer();
        annotationDescription = new AnnotationDescription(null);
        Collection<Token> tokens = new ArrayList<Token>();
        tokens.add(new Token("token1".toCharArray(),0,6,0,6));
        tokens.add(new Token("token2".toCharArray(),0,6,7,13));
        tokens.add(new Token("token3".toCharArray(),0,6,14,20));
    View Full Code Here

    Examples of org.crsh.cli.impl.tokenizer.Tokenizer

        }
        return match(tokens);
      }

      private InvocationMatch<T> match(final Iterable<Token> tokens) throws SyntaxException {
        Tokenizer tokenizer = new Tokenizer() {

          /** . */
          Iterator<Token> i = tokens.iterator();

          @Override
    View Full Code Here

    Examples of org.eclipse.orion.server.cf.manifest.v2.Tokenizer

      private static String INCORRECT_MANIFEST_LOCATION = "testData/manifestTest/incorrect"; //$NON-NLS-1$

      private ManifestParseTree parse(InputStream inputStream) throws IOException, TokenizerException, ParserException {
        Preprocessor preprocessor = new ManifestPreprocessor();
        List<InputLine> contents = preprocessor.process(inputStream);
        Tokenizer tokenizer = new ManifestTokenizer(contents);

        Parser parser = new ManifestParser();
        return parser.parse(tokenizer);
      }
    View Full Code Here

    Examples of org.eclipse.osgi.framework.internal.core.Tokenizer

       */
      public static ManifestElement[] parseHeader(String header, String value) throws BundleException {
        if (value == null)
          return (null);
        ArrayList headerElements = new ArrayList(10);
        Tokenizer tokenizer = new Tokenizer(value);
        parseloop: while (true) {
          String next = tokenizer.getString(";,"); //$NON-NLS-1$
          if (next == null)
            throw new BundleException(NLS.bind(Msg.MANIFEST_INVALID_HEADER_EXCEPTION, header, value), BundleException.MANIFEST_ERROR);
          ArrayList headerValues = new ArrayList();
          StringBuffer headerValue = new StringBuffer(next);
          headerValues.add(next);

          if (Debug.DEBUG && Debug.DEBUG_MANIFEST)
            Debug.print("parseHeader: " + next); //$NON-NLS-1$
          boolean directive = false;
          char c = tokenizer.getChar();
          // Header values may be a list of ';' separated values.  Just append them all into one value until the first '=' or ','
          while (c == ';') {
            next = tokenizer.getString(";,=:"); //$NON-NLS-1$
            if (next == null)
              throw new BundleException(NLS.bind(Msg.MANIFEST_INVALID_HEADER_EXCEPTION, header, value), BundleException.MANIFEST_ERROR);
            c = tokenizer.getChar();
            while (c == ':') { // may not really be a :=
              c = tokenizer.getChar();
              if (c != '=') {
                String restOfNext = tokenizer.getToken(";,=:"); //$NON-NLS-1$
                if (restOfNext == null)
                  throw new BundleException(NLS.bind(Msg.MANIFEST_INVALID_HEADER_EXCEPTION, header, value), BundleException.MANIFEST_ERROR);
                next += ":" + c + restOfNext; //$NON-NLS-1$
                c = tokenizer.getChar();
              } else
                directive = true;
            }
            if (c == ';' || c == ',' || c == '\0') /* more */{
              headerValues.add(next);
              headerValue.append(";").append(next); //$NON-NLS-1$
              if (Debug.DEBUG && Debug.DEBUG_MANIFEST)
                Debug.print(";" + next); //$NON-NLS-1$
            }
          }
          // found the header value create a manifestElement for it.
          ManifestElement manifestElement = new ManifestElement();
          manifestElement.value = headerValue.toString();
          manifestElement.valueComponents = (String[]) headerValues.toArray(new String[headerValues.size()]);

          // now add any attributes/directives for the manifestElement.
          while (c == '=' || c == ':') {
            while (c == ':') { // may not really be a :=
              c = tokenizer.getChar();
              if (c != '=') {
                String restOfNext = tokenizer.getToken("=:"); //$NON-NLS-1$
                if (restOfNext == null)
                  throw new BundleException(NLS.bind(Msg.MANIFEST_INVALID_HEADER_EXCEPTION, header, value), BundleException.MANIFEST_ERROR);
                next += ":" + c + restOfNext; //$NON-NLS-1$
                c = tokenizer.getChar();
              } else
                directive = true;
            }
            String val = tokenizer.getString(";,"); //$NON-NLS-1$
            if (val == null)
              throw new BundleException(NLS.bind(Msg.MANIFEST_INVALID_HEADER_EXCEPTION, header, value), BundleException.MANIFEST_ERROR);

            if (Debug.DEBUG && Debug.DEBUG_MANIFEST)
              Debug.print(";" + next + "=" + val); //$NON-NLS-1$ //$NON-NLS-2$
            try {
              if (directive)
                manifestElement.addDirective(next, val);
              else
                manifestElement.addAttribute(next, val);
              directive = false;
            } catch (Exception e) {
              throw new BundleException(NLS.bind(Msg.MANIFEST_INVALID_HEADER_EXCEPTION, header, value), BundleException.MANIFEST_ERROR, e);
            }
            c = tokenizer.getChar();
            if (c == ';') /* more */{
              next = tokenizer.getToken("=:"); //$NON-NLS-1$
              if (next == null)
                throw new BundleException(NLS.bind(Msg.MANIFEST_INVALID_HEADER_EXCEPTION, header, value), BundleException.MANIFEST_ERROR);
              c = tokenizer.getChar();
            }
          }
          headerElements.add(manifestElement);
          if (Debug.DEBUG && Debug.DEBUG_MANIFEST)
            Debug.println(""); //$NON-NLS-1$
    View Full Code Here

    Examples of org.exist.storage.analysis.Tokenizer

            return path.getDependencies();
        }

        protected String[] getSearchTerms(String searchString) throws EXistException {
            final List<String> tokens = new ArrayList<String>();
            final Tokenizer tokenizer = context.getBroker().getTextEngine().getTokenizer();
            tokenizer.setText(searchString);
            org.exist.storage.analysis.TextToken token;
            String word;
            while (null != (token = tokenizer.nextToken(true))) {
                word = token.getText();
                tokens.add(word);
            }
            final String[] terms = new String[tokens.size()];
            return tokens.toArray(terms);
    View Full Code Here
    TOP
    Copyright © 2018 www.massapi.com. All rights reserved.
    All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.