Examples of Tokenizer

  • org.apache.jena.riot.tokens.Tokenizer
  • org.apache.lucene.analysis.Tokenizer
    A Tokenizer is a TokenStream whose input is a Reader.

    This is an abstract class.

    NOTE: subclasses must override {@link #incrementToken()} if the new TokenStream API is usedand {@link #next(Token)} or {@link #next()} if the oldTokenStream API is used.

    NOTE: Subclasses overriding {@link #incrementToken()} mustcall {@link AttributeSource#clearAttributes()} beforesetting attributes. Subclasses overriding {@link #next(Token)} must call{@link Token#clear()} before setting Token attributes.

  • org.apache.myfaces.trinidadinternal.el.Tokenizer
    converts a EL expression into tokens. @author The Oracle ADF Faces Team
  • org.apache.uima.lucas.indexer.Tokenizer
  • org.crsh.cli.impl.tokenizer.Tokenizer
  • org.eclipse.orion.server.cf.manifest.v2.Tokenizer
  • org.eclipse.osgi.framework.internal.core.Tokenizer
    Simple tokenizer class. Used to parse data.
  • org.exist.storage.analysis.Tokenizer
  • org.geoserver.ows.util.KvpUtils.Tokenizer
  • org.hsqldb.Tokenizer
    Provides the ability to tokenize SQL character sequences. Extensively rewritten and extended in successive versions of HSQLDB. @author Thomas Mueller (Hypersonic SQL Group) @version 1.8.0 @since Hypersonic SQL
  • org.jboss.dna.common.text.TokenStream.Tokenizer
  • org.jboss.forge.shell.command.parser.Tokenizer
    @author Lincoln Baxter, III
  • org.jstripe.tokenizer.Tokenizer
  • org.languagetool.tokenizers.Tokenizer
    Interface for classes that tokenize text into smaller units. @author Daniel Naber
  • org.modeshape.common.text.TokenStream.Tokenizer
  • org.openjena.riot.tokens.Tokenizer
  • org.radargun.utils.Tokenizer
    Tokenizer that allows string delims instead of char delims @author Radim Vansa <rvansa@redhat.com>
  • org.sonatype.maven.polyglot.atom.parsing.Tokenizer
    Taken from the Loop programming language compiler pipeline. @author dhanji@gmail.com (Dhanji R. Prasanna)
  • org.spoofax.jsglr.client.imploder.Tokenizer
  • org.supercsv_voltpatches.tokenizer.Tokenizer
    Reads the CSV file, line by line. If you want the line-reading functionality of this class, but want to define your own implementation of {@link #readColumns(List)}, then consider writing your own Tokenizer by extending AbstractTokenizer. @author Kasper B. Graversen @author James Bassett
  • org.zkoss.selector.lang.Tokenizer
    @author simonpai
  • weka.core.tokenizers.Tokenizer
    A superclass for all tokenizer algorithms. @author FracPete (fracpete at waikato dot ac dot nz) @version $Revision: 1.3 $

  • Examples of anvil.script.parser.Tokenizer

      public static final Object[] p_formatScript = { null, "source", "*handler", null };
      public static final Any formatScript(Context context, Any source, Any handler)
      {
        try {
          InputStream input = getInputStream(context, source);
          Tokenizer tokenizer = new Tokenizer(input, true);
          FormattingCallbacks calls = new FormattingCallbacks(context, handler);

          Token t = null;
          out: for (;;) {
            if ((t != null) && (t.next != null)) {
              t = t.next;
            } else {
              t = tokenizer.getNextToken();
            }
            switch(t.kind) {
            case Tokenizer.EOF:
              break out;
             
    View Full Code Here

    Examples of cambridge.parser.Tokenizer

          }
       }

       public static void main(String[] args) {
          try {
             Tokenizer tokenizer = new TemplateTokenizer(TokenizerTest.class.getResourceAsStream("input.html"));

             Token token;
             while (tokenizer.hasMoreTokens()) {
                token = tokenizer.nextToken();
                System.out.println(token);
             }

             tokenizer.close();
          } catch (IOException e) {
             e.printStackTrace();
          }
       }
    View Full Code Here

    Examples of ch.pollet.jzic.tokenizer.Tokenizer

        }

        @Test
        public void testParse() throws Exception {
            String line;
            Tokenizer t;
            Zone z;
            DateTime dt;
           
            Database db = new Database();
            ZoneParser zp = new ZoneParser(db);

            //-- Test 1
            line = "Zone1 1:34:08 - LMT 2000";
            t = new Tokenizer(line, "\t ", null, 0);
            zp.parse(t);

            dt = new DateTime();
            dt.setYear(2000);
            dt.setMonth(1);
            dt.setDay(1);
            dt.setHour(0);
            dt.setMinute(0);
            dt.setSecond(0);
            dt.setType(DateTime.Type.WALL);

            assertTrue(db.getZones().containsKey("Zone1"));
            z = db.getZones().get("Zone1").first();
            assertEquals(5648, z.getOffset());
            assertEquals(null, z.getRule());
            assertEquals("LMT", z.getFormat());
            assertEquals(dt, z.getUntil());

            //-- Test 2
            line = "Zone2 1:34:08 Rule LMT 2000 Feb 12 1u";
            t = new Tokenizer(line, "\t ", null, 0);
            zp.parse(t);

            dt = new DateTime();
            dt.setYear(2000);
            dt.setMonth(2);
            dt.setDay(12);
            dt.setHour(1);
            dt.setMinute(0);
            dt.setSecond(0);
            dt.setType(DateTime.Type.UTC);

            z = db.getZones().get("Zone2").first();
            assertEquals("Rule", z.getRule());
            assertEquals(dt, z.getUntil());

            //-- Test 3
            line = "Zone3 1:34:08 Rule LMT 2000 Mar 12 1:30s";
            t = new Tokenizer(line, "\t ", null, 0);
            zp.parse(t);

            dt = new DateTime();
            dt.setYear(2000);
            dt.setMonth(3);
            dt.setDay(12);
            dt.setHour(1);
            dt.setMinute(30);
            dt.setSecond(0);
            dt.setType(DateTime.Type.STD);

            z = db.getZones().get("Zone3").first();
            assertEquals(dt, z.getUntil());

            //-- Test 4
            line = "Zone4 1:34:08 Rule LMT 2000 Apr 12 1:30:44u";
            t = new Tokenizer(line, "\t ", null, 0);
            zp.parse(t);

            dt = new DateTime();
            dt.setYear(2000);
            dt.setMonth(4);
            dt.setDay(12);
            dt.setHour(1);
            dt.setMinute(30);
            dt.setSecond(44);
            dt.setType(DateTime.Type.UTC);

            z = db.getZones().get("Zone4").first();
            assertEquals(dt, z.getUntil());

            //-- Test 5
            line = "Zone5 -1:34:08 Rule LMT";
            t = new Tokenizer(line, "\t ", null, 0);
            zp.parse(t);

            dt = new DateTime();
            dt.setYear(Integer.MAX_VALUE);
            dt.setMonth(12);
            dt.setDay(31);
            dt.setHour(23);
            dt.setMinute(59);
            dt.setSecond(59);
            dt.setType(DateTime.Type.WALL);

            z = db.getZones().get("Zone5").first();
            assertEquals(dt, z.getUntil());
            assertEquals(-5648, z.getOffset());

            //-- Test 6
            line = "Zone6 0 Rule LMT 2009 Sep lastSun";
            t = new Tokenizer(line, "\t ", null, 0);
            zp.parse(t);

            dt = new DateTime();
            dt.setYear(2009);
            dt.setMonth(9);
            dt.setDay(27);
            dt.setHour(0);
            dt.setMinute(0);
            dt.setSecond(0);
            dt.setType(DateTime.Type.WALL);

            z = db.getZones().get("Zone6").first();
            assertEquals(dt, z.getUntil());

            //-- Test 7
            line = "Zone7 0 Rule LMT 2009 Oct Mon>=3";
            t = new Tokenizer(line, "\t ", null, 0);
            zp.parse(t);

            dt = new DateTime();
            dt.setYear(2009);
            dt.setMonth(10);
            dt.setDay(5);
            dt.setHour(0);
            dt.setMinute(0);
            dt.setSecond(0);
            dt.setType(DateTime.Type.WALL);

            z = db.getZones().get("Zone7").first();
            assertEquals(dt, z.getUntil());

            //-- Test 8
            line = "Zone8 0 Rule LMT 2009 Nov Mon>=5";
            t = new Tokenizer(line, "\t ", null, 0);
            zp.parse(t);

            dt = new DateTime();
            dt.setYear(2009);
            dt.setMonth(11);
            dt.setDay(9);
            dt.setHour(0);
            dt.setMinute(0);
            dt.setSecond(0);
            dt.setType(DateTime.Type.WALL);

            z = db.getZones().get("Zone8").first();
            assertEquals(dt, z.getUntil());

            //-- Test 9
            line = "Zone9 0 Rule LMT 2009 Dec Mon<=8";
            t = new Tokenizer(line, "\t ", null, 0);
            zp.parse(t);

            dt = new DateTime();
            dt.setYear(2009);
            dt.setMonth(12);
            dt.setDay(7);
            dt.setHour(0);
            dt.setMinute(0);
            dt.setSecond(0);
            dt.setType(DateTime.Type.WALL);

            z = db.getZones().get("Zone9").first();
            assertEquals(dt, z.getUntil());

            //-- Test 10
            line = "Zone10 0 Rule LMT 2009 Dec Mon<=14";
            t = new Tokenizer(line, "\t ", null, 0);
            zp.parse(t);

            dt = new DateTime();
            dt.setYear(2009);
            dt.setMonth(12);
    View Full Code Here

    Examples of com.aliasi.tokenizer.Tokenizer

      
       public String getPOS(String sentence, boolean allTags)
       {
        StringBuffer xmlOutput =  new StringBuffer();
        char[] cs = sentence.toCharArray();
        Tokenizer tokenizer = TOKENIZER_FACTORY.tokenizer(cs, 0, cs.length);
        String[] tokens = tokenizer.tokenize();
        String[] tags = decoder.firstBest(tokens); int len = tokens.length;
        for (int i = 0; i < len; i++)
        {
         //*-- set the adjective tags
         if (tags[i].startsWith("j") || tags[i].equals("cd") || tags[i].endsWith("od") )
    View Full Code Here

    Examples of com.aliasi.tokenizer.Tokenizer

       public void buildSentences(String in)
       {
        //*-- extract the sentence boundaries
        if (in.length() > Constants.DOC_LENGTH_MAXLIMIT) in = in.substring(0, Constants.DOC_LENGTH_MAXLIMIT - 1);
        ArrayList<Token> tokenList = new ArrayList<Token>(); ArrayList<Token> whiteList = new ArrayList<Token>();
        Tokenizer tokenizer = TOKENIZER_FACTORY.tokenizer(in.toCharArray(), 0, in.length() );
        tokenizer.tokenize(tokenList, whiteList);
        tokens = new String[tokenList.size()]; tokenList.toArray(tokens);
        whites = new String[whiteList.size()]; whiteList.toArray(whites);

        sentenceBoundaries = SENTENCE_MODEL.boundaryIndices(tokens, whites);  
        int numPossibleSentences = sentenceBoundaries.length;
    View Full Code Here

    Examples of com.aliasi.tokenizer.Tokenizer

        */
       public String[] tokenizer(String in)
       {  
        if (in.length() > Constants.DOC_LENGTH_MAXLIMIT) in = in.substring(0, Constants.DOC_LENGTH_MAXLIMIT - 1);
        ArrayList<Token> tokenList = new ArrayList<Token>(); ArrayList<Token> whiteList = new ArrayList<Token>();
        Tokenizer tokenizer = new StandardBgramTokenizerFactory().tokenizer(in.toCharArray(), 0, in.length() );
        tokenizer.tokenize(tokenList, whiteList);
        String[] tokens = new String[tokenList.size()]; tokenList.toArray(tokens);
        return(tokens);
       }
    View Full Code Here

    Examples of com.aliasi.tokenizer.Tokenizer

        StringBuffer normalizeQuery(CharSequence cSeq) {
      StringBuffer sb = new StringBuffer();
      sb.append(' ');
      if (mTokenizerFactory != null) {
          char[] cs = Strings.toCharArray(cSeq);
          Tokenizer tokenizer = mTokenizerFactory.tokenizer(cs,0,cs.length);
          String nextToken;
          while ((nextToken = tokenizer.nextToken()) != null) {
        mTokenCounter.increment(nextToken);
        sb.append(nextToken);
        sb.append(' ');
          }
      } else {
    View Full Code Here

    Examples of com.esotericsoftware.yamlbeans.tokenizer.Tokenizer

      public Parser (Reader reader, Version defaultVersion) {
        if (reader == null) throw new IllegalArgumentException("reader cannot be null.");
        if (defaultVersion == null) throw new IllegalArgumentException("defaultVersion cannot be null.");

        tokenizer = new Tokenizer(reader);

        this.defaultVersion = defaultVersion;

        initProductionTable();
    View Full Code Here

    Examples of com.googlecode.goclipse.go.lang.lexer.Tokenizer

        public void init() throws IOException {

          try {
            Lexer lexer = new Lexer();
            Tokenizer tokenizer = new Tokenizer(lexer);
            FunctionParser functionParser = new FunctionParser(true, tokenizer, file);

            BufferedReader reader = new BufferedReader(new FileReader(file));
            String temp = "";
            StringBuilder builder = new StringBuilder();
    View Full Code Here

    Examples of com.googlecode.psiprobe.tokenizer.Tokenizer

        public static String getJSPEncoding(InputStream is) throws IOException {

            String encoding = null;
            String contentType = null;

            Tokenizer jspTokenizer = new Tokenizer();
            jspTokenizer.addSymbol("\n", true);
            jspTokenizer.addSymbol(" ", true);
            jspTokenizer.addSymbol("\t", true);
            jspTokenizer.addSymbol(new TokenizerSymbol("dir", "<%@", "%>", false, false, true, false));

            StringTokenizer directiveTokenizer = new StringTokenizer();
            directiveTokenizer.addSymbol("\n", true);
            directiveTokenizer.addSymbol(" ", true);
            directiveTokenizer.addSymbol("\t", true);
            directiveTokenizer.addSymbol("=");
            directiveTokenizer.addSymbol("\"", "\"", false);
            directiveTokenizer.addSymbol("'", "'", false);

            StringTokenizer contentTypeTokenizer = new StringTokenizer();
            contentTypeTokenizer.addSymbol(" ", true);
            contentTypeTokenizer.addSymbol(";", true);


            Reader reader = new InputStreamReader(is, "ISO-8859-1");
            try {
                jspTokenizer.setReader(reader);
                while (jspTokenizer.hasMore()) {
                    Token token = jspTokenizer.nextToken();
                    if ("dir".equals(token.getName())) {
                        directiveTokenizer.setString(token.getInnerText());
                        if (directiveTokenizer.hasMore() && directiveTokenizer.nextToken().getText().equals("page")) {
                            while (directiveTokenizer.hasMore()) {
                                Token dTk = directiveTokenizer.nextToken();
    View Full Code Here
    TOP
    Copyright © 2018 www.massapi.com. All rights reserved.
    All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.