Examples of dk.brics.automaton.RegExp

dk.brics.automaton.RegExp

Regular Expression extension to Automaton.

Regular expressions are built from the following abstract syntax:

regexp	::=	unionexp
	\|
unionexp	::=	interexp `\|` unionexp	(union)
	\|	interexp
interexp	::=	concatexp `&` interexp	(intersection)	[OPTIONAL]
	\|	concatexp
concatexp	::=	repeatexp concatexp	(concatenation)
	\|	repeatexp
repeatexp	::=	repeatexp `?`	(zero or one occurrence)
	\|	repeatexp `*`	(zero or more occurrences)
	\|	repeatexp `+`	(one or more occurrences)
	\|	repeatexp `{n}`	(`n` occurrences)
	\|	repeatexp `{n,}`	(`n` or more occurrences)
	\|	repeatexp `{n,m}`	(`n` to `m` occurrences, including both)
	\|	complexp
complexp	::=	`~` complexp	(complement)	[OPTIONAL]
	\|	charclassexp
charclassexp	::=	`[` charclasses `]`	(character class)
	\|	`[^` charclasses `]`	(negated character class)
	\|	simpleexp
charclasses	::=	charclass charclasses
	\|	charclass
charclass	::=	charexp `-` charexp	(character range, including end-points)
	\|	charexp
simpleexp	::=	charexp
	\|	`.`	(any single character)
	\|	`#`	(the empty language)	[OPTIONAL]
	\|	`@`	(any string)	[OPTIONAL]
	\|	`"` <Unicode string without double-quotes> `"`	(a string)
	\|	`(` `)`	(the empty string)
	\|	`(` unionexp `)`	(precedence override)
	\|	`<` <identifier> `>`	(named automaton)	[OPTIONAL]
	\|	`<n-m>`	(numerical interval)	[OPTIONAL]
charexp	::=	<Unicode character>	(a single non-reserved character)
	\|	`\` <Unicode character>	(a single character)

The productions marked [OPTIONAL] are only allowed if specified by the syntax flags passed to the RegExp constructor. The reserved characters used in the (enabled) syntax must be escaped with backslash (\) or double-quotes ("..."). (In contrast to other regexp syntaxes, this is required also in character classes.) Be aware that dash (-) has a special meaning in charclass expressions. An identifier is a string not containing right angle bracket (>) or dash (-). Numerical intervals are specified by non-negative decimal integers and include both end points, and if n and m have the same number of digits, then the conforming strings must have that length (i.e. prefixed by 0's). @author Anders Møller <amoeller@cs.au.dk>

                                String url = ((StringConstant)expr.getArg(0)).value;
                                url_map.put(value, url);
                            }
                            if (expr.getMethod().getSignature().equals("<dk.brics.string.runtime.Strings: void bind(java.lang.String,java.lang.String)>")) {
                                String name = getName(expr);
                                RegExp re = getRegExp(expr);
                                regexp_bind.put(name, re);
                            }
                            if (expr.getMethod().getSignature().equals("<dk.brics.string.runtime.Strings: void bind(java.lang.String,java.net.URL)>")) {
                                String name = getName(expr);
                                URL url = getConstantURL(expr.getArg(1));

View Full Code Here

        }
    }


    private RegExp getRegExp(InvokeExpr expr) {
        if (expr.getArg(1) instanceof StringConstant) {
            return new RegExp(((StringConstant) expr.getArg(1)).value);
        } else {
            throw new InvalidRuntimeUseException("Non-constant regexp");
        }
    }

View Full Code Here

        String type = at.getType().trim();
        if (type.equals("Ldk/brics/string/annotation/Type;") && at.getNumElems() == 1) {
            // XXX why are we trimming the regexp here?? Although rare, it is perfectly sane for a string-type
            // to end with blanks. E.g @Type("Hello ") would become @Type("Hello").
          String pattern = ((AnnotationStringElem)at.getElemAt(0)).getValue().trim();
          Automaton a = (new RegExp(pattern)).toAutomaton(bindings);
          automatonDescriptionMap.put(a, pattern);
          return a;
        }
        if (type.equals("Ldk/brics/string/annotation/LoadType;") && at.getNumElems() == 1) {
          String path = ((AnnotationStringElem)at.getElemAt(0)).getValue().trim();

View Full Code Here

        }
    }


    RegExp getRegExp(InvokeExpr expr) {
        if (expr.getArg(1) instanceof StringConstant) {
            return new RegExp(((StringConstant) expr.getArg(1)).value);
        } else {
            throw new InvalidRuntimeUseException("Non-constant regexp");
        }
    }

View Full Code Here

     * @throws IllegalArgumentException If the regular expression is invalid.
     */
    public Xeger(String regex, Random random) {
        assert regex != null;
        assert random != null;
        this.automaton = new RegExp(regex).toAutomaton();
        this.random = random;
    }

View Full Code Here

  
  /**
   * Builds the transition table data.
   */
  public void build() {
    RegExp regexp = new RegExp(this.expression);
    Automaton automata = regexp.toAutomaton(true);
    numOfStates = automata.getNumberOfStates();
    //System.out.println("Number of states " + numOfStates);
    
    State[] states = new State[numOfStates];
    automata.getStates().toArray(states);

View Full Code Here

0 1 2

TOP

Related Classes of dk.brics.automaton.RegExp

bgu.bio.ds.automata.TransitionTable

com.google.gerrit.server.project.RefControl

com.google.gerrit.server.query.change.RegexBranchPredicate

com.google.gerrit.server.query.change.RegexFilePredicate

com.google.gerrit.server.query.change.RegexProjectPredicate

com.google.gerrit.server.query.change.RegexRefPredicate

com.google.gerrit.server.query.change.RegexTopicPredicate

dk.brics.string.annotation.AnnotationAnalyzer

dk.brics.string.BindingAutomatonProvider

dk.brics.string.RuntimeResolver

All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.