The main idea behind this parser is that a person should be able to type whatever they want to represent a query, and this parser will do its best to interpret what to search for no matter how poorly composed the request may be. Tokens are considered to be any of a term, phrase, or subquery for the operations described below. Whitespace including ' ' '\n' '\r' and '\t' and certain operators may be used to delimit tokens ( ) + | " .
Any errors in query syntax will be ignored and the parser will attempt to decipher what it can; however, this may mean odd or unexpected results.
The {@link #setDefaultOperator default operator} is {@code OR} if no other operator is specified.For example, the following will {@code OR} {@code token1} and {@code token2} together:token1 token2
Normal operator precedence will be simple order from right to left. For example, the following will evaluate {@code token1 OR token2} first,then {@code AND} with {@code token3}:
token1 | token2 + token3
An individual term may contain any possible character with certain characters requiring escaping using a ' {@code \}'. The following characters will need to be escaped in terms and phrases: {@code + | " ( ) ' \}
The ' {@code -}' operator is a special case. On individual terms (not phrases) the first character of a term that is {@code -} must be escaped; however, any '{@code -}' characters beyond the first character do not need to be escaped. For example:
The ' {@code *}' operator is a special case. On individual terms (not phrases) the last character of a term that is ' {@code *}' must be escaped; however, any ' {@code *}' characters before the last character do not need to be escaped:
Note that above examples consider the terms before text processing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|