Examples of AnalyzingQueryParser

org.apache.lucene.queryParser.analyzing.AnalyzingQueryParser
Overrides Lucene's default QueryParser so that Fuzzy-, Prefix-, Range-, and WildcardQuerys are also passed through the given analyzer, but wild card characters (like *) don't get removed from the search terms.
Warning: This class should only be used with analyzers that do not use stopwords or that add tokens. Also, several stemming analyzers are inappropriate: for example, GermanAnalyzer will turn Häuser into hau, but H?user will become h?user when using this parser and thus no match would be found (i.e. using this parser will be no improvement over QueryParser in such cases). @version $Revision$, $Date$
org.apache.lucene.queryparser.analyzing.AnalyzingQueryParser
Overrides Lucene's default QueryParser so that Fuzzy-, Prefix-, Range-, and WildcardQuerys are also passed through the given analyzer, but wild card characters (like *) don't get removed from the search terms.
Warning: This class should only be used with analyzers that do not use stopwords or that add tokens. Also, several stemming analyzers are inappropriate: for example, GermanAnalyzer will turn Häuser into hau, but H?user will become h?user when using this parser and thus no match would be found (i.e. using this parser will be no improvement over QueryParser in such cases). @version $Revision$, $Date$

Examples of org.apache.lucene.queryParser.analyzing.AnalyzingQueryParser

                      Field.Store.YES, Field.Index.ANALYZED));
    writer.addDocument(doc);
    writer.close();
    IndexSearcher is = new IndexSearcher(ramDir, true);


    AnalyzingQueryParser aqp = new AnalyzingQueryParser(Version.LUCENE_CURRENT, "content", analyzer);
    aqp.setLowercaseExpandedTerms(false);


    // Unicode order would include U+0633 in [ U+062F - U+0698 ], but Farsi
    // orders the U+0698 character before the U+0633 character, so the single
    // index Term below should NOT be returned by a TermRangeQuery
    // with a Farsi Collator (or an Arabic one for the case when Farsi is not
    // supported).
      
    // Test TermRangeQuery
    ScoreDoc[] result
      = is.search(aqp.parse("[ \u062F TO \u0698 ]"), null, 1000).scoreDocs;
    assertEquals("The index Term should not be included.", 0, result.length);


    result = is.search(aqp.parse("[ \u0633 TO \u0638 ]"), null, 1000).scoreDocs;
    assertEquals("The index Term should be included.", 1, result.length);


    is.close();
  }

View Full Code Here

Examples of org.apache.lucene.queryParser.analyzing.AnalyzingQueryParser

                      Field.Store.YES, Field.Index.ANALYZED));
    writer.addDocument(doc);
    writer.close();
    IndexSearcher is = new IndexSearcher(ramDir);


    AnalyzingQueryParser aqp = new AnalyzingQueryParser("content", analyzer);
    aqp.setLowercaseExpandedTerms(false);


    // Unicode order would include U+0633 in [ U+062F - U+0698 ], but Farsi
    // orders the U+0698 character before the U+0633 character, so the single
    // index Term below should NOT be returned by a ConstantScoreRangeQuery
    // with a Farsi Collator (or an Arabic one for the case when Farsi is not
    // supported).
      
    // Test ConstantScoreRangeQuery
    aqp.setUseOldRangeQuery(false);
    ScoreDoc[] result
      = is.search(aqp.parse("[ \u062F TO \u0698 ]"), null, 1000).scoreDocs;
    assertEquals("The index Term should not be included.", 0, result.length);


    result = is.search(aqp.parse("[ \u0633 TO \u0638 ]"), null, 1000).scoreDocs;
    assertEquals("The index Term should be included.", 1, result.length);


    // Test TermRangeQuery
    aqp.setUseOldRangeQuery(true);
    result = is.search(aqp.parse("[ \u062F TO \u0698 ]"), null, 1000).scoreDocs;
    assertEquals("The index Term should not be included.", 0, result.length);


    result = is.search(aqp.parse("[ \u0633 TO \u0638 ]"), null, 1000).scoreDocs;
    assertEquals("The index Term should be included.", 1, result.length);


    is.close();
  }

View Full Code Here

Examples of org.apache.lucene.queryparser.analyzing.AnalyzingQueryParser

        indexSearcher.setSimilarity(new BinarySimilarity());


        // run an initial throw-away query just to "prime the pump" for
        // the cache, so we can accurately measure performance speed
        // per: http://wiki.apache.org/lucene-java/ImproveSearchingSpeed
        indexSearcher.search(new AnalyzingQueryParser(Version.LUCENE_4_9, INDEX_NAME.key(),
                INDEX_ANALYZER).parse("Reston"), null, DEFAULT_MAX_RESULTS, POPULATION_SORT);
        } catch (ParseException pe) {
            throw new ClavinException("Error executing priming query.", pe);
        } catch (IOException ioe) {
            throw new ClavinException("Error opening gazetteer index.", ioe);

View Full Code Here

Examples of org.apache.lucene.queryparser.analyzing.AnalyzingQueryParser

     * @throws IOException if an error occurs executing the query
     */
    private List<ResolvedLocation> executeQuery(final LocationOccurrence location, final String sanitizedName, final Filter filter,
            final int maxResults, final boolean fuzzy, final boolean dedupe, final List<ResolvedLocation> previousResults)
            throws ParseException, IOException {
        Query query = new AnalyzingQueryParser(Version.LUCENE_4_9, INDEX_NAME.key(), INDEX_ANALYZER)
                .parse(String.format(fuzzy ? FUZZY_FMT : EXACT_MATCH_FMT, sanitizedName));


        List<ResolvedLocation> matches = new ArrayList<ResolvedLocation>(maxResults);


        Map<Integer, Set<GeoName>> parentMap = new HashMap<Integer, Set<GeoName>>();

View Full Code Here

Examples of org.apache.lucene.queryparser.analyzing.AnalyzingQueryParser

        this.maxContextWindow = maxContextWindow;
        
        // run an initial throw-away query just to "prime the pump" for
        // the cache, so we can accurately measure performance speed
        // per: http://wiki.apache.org/lucene-java/ImproveSearchingSpeed
        indexSearcher.search(new AnalyzingQueryParser(Version.LUCENE_40,
                "indexName", indexAnalyzer).parse("Reston"), null, maxHitDepth, populationSort);
    }

View Full Code Here

Examples of org.apache.lucene.queryparser.analyzing.AnalyzingQueryParser

        String sanitizedLocationName = escape(locationName.name.toLowerCase());
        
        try{
            // Lucene query used to look for matches based on the
            // "indexName" field
            Query q = new AnalyzingQueryParser(Version.LUCENE_40,
                    "indexName", indexAnalyzer).parse("\"" + sanitizedLocationName + "\"");
            
            // collect all the hits up to maxHits, and sort them based
            // on Lucene match score and population for the associated
            // GeoNames record
            TopDocs results = indexSearcher.search(q, null, maxHitDepth, populationSort);
            
            // initialize the return object
            List<ResolvedLocation> candidateMatches = new ArrayList<ResolvedLocation>();
            
            // see if anything was found
            if (results.scoreDocs.length > 0) {
                // one or more exact String matches found for this location name
                for (int i = 0; i < results.scoreDocs.length; i++) {
                    // add each matching location to the list of candidates
                    ResolvedLocation location = new ResolvedLocation(indexSearcher.doc(results.scoreDocs[i].doc), locationName, false);
                    logger.debug("{}", location);
                    candidateMatches.add(location);
                }
            } else if (fuzzy) { // only if fuzzy matching is turned on
                // no exact String matches found -- fallback to fuzzy search
                
                // Using the tilde "~" makes this a fuzzy search. I compared this to FuzzyQuery
                // with TopTermsBoostOnlyBooleanQueryRewrite, I like the output better this way.
                // With the other method, we failed to match things like "Stra��enhaus Airport"
                // as <Stra��enhaus>, and the match scores didn't make as much sense.
                q = new AnalyzingQueryParser(Version.LUCENE_40, "indexName", indexAnalyzer).parse(sanitizedLocationName + "~");
                
                // collect all the fuzzy matches up to maxHits, and sort
                // them based on Lucene match score and population for the
                // associated GeoNames record
                results = indexSearcher.search(q, null, maxHitDepth, populationSort);

View Full Code Here

Examples of org.apache.lucene.queryparser.analyzing.AnalyzingQueryParser

        this.maxContextWindow = maxContextWindow;
        
        // run an initial throw-away query just to "prime the pump" for
        // the cache, so we can accurately measure performance speed
        // per: http://wiki.apache.org/lucene-java/ImproveSearchingSpeed
        indexSearcher.search(new AnalyzingQueryParser(Version.LUCENE_40,
                "indexName", indexAnalyzer).parse("Reston"), null, maxHitDepth, populationSort);
    }

View Full Code Here

Examples of org.apache.lucene.queryparser.analyzing.AnalyzingQueryParser

        String sanitizedLocationName = escape(locationName.text.toLowerCase());
        
        try{
            // Lucene query used to look for matches based on the
            // "indexName" field
            Query q = new AnalyzingQueryParser(Version.LUCENE_40,
                    "indexName", indexAnalyzer).parse("\"" + sanitizedLocationName + "\"");
            
            // collect all the hits up to maxHits, and sort them based
            // on Lucene match score and population for the associated
            // GeoNames record
            TopDocs results = indexSearcher.search(q, null, maxHitDepth, populationSort);
            
            // initialize the return object
            List<ResolvedLocation> candidateMatches = new ArrayList<ResolvedLocation>();
            
            // see if anything was found
            if (results.scoreDocs.length > 0) {
                // one or more exact String matches found for this location name
                for (int i = 0; i < results.scoreDocs.length; i++) {
                    // add each matching location to the list of candidates
                    ResolvedLocation location = new ResolvedLocation(indexSearcher.doc(results.scoreDocs[i].doc), locationName, false);
                    logger.debug("{}", location);
                    candidateMatches.add(location);
                }
            } else if (fuzzy) { // only if fuzzy matching is turned on
                // no exact String matches found -- fallback to fuzzy search
                
                // Using the tilde "~" makes this a fuzzy search. I compared this to FuzzyQuery
                // with TopTermsBoostOnlyBooleanQueryRewrite, I like the output better this way.
                // With the other method, we failed to match things like "Stra��enhaus Airport"
                // as <Stra��enhaus>, and the match scores didn't make as much sense.
                q = new AnalyzingQueryParser(Version.LUCENE_40, "indexName", indexAnalyzer).parse(sanitizedLocationName + "~");
                
                // collect all the fuzzy matches up to maxHits, and sort
                // them based on Lucene match score and population for the
                // associated GeoNames record
                results = indexSearcher.search(q, null, maxHitDepth, populationSort);

View Full Code Here

Examples of org.apache.lucene.queryparser.analyzing.AnalyzingQueryParser

    this.maxContextWindow = maxContextWindow;
    
    // run an initial throw-away query just to "prime the pump" for
    // the cache, so we can accurately measure performance speed
    // per: http://wiki.apache.org/lucene-java/ImproveSearchingSpeed
    indexSearcher.search(new AnalyzingQueryParser(Version.LUCENE_40,
        "indexName", indexAnalyzer).parse("Reston"), null, maxHitDepth, populationSort);
  }

View Full Code Here

Examples of org.apache.lucene.queryparser.analyzing.AnalyzingQueryParser

    String sanitizedLocationName = escape(locationName.name.toLowerCase());
    
    try{
        // Lucene query used to look for matches based on the
      // "indexName" field
        Query q = new AnalyzingQueryParser(Version.LUCENE_40,
            "indexName", indexAnalyzer).parse("\"" + sanitizedLocationName + "\"");
        
        // collect all the hits up to maxHits, and sort them based
        // on Lucene match score and population for the associated
        // GeoNames record
        TopDocs results = indexSearcher.search(q, null, maxHitDepth, populationSort);
          
        // initialize the return object
          List<ResolvedLocation> candidateMatches = new ArrayList<ResolvedLocation>();
          
          // see if anything was found
          if (results.scoreDocs.length > 0) {
            // one or more exact String matches found for this location name
            for (int i = 0; i < results.scoreDocs.length; i++) {
              // add each matching location to the list of candidates
              ResolvedLocation location = new ResolvedLocation(indexSearcher.doc(results.scoreDocs[i].doc), locationName, false);
              logger.debug("{}", location);
              candidateMatches.add(location);
            }
          } else if (fuzzy) { // only if fuzzy matching is turned on
            // no exact String matches found -- fallback to fuzzy search
            
            // Using the tilde "~" makes this a fuzzy search. I compared this to FuzzyQuery
            // with TopTermsBoostOnlyBooleanQueryRewrite, I like the output better this way.
            // With the other method, we failed to match things like "Stra��enhaus Airport"
            // as <Stra��enhaus>, and the match scores didn't make as much sense.
            q = new AnalyzingQueryParser(Version.LUCENE_40, "indexName", indexAnalyzer).parse(sanitizedLocationName + "~");
            
            // collect all the fuzzy matches up to maxHits, and sort
            // them based on Lucene match score and population for the
            // associated GeoNames record
            results = indexSearcher.search(q, null, maxHitDepth, populationSort);

View Full Code Here

TOP

All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.