Examples of org.apache.lucene.index.memory.SynonymMap

org.apache.lucene.index.memory.SynonymMap
ogsci.princeton.edu/~wn/">WordNet prolog file wn_s.pl into a thread-safe main-memory hash map that can be used for fast high-frequncy lookups of synonyms for any given (lowercase) word string.
There holds: If B is a synonym for A (A -> B) then A is also a synonym for B (B -> A). There does not necessary hold: A -> B, B -> C then A -> C.
Loading typically takes some 1.5 secs, so should be done only once per (server) program execution, using a singleton pattern. Once loaded, a synonym lookup via {@link #getSynonyms(String)}takes constant time O(1). A loaded default synonym map consumes about 10 MB main memory. An instance is immutable, hence thread-safe.
This implementation borrows some ideas from the Lucene Syns2Index demo that Dave Spencer originally contributed to Lucene. Dave's approach involved a persistent Lucene index which is suitable for occasional lookups or very large synonym tables, but considered unsuitable for high-frequency lookups of medium size synonym tables.
Example Usage:
```
 String[] words = new String[] { "hard", "woods", "forest", "wolfish", "xxxx"}; SynonymMap map = new SynonymMap(new FileInputStream("samples/fulltext/wn_s.pl")); for (int i = 0; i < words.length; i++) { String[] synonyms = map.getSynonyms(words[i]); System.out.println(words[i] + ":" + java.util.Arrays.asList(synonyms).toString()); } Example output: hard:[arduous, backbreaking, difficult, fermented, firmly, grueling, gruelling, heavily, heavy, intemperately, knockout, laborious, punishing, severe, severely, strong, toilsome, tough] woods:[forest, wood] forest:[afforest, timber, timberland, wood, woodland, woods] wolfish:[edacious, esurient, rapacious, ravening, ravenous, voracious, wolflike] xxxx:[] 
```
@author whoschek.AT.lbl.DOT.gov @see prologdb man page @see Dave's synonym demo site

    * {@inheritDoc}
    */
   public void initialize(InputStream configuration) throws IOException
   {


      SynonymMap sm = null;
      try
      {
         sm = new SynonymMap(configuration);
      }
      catch (IOException e)
      {
         if (LOG.isTraceEnabled())
         {

View Full Code Here

    * {@inheritDoc}
    */
   public void initialize(InputStream configuration) throws IOException
   {


      SynonymMap sm = null;
      try
      {
         sm = new SynonymMap(configuration);
      }
      catch (IOException e)
      {
         // ignore
      }

View Full Code Here

    * {@inheritDoc}
    */
   public void initialize(InputStream configuration) throws IOException
   {


      SynonymMap sm = null;
      try
      {
         sm = new SynonymMap(configuration);
      }
      catch (IOException e)
      {
         // ignore
      }

View Full Code Here

                Log initLog = LogFactory.getLog(SynonymMapFactory.class);
                initLog.debug("Initializing SynonymMap.");
                try
                {
                    // NOTE: the synonym stream is closed inside SynonymMap
                    synonyms = new SynonymMap(thesaurusInputStream);
                }
                catch (Exception e)
                {
                    throw new RuntimeException(e);
                }

View Full Code Here

    * {@inheritDoc}
    */
   public void initialize(InputStream configuration) throws IOException
   {


      SynonymMap sm = null;
      try
      {
         sm = new SynonymMap(configuration);
      }
      catch (IOException e)
      {
         if (LOG.isTraceEnabled())
         {

View Full Code Here

TOP

Related Classes of org.apache.lucene.index.memory.SynonymMap

org.eurekastreams.commons.search.analysis.SynonymMapFactory

org.exoplatform.services.jcr.impl.core.query.lucene.synonym.WordNetSynonyms

java.util.TreeSet

java.util.TreeMap

java.nio.charset.Charset

java.util.ArrayList

java.util.HashMap

java.util.Iterator

All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.