Examples of org.apache.lucene.index.IndexWriter

org.apache.lucene.index.IndexWriter
An IndexWriter creates and maintains an index.
The create argument to the {@link #IndexWriter(Directory,Analyzer,boolean,MaxFieldLength) constructor} determines whether a new index is created, or whether an existing index is opened. Note that you can open an index with create=true even while readers are using the index. The old readers will continue to search the "point in time" snapshot they had opened, and won't see the newly created index until they re-open. There are also {@link #IndexWriter(Directory,Analyzer,MaxFieldLength) constructors}with no create argument which will create a new index if there is not already an index at the provided path and otherwise open the existing index.

In either case, documents are added with {@link #addDocument(Document) addDocument} and removed with {@link #deleteDocuments(Term)} or {@link #deleteDocuments(Query)}. A document can be updated with {@link #updateDocument(Term,Document) updateDocument} (which just deletesand then adds the entire document). When finished adding, deleting and updating documents, {@link #close() close} should be called.

These changes are buffered in memory and periodically flushed to the {@link Directory} (during the above methodcalls). A flush is triggered when there are enough buffered deletes (see {@link #setMaxBufferedDeleteTerms}) or enough added documents since the last flush, whichever is sooner. For the added documents, flushing is triggered either by RAM usage of the documents (see {@link #setRAMBufferSizeMB}) or the number of added documents. The default is to flush when RAM usage hits 16 MB. For best indexing speed you should flush by RAM usage with a large RAM buffer. Note that flushing just moves the internal buffered state in IndexWriter into the index, but these changes are not visible to IndexReader until either {@link #commit()} or {@link #close} is called. A flush mayalso trigger one or more segment merges which by default run with a background thread so as not to block the addDocument calls (see below for changing the {@link MergeScheduler}).

If an index will not have more documents added for a while and optimal search performance is desired, then either the full {@link #optimize() optimize}method or partial {@link #optimize(int)} method should becalled before the index is closed.

Opening an IndexWriter creates a lock file for the directory in use. Trying to open another IndexWriter on the same directory will lead to a {@link LockObtainFailedException}. The {@link LockObtainFailedException}is also thrown if an IndexReader on the same directory is used to delete documents from the index.

Expert: IndexWriter allows an optional {@link IndexDeletionPolicy} implementation to bespecified. You can use this to control when prior commits are deleted from the index. The default policy is {@link KeepOnlyLastCommitDeletionPolicy} which removes all priorcommits as soon as a new commit is done (this matches behavior before 2.2). Creating your own policy can allow you to explicitly keep previous "point in time" commits alive in the index for some time, to allow readers to refresh to the new commit without having the old commit deleted out from under them. This is necessary on filesystems like NFS that do not support "delete on last close" semantics, which Lucene's "point in time" search normally relies on.

Expert: IndexWriter allows you to separately change the {@link MergePolicy} and the {@link MergeScheduler}. The {@link MergePolicy} is invoked whenever there arechanges to the segments in the index. Its role is to select which merges to do, if any, and return a {@link MergePolicy.MergeSpecification} describing the merges. Italso selects merges to do for optimize(). (The default is {@link LogByteSizeMergePolicy}. Then, the {@link MergeScheduler} is invoked with the requested merges andit decides when and how to run the merges. The default is {@link ConcurrentMergeScheduler}.

NOTE: if you hit an OutOfMemoryError then IndexWriter will quietly record this fact and block all future segment commits. This is a defensive measure in case any internal state (buffered documents and deletions) were corrupted. Any subsequent calls to {@link #commit()} will throw anIllegalStateException. The only course of action is to call {@link #close()}, which internally will call {@link #rollback()}, to undo any changes to the index since the last commit. You can also just call {@link #rollback()}directly.

NOTE: {@link IndexWriter} instances are completely threadsafe, meaning multiple threads can call any of its methods, concurrently. If your application requires external synchronization, you should not synchronize on the IndexWriter instance as this may cause deadlock; use your own (non-Lucene) objects instead.

NOTE: If you call Thread.interrupt() on a thread that's within IndexWriter, IndexWriter will try to catch this (eg, if it's in a wait() or Thread.sleep()), and will then throw the unchecked exception {@link ThreadInterruptedException}and clear the interrupt status on the thread.

    Document doc = new Document();
    doc.add(new Field("id", "1", Field.Store.YES, Field.Index.UN_TOKENIZED));
    doc.add(new Field("author", "����IBM OSIһһ����", Field.Store.YES, Field.Index.TOKENIZED));
    doc.add(new Field("time", Long.toString(System.currentTimeMillis()), Field.Store.YES, Field.Index.UN_TOKENIZED));
    doc.add(new Field("content", "��֪����ʲôʱ��ʼ����ʲô�������涼�и����ڣ��ﵶ�����ڣ����ͷ����ڣ�������ֽ������ڣ��ҿ�ʼ���ɣ�����������ϣ�����ʲô�����ǲ�����ڵģ�", Field.Store.NO, Field.Index.TOKENIZED));
    IndexWriter writer = getWriter();
    try {
        writer.addDocument(doc);
        writer.optimize();
    }finally {
        writer.close();
    }
    System.out.println("doc count = " + writer.docCount());
        return 1;
    }

View Full Code Here

      }
      wc++;
        }
        File segments = new File(lucenePath + File.separator + SEGMENTS);
        boolean bCreate = !segments.exists();
        return new IndexWriter(lucenePath,new StandardAnalyzer(),bCreate);
    }

View Full Code Here

    for(int i=0;indexFields!=null && i<indexFields.length;i++){
      String propertyValue = getField(doc, indexFields[i]);
      lucene_doc.add(UnStored(indexFields[i], propertyValue));
    }
    //Write document
    IndexWriter writer = getWriter(doc.name());
    try {
        writer.addDocument(lucene_doc);
        writer.optimize();
    }finally {
      try{
        writer.close();
      }catch(Exception e){
        log.error("Error occur when closing IndexWriter", e);
      }finally{
        writer = null;
      }

View Full Code Here

        path.append(File.separator);
        path.append(SEGMENTS);
        File segments = new File(path.toString());
        try{
          boolean bCreate = !segments.exists();
          return new IndexWriter(index_path,new StandardAnalyzer(),bCreate);
        }finally{
          path = null;
          segments = null;
          rp = null;
        }

View Full Code Here

    try {
      indexName = indexFile.getCanonicalPath();
      directory = FSDirectory.getDirectory(indexName);
      if (create) {
        log.debug("Initialize index: '" + indexFile + "'");
        IndexWriter iw = new IndexWriter(directory,
            new StandardAnalyzer(), create);
        iw.close();
      }
    } catch (IOException e) {
      throw new HibernateException("Unable to initialize index: "
          + indexFile, e);
    }

View Full Code Here

                    //## LUCENE3 begin ##
                    File f = new File(path);
                    Directory indexDir = FSDirectory.open(f);
                    boolean recreate = !IndexReader.indexExists(indexDir);
                    Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_30);
                    IndexWriter writer = new IndexWriter(indexDir, analyzer,
                            recreate, IndexWriter.MaxFieldLength.UNLIMITED);
                    //see http://wiki.apache.org/lucene-java/NearRealtimeSearch
                    IndexReader reader = writer.getReader();
                    access = new IndexAccess();
                    access.writer = writer;
                    access.reader = reader;
                    access.searcher = new IndexSearcher(reader);
                    //## LUCENE3 end ##

View Full Code Here

                String       botname = resultSet.getString("botname").toLowerCase();
                String       channel = resultSet.getString("channel").substring(1).toLowerCase();
                String       key = getIndexName(botname, servername, channel);
                Document     doc = new Document();
                Timestamp     moment = resultSet.getTimestamp("moment");
                IndexWriter   writer = (IndexWriter)index_writer_cache.get(key);


                if (writer == null)
                {
                  writer = getIndexWriter(key, true);


                  index_writer_cache.put(key, writer);
                }


                beginningOfDay.setTime(moment);
                beginningOfDay.setTimeZone(TimeZone.getTimeZone(DroneConfig.getTimezone()));
                beginningOfDay.set(Calendar.HOUR, 0);
                beginningOfDay.set(Calendar.MINUTE, 0);
                beginningOfDay.set(Calendar.SECOND, 0);
                beginningOfDay.set(Calendar.MILLISECOND, 0);


                long dayInMillis = beginningOfDay.getTimeInMillis();
                long timeInMillis = moment.getTime() - dayInMillis;


                // need to store channel without the '#'
                doc.add(new Field("moment", DateField.dateToString(resultSet.getTimestamp("moment")), true, true, true));
                doc.add(new Field("momentDateSort", String.valueOf(dayInMillis), false, true, false));
                doc.add(new Field("momentTimeSort", String.valueOf(timeInMillis), false, true, false));
                doc.add(new Field("botname", botname, true, true, true));
                doc.add(new Field("channel", channel, true, true, true));
                doc.add(new Field("servername", servername, true, true, true));
                doc.add(new Field("nickname", resultSet.getString("nickname"), true, true, true));
                doc.add(new Field("username", resultSet.getString("username"), true, true, true));
                doc.add(new Field("hostname", resultSet.getString("hostname"), true, true, true));
                doc.add(new Field("message", resultSet.getString("message"), true, true, true));


                try
                {
                  writer.addDocument(doc, getAnalyzer());
                }
                catch (IOException e)
                {
                  Logger
                    .getLogger("com.uwyn.drone.tools")

View Full Code Here

        indexArticle( article );
    }


    private void indexArticle(Article article) throws IOException {
        Document doc = article.getOveriewDocument();
        IndexWriter writer = new IndexWriter( forum.getArticleOverview(), ws, true );
        writer.addDocument( doc );
        writer.close();
    }

View Full Code Here

  
  public void testIndexReaderReopen() throws Exception{
    Directory idxDir = new RAMDirectory();
    Document[] docs = buildData();
    
    IndexWriter writer = new IndexWriter(idxDir,new StandardAnalyzer(Version.LUCENE_29),MaxFieldLength.UNLIMITED);
    writer.addDocument(docs[0]);
    writer.optimize();
    writer.commit();
    
    IndexReader idxReader = IndexReader.open(idxDir,true);
    BoboIndexReader boboReader = BoboIndexReader.getInstance(idxReader,_fconf);


    
    for (int i=1;i<docs.length;++i){
      Document doc = docs[i];
      int numDocs = boboReader.numDocs();
      BoboIndexReader reader = (BoboIndexReader)boboReader.reopen(true);
      assertSame(boboReader,reader);
      
      Directory tmpDir = new RAMDirectory();
      IndexWriter subWriter = new IndexWriter(tmpDir,new StandardAnalyzer(Version.LUCENE_29),MaxFieldLength.UNLIMITED);
      subWriter.addDocument(doc);
      subWriter.optimize();
      subWriter.close();
      writer.addIndexesNoOptimize(new Directory[]{tmpDir});
      writer.commit();
      reader = (BoboIndexReader)boboReader.reopen();
      assertNotSame(boboReader, reader);
      assertEquals(numDocs+1,reader.numDocs());

View Full Code Here

  {
    List<FacetHandler<?>> facetHandlers = new ArrayList<FacetHandler<?>>();
    /* Underlying time facet for DynamicTimeRangeFacetHandler */
    facetHandlers.add(new RangeFacetHandler("timeinmillis", new PredefinedTermListFactory(Long.class, DynamicTimeRangeFacetHandler.NUMBER_FORMAT),null));
    Directory idxDir = new RAMDirectory();
    IndexWriter writer = new IndexWriter(idxDir,new StandardAnalyzer(Version.LUCENE_29),MaxFieldLength.UNLIMITED);
    
    long now = System.currentTimeMillis();
    DecimalFormat df = new DecimalFormat(DynamicTimeRangeFacetHandler.NUMBER_FORMAT);
    for(long l=0; l<53; l++)
    {
      Document d = new Document();
      d.add(buildMetaField("timeinmillis", df.format(now - l*3500000)));
      writer.addDocument(d);
      writer.optimize();
      writer.commit();
    }
    IndexReader idxReader = IndexReader.open(idxDir,true);
    BoboIndexReader boboReader = BoboIndexReader.getInstance(idxReader,facetHandlers);
    BoboBrowser browser = new BoboBrowser(boboReader);
    List<String> ranges = new ArrayList<String>();

View Full Code Here

0 1 2 3 4 5 6 7 8 9

TOP

Related Classes of org.apache.lucene.index.IndexWriter

com.alimama.mdrill.solr.hbaserealtime.realtime.RealTimeDirectoryData

com.alimama.mdrill.solr.realtime.realtime.RealTimeDirectoryData

com.browseengine.bobo.facets.attribute.AttributesFacetHandlerTest

com.browseengine.bobo.test.BoboTestCase

com.gentics.cr.lucene.indexaccessor.DefaultIndexAccessor

fi.foyt.hibernate.gae.search.GaeInstantWorkspace

index.IndexBuilder

it.eng.spagobi.commons.utilities.indexing.LuceneIndexer

org.apache.derby.optional.lucene.LuceneSupport

org.apache.jackrabbit.core.query.lucene.AbstractIndex

All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.