|
Giovanni Fernandez-Kincad...
2009-10-05, 16:03
Feak, Todd
2009-10-05, 16:10
Yonik Seeley
2009-10-05, 16:14
Giovanni Fernandez-Kincad...
2009-10-05, 16:30
Yonik Seeley
2009-10-05, 16:37
Feak, Todd
2009-10-05, 16:38
Feak, Todd
2009-10-05, 16:40
Giovanni Fernandez-Kincad...
2009-10-05, 16:44
Yonik Seeley
2009-10-05, 16:52
Walter Underwood
2009-10-05, 16:52
Giovanni Fernandez-Kincad...
2009-10-05, 17:04
Yonik Seeley
2009-10-05, 17:17
Giovanni Fernandez-Kincad...
2009-10-05, 17:29
Giovanni Fernandez-Kincad...
2009-10-05, 18:11
Giovanni Fernandez-Kincad...
2009-10-06, 16:33
Lance Norskog
2009-10-06, 18:59
Yonik Seeley
2009-10-06, 19:06
Giovanni Fernandez-Kincad...
2009-10-06, 19:37
Giovanni Fernandez-Kincad...
2009-10-06, 19:38
Feak, Todd
2009-10-06, 20:32
Mark Miller
2009-10-06, 20:43
Giovanni Fernandez-Kincad...
2009-10-06, 20:49
Shalin Shekhar Mangar
2009-10-07, 09:28
|
-
Solr TimeoutsGiovanni Fernandez-Kincad... 2009-10-05, 16:03
Hi,
I'm attempting to index approximately 6 million HTML/Text files using SOLR 1.4/Tomcat6 on Windows Server 2003 x64. I'm running 64 bit Tomcat and JVM. I've fired up 4-5 different jobs that are making indexing requests using the ExtractionRequestHandler, and everything works well for about 30-40 minutes, after which all indexing requests start timing out. I profiled the server and found that all of the threads are getting blocked by this call to flush the Lucene index to disk (see below). This leads me to a few questions: 1. Is this normal? 2. Can I reduce the frequency with which this happens somehow? I've greatly increased the indexing options in SolrConfig.xml (attached here) to no avail. 3. During these flushes, resource utilization (CPU, I/O, Memory Consumption) is significantly down compared to when requests are being handled. Is there any way to make this index go faster? I have plenty of bandwidth on the machine. I appreciate any insight you can provide. We're currently using MS SQL 2005 as our full-text solution and are pretty much miserable. So far SOLR has been a great experience. Thanks, Gio. http-8080-Processor21 [RUNNABLE] CPU time: 9:51 java.io.RandomAccessFile.seek(long) org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(byte[], int, int) org.apache.lucene.store.BufferedIndexInput.refill() org.apache.lucene.store.BufferedIndexInput.readByte() org.apache.lucene.store.IndexInput.readVInt() org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) org.apache.lucene.index.SegmentTermEnum.next() org.apache.lucene.index.SegmentTermEnum.scanTo(Term) org.apache.lucene.index.TermInfosReader.get(Term, boolean) org.apache.lucene.index.TermInfosReader.get(Term) org.apache.lucene.index.SegmentTermDocs.seek(Term) org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) org.apache.lucene.index.IndexWriter.applyDeletes() org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) org.apache.lucene.index.IndexWriter.closeInternal(boolean) org.apache.lucene.index.IndexWriter.close(boolean) org.apache.lucene.index.IndexWriter.close() org.apache.solr.update.SolrIndexWriter.close() org.apache.solr.update.DirectUpdateHandler2.closeWriter() org.apache.solr.update.DirectUpdateHandler2.commit(CommitUpdateCommand) org.apache.solr.update.processor.RunUpdateProcessor.processCommit(CommitUpdateCommand) org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor, SolrParams, boolean) org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse) org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, SolrQueryResponse) org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest, SolrQueryResponse) org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, ServletResponse, FilterChain) org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest, ServletResponse) org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, ServletResponse) org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response) org.apache.catalina.core.StandardContextValve.invoke(Request, Response) org.apache.catalina.core.StandardHostValve.invoke(Request, Response) org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response) org.apache.catalina.core.StandardEngineValve.invoke(Request, Response) org.apache.catalina.connector.CoyoteAdapter.service(Request, Response) org.apache.coyote.http11.Http11Processor.process(InputStream, OutputStream) org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(TcpConnection, Object[]) org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(Socket, TcpConnection, Object[]) org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(Object[]) org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run() java.lang.Thread.run()
-
RE: Solr TimeoutsFeak, Todd 2009-10-05, 16:10
How often are you committing?
Every time you commit, Solr will close the old index and open the new one. If you are doing this in parallel from multiple jobs (4-5 you mention) then eventually the server gets behind and you start to pile up commit requests. Once this starts to happen, it will cascade out of control if the rate of commits isn't slowed. -Todd ________________________________ From: Giovanni Fernandez-Kincade [mailto:[EMAIL PROTECTED]] Sent: Monday, October 05, 2009 9:04 AM To: [EMAIL PROTECTED] Subject: Solr Timeouts Hi, I'm attempting to index approximately 6 million HTML/Text files using SOLR 1.4/Tomcat6 on Windows Server 2003 x64. I'm running 64 bit Tomcat and JVM. I've fired up 4-5 different jobs that are making indexing requests using the ExtractionRequestHandler, and everything works well for about 30-40 minutes, after which all indexing requests start timing out. I profiled the server and found that all of the threads are getting blocked by this call to flush the Lucene index to disk (see below). This leads me to a few questions: 1. Is this normal? 2. Can I reduce the frequency with which this happens somehow? I've greatly increased the indexing options in SolrConfig.xml (attached here) to no avail. 3. During these flushes, resource utilization (CPU, I/O, Memory Consumption) is significantly down compared to when requests are being handled. Is there any way to make this index go faster? I have plenty of bandwidth on the machine. I appreciate any insight you can provide. We're currently using MS SQL 2005 as our full-text solution and are pretty much miserable. So far SOLR has been a great experience. Thanks, Gio. http-8080-Processor21 [RUNNABLE] CPU time: 9:51 java.io.RandomAccessFile.seek(long) org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(byte[], int, int) org.apache.lucene.store.BufferedIndexInput.refill() org.apache.lucene.store.BufferedIndexInput.readByte() org.apache.lucene.store.IndexInput.readVInt() org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) org.apache.lucene.index.SegmentTermEnum.next() org.apache.lucene.index.SegmentTermEnum.scanTo(Term) org.apache.lucene.index.TermInfosReader.get(Term, boolean) org.apache.lucene.index.TermInfosReader.get(Term) org.apache.lucene.index.SegmentTermDocs.seek(Term) org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) org.apache.lucene.index.IndexWriter.applyDeletes() org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) org.apache.lucene.index.IndexWriter.closeInternal(boolean) org.apache.lucene.index.IndexWriter.close(boolean) org.apache.lucene.index.IndexWriter.close() org.apache.solr.update.SolrIndexWriter.close() org.apache.solr.update.DirectUpdateHandler2.closeWriter() org.apache.solr.update.DirectUpdateHandler2.commit(CommitUpdateCommand) org.apache.solr.update.processor.RunUpdateProcessor.processCommit(CommitUpdateCommand) org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor, SolrParams, boolean) org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse) org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, SolrQueryResponse) org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest, SolrQueryResponse) org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, ServletResponse, FilterChain) org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest, ServletResponse) org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, ServletResponse) org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response) org.apache.catalina.core.StandardContextValve.invoke(Request, Response) org.apache.catalina.core.StandardHostValve.invoke(Request, Response) org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response) org.apache.catalina.core.StandardEngineValve.invoke(Request, Response) org.apache.catalina.connector.CoyoteAdapter.service(Request, Response) org.apache.coyote.http11.Http11Processor.process(InputStream, OutputStream) org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(TcpConnection, Object[]) org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(Socket, TcpConnection, Object[]) org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(Object[]) org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run() java.lang.Thread.run()
-
Re: Solr TimeoutsYonik Seeley 2009-10-05, 16:14
On Mon, Oct 5, 2009 at 12:03 PM, Giovanni Fernandez-Kincade
<[EMAIL PROTECTED]> wrote: > Hi, > > I’m attempting to index approximately 6 million HTML/Text files using SOLR > 1.4/Tomcat6 on Windows Server 2003 x64. I’m running 64 bit Tomcat and JVM. > I’ve fired up 4-5 different jobs that are making indexing requests using the > ExtractionRequestHandler, and everything works well for about 30-40 minutes, > after which all indexing requests start timing out. I profiled the server > and found that all of the threads are getting blocked by this call to flush > the Lucene index to disk (see below). > > > > This leads me to a few questions: > > 1. Is this normal? Yes... one can't currently add documents when the first part of a commit is going on (closing the IndexWriter). The threads will normally block and then resume after the writer has been successfully closed. This is normally fine and you can work around it by increasing the servlet container timeout. Due to advances in Lucene, this restriction will probably be lifted in the next version of Solr (1.5) > 2. Can I reduce the frequency with which this happens somehow? I’ve > greatly increased the indexing options in SolrConfig.xml (attached here) to > no avail. It looks like Solr is committing because you told it to? > 3. During these flushes, resource utilization (CPU, I/O, Memory > Consumption) is significantly down compared to when requests are being > handled. Is there any way to make this index go faster? I have plenty of > bandwidth on the machine. Don't commit until you're done a big indexing run? If you're using SolrJ, use the StreamingUpdateSolrServer.... it's much faster! -Yonik http://www.lucidimagination.com > I appreciate any insight you can provide. We’re currently using MS SQL 2005 > as our full-text solution and are pretty much miserable. So far SOLR has > been a great experience. > > > > Thanks, > > Gio. > > > > http-8080-Processor21 [RUNNABLE] CPU time: 9:51 > > java.io.RandomAccessFile.seek(long) > > org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(byte[], > int, int) > > org.apache.lucene.store.BufferedIndexInput.refill() > > org.apache.lucene.store.BufferedIndexInput.readByte() > > org.apache.lucene.store.IndexInput.readVInt() > > org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) > > org.apache.lucene.index.SegmentTermEnum.next() > > org.apache.lucene.index.SegmentTermEnum.scanTo(Term) > > org.apache.lucene.index.TermInfosReader.get(Term, boolean) > > org.apache.lucene.index.TermInfosReader.get(Term) > > org.apache.lucene.index.SegmentTermDocs.seek(Term) > > org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) > > org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) > > org.apache.lucene.index.IndexWriter.applyDeletes() > > org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) > > org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) > > org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) > > org.apache.lucene.index.IndexWriter.closeInternal(boolean) > > org.apache.lucene.index.IndexWriter.close(boolean) > > org.apache.lucene.index.IndexWriter.close() > > org.apache.solr.update.SolrIndexWriter.close() > > org.apache.solr.update.DirectUpdateHandler2.closeWriter() > > org.apache.solr.update.DirectUpdateHandler2.commit(CommitUpdateCommand) > > org.apache.solr.update.processor.RunUpdateProcessor.processCommit(CommitUpdateCommand) > > org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor, > SolrParams, boolean) > > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, > SolrQueryResponse) > > org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, > SolrQueryResponse) > > org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest, > SolrQueryResponse) > > org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest,
-
RE: Solr TimeoutsGiovanni Fernandez-Kincad... 2009-10-05, 16:30
I'm not committing at all actually - I'm waiting for all 6 million to be done.
-----Original Message----- From: Feak, Todd [mailto:[EMAIL PROTECTED]] Sent: Monday, October 05, 2009 12:10 PM To: [EMAIL PROTECTED] Subject: RE: Solr Timeouts How often are you committing? Every time you commit, Solr will close the old index and open the new one. If you are doing this in parallel from multiple jobs (4-5 you mention) then eventually the server gets behind and you start to pile up commit requests. Once this starts to happen, it will cascade out of control if the rate of commits isn't slowed. -Todd ________________________________ From: Giovanni Fernandez-Kincade [mailto:[EMAIL PROTECTED]] Sent: Monday, October 05, 2009 9:04 AM To: [EMAIL PROTECTED] Subject: Solr Timeouts Hi, I'm attempting to index approximately 6 million HTML/Text files using SOLR 1.4/Tomcat6 on Windows Server 2003 x64. I'm running 64 bit Tomcat and JVM. I've fired up 4-5 different jobs that are making indexing requests using the ExtractionRequestHandler, and everything works well for about 30-40 minutes, after which all indexing requests start timing out. I profiled the server and found that all of the threads are getting blocked by this call to flush the Lucene index to disk (see below). This leads me to a few questions: 1. Is this normal? 2. Can I reduce the frequency with which this happens somehow? I've greatly increased the indexing options in SolrConfig.xml (attached here) to no avail. 3. During these flushes, resource utilization (CPU, I/O, Memory Consumption) is significantly down compared to when requests are being handled. Is there any way to make this index go faster? I have plenty of bandwidth on the machine. I appreciate any insight you can provide. We're currently using MS SQL 2005 as our full-text solution and are pretty much miserable. So far SOLR has been a great experience. Thanks, Gio. http-8080-Processor21 [RUNNABLE] CPU time: 9:51 java.io.RandomAccessFile.seek(long) org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(byte[], int, int) org.apache.lucene.store.BufferedIndexInput.refill() org.apache.lucene.store.BufferedIndexInput.readByte() org.apache.lucene.store.IndexInput.readVInt() org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) org.apache.lucene.index.SegmentTermEnum.next() org.apache.lucene.index.SegmentTermEnum.scanTo(Term) org.apache.lucene.index.TermInfosReader.get(Term, boolean) org.apache.lucene.index.TermInfosReader.get(Term) org.apache.lucene.index.SegmentTermDocs.seek(Term) org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) org.apache.lucene.index.IndexWriter.applyDeletes() org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) org.apache.lucene.index.IndexWriter.closeInternal(boolean) org.apache.lucene.index.IndexWriter.close(boolean) org.apache.lucene.index.IndexWriter.close() org.apache.solr.update.SolrIndexWriter.close() org.apache.solr.update.DirectUpdateHandler2.closeWriter() org.apache.solr.update.DirectUpdateHandler2.commit(CommitUpdateCommand) org.apache.solr.update.processor.RunUpdateProcessor.processCommit(CommitUpdateCommand) org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor, SolrParams, boolean) org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse) org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, SolrQueryResponse) org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest, SolrQueryResponse) org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, ServletResponse, FilterChain) org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest, ServletResponse) org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, ServletResponse) org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response) org.apache.catalina.core.StandardContextValve.invoke(Request, Response) org.apache.catalina.core.StandardHostValve.invoke(Request, Response) org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response) org.apache.catalina.core.StandardEngineValve.invoke(Request, Response) org.apache.catalina.connector.CoyoteAdapter.service(Request, Response) org.apache.coyote.http11.Http11Processor.process(InputStream, OutputStream) org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(TcpConnection, Object[]) org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(Socket, TcpConnection, Object[]) org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(Object[]) org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run() java.lang.Thread.run()
-
Re: Solr TimeoutsYonik Seeley 2009-10-05, 16:37
On Mon, Oct 5, 2009 at 12:30 PM, Giovanni Fernandez-Kincade
<[EMAIL PROTECTED]> wrote: > I'm not committing at all actually - I'm waiting for all 6 million to be done. You either have solr auto commit set up, or a client is issuing a commit. -Yonik http://www.lucidimagination.com > -----Original Message----- > From: Feak, Todd [mailto:[EMAIL PROTECTED]] > Sent: Monday, October 05, 2009 12:10 PM > To: [EMAIL PROTECTED] > Subject: RE: Solr Timeouts > > How often are you committing? > > Every time you commit, Solr will close the old index and open the new one. If you are doing this in parallel from multiple jobs (4-5 you mention) then eventually the server gets behind and you start to pile up commit requests. Once this starts to happen, it will cascade out of control if the rate of commits isn't slowed. > > -Todd > > ________________________________ > From: Giovanni Fernandez-Kincade [mailto:[EMAIL PROTECTED]] > Sent: Monday, October 05, 2009 9:04 AM > To: [EMAIL PROTECTED] > Subject: Solr Timeouts > > Hi, > I'm attempting to index approximately 6 million HTML/Text files using SOLR 1.4/Tomcat6 on Windows Server 2003 x64. I'm running 64 bit Tomcat and JVM. I've fired up 4-5 different jobs that are making indexing requests using the ExtractionRequestHandler, and everything works well for about 30-40 minutes, after which all indexing requests start timing out. I profiled the server and found that all of the threads are getting blocked by this call to flush the Lucene index to disk (see below). > > This leads me to a few questions: > > 1. Is this normal? > > 2. Can I reduce the frequency with which this happens somehow? I've greatly increased the indexing options in SolrConfig.xml (attached here) to no avail. > > 3. During these flushes, resource utilization (CPU, I/O, Memory Consumption) is significantly down compared to when requests are being handled. Is there any way to make this index go faster? I have plenty of bandwidth on the machine. > > I appreciate any insight you can provide. We're currently using MS SQL 2005 as our full-text solution and are pretty much miserable. So far SOLR has been a great experience. > > Thanks, > Gio. > > http-8080-Processor21 [RUNNABLE] CPU time: 9:51 > java.io.RandomAccessFile.seek(long) > org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(byte[], int, int) > org.apache.lucene.store.BufferedIndexInput.refill() > org.apache.lucene.store.BufferedIndexInput.readByte() > org.apache.lucene.store.IndexInput.readVInt() > org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) > org.apache.lucene.index.SegmentTermEnum.next() > org.apache.lucene.index.SegmentTermEnum.scanTo(Term) > org.apache.lucene.index.TermInfosReader.get(Term, boolean) > org.apache.lucene.index.TermInfosReader.get(Term) > org.apache.lucene.index.SegmentTermDocs.seek(Term) > org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) > org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) > org.apache.lucene.index.IndexWriter.applyDeletes() > org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) > org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) > org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) > org.apache.lucene.index.IndexWriter.closeInternal(boolean) > org.apache.lucene.index.IndexWriter.close(boolean) > org.apache.lucene.index.IndexWriter.close() > org.apache.solr.update.SolrIndexWriter.close() > org.apache.solr.update.DirectUpdateHandler2.closeWriter() > org.apache.solr.update.DirectUpdateHandler2.commit(CommitUpdateCommand) > org.apache.solr.update.processor.RunUpdateProcessor.processCommit(CommitUpdateCommand) > org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor, SolrParams, boolean) > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse) > org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, SolrQueryResponse)
-
RE: Solr TimeoutsFeak, Todd 2009-10-05, 16:38
Ok. Guess that isn't a problem. :)
A second consideration... I could see lock contention being an issue with multiple clients indexing at once. Is there any disadvantage to serializing the clients to remove lock contention? -Todd -----Original Message----- From: Giovanni Fernandez-Kincade [mailto:[EMAIL PROTECTED]] Sent: Monday, October 05, 2009 9:30 AM To: [EMAIL PROTECTED] Subject: RE: Solr Timeouts I'm not committing at all actually - I'm waiting for all 6 million to be done. -----Original Message----- From: Feak, Todd [mailto:[EMAIL PROTECTED]] Sent: Monday, October 05, 2009 12:10 PM To: [EMAIL PROTECTED] Subject: RE: Solr Timeouts How often are you committing? Every time you commit, Solr will close the old index and open the new one. If you are doing this in parallel from multiple jobs (4-5 you mention) then eventually the server gets behind and you start to pile up commit requests. Once this starts to happen, it will cascade out of control if the rate of commits isn't slowed. -Todd ________________________________ From: Giovanni Fernandez-Kincade [mailto:[EMAIL PROTECTED]] Sent: Monday, October 05, 2009 9:04 AM To: [EMAIL PROTECTED] Subject: Solr Timeouts Hi, I'm attempting to index approximately 6 million HTML/Text files using SOLR 1.4/Tomcat6 on Windows Server 2003 x64. I'm running 64 bit Tomcat and JVM. I've fired up 4-5 different jobs that are making indexing requests using the ExtractionRequestHandler, and everything works well for about 30-40 minutes, after which all indexing requests start timing out. I profiled the server and found that all of the threads are getting blocked by this call to flush the Lucene index to disk (see below). This leads me to a few questions: 1. Is this normal? 2. Can I reduce the frequency with which this happens somehow? I've greatly increased the indexing options in SolrConfig.xml (attached here) to no avail. 3. During these flushes, resource utilization (CPU, I/O, Memory Consumption) is significantly down compared to when requests are being handled. Is there any way to make this index go faster? I have plenty of bandwidth on the machine. I appreciate any insight you can provide. We're currently using MS SQL 2005 as our full-text solution and are pretty much miserable. So far SOLR has been a great experience. Thanks, Gio. http-8080-Processor21 [RUNNABLE] CPU time: 9:51 java.io.RandomAccessFile.seek(long) org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(byte[], int, int) org.apache.lucene.store.BufferedIndexInput.refill() org.apache.lucene.store.BufferedIndexInput.readByte() org.apache.lucene.store.IndexInput.readVInt() org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) org.apache.lucene.index.SegmentTermEnum.next() org.apache.lucene.index.SegmentTermEnum.scanTo(Term) org.apache.lucene.index.TermInfosReader.get(Term, boolean) org.apache.lucene.index.TermInfosReader.get(Term) org.apache.lucene.index.SegmentTermDocs.seek(Term) org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) org.apache.lucene.index.IndexWriter.applyDeletes() org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) org.apache.lucene.index.IndexWriter.closeInternal(boolean) org.apache.lucene.index.IndexWriter.close(boolean) org.apache.lucene.index.IndexWriter.close() org.apache.solr.update.SolrIndexWriter.close() org.apache.solr.update.DirectUpdateHandler2.closeWriter() org.apache.solr.update.DirectUpdateHandler2.commit(CommitUpdateCommand) org.apache.solr.update.processor.RunUpdateProcessor.processCommit(CommitUpdateCommand) org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor, SolrParams, boolean) org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse) org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, SolrQueryResponse) org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest, SolrQueryResponse) org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, ServletResponse, FilterChain) org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest, ServletResponse) org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, ServletResponse) org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response) org.apache.catalina.core.StandardContextValve.invoke(Request, Response) org.apache.catalina.core.StandardHostValve.invoke(Request, Response) org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response) org.apache.catalina.core.StandardEngineValve.invoke(Request, Response) org.apache.catalina.connector.CoyoteAdapter.service(Request, Response) org.apache.coyote.http11.Http11Processor.process(InputStream, OutputStream) org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(TcpConnection, Object[]) org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(Socket, TcpConnection, Object[]) org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(Object[]) org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run() java.lang.Thread.run()
-
RE: Solr TimeoutsFeak, Todd 2009-10-05, 16:40
Actually, ignore my other response.
I believe you are committing, whether you know it or not. This is in your provided stack trace org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor, SolrParams, boolean) org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse) I think Yonik gave you additional information for how to make it faster. -Todd -----Original Message----- From: Giovanni Fernandez-Kincade [mailto:[EMAIL PROTECTED]] Sent: Monday, October 05, 2009 9:30 AM To: [EMAIL PROTECTED] Subject: RE: Solr Timeouts I'm not committing at all actually - I'm waiting for all 6 million to be done. -----Original Message----- From: Feak, Todd [mailto:[EMAIL PROTECTED]] Sent: Monday, October 05, 2009 12:10 PM To: [EMAIL PROTECTED] Subject: RE: Solr Timeouts How often are you committing? Every time you commit, Solr will close the old index and open the new one. If you are doing this in parallel from multiple jobs (4-5 you mention) then eventually the server gets behind and you start to pile up commit requests. Once this starts to happen, it will cascade out of control if the rate of commits isn't slowed. -Todd ________________________________ From: Giovanni Fernandez-Kincade [mailto:[EMAIL PROTECTED]] Sent: Monday, October 05, 2009 9:04 AM To: [EMAIL PROTECTED] Subject: Solr Timeouts Hi, I'm attempting to index approximately 6 million HTML/Text files using SOLR 1.4/Tomcat6 on Windows Server 2003 x64. I'm running 64 bit Tomcat and JVM. I've fired up 4-5 different jobs that are making indexing requests using the ExtractionRequestHandler, and everything works well for about 30-40 minutes, after which all indexing requests start timing out. I profiled the server and found that all of the threads are getting blocked by this call to flush the Lucene index to disk (see below). This leads me to a few questions: 1. Is this normal? 2. Can I reduce the frequency with which this happens somehow? I've greatly increased the indexing options in SolrConfig.xml (attached here) to no avail. 3. During these flushes, resource utilization (CPU, I/O, Memory Consumption) is significantly down compared to when requests are being handled. Is there any way to make this index go faster? I have plenty of bandwidth on the machine. I appreciate any insight you can provide. We're currently using MS SQL 2005 as our full-text solution and are pretty much miserable. So far SOLR has been a great experience. Thanks, Gio. http-8080-Processor21 [RUNNABLE] CPU time: 9:51 java.io.RandomAccessFile.seek(long) org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(byte[], int, int) org.apache.lucene.store.BufferedIndexInput.refill() org.apache.lucene.store.BufferedIndexInput.readByte() org.apache.lucene.store.IndexInput.readVInt() org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) org.apache.lucene.index.SegmentTermEnum.next() org.apache.lucene.index.SegmentTermEnum.scanTo(Term) org.apache.lucene.index.TermInfosReader.get(Term, boolean) org.apache.lucene.index.TermInfosReader.get(Term) org.apache.lucene.index.SegmentTermDocs.seek(Term) org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) org.apache.lucene.index.IndexWriter.applyDeletes() org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) org.apache.lucene.index.IndexWriter.closeInternal(boolean) org.apache.lucene.index.IndexWriter.close(boolean) org.apache.lucene.index.IndexWriter.close() org.apache.solr.update.SolrIndexWriter.close() org.apache.solr.update.DirectUpdateHandler2.closeWriter() org.apache.solr.update.DirectUpdateHandler2.commit(CommitUpdateCommand) org.apache.solr.update.processor.RunUpdateProcessor.processCommit(CommitUpdateCommand) org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor, SolrParams, boolean) org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse) org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, SolrQueryResponse) org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest, SolrQueryResponse) org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, ServletResponse, FilterChain) org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest, ServletResponse) org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, ServletResponse) org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response) org.apache.catalina.core.StandardContextValve.invoke(Request, Response) org.apache.catalina.core.StandardHostValve.invoke(Request, Response) org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response) org.apache.catalina.core.StandardEngineValve.invoke(Request, Response) org.apache.catalina.connector.CoyoteAdapter.service(Request, Response) org.apache.coyote.http11.Http11Processor.process(InputStream, OutputStream) org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(TcpConnection, Object[]) org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(Socket, TcpConnection, Object[]) org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(Object[]) org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run() java.lang.Thread.run()
-
RE: Solr TimeoutsGiovanni Fernandez-Kincad... 2009-10-05, 16:44
Is there somewhere other than solrConfig.xml that the autoCommit feature is enabled? I've looked through that file and found autocommit to be commented out:
<!-- Perform a <commit/> automatically under certain conditions: maxDocs - number of updates since last commit is greater than this maxTime - oldest uncommited update (in ms) is this long ago <autoCommit> <maxDocs>10000</maxDocs> <maxTime>1000</maxTime> </autoCommit> --> This is what one of my SOLR requests look like: http://titans:8080/solr/update/extract/?literal.versionId=684936&literal.filingDate=1997-12-04T00:00:00Z&literal.formTypeId=95&literal.companyId=3567904&literal.sourceId=0&resource.name=684936.txt&commit=false -----Original Message----- From: Feak, Todd [mailto:[EMAIL PROTECTED]] Sent: Monday, October 05, 2009 12:40 PM To: [EMAIL PROTECTED] Subject: RE: Solr Timeouts Actually, ignore my other response. I believe you are committing, whether you know it or not. This is in your provided stack trace org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor, SolrParams, boolean) org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse) I think Yonik gave you additional information for how to make it faster. -Todd -----Original Message----- From: Giovanni Fernandez-Kincade [mailto:[EMAIL PROTECTED]] Sent: Monday, October 05, 2009 9:30 AM To: [EMAIL PROTECTED] Subject: RE: Solr Timeouts I'm not committing at all actually - I'm waiting for all 6 million to be done. -----Original Message----- From: Feak, Todd [mailto:[EMAIL PROTECTED]] Sent: Monday, October 05, 2009 12:10 PM To: [EMAIL PROTECTED] Subject: RE: Solr Timeouts How often are you committing? Every time you commit, Solr will close the old index and open the new one. If you are doing this in parallel from multiple jobs (4-5 you mention) then eventually the server gets behind and you start to pile up commit requests. Once this starts to happen, it will cascade out of control if the rate of commits isn't slowed. -Todd ________________________________ From: Giovanni Fernandez-Kincade [mailto:[EMAIL PROTECTED]] Sent: Monday, October 05, 2009 9:04 AM To: [EMAIL PROTECTED] Subject: Solr Timeouts Hi, I'm attempting to index approximately 6 million HTML/Text files using SOLR 1.4/Tomcat6 on Windows Server 2003 x64. I'm running 64 bit Tomcat and JVM. I've fired up 4-5 different jobs that are making indexing requests using the ExtractionRequestHandler, and everything works well for about 30-40 minutes, after which all indexing requests start timing out. I profiled the server and found that all of the threads are getting blocked by this call to flush the Lucene index to disk (see below). This leads me to a few questions: 1. Is this normal? 2. Can I reduce the frequency with which this happens somehow? I've greatly increased the indexing options in SolrConfig.xml (attached here) to no avail. 3. During these flushes, resource utilization (CPU, I/O, Memory Consumption) is significantly down compared to when requests are being handled. Is there any way to make this index go faster? I have plenty of bandwidth on the machine. I appreciate any insight you can provide. We're currently using MS SQL 2005 as our full-text solution and are pretty much miserable. So far SOLR has been a great experience. Thanks, Gio. http-8080-Processor21 [RUNNABLE] CPU time: 9:51 java.io.RandomAccessFile.seek(long) org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(byte[], int, int) org.apache.lucene.store.BufferedIndexInput.refill() org.apache.lucene.store.BufferedIndexInput.readByte() org.apache.lucene.store.IndexInput.readVInt() org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) org.apache.lucene.index.SegmentTermEnum.next() org.apache.lucene.index.SegmentTermEnum.scanTo(Term) org.apache.lucene.index.TermInfosReader.get(Term, boolean) org.apache.lucene.index.TermInfosReader.get(Term) org.apache.lucene.index.SegmentTermDocs.seek(Term) org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) org.apache.lucene.index.IndexWriter.applyDeletes() org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) org.apache.lucene.index.IndexWriter.closeInternal(boolean) org.apache.lucene.index.IndexWriter.close(boolean) org.apache.lucene.index.IndexWriter.close() org.apache.solr.update.SolrIndexWriter.close() org.apache.solr.update.DirectUpdateHandler2.closeWriter() org.apache.solr.update.DirectUpdateHandler2.commit(CommitUpdateCommand) org.apache.solr.update.processor.RunUpdateProcessor.processCommit(CommitUpdateCommand) org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor, SolrParams, boolean) org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse) org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, SolrQueryResponse) org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest, SolrQueryResponse) org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, ServletResponse, FilterChain) org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest, ServletResponse) org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, ServletResponse) org.ap
-
Re: Solr TimeoutsYonik Seeley 2009-10-05, 16:52
> This is what one of my SOLR requests look like:
> > http://titans:8080/solr/update/extract/?literal.versionId=684936&literal.filingDate=1997-12-04T00:00:00Z&literal.formTypeId=95&literal.companyId=3567904&literal.sourceId=0&resource.name=684936.txt&commit=false Have you verified that all of your indexing jobs (you said you had 4 or 5) have commit=false? Also make sure that your extract handler doesn't have a default of something that could cause a commit - like commitWithin or something. -Yonik http://www.lucidimagination.com On Mon, Oct 5, 2009 at 12:44 PM, Giovanni Fernandez-Kincade <[EMAIL PROTECTED]> wrote: > Is there somewhere other than solrConfig.xml that the autoCommit feature is enabled? I've looked through that file and found autocommit to be commented out: > > > > <!-- > > Perform a <commit/> automatically under certain conditions: > > maxDocs - number of updates since last commit is greater than this > > maxTime - oldest uncommited update (in ms) is this long ago > > <autoCommit> > > <maxDocs>10000</maxDocs> > > <maxTime>1000</maxTime> > > </autoCommit> > > > > > > --> > > > > > > > -----Original Message----- > From: Feak, Todd [mailto:[EMAIL PROTECTED]] > Sent: Monday, October 05, 2009 12:40 PM > To: [EMAIL PROTECTED] > Subject: RE: Solr Timeouts > > > > Actually, ignore my other response. > > > > I believe you are committing, whether you know it or not. > > > > This is in your provided stack trace > > org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor, SolrParams, boolean) org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse) > > > > I think Yonik gave you additional information for how to make it faster. > > > > -Todd > > > > -----Original Message----- > > From: Giovanni Fernandez-Kincade [mailto:[EMAIL PROTECTED]] > > Sent: Monday, October 05, 2009 9:30 AM > > To: [EMAIL PROTECTED] > > Subject: RE: Solr Timeouts > > > > I'm not committing at all actually - I'm waiting for all 6 million to be done. > > > > -----Original Message----- > > From: Feak, Todd [mailto:[EMAIL PROTECTED]] > > Sent: Monday, October 05, 2009 12:10 PM > > To: [EMAIL PROTECTED] > > Subject: RE: Solr Timeouts > > > > How often are you committing? > > > > Every time you commit, Solr will close the old index and open the new one. If you are doing this in parallel from multiple jobs (4-5 you mention) then eventually the server gets behind and you start to pile up commit requests. Once this starts to happen, it will cascade out of control if the rate of commits isn't slowed. > > > > -Todd > > > > ________________________________ > > From: Giovanni Fernandez-Kincade [mailto:[EMAIL PROTECTED]] > > Sent: Monday, October 05, 2009 9:04 AM > > To: [EMAIL PROTECTED] > > Subject: Solr Timeouts > > > > Hi, > > I'm attempting to index approximately 6 million HTML/Text files using SOLR 1.4/Tomcat6 on Windows Server 2003 x64. I'm running 64 bit Tomcat and JVM. I've fired up 4-5 different jobs that are making indexing requests using the ExtractionRequestHandler, and everything works well for about 30-40 minutes, after which all indexing requests start timing out. I profiled the server and found that all of the threads are getting blocked by this call to flush the Lucene index to disk (see below). > > > > This leads me to a few questions: > > > > 1. Is this normal? > > > > 2. Can I reduce the frequency with which this happens somehow? I've greatly increased the indexing options in SolrConfig.xml (attached here) to no avail. > > > > 3. During these flushes, resource utilization (CPU, I/O, Memory Consumption) is significantly down compared to when requests are being handled. Is there any way to make this index go faster? I have plenty of bandwidth on the machine. > > > > I appreciate any insight you can provide. We're currently using MS SQL 2005 as our full-text solution and are pretty much miserable. So far SOLR has been a great experience.
-
RE: Solr TimeoutsWalter Underwood 2009-10-05, 16:52
How long is your timeout? Maybe it should be longer, since this is normal
Solr behavior. --wunder -----Original Message----- From: Giovanni Fernandez-Kincade [mailto:[EMAIL PROTECTED]] Sent: Monday, October 05, 2009 9:45 AM To: [EMAIL PROTECTED] Subject: RE: Solr Timeouts Is there somewhere other than solrConfig.xml that the autoCommit feature is enabled? I've looked through that file and found autocommit to be commented out: <!-- Perform a <commit/> automatically under certain conditions: maxDocs - number of updates since last commit is greater than this maxTime - oldest uncommited update (in ms) is this long ago <autoCommit> <maxDocs>10000</maxDocs> <maxTime>1000</maxTime> </autoCommit> --> This is what one of my SOLR requests look like: http://titans:8080/solr/update/extract/?literal.versionId=684936&literal.fil ingDate=1997-12-04T00:00:00Z&literal.formTypeId=95&literal.companyId=3567904 &literal.sourceId=0&resource.name=684936.txt&commit=false -----Original Message----- From: Feak, Todd [mailto:[EMAIL PROTECTED]] Sent: Monday, October 05, 2009 12:40 PM To: [EMAIL PROTECTED] Subject: RE: Solr Timeouts Actually, ignore my other response. I believe you are committing, whether you know it or not. This is in your provided stack trace org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProces sor, SolrParams, boolean) org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQuery Request, SolrQueryResponse) I think Yonik gave you additional information for how to make it faster. -Todd -----Original Message----- From: Giovanni Fernandez-Kincade [mailto:[EMAIL PROTECTED]] Sent: Monday, October 05, 2009 9:30 AM To: [EMAIL PROTECTED] Subject: RE: Solr Timeouts I'm not committing at all actually - I'm waiting for all 6 million to be done. -----Original Message----- From: Feak, Todd [mailto:[EMAIL PROTECTED]] Sent: Monday, October 05, 2009 12:10 PM To: [EMAIL PROTECTED] Subject: RE: Solr Timeouts How often are you committing? Every time you commit, Solr will close the old index and open the new one. If you are doing this in parallel from multiple jobs (4-5 you mention) then eventually the server gets behind and you start to pile up commit requests. Once this starts to happen, it will cascade out of control if the rate of commits isn't slowed. -Todd ________________________________ From: Giovanni Fernandez-Kincade [mailto:[EMAIL PROTECTED]] Sent: Monday, October 05, 2009 9:04 AM To: [EMAIL PROTECTED] Subject: Solr Timeouts Hi, I'm attempting to index approximately 6 million HTML/Text files using SOLR 1.4/Tomcat6 on Windows Server 2003 x64. I'm running 64 bit Tomcat and JVM. I've fired up 4-5 different jobs that are making indexing requests using the ExtractionRequestHandler, and everything works well for about 30-40 minutes, after which all indexing requests start timing out. I profiled the server and found that all of the threads are getting blocked by this call to flush the Lucene index to disk (see below). This leads me to a few questions: 1. Is this normal? 2. Can I reduce the frequency with which this happens somehow? I've greatly increased the indexing options in SolrConfig.xml (attached here) to no avail. 3. During these flushes, resource utilization (CPU, I/O, Memory Consumption) is significantly down compared to when requests are being handled. Is there any way to make this index go faster? I have plenty of bandwidth on the machine. I appreciate any insight you can provide. We're currently using MS SQL 2005 as our full-text solution and are pretty much miserable. So far SOLR has been a great experience. Thanks, Gio. http-8080-Processor21 [RUNNABLE] CPU time: 9:51 java.io.RandomAccessFile.seek(long) org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(by te[], int, int) org.apache.lucene.store.BufferedIndexInput.refill() org.apache.lucene.store.BufferedIndexInput.readByte() org.apache.lucene.store.IndexInput.readVInt() org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) org.apache.lucene.index.SegmentTermEnum.next() org.apache.lucene.index.SegmentTermEnum.scanTo(Term) org.apache.lucene.index.TermInfosReader.get(Term, boolean) org.apache.lucene.index.TermInfosReader.get(Term) org.apache.lucene.index.SegmentTermDocs.seek(Term) org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) org.apache.lucene.index.IndexWriter.applyDeletes() org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) org.apache.lucene.index.IndexWriter.closeInternal(boolean) org.apache.lucene.index.IndexWriter.close(boolean) org.apache.lucene.index.IndexWriter.close() org.apache.solr.update.SolrIndexWriter.close() org.apache.solr.update.DirectUpdateHandler2.closeWriter() org.apache.solr.update.DirectUpdateHandler2.commit(CommitUpdateCommand) org.apache.solr.update.processor.RunUpdateProcessor.processCommit(CommitUpda teCommand) org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProces sor, SolrParams, boolean) org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQuery Request, SolrQueryResponse) org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, SolrQueryResponse) org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest (SolrQueryRequest, SolrQueryResponse) org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, SolrRequestHandler, SolrQueryRequest, SolrQ
-
RE: Solr TimeoutsGiovanni Fernandez-Kincad... 2009-10-05, 17:04
I'm fairly certain that all of the indexing jobs are calling SOLR with commit=false. They all construct the indexing URLs using a CLR function I wrote, which takes in a Commit parameter, which is always set to false.
Also, I don't see any calls to commit in the Tomcat logs (whereas normally when I make a commit call I do). This suggests that Solr is doing it automatically, but the extract handler doesn't seem to be the problem: <requestHandler name="/update/extract" class="org.apache.solr.handler.extraction.ExtractingRequestHandler" startup="lazy"> <lst name="defaults"> <str name="uprefix">ignored_</str> <str name="map.content">fileData</str> </lst> </requestHandler> There is no external config file specified, and I don't see anything about commits here. I've tried setting up more detailed indexer logging but haven't been able to get it to work: <infoStream file="c:\solr\indexer.log">true</infoStream> I tried relative and absolute paths, but no dice so far. Any other ideas? -Gio. -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Yonik Seeley Sent: Monday, October 05, 2009 12:52 PM To: [EMAIL PROTECTED] Subject: Re: Solr Timeouts > This is what one of my SOLR requests look like: > > http://titans:8080/solr/update/extract/?literal.versionId=684936&literal.filingDate=1997-12-04T00:00:00Z&literal.formTypeId=95&literal.companyId=3567904&literal.sourceId=0&resource.name=684936.txt&commit=false Have you verified that all of your indexing jobs (you said you had 4 or 5) have commit=false? Also make sure that your extract handler doesn't have a default of something that could cause a commit - like commitWithin or something. -Yonik http://www.lucidimagination.com On Mon, Oct 5, 2009 at 12:44 PM, Giovanni Fernandez-Kincade <[EMAIL PROTECTED]> wrote: > Is there somewhere other than solrConfig.xml that the autoCommit feature is enabled? I've looked through that file and found autocommit to be commented out: > > > > <!-- > > Perform a <commit/> automatically under certain conditions: > > maxDocs - number of updates since last commit is greater than this > > maxTime - oldest uncommited update (in ms) is this long ago > > <autoCommit> > > <maxDocs>10000</maxDocs> > > <maxTime>1000</maxTime> > > </autoCommit> > > > > > > --> > > > > > > > -----Original Message----- > From: Feak, Todd [mailto:[EMAIL PROTECTED]] > Sent: Monday, October 05, 2009 12:40 PM > To: [EMAIL PROTECTED] > Subject: RE: Solr Timeouts > > > > Actually, ignore my other response. > > > > I believe you are committing, whether you know it or not. > > > > This is in your provided stack trace > > org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor, SolrParams, boolean) org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse) > > > > I think Yonik gave you additional information for how to make it faster. > > > > -Todd > > > > -----Original Message----- > > From: Giovanni Fernandez-Kincade [mailto:[EMAIL PROTECTED]] > > Sent: Monday, October 05, 2009 9:30 AM > > To: [EMAIL PROTECTED] > > Subject: RE: Solr Timeouts > > > > I'm not committing at all actually - I'm waiting for all 6 million to be done. > > > > -----Original Message----- > > From: Feak, Todd [mailto:[EMAIL PROTECTED]] > > Sent: Monday, October 05, 2009 12:10 PM > > To: [EMAIL PROTECTED] > > Subject: RE: Solr Timeouts > > > > How often are you committing? > > > > Every time you commit, Solr will close the old index and open the new one. If you are doing this in parallel from multiple jobs (4-5 you mention) then eventually the server gets behind and you start to pile up commit requests. Once this starts to happen, it will cascade out of control if the rate of commits isn't slowed. > > > > -Todd > > > > ________________________________ > > From: Giovanni Fernandez-Kincade [mailto:[EMAIL PROTECTED]]
-
Re: Solr TimeoutsYonik Seeley 2009-10-05, 17:17
OK... next step is to verify that SolrCell doesn't have a bug that
causes it to commit. I'll try and verify today unless someone else beats me to it. -Yonik http://www.lucidimagination.com On Mon, Oct 5, 2009 at 1:04 PM, Giovanni Fernandez-Kincade <[EMAIL PROTECTED]> wrote: > I'm fairly certain that all of the indexing jobs are calling SOLR with commit=false. They all construct the indexing URLs using a CLR function I wrote, which takes in a Commit parameter, which is always set to false. > > Also, I don't see any calls to commit in the Tomcat logs (whereas normally when I make a commit call I do). > > This suggests that Solr is doing it automatically, but the extract handler doesn't seem to be the problem: > <requestHandler name="/update/extract" class="org.apache.solr.handler.extraction.ExtractingRequestHandler" startup="lazy"> > <lst name="defaults"> > <str name="uprefix">ignored_</str> > <str name="map.content">fileData</str> > </lst> > </requestHandler> > > > There is no external config file specified, and I don't see anything about commits here. > > I've tried setting up more detailed indexer logging but haven't been able to get it to work: > <infoStream file="c:\solr\indexer.log">true</infoStream> > > I tried relative and absolute paths, but no dice so far. > > Any other ideas? > > -Gio. > > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Yonik Seeley > Sent: Monday, October 05, 2009 12:52 PM > To: [EMAIL PROTECTED] > Subject: Re: Solr Timeouts > >> This is what one of my SOLR requests look like: >> >> http://titans:8080/solr/update/extract/?literal.versionId=684936&literal.filingDate=1997-12-04T00:00:00Z&literal.formTypeId=95&literal.companyId=3567904&literal.sourceId=0&resource.name=684936.txt&commit=false > > Have you verified that all of your indexing jobs (you said you had 4 > or 5) have commit=false? > > Also make sure that your extract handler doesn't have a default of > something that could cause a commit - like commitWithin or something. > > -Yonik > http://www.lucidimagination.com > > > > On Mon, Oct 5, 2009 at 12:44 PM, Giovanni Fernandez-Kincade > <[EMAIL PROTECTED]> wrote: >> Is there somewhere other than solrConfig.xml that the autoCommit feature is enabled? I've looked through that file and found autocommit to be commented out: >> >> >> >> <!-- >> >> Perform a <commit/> automatically under certain conditions: >> >> maxDocs - number of updates since last commit is greater than this >> >> maxTime - oldest uncommited update (in ms) is this long ago >> >> <autoCommit> >> >> <maxDocs>10000</maxDocs> >> >> <maxTime>1000</maxTime> >> >> </autoCommit> >> >> >> >> >> >> --> >> >> >> > >> >> >> >> -----Original Message----- >> From: Feak, Todd [mailto:[EMAIL PROTECTED]] >> Sent: Monday, October 05, 2009 12:40 PM >> To: [EMAIL PROTECTED] >> Subject: RE: Solr Timeouts >> >> >> >> Actually, ignore my other response. >> >> >> >> I believe you are committing, whether you know it or not. >> >> >> >> This is in your provided stack trace >> >> org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor, SolrParams, boolean) org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse) >> >> >> >> I think Yonik gave you additional information for how to make it faster. >> >> >> >> -Todd >> >> >> >> -----Original Message----- >> >> From: Giovanni Fernandez-Kincade [mailto:[EMAIL PROTECTED]] >> >> Sent: Monday, October 05, 2009 9:30 AM >> >> To: [EMAIL PROTECTED] >> >> Subject: RE: Solr Timeouts >> >> >> >> I'm not committing at all actually - I'm waiting for all 6 million to be done. >> >> >> >> -----Original Message----- >> >> From: Feak, Todd [mailto:[EMAIL PROTECTED]] >> >> Sent: Monday, October 05, 2009 12:10 PM >> >> To: [EMAIL PROTECTED] >> >> Subject: RE: Solr Timeouts
-
RE: Solr TimeoutsGiovanni Fernandez-Kincad... 2009-10-05, 17:29
Thanks for your help!
-----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Yonik Seeley Sent: Monday, October 05, 2009 1:18 PM To: [EMAIL PROTECTED] Subject: Re: Solr Timeouts OK... next step is to verify that SolrCell doesn't have a bug that causes it to commit. I'll try and verify today unless someone else beats me to it. -Yonik http://www.lucidimagination.com On Mon, Oct 5, 2009 at 1:04 PM, Giovanni Fernandez-Kincade <[EMAIL PROTECTED]> wrote: > I'm fairly certain that all of the indexing jobs are calling SOLR with commit=false. They all construct the indexing URLs using a CLR function I wrote, which takes in a Commit parameter, which is always set to false. > > Also, I don't see any calls to commit in the Tomcat logs (whereas normally when I make a commit call I do). > > This suggests that Solr is doing it automatically, but the extract handler doesn't seem to be the problem: > <requestHandler name="/update/extract" class="org.apache.solr.handler.extraction.ExtractingRequestHandler" startup="lazy"> > <lst name="defaults"> > <str name="uprefix">ignored_</str> > <str name="map.content">fileData</str> > </lst> > </requestHandler> > > > There is no external config file specified, and I don't see anything about commits here. > > I've tried setting up more detailed indexer logging but haven't been able to get it to work: > <infoStream file="c:\solr\indexer.log">true</infoStream> > > I tried relative and absolute paths, but no dice so far. > > Any other ideas? > > -Gio. > > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Yonik Seeley > Sent: Monday, October 05, 2009 12:52 PM > To: [EMAIL PROTECTED] > Subject: Re: Solr Timeouts > >> This is what one of my SOLR requests look like: >> >> http://titans:8080/solr/update/extract/?literal.versionId=684936&literal.filingDate=1997-12-04T00:00:00Z&literal.formTypeId=95&literal.companyId=3567904&literal.sourceId=0&resource.name=684936.txt&commit=false > > Have you verified that all of your indexing jobs (you said you had 4 > or 5) have commit=false? > > Also make sure that your extract handler doesn't have a default of > something that could cause a commit - like commitWithin or something. > > -Yonik > http://www.lucidimagination.com > > > > On Mon, Oct 5, 2009 at 12:44 PM, Giovanni Fernandez-Kincade > <[EMAIL PROTECTED]> wrote: >> Is there somewhere other than solrConfig.xml that the autoCommit feature is enabled? I've looked through that file and found autocommit to be commented out: >> >> >> >> <!-- >> >> Perform a <commit/> automatically under certain conditions: >> >> maxDocs - number of updates since last commit is greater than this >> >> maxTime - oldest uncommited update (in ms) is this long ago >> >> <autoCommit> >> >> <maxDocs>10000</maxDocs> >> >> <maxTime>1000</maxTime> >> >> </autoCommit> >> >> >> >> >> >> --> >> >> >> > >> >> >> >> -----Original Message----- >> From: Feak, Todd [mailto:[EMAIL PROTECTED]] >> Sent: Monday, October 05, 2009 12:40 PM >> To: [EMAIL PROTECTED] >> Subject: RE: Solr Timeouts >> >> >> >> Actually, ignore my other response. >> >> >> >> I believe you are committing, whether you know it or not. >> >> >> >> This is in your provided stack trace >> >> org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor, SolrParams, boolean) org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse) >> >> >> >> I think Yonik gave you additional information for how to make it faster. >> >> >> >> -Todd >> >> >> >> -----Original Message----- >> >> From: Giovanni Fernandez-Kincade [mailto:[EMAIL PROTECTED]] >> >> Sent: Monday, October 05, 2009 9:30 AM >> >> To: [EMAIL PROTECTED] >> >> Subject: RE: Solr Timeouts >> >> >> >> I'm not committing at all actually - I'm waiting for all 6 million to be done.
-
RE: Solr TimeoutsGiovanni Fernandez-Kincad... 2009-10-05, 18:11
I just grabbed another stack trace for a thread that has been similarly blocking for over an hour. Notice that there is no Commit in this one:
http-8080-Processor67 [RUNNABLE] CPU time: 1:02:05 org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) org.apache.lucene.index.SegmentTermEnum.next() org.apache.lucene.index.SegmentTermEnum.scanTo(Term) org.apache.lucene.index.TermInfosReader.get(Term, boolean) org.apache.lucene.index.TermInfosReader.get(Term) org.apache.lucene.index.SegmentTermDocs.seek(Term) org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) org.apache.lucene.index.IndexWriter.applyDeletes() org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) org.apache.lucene.index.IndexWriter.updateDocument(Term, Document, Analyzer) org.apache.lucene.index.IndexWriter.updateDocument(Term, Document) org.apache.solr.update.DirectUpdateHandler2.addDoc(AddUpdateCommand) org.apache.solr.update.processor.RunUpdateProcessor.processAdd(AddUpdateCommand) org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(SolrContentHandler, AddUpdateCommand) org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(SolrContentHandler) org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(SolrQueryRequest, SolrQueryResponse, ContentStream) org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse) org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, SolrQueryResponse) org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest, SolrQueryResponse) org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, ServletResponse, FilterChain) org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest, ServletResponse) org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, ServletResponse) org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response) org.apache.catalina.core.StandardContextValve.invoke(Request, Response) org.apache.catalina.core.StandardHostValve.invoke(Request, Response) org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response) org.apache.catalina.core.StandardEngineValve.invoke(Request, Response) org.apache.catalina.connector.CoyoteAdapter.service(Request, Response) org.apache.coyote.http11.Http11Processor.process(InputStream, OutputStream) org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(TcpConnection, Object[]) org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(Socket, TcpConnection, Object[]) org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(Object[]) org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run() java.lang.Thread.run() -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Yonik Seeley Sent: Monday, October 05, 2009 1:18 PM To: [EMAIL PROTECTED] Subject: Re: Solr Timeouts OK... next step is to verify that SolrCell doesn't have a bug that causes it to commit. I'll try and verify today unless someone else beats me to it. -Yonik http://www.lucidimagination.com On Mon, Oct 5, 2009 at 1:04 PM, Giovanni Fernandez-Kincade <[EMAIL PROTECTED]> wrote: > I'm fairly certain that all of the indexing jobs are calling SOLR with commit=false. They all construct the indexing URLs using a CLR function I wrote, which takes in a Commit parameter, which is always set to false. > > Also, I don't see any calls to commit in the Tomcat logs (whereas normally when I make a commit call I do).
-
RE: Solr TimeoutsGiovanni Fernandez-Kincad... 2009-10-06, 16:33
Is it possible that deletions are triggering these commits? Some of the documents that I'm making indexing requests for already exist in the index, so they would result in deletions. I tried messing with some of these parameters but I'm still running into the same problem:
<deletionPolicy class="solr.SolrDeletionPolicy"> <!-- Keep only optimized commit points --> <str name="keepOptimizedOnly">false</str> <!-- The maximum number of commit points to be kept --> <str name="maxCommitsToKeep">100</str> <!-- Delete all commit points once they have reached the given age. Supports DateMathParser syntax e.g. <str name="maxCommitAge">30MINUTES</str> <str name="maxCommitAge">1DAY</str> --> </deletionPolicy> This is happening like every 30-40minutes and it's really hampering the indexing progress... -----Original Message----- From: Giovanni Fernandez-Kincade [mailto:[EMAIL PROTECTED]] Sent: Monday, October 05, 2009 2:11 PM To: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: RE: Solr Timeouts I just grabbed another stack trace for a thread that has been similarly blocking for over an hour. Notice that there is no Commit in this one: http-8080-Processor67 [RUNNABLE] CPU time: 1:02:05 org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) org.apache.lucene.index.SegmentTermEnum.next() org.apache.lucene.index.SegmentTermEnum.scanTo(Term) org.apache.lucene.index.TermInfosReader.get(Term, boolean) org.apache.lucene.index.TermInfosReader.get(Term) org.apache.lucene.index.SegmentTermDocs.seek(Term) org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) org.apache.lucene.index.IndexWriter.applyDeletes() org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) org.apache.lucene.index.IndexWriter.updateDocument(Term, Document, Analyzer) org.apache.lucene.index.IndexWriter.updateDocument(Term, Document) org.apache.solr.update.DirectUpdateHandler2.addDoc(AddUpdateCommand) org.apache.solr.update.processor.RunUpdateProcessor.processAdd(AddUpdateCommand) org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(SolrContentHandler, AddUpdateCommand) org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(SolrContentHandler) org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(SolrQueryRequest, SolrQueryResponse, ContentStream) org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse) org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, SolrQueryResponse) org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest, SolrQueryResponse) org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, ServletResponse, FilterChain) org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest, ServletResponse) org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, ServletResponse) org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response) org.apache.catalina.core.StandardContextValve.invoke(Request, Response) org.apache.catalina.core.StandardHostValve.invoke(Request, Response) org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response) org.apache.catalina.core.StandardEngineValve.invoke(Request, Response) org.apache.catalina.connector.CoyoteAdapter.service(Request, Response) org.apache.coyote.http11.Http11Processor.process(InputStream, OutputStream) org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(TcpConnection, Object[]) org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(Socket, TcpConnection, Object[]) org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(Object[]) org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run() java.lang.Thread.run() From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Yonik Seeley Sent: Monday, October 05, 2009 1:18 PM To: [EMAIL PROTECTED] Subject: Re: Solr Timeouts OK... next step is to verify that SolrCell doesn't have a bug that causes it to commit. I'll try and verify today unless someone else beats me to it. -Yonik http://www.lucidimagination.com On Mon, Oct 5, 2009 at 1:04 PM, Giovanni Fernandez-Kincade <[EMAIL PROTECTED]> wrote:
-
Re: Solr TimeoutsLance Norskog 2009-10-06, 18:59
Is this Java 1.5? There are known threading bugs in 1.5 that were
fixed in Java 1.6. Also, there was one short series of 1.6 releases that wrote bogus Lucene index files. So, make sure you use the latest Java 1.6 release. Also, I hope this is a local disk. Some shops try running over NFS or Windows file sharing and this often does not work well. Lance On 10/6/09, Giovanni Fernandez-Kincade <[EMAIL PROTECTED]> wrote: > Is it possible that deletions are triggering these commits? Some of the > documents that I'm making indexing requests for already exist in the index, > so they would result in deletions. I tried messing with some of these > parameters but I'm still running into the same problem: > > <deletionPolicy class="solr.SolrDeletionPolicy"> > <!-- Keep only optimized commit points --> > <str name="keepOptimizedOnly">false</str> > <!-- The maximum number of commit points to be kept --> > <str name="maxCommitsToKeep">100</str> > <!-- > Delete all commit points once they have reached the given age. > Supports DateMathParser syntax e.g. > > <str name="maxCommitAge">30MINUTES</str> > <str name="maxCommitAge">1DAY</str> > --> > </deletionPolicy> > > This is happening like every 30-40minutes and it's really hampering the > indexing progress... > > > -----Original Message----- > From: Giovanni Fernandez-Kincade [mailto:[EMAIL PROTECTED]] > Sent: Monday, October 05, 2009 2:11 PM > To: [EMAIL PROTECTED]; [EMAIL PROTECTED] > Subject: RE: Solr Timeouts > > I just grabbed another stack trace for a thread that has been similarly > blocking for over an hour. Notice that there is no Commit in this one: > > http-8080-Processor67 [RUNNABLE] CPU time: 1:02:05 > org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) > org.apache.lucene.index.SegmentTermEnum.next() > org.apache.lucene.index.SegmentTermEnum.scanTo(Term) > org.apache.lucene.index.TermInfosReader.get(Term, boolean) > org.apache.lucene.index.TermInfosReader.get(Term) > org.apache.lucene.index.SegmentTermDocs.seek(Term) > org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) > org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) > org.apache.lucene.index.IndexWriter.applyDeletes() > org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) > org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) > org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) > org.apache.lucene.index.IndexWriter.updateDocument(Term, Document, Analyzer) > org.apache.lucene.index.IndexWriter.updateDocument(Term, Document) > org.apache.solr.update.DirectUpdateHandler2.addDoc(AddUpdateCommand) > org.apache.solr.update.processor.RunUpdateProcessor.processAdd(AddUpdateCommand) > org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(SolrContentHandler, > AddUpdateCommand) > org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(SolrContentHandler) > org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(SolrQueryRequest, > SolrQueryResponse, ContentStream) > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, > SolrQueryResponse) > org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, > SolrQueryResponse) > org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest, > SolrQueryResponse) > org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, > SolrQueryResponse) > org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, > SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) > org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, > ServletResponse, FilterChain) > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest, > ServletResponse) > org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, Lance Norskog [EMAIL PROTECTED]
-
Re: Solr TimeoutsYonik Seeley 2009-10-06, 19:06
This specific thread was blocked for an hour?
If so, I'd echo Lance... this is a local disk right? -Yonik http://www.lucidimagination.com On Mon, Oct 5, 2009 at 2:11 PM, Giovanni Fernandez-Kincade <[EMAIL PROTECTED]> wrote: > I just grabbed another stack trace for a thread that has been similarly blocking for over an hour. Notice that there is no Commit in this one: > > http-8080-Processor67 [RUNNABLE] CPU time: 1:02:05 > org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) > org.apache.lucene.index.SegmentTermEnum.next() > org.apache.lucene.index.SegmentTermEnum.scanTo(Term) > org.apache.lucene.index.TermInfosReader.get(Term, boolean) > org.apache.lucene.index.TermInfosReader.get(Term) > org.apache.lucene.index.SegmentTermDocs.seek(Term) > org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) > org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) > org.apache.lucene.index.IndexWriter.applyDeletes() > org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) > org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) > org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) > org.apache.lucene.index.IndexWriter.updateDocument(Term, Document, Analyzer) > org.apache.lucene.index.IndexWriter.updateDocument(Term, Document) > org.apache.solr.update.DirectUpdateHandler2.addDoc(AddUpdateCommand) > org.apache.solr.update.processor.RunUpdateProcessor.processAdd(AddUpdateCommand) > org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(SolrContentHandler, AddUpdateCommand) > org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(SolrContentHandler) > org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(SolrQueryRequest, SolrQueryResponse, ContentStream) > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse) > org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, SolrQueryResponse) > org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest, SolrQueryResponse) > org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) > org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) > org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, ServletResponse, FilterChain) > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest, ServletResponse) > org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, ServletResponse) > org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response) > org.apache.catalina.core.StandardContextValve.invoke(Request, Response) > org.apache.catalina.core.StandardHostValve.invoke(Request, Response) > org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response) > org.apache.catalina.core.StandardEngineValve.invoke(Request, Response) > org.apache.catalina.connector.CoyoteAdapter.service(Request, Response) > org.apache.coyote.http11.Http11Processor.process(InputStream, OutputStream) > org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(TcpConnection, Object[]) > org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(Socket, TcpConnection, Object[]) > org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(Object[]) > org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run() > java.lang.Thread.run() > > > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Yonik Seeley > Sent: Monday, October 05, 2009 1:18 PM > To: [EMAIL PROTECTED] > Subject: Re: Solr Timeouts > > OK... next step is to verify that SolrCell doesn't have a bug that > causes it to commit. > I'll try and verify today unless someone else beats me to it. > > -Yonik > http://www.lucidimagination.com > > On Mon, Oct 5, 2009 at 1:04 PM, Giovanni Fernandez-Kincade > <[EMAIL PROTECTED]> wrote:
-
RE: Solr TimeoutsGiovanni Fernandez-Kincad... 2009-10-06, 19:37
Yeah this is Java 1.6.
The indexes are being written to a local disk, but they files being indexed live on a NFS. -----Original Message----- From: Lance Norskog [mailto:[EMAIL PROTECTED]] Sent: Tuesday, October 06, 2009 2:59 PM To: [EMAIL PROTECTED] Subject: Re: Solr Timeouts Is this Java 1.5? There are known threading bugs in 1.5 that were fixed in Java 1.6. Also, there was one short series of 1.6 releases that wrote bogus Lucene index files. So, make sure you use the latest Java 1.6 release. Also, I hope this is a local disk. Some shops try running over NFS or Windows file sharing and this often does not work well. Lance On 10/6/09, Giovanni Fernandez-Kincade <[EMAIL PROTECTED]> wrote: > Is it possible that deletions are triggering these commits? Some of the > documents that I'm making indexing requests for already exist in the index, > so they would result in deletions. I tried messing with some of these > parameters but I'm still running into the same problem: > > <deletionPolicy class="solr.SolrDeletionPolicy"> > <!-- Keep only optimized commit points --> > <str name="keepOptimizedOnly">false</str> > <!-- The maximum number of commit points to be kept --> > <str name="maxCommitsToKeep">100</str> > <!-- > Delete all commit points once they have reached the given age. > Supports DateMathParser syntax e.g. > > <str name="maxCommitAge">30MINUTES</str> > <str name="maxCommitAge">1DAY</str> > --> > </deletionPolicy> > > This is happening like every 30-40minutes and it's really hampering the > indexing progress... > > > -----Original Message----- > From: Giovanni Fernandez-Kincade [mailto:[EMAIL PROTECTED]] > Sent: Monday, October 05, 2009 2:11 PM > To: [EMAIL PROTECTED]; [EMAIL PROTECTED] > Subject: RE: Solr Timeouts > > I just grabbed another stack trace for a thread that has been similarly > blocking for over an hour. Notice that there is no Commit in this one: > > http-8080-Processor67 [RUNNABLE] CPU time: 1:02:05 > org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) > org.apache.lucene.index.SegmentTermEnum.next() > org.apache.lucene.index.SegmentTermEnum.scanTo(Term) > org.apache.lucene.index.TermInfosReader.get(Term, boolean) > org.apache.lucene.index.TermInfosReader.get(Term) > org.apache.lucene.index.SegmentTermDocs.seek(Term) > org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) > org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) > org.apache.lucene.index.IndexWriter.applyDeletes() > org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) > org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) > org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) > org.apache.lucene.index.IndexWriter.updateDocument(Term, Document, Analyzer) > org.apache.lucene.index.IndexWriter.updateDocument(Term, Document) > org.apache.solr.update.DirectUpdateHandler2.addDoc(AddUpdateCommand) > org.apache.solr.update.processor.RunUpdateProcessor.processAdd(AddUpdateCommand) > org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(SolrContentHandler, > AddUpdateCommand) > org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(SolrContentHandler) > org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(SolrQueryRequest, > SolrQueryResponse, ContentStream) > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, > SolrQueryResponse) > org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, > SolrQueryResponse) > org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest, > SolrQueryResponse) > org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, > SolrQueryResponse) > org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, > SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) Lance Norskog [EMAIL PROTECTED]
-
RE: Solr TimeoutsGiovanni Fernandez-Kincad... 2009-10-06, 19:38
That thread was blocking for an hour while all other threads were idle or blocked.
-----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Yonik Seeley Sent: Tuesday, October 06, 2009 3:07 PM To: [EMAIL PROTECTED] Subject: Re: Solr Timeouts This specific thread was blocked for an hour? If so, I'd echo Lance... this is a local disk right? -Yonik http://www.lucidimagination.com On Mon, Oct 5, 2009 at 2:11 PM, Giovanni Fernandez-Kincade <[EMAIL PROTECTED]> wrote: > I just grabbed another stack trace for a thread that has been similarly blocking for over an hour. Notice that there is no Commit in this one: > > http-8080-Processor67 [RUNNABLE] CPU time: 1:02:05 > org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) > org.apache.lucene.index.SegmentTermEnum.next() > org.apache.lucene.index.SegmentTermEnum.scanTo(Term) > org.apache.lucene.index.TermInfosReader.get(Term, boolean) > org.apache.lucene.index.TermInfosReader.get(Term) > org.apache.lucene.index.SegmentTermDocs.seek(Term) > org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) > org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) > org.apache.lucene.index.IndexWriter.applyDeletes() > org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) > org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) > org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) > org.apache.lucene.index.IndexWriter.updateDocument(Term, Document, Analyzer) > org.apache.lucene.index.IndexWriter.updateDocument(Term, Document) > org.apache.solr.update.DirectUpdateHandler2.addDoc(AddUpdateCommand) > org.apache.solr.update.processor.RunUpdateProcessor.processAdd(AddUpdateCommand) > org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(SolrContentHandler, AddUpdateCommand) > org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(SolrContentHandler) > org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(SolrQueryRequest, SolrQueryResponse, ContentStream) > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse) > org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, SolrQueryResponse) > org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest, SolrQueryResponse) > org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) > org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) > org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, ServletResponse, FilterChain) > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest, ServletResponse) > org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, ServletResponse) > org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response) > org.apache.catalina.core.StandardContextValve.invoke(Request, Response) > org.apache.catalina.core.StandardHostValve.invoke(Request, Response) > org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response) > org.apache.catalina.core.StandardEngineValve.invoke(Request, Response) > org.apache.catalina.connector.CoyoteAdapter.service(Request, Response) > org.apache.coyote.http11.Http11Processor.process(InputStream, OutputStream) > org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(TcpConnection, Object[]) > org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(Socket, TcpConnection, Object[]) > org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(Object[]) > org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run() > java.lang.Thread.run() > > > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Yonik Seeley > Sent: Monday, October 05, 2009 1:18 PM > To: [EMAIL PROTECTED] > Subject
-
RE: Solr TimeoutsFeak, Todd 2009-10-06, 20:32
I seem to recall hearing something about *not* putting a Solr index directory on an NFS mount. Might want to search on that.
That, of course, doesn't have anything to do with commits showing up unexpectedly in stack traces, per your original email. -Todd -----Original Message----- From: Giovanni Fernandez-Kincade [mailto:[EMAIL PROTECTED]] Sent: Tuesday, October 06, 2009 12:39 PM To: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: RE: Solr Timeouts That thread was blocking for an hour while all other threads were idle or blocked. -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Yonik Seeley Sent: Tuesday, October 06, 2009 3:07 PM To: [EMAIL PROTECTED] Subject: Re: Solr Timeouts This specific thread was blocked for an hour? If so, I'd echo Lance... this is a local disk right? -Yonik http://www.lucidimagination.com On Mon, Oct 5, 2009 at 2:11 PM, Giovanni Fernandez-Kincade <[EMAIL PROTECTED]> wrote: > I just grabbed another stack trace for a thread that has been similarly blocking for over an hour. Notice that there is no Commit in this one: > > http-8080-Processor67 [RUNNABLE] CPU time: 1:02:05 > org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) > org.apache.lucene.index.SegmentTermEnum.next() > org.apache.lucene.index.SegmentTermEnum.scanTo(Term) > org.apache.lucene.index.TermInfosReader.get(Term, boolean) > org.apache.lucene.index.TermInfosReader.get(Term) > org.apache.lucene.index.SegmentTermDocs.seek(Term) > org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) > org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) > org.apache.lucene.index.IndexWriter.applyDeletes() > org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) > org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) > org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) > org.apache.lucene.index.IndexWriter.updateDocument(Term, Document, Analyzer) > org.apache.lucene.index.IndexWriter.updateDocument(Term, Document) > org.apache.solr.update.DirectUpdateHandler2.addDoc(AddUpdateCommand) > org.apache.solr.update.processor.RunUpdateProcessor.processAdd(AddUpdateCommand) > org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(SolrContentHandler, AddUpdateCommand) > org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(SolrContentHandler) > org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(SolrQueryRequest, SolrQueryResponse, ContentStream) > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse) > org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, SolrQueryResponse) > org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest, SolrQueryResponse) > org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) > org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) > org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, ServletResponse, FilterChain) > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest, ServletResponse) > org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, ServletResponse) > org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response) > org.apache.catalina.core.StandardContextValve.invoke(Request, Response) > org.apache.catalina.core.StandardHostValve.invoke(Request, Response) > org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response) > org.apache.catalina.core.StandardEngineValve.invoke(Request, Response) > org.apache.catalina.connector.CoyoteAdapter.service(Request, Response) > org.apache.coyote.http11.Http11Processor.process(InputStream, OutputStream) > org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(TcpConnection, Object[])
-
Re: Solr TimeoutsMark Miller 2009-10-06, 20:43
It sounds like he is indexing on a local disk, but reading the files to
be index from NFS - which would be fine. You can get Lucene indexes to work on NFS (though still not recommended) , but you need to use a custom IndexDeletionPolicy to keep older commit points around longer and be sure not to use NIOFSDirectory. Feak, Todd wrote: > I seem to recall hearing something about *not* putting a Solr index directory on an NFS mount. Might want to search on that. > > That, of course, doesn't have anything to do with commits showing up unexpectedly in stack traces, per your original email. > > -Todd > > -----Original Message----- > From: Giovanni Fernandez-Kincade [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, October 06, 2009 12:39 PM > To: [EMAIL PROTECTED]; [EMAIL PROTECTED] > Subject: RE: Solr Timeouts > > That thread was blocking for an hour while all other threads were idle or blocked. > > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Yonik Seeley > Sent: Tuesday, October 06, 2009 3:07 PM > To: [EMAIL PROTECTED] > Subject: Re: Solr Timeouts > > This specific thread was blocked for an hour? > If so, I'd echo Lance... this is a local disk right? > > -Yonik > http://www.lucidimagination.com > > > On Mon, Oct 5, 2009 at 2:11 PM, Giovanni Fernandez-Kincade > <[EMAIL PROTECTED]> wrote: > >> I just grabbed another stack trace for a thread that has been similarly blocking for over an hour. Notice that there is no Commit in this one: >> >> http-8080-Processor67 [RUNNABLE] CPU time: 1:02:05 >> org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) >> org.apache.lucene.index.SegmentTermEnum.next() >> org.apache.lucene.index.SegmentTermEnum.scanTo(Term) >> org.apache.lucene.index.TermInfosReader.get(Term, boolean) >> org.apache.lucene.index.TermInfosReader.get(Term) >> org.apache.lucene.index.SegmentTermDocs.seek(Term) >> org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) >> org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) >> org.apache.lucene.index.IndexWriter.applyDeletes() >> org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) >> org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) >> org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) >> org.apache.lucene.index.IndexWriter.updateDocument(Term, Document, Analyzer) >> org.apache.lucene.index.IndexWriter.updateDocument(Term, Document) >> org.apache.solr.update.DirectUpdateHandler2.addDoc(AddUpdateCommand) >> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(AddUpdateCommand) >> org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(SolrContentHandler, AddUpdateCommand) >> org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(SolrContentHandler) >> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(SolrQueryRequest, SolrQueryResponse, ContentStream) >> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse) >> org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, SolrQueryResponse) >> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest, SolrQueryResponse) >> org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) >> org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) >> org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, ServletResponse, FilterChain) >> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest, ServletResponse) >> org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, ServletResponse) >> org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response) >> org.apache.catalina.core.StandardContextValve.invoke(Request, Response) >> org.apache.catalina.core.StandardHostValve.invoke(Request, Response) - Mark http://www.lucidimagination.com
-
RE: Solr TimeoutsGiovanni Fernandez-Kincad... 2009-10-06, 20:49
Yeah that's exactly right Mark.
What does the "maxCommitsToKeep"(from SolrDeletionPolicy in SolrConfig.xml) parameter actually do? Increasing this value seems to have helped a little, but I'm wary of cranking it without having a better understanding of what it does. -----Original Message----- From: Mark Miller [mailto:[EMAIL PROTECTED]] Sent: Tuesday, October 06, 2009 4:44 PM To: [EMAIL PROTECTED] Subject: Re: Solr Timeouts It sounds like he is indexing on a local disk, but reading the files to be index from NFS - which would be fine. You can get Lucene indexes to work on NFS (though still not recommended) , but you need to use a custom IndexDeletionPolicy to keep older commit points around longer and be sure not to use NIOFSDirectory. Feak, Todd wrote: > I seem to recall hearing something about *not* putting a Solr index directory on an NFS mount. Might want to search on that. > > That, of course, doesn't have anything to do with commits showing up unexpectedly in stack traces, per your original email. > > -Todd > > -----Original Message----- > From: Giovanni Fernandez-Kincade [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, October 06, 2009 12:39 PM > To: [EMAIL PROTECTED]; [EMAIL PROTECTED] > Subject: RE: Solr Timeouts > > That thread was blocking for an hour while all other threads were idle or blocked. > > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Yonik Seeley > Sent: Tuesday, October 06, 2009 3:07 PM > To: [EMAIL PROTECTED] > Subject: Re: Solr Timeouts > > This specific thread was blocked for an hour? > If so, I'd echo Lance... this is a local disk right? > > -Yonik > http://www.lucidimagination.com > > > On Mon, Oct 5, 2009 at 2:11 PM, Giovanni Fernandez-Kincade > <[EMAIL PROTECTED]> wrote: > >> I just grabbed another stack trace for a thread that has been similarly blocking for over an hour. Notice that there is no Commit in this one: >> >> http-8080-Processor67 [RUNNABLE] CPU time: 1:02:05 >> org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) >> org.apache.lucene.index.SegmentTermEnum.next() >> org.apache.lucene.index.SegmentTermEnum.scanTo(Term) >> org.apache.lucene.index.TermInfosReader.get(Term, boolean) >> org.apache.lucene.index.TermInfosReader.get(Term) >> org.apache.lucene.index.SegmentTermDocs.seek(Term) >> org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int) >> org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos) >> org.apache.lucene.index.IndexWriter.applyDeletes() >> org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean) >> org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean) >> org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean) >> org.apache.lucene.index.IndexWriter.updateDocument(Term, Document, Analyzer) >> org.apache.lucene.index.IndexWriter.updateDocument(Term, Document) >> org.apache.solr.update.DirectUpdateHandler2.addDoc(AddUpdateCommand) >> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(AddUpdateCommand) >> org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(SolrContentHandler, AddUpdateCommand) >> org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(SolrContentHandler) >> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(SolrQueryRequest, SolrQueryResponse, ContentStream) >> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse) >> org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, SolrQueryResponse) >> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest, SolrQueryResponse) >> org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) >> org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) >> org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, ServletResponse, FilterChain) - Mark http://www.lucidimagination.com
-
Re: Solr TimeoutsShalin Shekhar Mangar 2009-10-07, 09:28
On Wed, Oct 7, 2009 at 2:19 AM, Giovanni Fernandez-Kincade <
[EMAIL PROTECTED]> wrote: > > What does the "maxCommitsToKeep"(from SolrDeletionPolicy in SolrConfig.xml) > parameter actually do? Increasing this value seems to have helped a little, > but I'm wary of cranking it without having a better understanding of what it > does. > > maxCommitsToKeep is the number of commit points (a point-in-time snapshot of the index) to keep from getting deleted. But deletion of commit points only happens on startup or when someone calls commit/optimize. -- Regards, Shalin Shekhar Mangar. |