Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Solr, mail # user - Solr Performance Improvement and degradation Help


Copy link to this message
-
Re: Solr Performance Improvement and degradation Help
Erick Erickson 2012-02-23, 16:50
Ah, no, my mistake. The wildcards for the fl list won't matter re:
maxBooleanClauses,
I didn't read carefully enough.

I assume that just returning a field or two doesn't slow down....

But one possible culprit, especially since you say this kicks in after
a while, is garbage collection. Here's an excellent intro:

http://www.lucidimagination.com/blog/2011/03/27/garbage-collection-bootcamp-1-0/

Especially look at the "getting a view into garbage collection"
section and try specifying
those options. The result should be that your solr log gets stats
dumped every time
GC kicks in. If this is a problem, look at the times in the logfile
after your system slows
down. You'll see a bunch of GC dumps that collect very little unused
memory. You can
also connect to the process using jConsole (should be in the Java
distro) and watch
the "memory" tab, especially after your server has slowed down. You can also
connect jConsole remotely...

This is just an experiment, but any time I see "and it slows down
after ### minutes",
GC is the first thing I think of.
Best
Erick
On Thu, Feb 23, 2012 at 10:16 AM, naptowndev <[EMAIL PROTECTED]> wrote:
> Erick -
>
> Agreed, it is puzzling.
>
> What I've found is that it doesn't matter if I pass in wildcards for the
> field list or not...but that the overall response time from the newer builds
> of Solr that we've tested (e.g. 4.0.0.2012.02.16) is slower than the older
> (4.0.0.2010.12.10.08.54.56) build.
>
> If I run the exact same query against those two cores, bringing back a
> payload of just over 13MB (xml), the older build brings it back in about 1.6
> seconds and the newer build brings it back in about 8.4 seconds.
>
> Implementing the field list wildcard allows us to reduce the payload in the
> newer build (not an option in the older build).  They payload is reduced to
> 1.8MB but takes over 3.5 seconds to come back as compared to the full
> payload (13MB) in the older build at about 1.6 seconds.
>
> With everything else remaining the same (machine/processors/memory/network
> and the code base calling Solr) it seems to point to something in the newer
> builds that's causing the slowdown, but I'm not intimate enough with Solr to
> be able to figure that out.
>
> We are using the &debugQuery=on in our test to see timings and they aren't
> showing any anomalies, so that makes it even more confusing.
>
> From a wildcard perspective, it's on the fl parameter... here's a 'snippet'
> of part of our fl parameter for the query....
>
> &fl=id, CategoryGroupTypeID, MedicalSpecialtyDescription, TermsMisspelled,
> DictionarySource, timestamp, Category_*_MemberReports,
> Category_*_MemberReportRange, Category_*_NonMemberReports, Category_*_Grade,
> Category_*_GradeDisplay, Category_*_GradeTier, Category_*_ReportLocations,
> Category_*_ReportLocationCoordinates, Category_*_coordinate, score
>
> Please note that that fl param is greatly reduced from our full query, we
> have over 100 static files and a slew of dynamic fields - but that should
> give you an idea of how we are using wildcards.
>
> I'm not sure about the maxBooleanClauses...not being all that familiar with
> Solr, does that apply to wildcards used in the fl list?
>
> Thanks!
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3769995.html
> Sent from the Solr - User mailing list archive at Nabble.com.