On 6/8/2018 8:59 AM, Markus Jelsma wrote:
<snip>
> Caused by: org.eclipse.jetty.io.EofException

If you haven't tweaked the shard handler config to drastically reduce
the socket timeout, that is weird.  The only thing that comes to mind is
extreme GC pauses that cause the socket timeout to be exceeded.

> We operate three distinct type of Solr collections, they only share the same Zookeeper quorum. The other two collections do not seem to have this problem, but i don't restart those as often as i restart this collection, as i am STILL trying to REPRODUCE the dreaded memory leak i reported having on 7.3 about two weeks ago. Sorry, but i drives me nuts!

I've reviewed the list messages about the leak.  As you might imagine,
my immediate thought is that the custom plugins you're running are
probably the cause, because we are not getting OOME reports like I would
expect if there were a leak in Solr itself.  It would not be unheard of
for a custom plugin to experience no leaks with one Solr version but
leak when Solr is upgraded, requiring a change in the plugin to properly
close resources.  I do not know if that's what's happening.

A leak could lead to GC pause problems, but it does seem really odd for
that to happen on a Solr node that's just been started.  You could try
bumping the heap size by 25 to 50 percent and see if the behavior
changes at all.  Honestly I don't expect it to change, and if it
doesn't, then I do not know what the next troubleshooting step should
be.  I could review your solr.log, though I can't be sure I would see
something you didn't.

Thanks,
Shawn
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB