I have from time-to-time posted questions to this list (and received
very prompt and helpful responses). But it seems that many of you are
operating in a very different space from me. The problems (and
lessons-learned) which I encounter are often very different from those
that are reflected in exchanges with most other participants.
So I thought it would be useful to describe what I'm about, and see if
there are others out there with similar implementations (or interest in
moving in that direction). A sort of pay-forward.
We (the Lakota Peoples Law Office) are a small public interest, pro bono
law firm actively engaged in defending Native American North Dakota
Water Protector clients against (ridiculously excessive) criminal charges.
I have a small Solr (6.6.0) implementation - just one shard. I'm using
the cloud mode mainly to be able to implement access controls. The
server is an ordinary (2.5GHz) laptop running Ubuntu 16.04 with 8GB of
RAM and 4 cpu processors. We presently have 8 collections with a total
of about 60,000 documents, mostly pdfs and emails. The indexed
documents are partly our own files and partly those we obtain through
legal discovery (which, surprisingly, is allowed in ND for criminal
cases). We only have a few users (our lawyers and a couple of
researchers mostly), so traffic is minimal. However, there's a premium
on precision (and recall) in searches.
The document repository is local to the server. I piggyback on the
embedded Jetty httpd in order to serve files (selected from the
hitlists). I just use a symbolic link to tie the repository to
Solr/Jetty's "webapp" subdirectory.
We provide remote access via ssh with port forwarding. It provides very
snappy performance, with fully encrypted links. Appears quite stable.
I've had some bizarre behavior apparently caused by an interaction
between repository permissions, solr permissions and the ssh link. I
seem "solved" for the moment, but time will tell for how long.
If there are any folks out there who have similar requirements, I'd be
more than happy to share the insights I've gained and problems I've
encountered and (I think) overcome. There are so many unique parts of
this small scale, specialized application (many dimensions of which are
not strictly internal to Solr) that it probably won't be appreciated to
dump them on this (excellent) Solr list. So, if you encounter problems
peculiar to this kind of setup, we can perhaps help handle them off-list
(although if they have more general Solr application, we should, of
course, post them to the list).