If the index size on disk is about 750 GiB then a memory usage of 2.3 G heap space for the FST seems fine. It's just a bit strange that you only have 10 million documents!

Are those documents huge and have lots of indexed text content, possibly OCR/scanned stuff? If this is the case, the term dictionary may get huge because of many terms with incorrect spelling.

Please also give us a "ls -lh" of your index directory to make a guess.


Uwe Schindler
Achterdiek 19, D-28357 Bremen
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB