Thanks Karl and Fukran!!!

After pointing to different Documentum instance, the performance issue got resolved.
So its look like a Documentum issue.

Regards,
Tamizh Kumaran

From: Furkan KAMACI [mailto:[EMAIL PROTECTED]]
Sent: Thursday, July 06, 2017 3:22 PM
To: [EMAIL PROTECTED]
Cc: Sharnel Merdeck Pereira; Sundarapandian Arumaidurai Vethasigamani
Subject: Re: ManifoldCF slow documentum indexing performance

Hi Tamizh,

Set Xmx and Xms to same values for a better performance.

Kind Regards,
Furkan KAMACI

On Thu, Jul 6, 2017 at 9:10 AM, Karl Wright <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
Hi Tamizh,

The Documentum Server Process is a thin shell around DFC and its dependencies.  In order to get helpful suggestions, you will need to contact Documentum, I'm afraid.

Thanks,
Karl

On Thu, Jul 6, 2017 at 1:57 AM, Tamizh Kumaran Thamizharasan <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
Thanks Karl!!

After monitoring the CPU usage of Postgresql, the agents process, and the documentum server process, mainly the documentum server process consumes most of the CPU and the agent process is the second most CPU consumer.

In documentum server run script, java heap is having value as below.
-Xmx512m -Xms32m

Is there any way to speed up the indexing through heap configuration or increasing hardware?
If so, Kindly share us the details.

Regards,
Tamizh Kumaran

From: Karl Wright [mailto:[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>]
Sent: Wednesday, July 05, 2017 6:19 PM
To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
Cc: Sharnel Merdeck Pereira; Sundarapandian Arumaidurai Vethasigamani
Subject: Re: ManifoldCF slow documentum indexing performance

Hi Tamizh,

The likely culprit is Documentum itself.  In my experience it can be quite slow, depending on how it is configured.  But you can confirm that by monitoring the CPU usage of Postgresql, the agents process, and the documentum server process.  If none of these are CPU bound, then Documentum itself is the problem.

Thanks,
Karl
On Wed, Jul 5, 2017 at 8:24 AM, Tamizh Kumaran Thamizharasan <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
Hi Team,

The postgresql 9.2, solr 5.3.2 and manifoldcf 2.7.1 are installed on the same linux box. The documentum server sits on a different linux box. The indexing performance is slow(approx 1000 doc per hour) with the documentum crawler. The used properties files is as below for reference

<configuration>
  <!-- Version string for UI -->
  <!-- Point to a specific (common) logging file -->
  <property name="org.apache.manifoldcf.logconfigfile" value="./logging.ini"/>
  <!-- Specify the connectors to be loaded -->
  <property name="org.apache.manifoldcf.co<http://org.apache.manifoldcf.co>nnectorsconfigurationfile" value="../connectors.xml"/>
  <!-- Specify the path to the file resources directory -->
  <property name="org.apache.manifoldcf.fi<http://org.apache.manifoldcf.fi>leresources" value="../file-resources"/>
  <property name="org.apache.manifoldcf.databaseimplementationclass" value="org.apache.manifoldcf.core.database.DBInterfacePostgreSQL"/>
  <property name="org.apache.manifoldcf.postgresql.hostname" value="localhost"/>
  <property name="org.apache.manifoldcf.postgresql.port" value="5432"/>
  <property name="org.apache.manifoldcf.dbsuperusername" value="postgres"/>
  <property name="org.apache.manifoldcf.dbsuperuserpassword" value=""/>
  <property name="org.apache.manifoldcf.database.name<http://org.apache.manifoldcf.database.name>" value="manifoldcf"/>
  <property name="org.apache.manifoldcf.database.username" value="postgres"/>
  <property name="org.apache.manifoldcf.database.password" value=""/>
  <property name="org.apache.manifoldcf.database.maxhandles" value="100"/>
  <property name="org.apache.manifoldcf.cr<http://org.apache.manifoldcf.cr>awler.threads" value="15"/>
  <property name="org.apache.manifoldcf.cr<http://org.apache.manifoldcf.cr>awler.repository.store_history" value="false"/>

  <property name="org.apache.manifoldcf.zookeeper.connectstring" value="***********:8349"/>
  <property name="org.apache.manifoldcf.zookeeper.sessiontimeout" value="5000"/>
<!-- Tell MCF where to find the connector jars -->
  <libdir path="../connector-lib"/>
  <libdir path="../connector-common-lib"/>
  <libdir path="../connector-lib-proprietary"/>
  <!-- Any additional local properties go here -->
</configuration>

Initially the org.apache.manifoldcf.crawler.threads is setup with 45 and the observation is it taking a long time gap between each batch of 45 documents during processing.
Can you please point out any changes/recommendations that will speed up the indexing.

Regards,
Tamizh Kumaran Thamizharasan
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB