Thank you for this information. Since this is very much related to Any23 and microdata parsing, I’m going to ask what I believe is a related question but keep this same thread so it will be organized in one place:

I noticed a lot of job boards such as dice.com <http://dice.com/>, monster.com <http://monster.com/>, etc use http://schema.org/JobPosting <http://schema.org/JobPosting> information, however many seem to use <script type="application/ld+json”>…</script> rather than RDF.
Summer 2017, Google announced structured data guidance for Jobs:
https://developers.google.com/search/docs/data-types/job-posting <https://developers.google.com/search/docs/data-types/job-posting>
and a testing tool to validate your HTML: https://search.google.com/structured-data/testing-tool
I verified a few sample listings on the above mentioned job boards on google’s testing-tool and they validate OK.

So after looking at http://any23.apache.org/getting-started.html <http://any23.apache.org/getting-started.html> for the supported extractors, I see Any23 mentions it supports JSON+LD input, so I added this to nutch-site.xml to override the same property in nutch-default.xml:

<property>
    <name>any23.extractors</name>
    <value>html-microdata,html-embedded-jsonld,rdf-jsonld</value>
    <description>Comma-separated list of Any23 extractors (a list of extractors is available here: http://any23.apache.org/getting-started.html)</description>
</property>

I expected to see additional information from nutch parsechecker after adding the jsonld extractors, however I see NO changes to Any23-Triples microdata parsed.

What might I be doing wrong?
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB