Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Lucene and all its subprojects:

Switch to Threaded View
Lucene >> mail # user >> Lucene 4 -  POS and Syntactic Tagging


Copy link to this message
-
RE: Lucene 4 -  POS and Syntactic Tagging
> Mark McGuire wrote:
> I'm working on a project where I need to tag both the part of speech and other syntactic information on tokens

To pick up on this thread from a few weeks back.

I've never done this myself, but I think that your desire to put extra information that is not really a token in the index at a particular location is exactly what Payloads are for.
http://www.lucidimagination.com/blog/2009/08/05/getting-started-with-payloads/

The above article even mentions:
"A payload can be used to store weights for specific terms or things like part of speech tags or other semantic information. "

I don't believe that searching on attributes is the way to speak about it.  Attributes are features of some of Lucene objects, a way to ask for something from a complex object.  Some attributes return information from the index, but attributes are not in indexes, tokens and payloads are in indexes.  But I'm sure my understanding is incomplete also, because using something other than "WORD" seems like a way to go, but I can't see how to get a query to search on a particular type of token.

-Paul
---------------------------------------------------------------------
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB