|
|
-
RE: Lucene tokenizationSteven A Rowe 2012-03-27, 18:11
Hi Nilesh,
Which version of Lucene are you using? StandardTokenizer behavior changed in v3.1. Steve -----Original Message----- From: Nilesh Vijaywargiay [mailto:[EMAIL PROTECTED]] Sent: Tuesday, March 27, 2012 2:04 PM To: [EMAIL PROTECTED] Subject: Lucene tokenization I have a string 01a_b-_-c-d which is tokenized as 01a_b c d and the string a_b-_-c_d which is tokenized as a b c d why is there a difference when there is a digit at the beginning? I am using standard unstemmed tokenizer. --------------------------------------------------------------------- |