Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Lucene, mail # dev - Bug in org.apache.solr.common.util.XML


Copy link to this message
-
RE: Bug in org.apache.solr.common.util.XML
Fuad Efendi 2011-03-25, 13:55


Oh not!!! I am sorry.

 

  // many chars less than 0x20 are *not* valid XML, even when escaped!

 // for example, <foo>�<foo> is invalid XML.

 

 

(sorry prev. message went to wrong mail list)

 

 

From: Fuad Efendi [mailto:[EMAIL PROTECTED]]
Sent: March-25-11 9:48 AM
To: [EMAIL PROTECTED]
Subject: Bug in org.apache.solr.common.util.XML

 

This is a not-yet-seen bug in org.apache.solr.common.util.XML

 

 

XML character entities should be in a form {

However, XML.java will generate #123; for some (very special) characters

 

We forgot ampersand for this:

 

  private static final String[] chardata_escapes
 
{"#0;","#1;","#2;","#3;","#4;","#5;","#6;","#7;","#8;",null,null,"#11;","#12
;",null,"#14;","#15;","#16;","#17;","#18;","#19;","#20;","#21;","#22;","#23;
","#24;","#25;","#26;","#27;","#28;","#29;","#30;","#31;",null,null,null,nul
l,null,null,"&",null,null,null,null,null,null,null,null,null,null,null,n
ull,null,null,null,null,null,null,null,null,null,"<",null,">"};

 

  private static final String[] attribute_escapes
 
{"#0;","#1;","#2;","#3;","#4;","#5;","#6;","#7;","#8;",null,null,"#11;","#12
;",null,"#14;","#15;","#16;","#17;","#18;","#19;","#20;","#21;","#22;","#23;
","#24;","#25;","#26;","#27;","#28;","#29;","#30;","#31;",null,null,"""
,null,null,null,"&",null,null,null,null,null,null,null,null,null,null,nu
ll,null,null,null,null,null,null,null,null,null,null,"<"};

 

 

 

 

Fuad Efendi

+1 416-993-2060

http://www.linkedin.com/in/liferay

 

Tokenizer Inc.

http://www.tokenizer.ca/

Data Mining, Vertical Search