|
Jed Reynolds
2007-03-02, 05:48
Bertrand Delacretaz
2007-03-02, 07:32
Yonik Seeley
2007-03-02, 15:28
Otis Gospodnetic
2007-03-02, 15:41
Jed Reynolds
2007-03-03, 02:26
Ryan McKinley
2007-03-03, 02:42
Yonik Seeley
2007-03-03, 04:11
Ryan McKinley
2007-03-03, 04:33
Jed Reynolds
2007-03-03, 06:01
Jed Reynolds
2007-03-03, 06:13
Bertrand Delacretaz
2007-03-03, 08:50
Jed Reynolds
2007-03-03, 09:36
Walter Underwood
2007-03-03, 18:17
Yonik Seeley
2007-03-03, 18:26
Jed Reynolds
2007-03-03, 20:10
Yonik Seeley
2007-03-03, 20:17
Chris Hostetter
2007-03-03, 21:39
Chris Hostetter
2007-03-03, 21:41
Chris Hostetter
2007-03-03, 21:43
Ryan McKinley
2007-03-03, 21:57
Yonik Seeley
2007-03-03, 22:26
Ryan McKinley
2007-03-03, 23:15
Ryan McKinley
2007-03-03, 23:20
Ryan McKinley
2007-03-04, 00:09
Yonik Seeley
2007-03-04, 01:54
Ryan McKinley
2007-03-04, 02:08
Jed Reynolds
2007-03-04, 04:09
Walter Underwood
2007-03-04, 16:52
Chris Hostetter
2007-03-04, 23:01
Walter Underwood
2007-03-05, 00:44
Chris Hostetter
2007-03-05, 21:08
Ryan McKinley
2007-03-05, 21:29
|
-
merely a suggestion: schema.xml validator or better schema validation loggingJed Reynolds 2007-03-02, 05:48
First time user. Not interested in flamewar, just making a suggestion.
I just got Solr working with my own schema and it was only a little more mysterious than I expected, having previously dealth with Nutch. Solr is exactly what I wanted in terms of (theoretical) ease of configurability. However, my first try at defining a schema.xml file was tough because my only feedback for a long time was "NullPointerException" from SolrCore when I was trying to add content. I deduce what was happening was when SolrCore tried invoking methods on the schema instance, the schema instance was null. From a design point of view, this could easily be modeled with the NullObject pattern, and an InvalidSchema object could be substituted as a default schema object. Method invocations to that schema would appropriately log why the proper schema failed to validate and substantiate. I'd think that since the capacity to define a schema via XML is so attractively powerful, that providing feedback on bad schemata would really speed deployment and adoption. It turned out that I had misspelled the unique key field reference. Silly, but can't be uncommon for a first time user. If there is already a method of pre-validating the schema, noting it on the wiki would be really helpful. So far, that has been my only hangup. This has been so much easier and appropriate than Nutch I've been gung-ho all week setting this up. Thank you! Jed
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingBertrand Delacretaz 2007-03-02, 07:32
On 3/2/07, Jed Reynolds <[EMAIL PROTECTED]> wrote:
> ...my first try at defining a schema.xml file was tough because my > only feedback for a long time was "NullPointerException" from SolrCore > when I was trying to add content... Can you give us enough information to reproduce the problem? What was wrong in your schema, exactly? Please indicate also which version of Solr you used. -Bertrand
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingYonik Seeley 2007-03-02, 15:28
Hi Jed,
NullPointerException when adding a document w/o the uniqueKey field is a known bug, and should be fixed shortly. If the actual schema was null, then that was probably some problem parsing the schema. If that's the case, hopefully you saw an exception in the logs on startup? Anyway, I agree that some config errors could be handled in a more user-friendly manner, and it would be nice if config failures could make it to the front-page admin screen or something. -Yonik On 3/2/07, Jed Reynolds <[EMAIL PROTECTED]> wrote: > First time user. Not interested in flamewar, just making a suggestion. > > I just got Solr working with my own schema and it was only a little more > mysterious than I expected, having previously dealth with Nutch. Solr is > exactly what I wanted in terms of (theoretical) ease of configurability. > > However, my first try at defining a schema.xml file was tough because my > only feedback for a long time was "NullPointerException" from SolrCore > when I was trying to add content. I deduce what was happening was when > SolrCore tried invoking methods on the schema instance, the schema > instance was null. > > From a design point of view, this could easily be modeled with the > NullObject pattern, and an InvalidSchema object could be substituted as > a default schema object. Method invocations to that schema would > appropriately log why the proper schema failed to validate and substantiate. > > I'd think that since the capacity to define a schema via XML is so > attractively powerful, that providing feedback on bad schemata would > really speed deployment and adoption. It turned out that I had > misspelled the unique key field reference. Silly, but can't be uncommon > for a first time user. > > If there is already a method of pre-validating the schema, noting it on > the wiki would be really helpful. > > So far, that has been my only hangup. This has been so much easier and > appropriate than Nutch I've been gung-ho all week setting this up. Thank > you! > > > Jed >
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingOtis Gospodnetic 2007-03-02, 15:41
Hi,
Ah, a convenient thread - I was about to mention that I was able to mistakenly define multiple <tokenizer .../>'s inside a s fieldType's analyzer without getting any kind of an error. The correct thing to do is to definite 1 tokenizer followed by N* (token)filters. Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com/ - Tag - Search - Share ----- Original Message ---- From: Yonik Seeley <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Friday, March 2, 2007 10:28:31 AM Subject: Re: merely a suggestion: schema.xml validator or better schema validation logging Hi Jed, NullPointerException when adding a document w/o the uniqueKey field is a known bug, and should be fixed shortly. If the actual schema was null, then that was probably some problem parsing the schema. If that's the case, hopefully you saw an exception in the logs on startup? Anyway, I agree that some config errors could be handled in a more user-friendly manner, and it would be nice if config failures could make it to the front-page admin screen or something. -Yonik On 3/2/07, Jed Reynolds <[EMAIL PROTECTED]> wrote: > First time user. Not interested in flamewar, just making a suggestion. > > I just got Solr working with my own schema and it was only a little more > mysterious than I expected, having previously dealth with Nutch. Solr is > exactly what I wanted in terms of (theoretical) ease of configurability. > > However, my first try at defining a schema.xml file was tough because my > only feedback for a long time was "NullPointerException" from SolrCore > when I was trying to add content. I deduce what was happening was when > SolrCore tried invoking methods on the schema instance, the schema > instance was null. > > From a design point of view, this could easily be modeled with the > NullObject pattern, and an InvalidSchema object could be substituted as > a default schema object. Method invocations to that schema would > appropriately log why the proper schema failed to validate and substantiate. > > I'd think that since the capacity to define a schema via XML is so > attractively powerful, that providing feedback on bad schemata would > really speed deployment and adoption. It turned out that I had > misspelled the unique key field reference. Silly, but can't be uncommon > for a first time user. > > If there is already a method of pre-validating the schema, noting it on > the wiki would be really helpful. > > So far, that has been my only hangup. This has been so much easier and > appropriate than Nutch I've been gung-ho all week setting this up. Thank > you! > > > Jed >
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingJed Reynolds 2007-03-03, 02:26
Yonik Seeley wrote:
> If the actual schema was null, then that was probably some problem > parsing the schema. > If that's the case, hopefully you saw an exception in the logs on > startup? Using apache-solr-1.1.0-incubating. Actually not at first, but now I do. But I've gone back and re-created the (or a similar) error, and what the problem was happened to be the way I was watching my logs. When I first started, I was just doing a tail -F on catalina.out, but the exception was throwing to the logfile localhost.2007-03-01.log. Ah, tomcat my best old buddy old pal. I've learned to just do a "tail -F *". I've obviously grown desinsitized by other java projects throwing exceptions to logs, and by so much logging duplication between catalina.out and the tomcat contextual logs. I almost didn't notice the exception fly by because there's soooo much log output, and I can see why I might not have noticed. Yay for scrollback! (Hrm, I might not have wanted to watch logging for 4 instances of solr all at once. Might explain why so much logging.) Another helpful modification would be returning 500 errors codes in the header. This would help a script detect error codes without needing to grep or dom process the result element. The output of my php script to load documents was showing me the snippet below. Possibly making the error code configurable might help (I can see cases where forcing a 200 response is useful) . Array ( [errno] => 0 [errstr] => [response] => HTTP/1.1 200 OK Server: Apache-Coyote/1.1 Content-Type: text/xml;charset=UTF-8 Content-Length: 1329 Date: Sat, 03 Mar 2007 02:04:12 GMT Connection: close <result status="1">java.lang.NullPointerException at org.apache.solr.core.SolrCore.update(SolrCore.java:763) at org.apache.solr.servlet.SolrUpdateServlet.doPost(SolrUpdateServlet.java:53) </result> ) > Anyway, I agree that some config errors could be handled in a more > user-friendly manner, and it would be nice if config failures could > make it to the front-page admin screen or something. That would groovy! I was able to see instances where a field was not defined. Now that I'm looking at all the log files, I'm seeing the error I should have seen earlier. Thanks guys! Jed PS Last night I was able to index about 180,000 documents in about 2.5 hours. The resulting index is a bit over 800M. Compared to my self-crawling with Nutch, this is 1/4 the time to index and 1/30th the disk space used by indicies. I am really impressed. I threw four concurrent scripts making 50,000 distinct (select distinct tag from taglist;) requests at this solr instance and my solr server was serving 50 requests per second per script and the solr server load average was about 3.2. That's 200 requests per second against a 4 core box. The tomcat instance was taking 606M ram, resident.
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingRyan McKinley 2007-03-03, 02:42
>
> I almost didn't notice the exception fly by because there's soooo much > log output, and I can see why I might not have noticed. Yay for > scrollback! (Hrm, I might not have wanted to watch logging for 4 > instances of solr all at once. Might explain why so much logging.) This has bitten me more then once too! The rationale with the solrconfig stuff is that a broken config should behave as best it can. This is great if you are running a real site with people actively using it - it is a pain in the ass if you are getting started and don't notice errors. I'd like to see a "strict" configuration parameter. If something fails on startup, nothing would work until it was fixed. If there is any interest, I can put this together. The other one that can confuse you is if you add documents with fields that are undefined - rather then getting an error, solr adds the fields that are defined (it may print out an exception somewhere, but i've never noticed it) > > Another helpful modification would be returning 500 errors codes in the > header. ... The 'new' RequestHandler framework (apache-solr-1.2-dev) returns a proper response code (400,500,etc). It is not (yet) the default handler for /select, but I hope it gets to be soon. best ryan
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingYonik Seeley 2007-03-03, 04:11
On 3/2/07, Ryan McKinley <[EMAIL PROTECTED]> wrote:
> The rationale with the solrconfig stuff is that a broken config should > behave as best it can. I don't think that's what I was actually going for in this instance (the schema). I was focused on getting correct stuff to work correctly, and worry about incorrect stuff later :-) > The other one that can confuse you is if you add documents with fields > that are undefined - rather then getting an error, solr adds the > fields that are defined (it may print out an exception somewhere, but > i've never noticed it) Also unintended. -Yonik
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingRyan McKinley 2007-03-03, 04:33
On 3/2/07, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> On 3/2/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: > > The rationale with the solrconfig stuff is that a broken config should > > behave as best it can. > > I don't think that's what I was actually going for in this instance > (the schema). > I was focused on getting correct stuff to work correctly, and worry > about incorrect stuff later :-) > sorry, I was referring to solrconfig.xml... if something goes wrong loading handlers it continues but prints out some log messages. I (think) there are code comments somewhere about how it should be ok to have an error and still keep a working system... I'd like to be able to configure a "strict" mode so it does not continue. > > The other one that can confuse you is if you add documents with fields > > that are undefined - rather then getting an error, solr adds the > > fields that are defined (it may print out an exception somewhere, but > > i've never noticed it) > > Also unintended. > How do you all feel about returning an error when you add a document with unknown fields? I spent a long time tracking down an error with a document set with an uppercase field name to something configured with a lowercase field.
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingJed Reynolds 2007-03-03, 06:01
Ryan McKinley wrote:
>> >> I almost didn't notice the exception fly by because there's soooo much >> log output, and I can see why I might not have noticed. Yay for >> scrollback! (Hrm, I might not have wanted to watch logging for 4 >> instances of solr all at once. Might explain why so much logging.) > > This has bitten me more then once too! > > The rationale with the solrconfig stuff is that a broken config should > behave as best it can. This is great if you are running a real site > with people actively using it - it is a pain in the ass if you are > getting started and don't notice errors. > > I'd like to see a "strict" configuration parameter. If something > fails on startup, nothing would work until it was fixed. If there is > any interest, I can put this together. That would be helpful. > The other one that can confuse you is if you add documents with fields > that are undefined - rather then getting an error, solr adds the > fields that are defined (it may print out an exception somewhere, but > i've never noticed it) > I've read about this capability but I haven't experienced it's effects yet. >> Another helpful modification would be returning 500 errors codes in the >> header. ... > > The 'new' RequestHandler framework (apache-solr-1.2-dev) returns a > proper response code (400,500,etc). It is not (yet) the default > handler for /select, but I hope it gets to be soon. Bitchen! Looking forward to that. However, I've got a lot more learning and testing to do. Don't rush anything on account of me. Jed
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingJed Reynolds 2007-03-03, 06:13
Ryan McKinley wrote:
> On 3/2/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: >> On 3/2/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: >> > The rationale with the solrconfig stuff is that a broken config should >> > behave as best it can. >> >> I don't think that's what I was actually going for in this instance >> (the schema). >> I was focused on getting correct stuff to work correctly, and worry >> about incorrect stuff later :-) >> > > sorry, I was referring to solrconfig.xml... if something goes wrong > loading handlers it continues but prints out some log messages. I > (think) there are code comments somewhere about how it should be ok to > have an error and still keep a working system... I'd like to be able > to configure a "strict" mode so it does not continue. > > >> > The other one that can confuse you is if you add documents with fields >> > that are undefined - rather then getting an error, solr adds the >> > fields that are defined (it may print out an exception somewhere, but >> > i've never noticed it) >> >> Also unintended. >> > > How do you all feel about returning an error when you add a document > with unknown fields? That sounds like a good option to specify in solrconfig.xml. > I spent a long time tracking down an error with a document set with an > uppercase field name to something configured with a lowercase field. Isn't this the kind of error that XML validation is supposed to address? I completely understand the appeal of loosely validating XML documents, of course. However, since adding a document to an index is not a lightweight operation, adding validation doesn't seem unreasonable. If writing a schema is required for validation, I'm willing to endure that step. I can certainly see many instances when components in my system written by other staff won't fit into my Solr schema. A way to enforce a schema, strictly, in a dev environment, is entirely appropriate for me. Jed
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingBertrand Delacretaz 2007-03-03, 08:50
On 3/3/07, Ryan McKinley <[EMAIL PROTECTED]> wrote:
> ...The rationale with the solrconfig stuff is that a broken config should > behave as best it can. This is great if you are running a real site > with people actively using it - it is a pain in the ass if you are > getting started and don't notice errors.... I think it's a PITA in any case, I like my systems to fail loudly when something's wrong in the configs (with details about what's happening, of course). -Bertrand
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingJed Reynolds 2007-03-03, 09:36
Bertrand Delacretaz wrote:
> On 3/3/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: > >> ...The rationale with the solrconfig stuff is that a broken config >> should >> behave as best it can. This is great if you are running a real site >> with people actively using it - it is a pain in the ass if you are >> getting started and don't notice errors.... > > I think it's a PITA in any case, I like my systems to fail loudly when > something's wrong in the configs (with details about what's happening, > of course). > > -Bertrand > I think it's interesting seeing the difference. The system at CNET obviously needed to fail gracefully before it needed to fail fast. I have the luxury of a dev environment and fail-fast is exactly the kinda thing I want so I know about as many limitations and problems as soon as possible. Having this behavior toggled would be idea. Version the solrconfig.xml between a fail-graceful for your production branch and a fail-fast for your dev branch. Jed
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingWalter Underwood 2007-03-03, 18:17
I was bit by this, tool. It made getting started a lot harder.
I think I had something outside of an <lst> instead of inside. More recently, I got a query time exception from a mis-formatted <mm> field. Right now, Solr accesses the DOM as needed (at runtime) to fetch information. There isn't much up-front checking beyond the XML parser. wunder On 3/3/07 12:50 AM, "Bertrand Delacretaz" <[EMAIL PROTECTED]> wrote: > On 3/3/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: > >> ...The rationale with the solrconfig stuff is that a broken config should >> behave as best it can. This is great if you are running a real site >> with people actively using it - it is a pain in the ass if you are >> getting started and don't notice errors.... > > I think it's a PITA in any case, I like my systems to fail loudly when > something's wrong in the configs (with details about what's happening, > of course). > > -Bertrand
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingYonik Seeley 2007-03-03, 18:26
On 3/2/07, Ryan McKinley <[EMAIL PROTECTED]> wrote:
> How do you all feel about returning an error when you add a document > with unknown fields? +1 dynamicField definitions can be used if desired (including "*" to match every undefined field). -Yonik
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingJed Reynolds 2007-03-03, 20:10
Yonik Seeley wrote:
> On 3/2/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: >> How do you all feel about returning an error when you add a document >> with unknown fields? > > +1 > > dynamicField definitions can be used if desired (including "*" to > match every undefined field). If dynamicField definitions are removed from the schema.xml file (and your fields are not referencing them), does this have the same effect of disabling unknown-field generation? Jed
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingYonik Seeley 2007-03-03, 20:17
On 3/3/07, Jed Reynolds <[EMAIL PROTECTED]> wrote:
> If dynamicField definitions are removed from the schema.xml file (and > your fields are not referencing them), does this have the same effect of > disabling unknown-field generation? Yes. You should get an error if you add a document with a field that doesn't match a defined field or a dynamic field. There still may be a bug that Ryan mentioned about unknown fields simply being ignored, but that should be fixed if true. -Yonik
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingChris Hostetter 2007-03-03, 21:39
: I almost didn't notice the exception fly by because there's soooo much : log output, and I can see why I might not have noticed. Yay for : scrollback! (Hrm, I might not have wanted to watch logging for 4 : instances of solr all at once. Might explain why so much logging.) FYI: Solr logs a lot of stuff at the INFO and DEBUG levels, but "errors" will always be at the SEVERE level (unless they aren't actualy SEVERE and are just exceptions encountered during trivial unimportant things in which case they are loged at the WARNING level) it's up to your servlet container how verbose to be (ie: what level to log) you should be able to configure it to put WARNING and SEVERE messages in a seperate log file even. : > Anyway, I agree that some config errors could be handled in a more : > user-friendly manner, and it would be nice if config failures could : > make it to the front-page admin screen or something. : : That would groovy! i've been thinking a Servlet that didn't depend on any special Solr code (so it will work even if SolrCore isn't initialized) but registeres a log handler and records the last N messages from Solr above a certain level would be handy to refer people to when they are having issues and aren't overly comfortable with log files. -Hoss
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingChris Hostetter 2007-03-03, 21:41
: > I spent a long time tracking down an error with a document set with an : > uppercase field name to something configured with a lowercase field. : Isn't this the kind of error that XML validation is supposed to address? it could be ... except that: 1) we can't using standard DTD/XSD style validation because we don't know all the field names (not to mention dynamic fields) 2) XML is just one of hte transoports for sending updates ... we expect to support a lot more customizable formats in the near future. -Hoss
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingChris Hostetter 2007-03-03, 21:43
: Right now, Solr accesses the DOM as needed (at runtime) to fetch : information. There isn't much up-front checking beyond the XML : parser. bingo, and adding more upfront checking is hard for at least two reasons i can think of... 1) keeping a DTD up to date is a pain sa new features are added 2) the way some options are passed to plugable classes makes it impossible to validate (ie: tokenizers, caches, etc...) -Hoss
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingRyan McKinley 2007-03-03, 21:57
>
> There still may be a bug that Ryan mentioned about unknown fields > simply being ignored, but that should be fixed if true. > I just looked into this - /trunk code is fine. I wasn't noticing the errors because the response code is always 200 with an error included in the xml. My code was only checking errors on non-200 response codes Is there enough general interest in having error response codes to change the standard web.xml config to let the SolrDispatchFilter handle /select? <init-param> <param-name>handle-select</param-name> <param-value>true</param-value> </init-param>
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingYonik Seeley 2007-03-03, 22:26
On 3/3/07, Ryan McKinley <[EMAIL PROTECTED]> wrote:
> Is there enough general interest in having error response codes to > change the standard web.xml config to let the SolrDispatchFilter > handle /select? /select should already use HTTP error codes, right? -Yonik
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingRyan McKinley 2007-03-03, 23:15
On 3/3/07, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> On 3/3/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: > > Is there enough general interest in having error response codes to > > change the standard web.xml config to let the SolrDispatchFilter > > handle /select? > > /select should already use HTTP error codes, right? > i see whats happening... I ran into this while writing the SolrDispatchFilter - had me stumped for a while. The SolrServlet passes along the status code from a SolrException. This works great if you throw a SolrException with a 'valid' HTTP status code (400, etc). But MANY of the SolrExceptions use a status code '1'. Then it depends on the servlet container what is actually sent to the client. I know resin and jetty do different things. In the SolrDispatchFilter, I send a HTTP status code 500 if the SolrException status is less then 100.
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingRyan McKinley 2007-03-03, 23:20
/update
does send 200 even if there was an error. after SOLR-173 we may want to change the default solrconfig to map /update so that everything has a consistent error format. On 3/3/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: > On 3/3/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: > > Is there enough general interest in having error response codes to > > change the standard web.xml config to let the SolrDispatchFilter > > handle /select? > > /select should already use HTTP error codes, right? > > -Yonik >
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingRyan McKinley 2007-03-04, 00:09
For anyone not on the dev list, I just posted:
http://issues.apache.org/jira/browse/SOLR-179 so it is not lost, I also posted Otis' bug report: http://issues.apache.org/jira/browse/SOLR-180
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingYonik Seeley 2007-03-04, 01:54
On 3/3/07, Ryan McKinley <[EMAIL PROTECTED]> wrote:
> But MANY of the SolrExceptions use a status > code '1'. Hmmm, I did an audit of the exceptions before we entered the incubator, and I thought I caught all the ones that generated anything out of the 400 and 500 range and could be thrown during a query (most of the "1" return codes had to do with schema or config parsing I think). Any I missed should be fixed. > Then it depends on the servlet container what is actually > sent to the client. I know resin and jetty do different things. In > the SolrDispatchFilter, I send a HTTP status code 500 if the > SolrException status is less then 100. That sounds fine. I didn't realize it could even vary by container. -Yonik
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingRyan McKinley 2007-03-04, 02:08
On 3/3/07, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> On 3/3/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: > > But MANY of the SolrExceptions use a status > > code '1'. > > Hmmm, I did an audit of the exceptions before we entered the incubator, and > I thought I caught all the ones that generated anything out of the 400 > and 500 range > and could be thrown during a query (most of the "1" return codes had > to do with schema or config parsing I think). > > Any I missed should be fixed. > I clearly overstated the case with "MANY" - and you are right, none are reachable from /select, so i must be off base about the /select response code stuff. quick search shows IndexSchema.java - 3, "1" status codes DirectUpdateHandler.java - 2, "2" status codes UpdateHandler.java - 2, "1" status codes everthing else has 500,400,503
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingJed Reynolds 2007-03-04, 04:09
Chris Hostetter wrote:
> : I almost didn't notice the exception fly by because there's soooo much > : log output, and I can see why I might not have noticed. Yay for > > you should be able to configure it to put WARNING and SEVERE messages in a > seperate log file even. > Certainly! I learned to reconfigure tomcat's logging when I was doing my Nutch deployment. I'm very likely going to reconfigure my logging. > i've been thinking a Servlet that didn't depend on any special Solr code > (so it will work even if SolrCore isn't initialized) but registeres a log > handler and records the last N messages from Solr above a certain level > would be handy to refer people to when they are having issues and aren't > overly comfortable with log files. > Yeah, like a ring buffer for last x number warning|severe messages. I'm pretty used to looking at apache log files. Some errors pointing out configuration or operational failure (like running out of file descriptors) on the admin and status pages would be helpful because I think that some people are probably going to check those pages first, possibly because they're deving and not necessarily watching logs. I'd still use Solr even if it didn't have a logging servlet, tho ;-) Jed
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingWalter Underwood 2007-03-04, 16:52
On 3/3/07 1:43 PM, "Chris Hostetter" <[EMAIL PROTECTED]> wrote:
> : Right now, Solr accesses the DOM as needed (at runtime) to fetch > : information. There isn't much up-front checking beyond the XML > : parser. > > bingo, and adding more upfront checking is hard for at least two reasons i > can think of... > > 1) keeping a DTD up to date is a pain sa new features are added > 2) the way some options are passed to plugable classes makes it impossible > to validate (ie: tokenizers, caches, etc...) I was thinking of translating the config file into internal config properties when it was read, and logging Solr specific errors then. Things like "I can't load this class" are pretty easy at that poin. DTDs are inadequate and XML Schema is horrid, plus the error messages from either would be not particularly useful. wunder
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingChris Hostetter 2007-03-04, 23:01
:
: > : Right now, Solr accesses the DOM as needed (at runtime) to fetch : > : information. There isn't much up-front checking beyond the XML : > : parser. : I was thinking of translating the config file into internal config : properties when it was read, and logging Solr specific errors then. : Things like "I can't load this class" are pretty easy at that poin. most of that work is done right now when the solrconfig.xml and schema.xml are loaded ... any missing classes should be logged as errors immediately. I'm actaully haven't a hard time thinking of what kinds of "just in time" DOM walking is delayed until request ... all of the feld names are already known, the analyzers are built, the requesthandlers and responsewriters all exist and have been initialized ... what stuff isn't checked until a request comes in? -Hoss
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingWalter Underwood 2007-03-05, 00:44
On 3/4/07 3:01 PM, "Chris Hostetter" <[EMAIL PROTECTED]> wrote:
> I'm actaully haven't a hard time thinking of what kinds of "just in time" > DOM walking is delayed until request ... all of the feld names are already > known, the analyzers are built, the requesthandlers and responsewriters > all exist and have been initialized ... what stuff isn't checked until a > request comes in? I had <mm> (minimum match) blow up at query time with a number format exception (this is from memory). I had silent a error that I can't remember the details of, but it was something like putting the <str> for boost functions outside the <lst>. It didn't blow up, but it was a nonsense config that was accepted. wunder
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingChris Hostetter 2007-03-05, 21:08
: I had <mm> (minimum match) blow up at query time with a number : format exception (this is from memory). That's a RequestHandler specific request param that can also be specified as a default/invarient/appended init param ... i'm not sure that SolrCore could do much to validate that when parsing the solrconfig.xml. DisMaxRequestHandler could possible throw an exception from it's init method if it sees param it recognizes but can't parse ... but that's a dangerous road to go down ... what if i want to subclass DisMaxRequestHandler and change hte format of the "mm" param? One thing you could do to ensure that your RequestHandler configuration makes sense without waiting for an error generated by a request, is to put in some explicit cache warming as part of the firstSearcher listener that hits each configured requestHandler with the minimal amount of input you expect ... then you'll see an error in your log immediately : I had silent a error that I can't remember the details of, but it : was something like putting the <str> for boost functions outside : the <lst>. It didn't blow up, but it was a nonsense config that : was accepted. again, there's nothing erroneous about having a <str> outside of a <lst> when specifing the init params of a RequestHandler as far as SolrCore is concerned ... it has no idea what types of init params the RequestHandler wants ... and the StandardRequestHandler could say that if it sees any top level init params which aren't "defaults", "invarients" or "appended" then it could complain ... but again: what if i subclass StandardRequestHandler and i want to add some custom init param to determine behavior in my subclass? -Hoss
-
Re: merely a suggestion: schema.xml validator or better schema validation loggingRyan McKinley 2007-03-05, 21:29
> : I had silent a error that I can't remember the details of, but it
> : was something like putting the <str> for boost functions outside > : the <lst>. It didn't blow up, but it was a nonsense config that > : was accepted. > > again, there's nothing erroneous about having a <str> outside of a <lst> > when specifing the init params of a RequestHandler as far as SolrCore is > concerned ... it has no idea what types of init params the RequestHandler > wants ... and the StandardRequestHandler could say that if it sees > any top level init params which aren't "defaults", "invarients" or > "appended" then it could complain ... but again: what if i subclass > StandardRequestHandler and i want to add some custom init param to > determine behavior in my subclass? > One trick i have used elsewhere is to output the loaded config and compare it to the initalazation config - if they are different, there may be a problem. We could pretty easily add a utility method like this to RequestHandlerBase and let RequestHandler's 'validate' their config in init() - It would not be an automatic thing that applies to every request handler, but adding some validation to DisMaxRequestHandler and StandardRequestHandler would take care of most problems (especially for beginners) ryan |