-RE: Solr MultiValue Fields and adding values
Dyer, James 2011-10-19, 14:26
While Solr/Lucene can't support true document updates, there are 2 ways you might be able to work around this in your situation.
1. If you store all of the fields, you can write something that will read back everything already indexed to the document, append whatever data you want, then write it back. This will increase index size and possibly make indexing too slow. On the other hand, it might be more efficient than requiring the database to return everything in order.
2. You could store your data as multiple documents per id (pick something else as your unique id). Then use the grouping functionality to roll up on your unique id whenever you query. This will mean changes to your application, probably a bigger index, and likely somewhat slower querying. But the performance losses might be slight and this seems to me like it maybe would be a good solution in your case. Perhaps it would make it so you wouldn't have to entirely re-index each month or so. See http://wiki.apache.org/solr/FieldCollapsing for more information.
Ingram Content Group
From: Tiernan OToole [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, October 19, 2011 5:11 AM
To: [EMAIL PROTECTED]
Cc: Otis Gospodnetic
Subject: Re: Solr MultiValue Fields and adding values
-----BEGIN PGP SIGNED MESSAGE-----
I was hoping that wasent going to be the case... I ended up querying for
all unique IDs in the DB, and then querying for each unique ID and
getting all names, and then inserting them that way... Seems a lot
slower than in theory it really should be...
On 18/10/2011 23:20, Otis Gospodnetic wrote:
> You'll need to construct the whole document and index it as such. You
can't append values to document fields.
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>> From: Tiernan OToole <[EMAIL PROTECTED]>
>> To: [EMAIL PROTECTED]
>> Sent: Tuesday, October 18, 2011 11:41 AM
>> Subject: Solr MultiValue Fields and adding values
>> Good morning.
>> I asked this question on StackOverflow, but though this group may be
>> able to help... the question is available on SO here:
>> here goes:
>> I am building a search engine, and have a not so unique ID for a lot of
>> different names... So, for example, there could be an id of B0051QVF7A
>> which would have multiple names like "Kindle" "Amazon Kindle" "Amazon
>> Kindle 3G" "Kindle Ebook Reader" "New Kindle" etc.
>> The problem, and question i have, is that i am trying to enter this data
> >from a DB of 11 ish million rows. each is being read one at a time. So i
>> dont have all the names of each ID. I am adding new documents to the
>> list each time.
>> What i am trying to find out is how do i add names to an existing
>> Document? if i am reading documentation correctly, it seems to overwrite
>> the whole document, not add extra info to the field... i just want to
>> add an extra name to the document multivalue field...
>> I know this could cause some weird and wonderful "issues" if a name is
>> removed (in the example above, "New Kindle" could be removed when a
>> newer Kindle gets released) but i am thinking of recreating the index
>> every now and again, to clear out issues like that (once a month or so.
>> Its taking about 45min currently to create the index).
>> So, how do you add a value to a multivalue field in solr for an existing
>> Thanks in advance.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
-----END PGP SIGNATURE-----