-Re: Trending topics?
Lance Norskog 2012-08-02, 22:48
Two easy ones:
1) Facets on a text field are simple word counts by document.
2) If you want the number of times a word appears inside a document,
that requires a separate dataset called a 'term vector'. This is a
list of all words in a document with a count for each one.
These are simple queries. There are also batch computations where you
create a 'term-document matrix', with a row for each document and a
column for all terms that appear in any document. These computations
require exporting all of your data into a separate computation.
On Thu, Aug 2, 2012 at 1:26 PM, Chris Dawson <[EMAIL PROTECTED]> wrote:
> Thanks for your response.
> I'd like to put an arbitrary set of text into Solr and then have Solr tell
> me the ten most popular "topics" that are in there. For example, if I put
> in 100 paragraphs of text about sports, I would like to retrieve topics
> like "swimming, basketball, tennis" if the three most popular and discussed
> topics are those inside the text.
> Is Solr the correct tool to do something like this? Or, is this too
> unstructured to get this kind of result without manually categorizing it?
> Is the correct term for this faceting? It seems to me that faceting
> requires putting the data into a more structured format (for example,
> telling the index that this is the "manufacturer", etc.)
> Basically, I would like to get something like a tag cloud (relevant topics
> with weights for each term) without asking users to tag things manually.
> On Thu, Aug 2, 2012 at 3:25 PM, Tor Henning Ueland <[EMAIL PROTECTED]>wrote:
>> On Thu, Aug 2, 2012 at 5:34 PM, Chris Dawson <[EMAIL PROTECTED]> wrote:
>> > How would I generate a list of trending topics using solr?
>> By putting them in solr.
>> (Generic question get at generic answer)
>> What do you mean? Trending searches, trending data, trending documents,
>> trending what?
>> Tor Henning Ueland
> Chris Dawson
> Human potential, travel and entrepreneurship: http://webiphany.com/
> Traveling to Portland, OR? http://www.airbnb.com/rooms/58909