Monday, May 12, 2008

say what????

Today I sent this email to a discussion list about the SOLR software:

"I'm experienced with Lucene, less so with SOLR. I am looking at two systems built on top of SOLR for a library discovery service: blacklight and vufind.

I checked the raw lucene index using Luke and noticed that both of these indexes have single character terms in the index, such as "d" or "f". I asked about this on the vufind list, and was told I didn't understand SOLR and why it would need these.

So I'm now asking: why would SOLR want single character terms? "a" is usually a stopword. I know the Library MARC data from which the index is derived has a lot of these characters because they denote subfields in the data. But why would we want them to be searchable?"

I got a personal email in response:

"Here is a trick: do not use a feminine name on the Internet :) The way you described this sounds very patronizing."

Huh????

Happily, a respected and knowledgeable poster answered my question on the list.

No comments: