Saturday, August 17, 2013

Make Custom Fields in Coldfusion Solr Collections not Searchable

How

Here is how I did it, might not be the best way, but it works:

  • Update Solr schema
    • Change custom fields definition to: type: string, indexed: false
    • Comment out all the copyField definitions for the custom fields
  • Purge all Solr collections and reindex all of them
Details of the XML Changes
Changes of custom field definitions
Before:
<field name="custom1"   type="text"   indexed="true" stored="true" required="false" />
<field name="custom2"   type="text"   indexed="true" stored="true" required="false" />
<field name="custom3"   type="text"   indexed="true" stored="true" required="false" />
<field name="custom4"   type="text"   indexed="true" stored="true" required="false" />
After:
<field name="custom1"   type="string"   indexed="false" stored="true" required="false" />
<field name="custom2"   type="string"   indexed="false" stored="true" required="false" />
<field name="custom3"   type="string"   indexed="false" stored="true" required="false" />
<field name="custom4"   type="string"   indexed="false" stored="true" required="false" />
Changes for the copyField definitions
Before:
<copyField source="contents_*"  dest="contents" />
<copyField source="custom*"  dest="contents" />
<copyField source="title"  dest="contents" />
<copyField source="contents_*"  dest="contentsExact" />
<copyField source="custom*"  dest="contentsExact" />

After
<copyField source="contents_*"  dest="contents" />
<!--
<copyField source="custom*"  dest="contents" />
-->
<copyField source="title"  dest="contents" />
<copyField source="contents_*"  dest="contentsExact" />
<!-- 
<copyField source="custom*"  dest="contentsExact" />
-->
Environment
Coldfusion 10

Why


Why do I need to make custom fields not searchable?

Migrating Coldfusion collections from Verity to Solr is never straightforward. See my other post: Solr in Coldfusion 9 for some lessons I learn during migration.

Fast forward to last week (a couple of months after the production servers are migrated), another surprise surfaced. Some search will result in random text that does not appear to be relevant. After some investigation, I finally found out it's due to unexpected behavior of custom fields.

Custom field in Verity was not seem to be searchable. After migration to Solr, the custom fields are indexed and became searchable. It may be a good thing for people who want it to be searchable. But, in my case they were used for storage only, contains database ID and other contents that should not be used for search purpose.

Parting Words

It took me nearly half a day to finally figure out how to do it. Hopefully it can save someone a few hours of frustration. I started by naively changing only custom fields to index: false, and turns out they are still searchable. Google search did not turn up anything helpful, except a few posts asking the exact question I was asking. Finally after exploring the Solr admin pages, and looking at schema of the collections, I found out that they are also copied into searchable fields, and type: text will also trigger analyzer on it.

No comments: