Wednesday, June 4, 2003 03:55 PM
Is there an easy way to get the number of terms in common between the query and each returned document? I need to build a custom similarity function: 2C/Q+D, where C is the number of terms in common, Q is the number of terms in the query and D is the number of terms in the document.
Yes this is a strange thing to do for typical IR, but we are implementing a fast search engine that will give us string similarity scores for an input (query) and each string stored in a DB (docs). This will be part of a bigger system.