[Pgbigm-hackers] Adding similarity() and similarity_op(), '%' to pg_bigm

Back to archive index

Fujii Masao masao****@gmail*****
2013年 10月 3日 (木) 20:08:55 JST


On Thu, Oct 3, 2013 at 7:55 PM, Fujii Masao <masao****@gmail*****> wrote:
> On Thu, Oct 3, 2013 at 7:34 PM, Amit Langote <amitl****@gmail*****> wrote:
>> On Thu, Oct 3, 2013 at 7:28 PM, Fujii Masao <masao****@gmail*****> wrote:
>>> On Thu, Oct 3, 2013 at 5:52 PM, Amit Langote <amitl****@gmail*****> wrote:
>>>> On Thu, Oct 3, 2013 at 5:41 PM, Fujii Masao <masao****@gmail*****> wrote:
>>>>
>>>>>> So the way to make similarity function case-insensitive would be to change
>>>>>> generate_bigm and not the similarity code itself. Also, the change will make
>>>>>> the show_bigm function behave differently.
>>>>>
>>>>> Yes, generate_bigm would need to be updated to make bigm_similarity
>>>>> case-sensitive.
>>>>>
>>>>> *From a user point of view*, bigm_similarity() and upcoming similarity
>>>>> search should be case-sensitive? If yes, we should change generate_bigm,
>>>>> but its change must not affect the behavior of the full-text search at all.
>>>>>
>>>>> Or we should just implement both case-sensitive and -insensitive
>>>>> bigm_similarity() and similarity search?
>>>>>
>>>>> Thought?
>>>>
>>>> Could we say that case-sensitivity applies more to the comparison
>>>> functions as in strcmp() vs strcmpi() than similarity() function? What
>>>> do you think?
>>>
>>> It depends the whole design of the full-text (similarity) search, I think.
>>> For example, pg_trgm's comparison function is case-*sensitive*.
>>> Its generate_trgm function is case-*insensitive*.
>>>
>>
>> How about make generate_bigm() accept a case sensitivity parameter?
>
> Yep, I had the same idea.
>
>> We could make some parts case-sensitive (text matching) while other
>> parts case-insensitive (similarity())?
>
> On the second thought, I'm afraid that this means that we need to build
> two kinds (case-sensitive and -insensitive) of GIN indexes when we'd
> like to use both text-matching and similarity search.

Anyway, my opinion is just to implement case-sensitive similarity search
first (i.e., generate_bigm doesn't need to be changed). It's simple. Then
we can improve the feature later if needed.

Regards,

-- 
Fujii Masao




Pgbigm-hackers メーリングリストの案内
Back to archive index