[Pgbigm-hackers] Regarding C locale restriction

Back to archive index

Amit Langote amitl****@gmail*****
2013年 10月 28日 (月) 11:47:28 JST


On Fri, Oct 25, 2013 at 10:17 PM, Beena Emerson <memis****@gmail*****> wrote:
> Hello,
>
> I have been trying with few different encoding and locales, but so far I
> have not been able to understand where exactly pg_bigm will fail if
> different locale or encoding is used.
>
> There is difference in the way the bi-grams are sorted but that is expected
> I guess. Besides that there isn't any unexpected behavior.
>

pg_bigm uses bttextcmp() which in turn uses varstr_cmp(). When using
the C locale, varstr_cmp() simply does a primitive memcmp() to compare
strings. So, beside ASCII alphanums, we can't really rely on the
sorting order of strings in other languages when using C locale for
the database. So, is this restriction more from the performance
perspective or for correctness?

Am I missing something here?

--
Amit




Pgbigm-hackers メーリングリストの案内
Back to archive index