Amit Langote
amitl****@gmail*****
2013年 10月 28日 (月) 11:47:28 JST
On Fri, Oct 25, 2013 at 10:17 PM, Beena Emerson <memis****@gmail*****> wrote: > Hello, > > I have been trying with few different encoding and locales, but so far I > have not been able to understand where exactly pg_bigm will fail if > different locale or encoding is used. > > There is difference in the way the bi-grams are sorted but that is expected > I guess. Besides that there isn't any unexpected behavior. > pg_bigm uses bttextcmp() which in turn uses varstr_cmp(). When using the C locale, varstr_cmp() simply does a primitive memcmp() to compare strings. So, beside ASCII alphanums, we can't really rely on the sorting order of strings in other languages when using C locale for the database. So, is this restriction more from the performance perspective or for correctness? Am I missing something here? -- Amit