任务单 #30827

User-defined matching predicate

开放日期: 2013-02-22 05:07 最后更新: 2013-08-02 06:00

报告人:
属主:
状态:
关闭
组件:
优先:
7
严重性:
5 - Medium
处理结果:
Accepted
文件:

Details

Allow specification on the command line of a file that in some way lists kanji (EIDS heads), so that "is on the list" becomes available as a matching predicate. The killer app: I'd like to be able to search KanjiVG or CHISE for kanji that are not in Tsukurimashou but are lr or tb combinations of kanji in Tsukurimashou.

Detailed proposal: one or more command-line options to specify user-defined matching predicates. These might include: specify a literal string in EIDS match-pattern syntax (matches anything the string does); specify a text file name and match any non-blank character in the file, or any line; specify a font file (OTF format, other?) and match any character that is defined in the font. New matching operator, unary .user. or .#. (with sugary brackets), which looks at the head of its child as a decimal number (considered equal to 1 if not otherwise valid), and takes that as the 1-based index into the list of user-defined matching predicates that have been specified. Using the head of the child means that in the common case of no more than nine user-defined predicates, we can use simple "#1", "#2", "#3" syntax which looks like argument-substitution in such languages as TeX.

Example of use:

idsgrep --user-predicate Tsukurimashou.otf -dkanjivg '&!#1|[lr]#1#1[tb]#1#1'

(beware editing this - the Wiki software here will garble it if not carefully escaped with extra exclamation points)

That would mean "match in KanjiVG dictionary, any kanji not in the font file, where the kanji is either an [lr] or a [tb] combination of two kanji that are in the font file." Such a query would return kanji that might easily be added to the font.

任务单历史 (3/3 Histories)

2013-02-22 05:07 Updated by: mskala
  • New Ticket "User-defined matching predicate" created
2013-02-22 05:12 Updated by: mskala
  • Details Updated
2013-08-02 06:00 Updated by: mskala
  • 处理结果 Update from to Accepted
  • 状态 Update from 开启 to 关闭
  • Ticket Close date is changed to 2013-08-02 06:00
评论

This basically works. I'd be happier if I had a wider range of font files to test on, but that can happen later.

Attachment File List

No attachments

编辑

You are not logged in. I you are not logged in, your comment will be treated as an anonymous post. » 登录名