OSDN > 浏览软件 > Desktop Environment > Fonts > Tsukurimashou Font Family and IDSgrep > Ticket List/Search > 任务单 #33629

Tsukurimashou Font Family and IDSgrep

任务单 #33629
Ticket List Submit New Ticket RSS

Syntax errors in CHISE and CJKVI databases

开放日期: 2014-04-04 03:06 最后更新: 2014-04-04 03:07

monitor

报告人:

mskala

属主:

mskala

类型:

Bugs

状态:

开启 [Owner assigned]

组件:

IDSgrep

里程碑:

(无)

优先:

5 - Medium

严重性:

5 - Medium

处理结果:

无

文件:

无

Details

Despite our detection and filtering of some errors of this type, the CHISE and CJKVI databases compiled by the IDSgrep build process contain some "entries" that are not single syntactically valid EIDSes. This is caused by syntax errors in the original databases we are looking at, and is visible at the output in discrepancies between the number of lines in a result set and the count reported by --statistics. Those two numbers should differ when the multi-line headers from the dictionaries are included in the results, but only then - all actual dictionary entries should be single-line. Usually what happens is that a partial entry on one line will consume one or two entries on following lines to make up its missing children, so the tree count ends up smaller than the line count. Lines are not special to the EIDS parser.

Since this is properly an issue with the input data which we didn't write (IDSgrep is functioning correctly, given its specifications and the bad data), and there's no way to really fix it right short of creating our own replacement dictionary entries for the bad ones, it may not be top priority; but it's not nice for speed tests because it means we can't just count lines to count matches but must capture and sum the STATS lines. Filing it as a bug and not a hairy yak, though, because we're already attempting to filter out bad data in input dictionaries and that filtering has evidently failed in this case. Maybe consider a syntax-check feature to *make* lines special to the EIDS parser and throw an error if there is a tree incomplete at line end; then errors of this type could at least be detected during dictionary creation.

任务单历史 (2/2 Histories)

2014-04-04 03:06 Updated by: mskala

New Ticket "Syntax errors in CHISE and CJKVI databases" created

2014-04-04 03:07 Updated by: mskala

Details Updated

Attachment File List

No attachments