sumom****@users*****
sumom****@users*****
2008年 10月 2日 (木) 17:23:48 JST
Index: julius4/mkbingram/00readme-ja.txt diff -u julius4/mkbingram/00readme-ja.txt:1.2 julius4/mkbingram/00readme-ja.txt:1.3 --- julius4/mkbingram/00readme-ja.txt:1.2 Tue Dec 18 23:08:22 2007 +++ julius4/mkbingram/00readme-ja.txt Thu Oct 2 17:23:48 2008 @@ -1,90 +1,86 @@ -MKBINGRAM(1) MKBINGRAM(1) + mkbingram +MKBINGRAM(1) MKBINGRAM(1) -NAME - mkbingram - make binary N-gram from arpa N-gram file -SYNOPSIS - mkbingram -nlr forward_ngram.arpa -nrl backward_ngram.arpa bingram +¼O + mkbingram + - oCi N-gram Ï· + +Tv + mkbingram [-nlr forward_ngram.arpa] [-nrl backward_ngram.arpa] + [-d old_bingram_file] {output_bingram_file} DESCRIPTION - mkbingram ÍCARPA`®ÌOü«/ãü« N-gram ðoCi`®Ìt@CÉ - EÏ··éc[Å·D±êðgp·é±ÆÅCJuliusÌN®ðåÉ - ¬»·é±ÆªÅ«Ü·D - - Rev.4.0 ©çÍ4-gramÈãÌN-gram൦éæ¤ÉÈèܵ½DãÀlÍ 10 - Å·D - - Oü«N-gramª "-nlr" Åwè³êCãü«N-gramªwè³ ê È ¢ ê C - mkbingram Í Oü«N-gram¾¯©çoCiN-gram𶬵ܷD±ÌoCi - N-gramðg¤Æ«CJulius Í»ÌÌ 2-gram ðgÁÄæ1pXðs¢Cæ2 - p XÅÍ»ÌOü«m¦©çãü«Ìm¦ðCxCY¥É]ÁÄZoµÈªç - F¯ðs¢Ü·D - - ãü«N-gramª "-nrl" Åwè³êCOü«N-gramªwè³ ê È ¢ ê C - mkbingramÍãëü«N-gram¾¯©çoCiN-gram𶬵ܷD±ÌoCi - N-gramðg¤Æ«CJulius Í»ÌÌãü« 2-gram ©çxCY¥É]ÁÄ - ZoµÈªçæ1pXÌF¯ðs¢Cæ2pXÅÍãü« N-gramðgÁ½F¯ð - s¢Ü·D - - ¼ûªwè³ê½Æ«ÍCOü«N-gramÌ2-gramÆãü«N-gramª³ ê - ½oCiN-gramª¶¬³êÜ·DJuliusÅÍ»ÌOü«2-gramÅæ1pXðs - ¢Cãü«N-gramÅæ2pXðs¢Ü·DȨ¼ N-gram ͯêÌR[pX©ç - ¯ êÌðiJbgItlCobNItvZû@jÅwK³êÄ èC¯ê - ÌêbðÁÄ¢éKvª èÜ·D - - mkbingram Í gzip ³k³ê½ ARPA t@Cð»ÌÜÜÇÝßÜ·D - - 4.0È~ÌJuliusÉt®ÌmkbingramðgÁÄÏ·µ½oCiN-gramt@C - ÍC 3.xÅÍÇÝßܹñÌŲӾ³¢D + mkbingram ÍCARPA`®Ì N-gram è`t@CðJuliuspÌoCiN-gram + t@CÉÏ··éc[Å·D 究ßÏ·µÄ¨±ÆÅCJuliusÌN + ®ðåɬ»Å«Ü·D + + Julius-4æèCN-gram ÍOü«Cãëü«C é¢Í¼ûðwèÅ«éæ¤ + ÉÈèܵ½Dmkbingram ÅàCÇ¿ç©êû¾¯ÅoCiN-gramð쬷 + é± ÆªÅ«Ü·DܽC¼ûðwèµ½êÍC»êç2ÂÌN-gramÍê + Ìo CiN-gramɳêÜ·D + + Oü«N-gramÌݪwè³ê½Æ«Cmkbingram Í Oü«N-gram¾¯©ço + CiN-gram𶬵ܷD±ÌoCiN-gramðg¤Æ«CJulius Í»Ì + Ì 2-gram ðgÁÄæ1pXðs¢Cæ2 p XÅÍ»ÌOü«m¦©çãü + «Ìm¦ðCxCY¥É]ÁÄZoµÈªçF¯ðs¢Ü·D + + ãü«N-gramÌݪwè³ê½Æ«CmkbingramÍãëü«N-gram¾¯©ço + CiN-gram𶬵ܷD±ÌoCiN-gramðg¤Æ«CJulius Í»Ì + Ìãü« 2-gram ©çxCY¥É]ÁÄZoµÈªçæ1pXÌF¯ðs¢C + æ2pXÅÍãü« N-gramðgÁ½F¯ðs¢Ü·D + + ¼ûªwè³ê½Æ«ÍCOü«N-gramÌ2-gramÆãü«N-gramª³ê + ½oCiN-gramª¶¬³êÜ·DJuliusÅÍ»ÌOü«2-gramÅæ1pXð + s¢Cãü«N-gramÅæ2pXðs¢Ü·DȨ¼ N-gram ͯêÌR[pX + ©ç¯ êÌðiJbgItlCobNItvZû@jÅwK³êÄ èC + ¯êÌêbðÁÄ¢éKvª èÜ·D + + ȨCmkbingram Í gzip ³k³ê½ ARPA t@Cà»ÌÜÜÇÝßÜ + ·D + + o[W 3.x ÈOÅ쬵½oCiN-gramÍC»ÌÜÜ 4.0 ÅàÇßÜ + ·Dmkbingram É -d Å^¦é±ÆÅCâoCi`® ðVµ¢oCi` + ®ÉÏ··é±ÆàÅ«Ü·DȨC4.0 È~Ì mkbingram Å쬵½oCi + N-gramt@CÍ3.x ÈOÌo[WÅÍ g¦Ü¹ñÌŲӾ³ + ¢D OPTIONS - -nlr forward_ngram.arpa - ARPAW`®ÌOü«Pê N-gram t@CD - - -nrl backward_ngram.arpa - ARPAW`®Ìtü«Pê N-gram t@CD - - -d oCiN-gram - üÍÆ·éoCiN-gramt@CiâoCiN-gramÌÄÏ·pj - - bingram - oÍt@CiJuliuspoCi`®j + -nlr forward_ngram.arpa + Oü«ileft-to-rightjÌARPA`® N-gram t@CðÇÝÞ -EXAMPLE - ARPA`®ÌN-gramðoCi`®ÉÏ··éF + -nrl backward_ngram.arpa + ãëü«iright-to-leftjÌARPA`® N-gram t@CðÇÝÞ - % mkbingram -nlr ARPA_2gram -nrl ARPA_rev_3gram outfile + -d old_bingram_file + oCiN-gramðÇÝÞiâoCi`®ÌÏ·pj - âoCiN-gramt@Cð3.5È~Ì`®ÉÏ··éF - - % mkbingram -d old_bingram new_bingram - - -USAGE - Julius žêfwèÉC³Ì ARPA `®t@Cð "-nlr 2gramfile - -nrl rev3gramfile" Æ·éãíèÉ mkbingram ÅÏ·µ½oCi`®t@ - Cð "-d bingramfile" ÆwèµÜ·D + output_bingram_file + oÍæÌoCiN-gramt@C¼ + +EXAMPLES + ARPA`®Ì N-gram ðoCi`®ÉÏ··éiOü«+ãëü«jF + ARPA`®ÌOü« 4-gram ðoCi`®ÉÏ··éiOü«ÌÝjF + âoCiN-gramt@Cð»ÝÌ`®ÉÏ··éF SEE ALSO - julius(1) - -BUGS - oOñEâ¢í¹ERg È Ç Í julius-info at lists.source- - forge.jp ÜŨ袵ܷD + julius ( 1 ) , + mkbinhmm ( 1 ) COPYRIGHT - Copyright (c) 1991-2007 såw Í´¤º - Copyright (c) 2000-2005 ÞÇæ[ÈwZpåw@åw 줺 - Copyright (c) 2005-2007 ¼Ã®HÆåw JuliusJ`[ + Copyright (c) 1991-2008 såw Í´¤º + + Copyright (c) 1997-2000 îñU»Æ¦ï(IPA) + + Copyright (c) 2000-2008 ÞÇæ[ÈwZpåw@åw 줺 -AUTHORS - WL (¼Ã®HÆåw) ªÀµÜµ½D + Copyright (c) 2005-2008 ¼Ã®HÆåw JuliusJ`[ LICENSE Julius ÌgpøɶܷD -4.3 Berkeley Distribution LOCAL MKBINGRAM(1) + 10/02/2008 MKBINGRAM(1) Index: julius4/mkbingram/00readme.txt diff -u julius4/mkbingram/00readme.txt:1.2 julius4/mkbingram/00readme.txt:1.3 --- julius4/mkbingram/00readme.txt:1.2 Tue Dec 18 23:08:22 2007 +++ julius4/mkbingram/00readme.txt Thu Oct 2 17:23:48 2008 @@ -1,89 +1,94 @@ + mkbingram + MKBINGRAM(1) MKBINGRAM(1) NAME - mkbingram - make binary N-gram from arpa N-gram file + mkbingram + - make binary N-gram from ARPA N-gram file SYNOPSIS - mkbingram -nlr forward_ngram.arpa -nrl backward_ngram.arpa bingram + mkbingram [-nlr forward_ngram.arpa] [-nrl backward_ngram.arpa] + [-d old_bingram_file] {output_bingram_file} DESCRIPTION - mkbingram makes a binary N-gram file for Julius from forward (left-to- - right) word N-gram and/or backward (right-to-left) word N-gram LMs in - ARPA standard format. Using the binary file, the initial startup of - Julius becomes much faster. - - From rev. 4.0, longer N-gram (N < 10) is supported. - - When only a forward N-gram is specified by "-nlr" and no backward N- - gram is specified, mkbingram generates binary N-gram for recognition - with only the forward N-gram. The 1st pass will use the 2-gram entry - in the given N-gram, and The 2nd pass will use the given N-gram, with - converting forward probabilities to backward probabilities by Bayes + mkbingram is a tool to convert N-gram definition file(s) in ARPA + standard format to a compact Julius binary format. It will speed up the + initial loading time of N-gram much faster. It can read gzipped file + directly. + + From rev.4.0, Julius can deal with forward N-gram, backward N-gram and + their combinations. So, mkbingram now generates binary N-gram file from + one of them, or combining them two to produce one binary N-gram. + + When only a forward N-gram is specified, mkbingram generates binary + N-gram from only the forward N-gram. When using this binary N-gram at + Julius, it performs the 1st pass with the 2-gram probabilities in the + N-gram, and run the 2nd pass with the given N-gram fully, with + converting forward probabilities to backward probabilities by Bayes rule. - When only a backward N-gram is specified by "-nrl" and no forward N- - gram is specified, mkbingram generates binary N-gram for recognition - with only the backward N-gram. The 1st pass will use the forward - 2-gram probability computed from the backward 2-gram using Bayes rule. - The 2nd pass fully use the given backward N-gram. - - When both forward and backward N-grams are specified, forward 2-gram - part and backward N-gram are gathered together into single bingram - file, to use the forward 2-gram for the 1st pass and backward N-gram - for the 2nd pass. Note that both N-gram should be trained in the same - corpus with same parameters (i.e. cut-off thresholds), with same vocab- - ulary. - - mkbingram can read gzipped ARPA file. + When only a backward N-gram is specified, mkbingram generates an binary + N-gram file that contains only the backward N-gram. The 1st pass will + use forward 2-gram probabilities that can be computed from the backward + 2-gram using Bayes rule, and the 2nd pass use the given backward N-gram + fully. + + When both forward and backward N-grams are specified, the 2-gram part + in the forward N-gram and all backward N-gram will be combined into + single bingram file. The forward 2-gram will be applied for the 1st + pass and backward N-gram for the 2nd pass. Note that both N-gram should + be trained in the same corpus with same parameters (i.e. cut-off + thresholds), with same vocabulary. + + The old binary N-gram produced by mkbingram of version 3.x and earlier + can be used in Julius-4, but you can convert the old version to the new + version by specifying it as input of current mkbingram by option "-d". - Please note that binary N-gram file converted by mkbingram of version - 4.0 and later cannot be read by Julius 3.x. + Please note that binary N-gram file converted by mkbingram of version + 4.0 and later cannot be read by older Julius 3.x. OPTIONS - -nlr forward_ngram.arpa - Forward (left-to-right) word N-gram file in ARPA standard for- - mat. - - -nrl backward_ngram.arpa - Backward (right-to-left) word N-gram file in ARPA standard for- - mat. - - -d old_bingram - Read in an old binary N-gram file (for conversion to the new - format). - - bingram - output binary N-gram file. - -EXAMPLE - Convert ARPA files to binary format: - - % mkbingram -nlr ARPA_2gram -nrl ARPA_rev_3gram outfile - - Convert old binary N-gram file to new format: - - % mkbingram -d old_bingram new_bingram - + -nlr forward_ngram.arpa + Read in a forward (left-to-right) word N-gram file in ARPA standard + format. + + -nrl backward_ngram.arpa + Read in a backward (right-to-left) word N-gram file in ARPA standard + format. + + -d old_bingram_file + Read in a binary N-gram file. + + output_bingram_file + binary N-gram file name to output. + +EXAMPLES + Convert a set of forward and backward N-gram in ARPA format into Julius + binary form: + Convert a single forward 4-gram in ARPA format into a binary file: + Convert old binary N-gram file to current format: SEE ALSO - julius(1) + julius ( 1 ) , + mkbinhmm ( 1 ) , + mkbinhmmlist ( 1 ) COPYRIGHT - Copyright (c) 1991-2007 Kawahara Lab., Kyoto University - Copyright (c) 2000-2005 Shikano Lab., Nara Institute of Science and + Copyright (c) 1997-2000 Information-technology Promotion Agency, Japan + + Copyright (c) 1991-2008 Kawahara Lab., Kyoto University + + Copyright (c) 2000-2005 Shikano Lab., Nara Institute of Science and Technology - Copyright (c) 2005-2007 Julius project team, Nagoya Institute of Tech- - nology -AUTHORS - LEE Akinobu (Nagoya Institute of Technology, Japan) - contact: juliu****@lists***** + Copyright (c) 2005-2008 Julius project team, Nagoya Institute of + Technology LICENSE - Same as Julius. + The same as Julius. -4.3 Berkeley Distribution LOCAL MKBINGRAM(1) + 10/02/2008 MKBINGRAM(1) Index: julius4/mkbingram/mkbingram.man diff -u julius4/mkbingram/mkbingram.man:1.2 julius4/mkbingram/mkbingram.man:removed --- julius4/mkbingram/mkbingram.man:1.2 Tue Dec 18 23:08:22 2007 +++ julius4/mkbingram/mkbingram.man Thu Oct 2 17:23:48 2008 @@ -1,83 +0,0 @@ -.de Sp -.if t .sp .5v -.if n .sp -.. -.de Ip -.br -.ie \\n.$>=3 .ne \\$3 -.el .ne 3 -.IP "\\$1" \\$2 -.. -.TH MKBINGRAM 1 LOCAL -.UC 6 -.SH NAME -mkbingram - make binary N-gram from arpa N-gram file -.SH SYNOPSIS -.B mkbingram -nlr forward_ngram.arpa -nrl backward_ngram.arpa bingram -.SH DESCRIPTION -.I mkbingram -makes a binary N-gram file for Julius from forward (left-to-right) -word N-gram and/or backward (right-to-left) word N-gram LMs in ARPA -standard format. Using the binary file, the initial startup of Julius -becomes much faster. -.PP -From rev. 4.0, longer N-gram (N < 10) is supported. -.PP -When only a forward N-gram is specified by "-nlr" and no backward -N-gram is specified, mkbingram generates binary N-gram for recognition -with only the forward N-gram. The 1st pass will use the 2-gram entry -in the given N-gram, and The 2nd pass will use the given N-gram, with -converting forward probabilities to backward probabilities by Bayes -rule. -.PP -When only a backward N-gram is specified by "-nrl" and no forward -N-gram is specified, mkbingram generates binary N-gram for recognition -with only the backward N-gram. The 1st pass will use the forward -2-gram probability computed from the backward 2-gram using Bayes rule. -The 2nd pass fully use the given backward N-gram. -.PP -When both forward and backward N-grams are specified, forward 2-gram -part and backward N-gram are gathered together into single bingram -file, to use the forward 2-gram for the 1st pass and backward N-gram -for the 2nd pass. Note that both N-gram should be trained in the same -corpus with same parameters (i.e. cut-off thresholds), with same -vocabulary. -.PP -.I mkbingram -can read gzipped ARPA file. -.PP -Please note that binary N-gram file converted by mkbingram of version -4.0 and later cannot be read by Julius 3.x. -.SH OPTIONS -.Ip "-nlr forward_ngram.arpa" -Forward (left-to-right) word N-gram file in ARPA standard format. -.Ip "-nrl backward_ngram.arpa" -Backward (right-to-left) word N-gram file in ARPA standard format. -.Ip "-d old_bingram" -Read in an old binary N-gram file (for conversion to the new format). -.Ip "bingram" -output binary N-gram file. -.SH EXAMPLE -Convert ARPA files to binary format: -.PP - % mkbingram -nlr ARPA_2gram -nrl ARPA_rev_3gram outfile -.PP -Convert old binary N-gram file to new format: -.PP - % mkbingram -d old_bingram new_bingram - -.SH "SEE ALSO" -julius(1) -.SH COPYRIGHT -Copyright (c) 1991-2007 Kawahara Lab., Kyoto University -.br -Copyright (c) 2000-2005 Shikano Lab., Nara Institute of Science and Technology -.br -Copyright (c) 2005-2007 Julius project team, Nagoya Institute of Technology -.SH AUTHORS -LEE Akinobu (Nagoya Institute of Technology, Japan) -.br -contact: juliu****@lists***** -.SH LICENSE -Same as -.I Julius. Index: julius4/mkbingram/mkbingram.man.ja diff -u julius4/mkbingram/mkbingram.man.ja:1.2 julius4/mkbingram/mkbingram.man.ja:removed --- julius4/mkbingram/mkbingram.man.ja:1.2 Tue Dec 18 23:08:22 2007 +++ julius4/mkbingram/mkbingram.man.ja Thu Oct 2 17:23:48 2008 @@ -1,86 +0,0 @@ -.de Sp -.if t .sp .5v -.if n .sp -.. -.de Ip -.br -.ie \\n.$>=3 .ne \\$3 -.el .ne 3 -.IP "\\$1" \\$2 -.. -.TH MKBINGRAM 1 LOCAL -.UC 6 -.SH NAME -mkbingram - make binary N-gram from arpa N-gram file -.SH SYNOPSIS -.B mkbingram -nlr forward_ngram.arpa -nrl backward_ngram.arpa bingram -.SH DESCRIPTION -.I mkbingram -は,ARPA形式の前向き/後向き N-gram をバイナリ形式のファイルに結合・変 -換するツールです.これを使用することで,Juliusの起動を大幅に高速化する -ことができます. -.PP -Rev.4.0 からは4-gram以上のN-gramも扱えるようになりました.上限値は 10 -です. -.PP -前向きN-gramが "-nlr" で指定され,後向きN-gramが指定されない場合, -mkbingramは前向きN-gramだけからバイナリN-gramを生成します.この -バイナリN-gramを使うとき,Julius はその中の 2-gram を使って第1パスを -行い,第2パスではその前向き確率から後向きの確率を,ベイズ則に従って -算出しながら認識を行います. -.PP -後向きN-gramが "-nrl" で指定され,前向きN-gramが指定されない場合, -mkbingramは後ろ向きN-gramだけからバイナリN-gramを生成します.このバイ -ナリN-gramを使うとき,Julius はその中の後向き 2-gram からベイズ則に従っ -て算出しながら第1パスの認識を行い,第2パスでは後向き N-gramを使った -認識を行います. -.PP -両方が指定されたときは,前向きN-gram中の2-gramと後向きN-gramが統合され -たバイナリN-gramが生成されます.Juliusではその前向き2-gramで第1パスを -行い,後向きN-gramで第2パスを行います.なお両 N-gram は同一のコーパス -から同一の条件(カットオフ値,バックオフ計算方法等)で学習されてあり, -同一の語彙を持っている必要があります. -.PP -.I mkbingram -は gzip 圧縮された ARPA ファイルをそのまま読み込めます. -.PP -4.0以降のJuliusに付属のmkbingramを使って変換したバイナリN-gramファイルは, -3.xでは読み込めませんのでご注意ください. -.SH OPTIONS -.Ip "-nlr forward_ngram.arpa" -ARPA標準形式の前向き単語 N-gram ファイル. -.Ip "-nrl backward_ngram.arpa" -ARPA標準形式の逆向き単語 N-gram ファイル. -.Ip "-d バイナリN-gram" -入力とするバイナリN-gramファイル(古いバイナリN-gramの再変換用) -.Ip "bingram" -出力ファイル(Julius用バイナリ形式) -.SH EXAMPLE -ARPA形式のN-gramをバイナリ形式に変換する: -.PP - % mkbingram -nlr ARPA_2gram -nrl ARPA_rev_3gram outfile -.PP -古いバイナリN-gramファイルを3.5以降の形式に変換する: -.PP - % mkbingram -d old_bingram new_bingram - -.SH USAGE -Julius で言語モデル指定時に,元の ARPA 形式ファイルを "-nlr 2gramfile --nrl rev3gramfile" とする代わりに mkbingram で変換したバイナリ形式ファ -イルを "-d bingramfile" と指定します. -.SH "SEE ALSO" -julius(1) -.SH BUGS -バグ報告・問い合わせ・コメントなどは -juli****@lists***** までお願いします. -.SH COPYRIGHT -Copyright (c) 1991-2007 京都大学 河原研究室 -.br -Copyright (c) 2000-2005 奈良先端科学技術大学院大学 鹿野研究室 -.br -Copyright (c) 2005-2007 名古屋工業大学 Julius開発チーム -.SH AUTHORS -李 晃伸 (名古屋工業大学) が実装しました. -.SH LICENSE -.I Julius -の使用許諾に準じます.