... the word order in target language To this end, we propose a simple but effective ranking-based approach to word reordering The ranking model is automatically derived from the word aligned ... baseline system for In order to show whether the improved performance is really due to improved reordering, we would like to measure the reorder performance directly 917 Reorder...
Ngày tải lên: 19/02/2014, 19:20
... P(e|f) ~ plm -word( eword)* plm -suffix( esuffix) * Σi=1n p(eword-j & esuffix-j|fj) * Σi=1n p(fj | eword-j & esuffix-j) Where plm -word is the n-gram language model probability over the word surface ... surface forms Similarly, plm -suffix( esuffix) is the language model probability over suffix sequences p(eword-j & esuffix-j|fj) and p(fj | eword-j & esuffix-j) are translation probabilit...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "A Discriminative Syntactic Word Order Model for Machine Translation" pdf
... the local tree order model outperformed either model by a large margin This indicates that combining syntactic (from the LTOM model) and surfacebased (from the language model) information is very ... language -model like features discriminatively to optimize ordering performance, is indeed worthwhile Next we compare data set First-pass models Model Lang Model (Permutations) Lan...
Ngày tải lên: 20/02/2014, 12:20
Tài liệu Báo cáo khoa học: "Bilingually Motivated Domain-Adapted Word Segmentation for Statistical Machine Translation" pptx
... Such a segmentation process in the training stage facilitates the utilisation of word lattice decoding Bootstrapped word segmentation Once the candidates are extracted, we perform word segmentation ... Stanford segmenters utilise machine learning techniques, with Hidden Markov Models for ICT (Zhang et al., 2003) and conditional random fields for the Stanford segmenter (Tseng...
Ngày tải lên: 22/02/2014, 02:20
Báo cáo khoa học: "Tailoring Word Alignments to Syntactic Machine Translation" docx
... choice in stage is to S TOP at the current leaf, then stage and are unnecessary Hence, a choice to S TOP immediately is a choice to emit another foreign word from the current English word We flatten ... to the third line Conclusion In light of the need to reconcile word alignments with phrase structure trees for syntactic MT, we have proposed an HMM-like model whose distort...
Ngày tải lên: 08/03/2014, 02:21
Báo cáo khoa học: "Word Sense Disambiguation Improves Statistical Machine Translation" docx
... techniques for lexical selection in statistical machine translation Technical report, University of Maryland M Carpuat and D Wu 2005 Word sense disambiguation vs statistical machine translation In Proc ... classification for word sense disambiguation with a kernel PCA model In Proc of SENSEVAL-3, pages 88–92 D Vickrey, L Biewald, M Teyssier, and D Koller 2005 Word -sense disa...
Ngày tải lên: 08/03/2014, 02:21
Báo cáo khoa học: "Multi-Engine Machine Translation Guided by Explicit Word Matching" docx
... the word- alignment matcher provides three main benefits First, it explicitly identifies translated words that appear in multiple MT translations, allowing the MEMT algorithm to reinforce words ... hypotheses is extended by incorporating an additional word from one of the original translations For each partial hypothesis, a data-structure keeps track of the words from the original transl...
Ngày tải lên: 08/03/2014, 04:22
Báo cáo khoa học: "Combining Word-Level and Character-Level Models for Machine Translation Between Closely-Related Languages" ppt
... alignments for MK–EN and EN–BG, from which we extracted four conditional lexical translation probabilities: Pr(m|e) and Pr(e|m) for MK–EN, and Pr(b|e) and Pr(e|b) for EN–BG, where m, e, and b stand for ... of characterand word-level translation models for translating between closely-related languages with scarce resources In future work, we want to use such a m...
Ngày tải lên: 23/03/2014, 14:20
Báo cáo khoa học: "Pseudo-word for Phrase-based Machine Translation" pot
... Baseline Performance Our baseline system feeds word into PB-SMT pipeline We use GIZA++ model for word alignment, use Moses for phrase-based decoding The setting of language model order for each ... of machine translation: Parameter estimation Computational Linguistics, 19:263–312 P.-C Chang, M Galley, and C D Manning 2008 Optimizing Chinese word segmentation for machine translat...
Ngày tải lên: 23/03/2014, 16:20
Báo cáo khoa học: "Diversify and Combine: Improving Word Alignment for Machine Translation on Low-Resource Languages" docx
... Koehn, and Ivona Kuˇ erov´ c a 2005 Clause restructuring for statistical machine translation In Proc of ACL, pages 531–540 25 Yonggang Deng and Bowen Zhou 2009 Optimizing word alignment combination ... estimation Computational Linguistics, 19(2):263–311 In this work, we have presented a word alignment combination method that improves both the alignment quality and the t...
Ngày tải lên: 23/03/2014, 16:20
Báo cáo khoa học: "A Word-Class Approach to Labeling PSCFG Rules for Machine Translation" pot
... labeling phrase pairs to create automatically learned PSCFG rules for machine translation Crucially, our methods only rely on “shallow” lexical tags, either generated by POS taggers or by automatic ... unsupervised clustering approaches, we first need to decide how to determine the number of word classes, N A straightforward approach is to run experiments and report test se...
Ngày tải lên: 23/03/2014, 16:20
Báo cáo khoa học: "Post-ordering by Parsing for Japanese-English Statistical Machine Translation" potx
... a max-chart-span 15 for the hierarchical phrase-based SMT We used distortion limits of 12 or 20 for PBMT and a max-chart-span 15 for HPBMT The parameters for SMT were tuned by MERT using the first ... Feature Forest Models for Probabilistic HPSG Parsing In Computational Linguistics, Volume 34, Number 1, pages 81–88 Slav Petrov and Dan Klein 2007 Improved Inference for Unlexical...
Ngày tải lên: 30/03/2014, 17:20
Báo cáo khoa học: "Distributed Word Clustering for Large Scale Class-Based Language Modeling in Machine Translation" docx
... progress in language modeling Technical report, Microsoft Research Reinherd Kneser and Hermann Ney 1993 Improved clustering techniques for class-based statistical language modelling In Proceedings ... the largest improvements using a specific clustering for the last word of each trigram but no clustering at all for the first two word positions Generalizing this leads...
Ngày tải lên: 31/03/2014, 00:20
Báo cáo khoa học: "Improved Word-Level System Combination for Machine Translation" doc
... - the TER tuned combination is the best in terms of TER, the BLEU tuned in terms of BLEU, and the METEOR tuned in 318 Chinese test system A system B system C system D system E system F no weights ... and BLEU, and lower-case METEOR scores on Arabic NIST MT03+MT04 Arabic test system A system B system C system D system E system F no weights baseline TER tuned BLEU tune...
Ngày tải lên: 31/03/2014, 01:20
Báo cáo khoa học: "Statistical Machine Translation with Word- and Sentence-Aligned Parallel Corpora" potx
... sentence-aligned data 0.29 5.4 Ratio of word- to sentence-aligned data We also varied the ratio of word-aligned to sentence-aligned data, and evaluated the AER and Bleu scores, and assigned high value to ... weight the relative contributions of the word-aligned and sentence-aligned data, and relating it to the ratio experiments Showing that improvements to AER and trans...
Ngày tải lên: 31/03/2014, 03:20