-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathPirinen-2018-dgfs-apertium-fin-deu.html
77 lines (72 loc) · 4.82 KB
/
Pirinen-2018-dgfs-apertium-fin-deu.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
<!DOCTYPE html><html>
<head>
<title>Rule-based machine-translation between Finnish and German</title>
<!--Generated on Tue Mar 13 16:25:56 2018 by LaTeXML (version 0.8.2) http://dlmf.nist.gov/LaTeXML/.-->
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<link rel="stylesheet" href="../latexml/LaTeXML.css" type="text/css">
<link rel="stylesheet" href="../latexml/ltx-article.css" type="text/css">
</head>
<body>
<div class="ltx_page_main">
<div class="ltx_page_content">
<article class="ltx_document ltx_authors_1line">
<h1 class="ltx_title ltx_title_document">Rule-based machine-translation between Finnish and German</h1>
<div class="ltx_authors">
<span class="ltx_creator ltx_role_author">
<span class="ltx_personname">Tommi A Pirinen
</span></span>
</div>
<div class="ltx_abstract">
<h6 class="ltx_title ltx_title_abstract">Abstract</h6>
<p class="ltx_p">With this poster I present a work-in-progress rule-based machine
translation between German and Finnish based on the
Apertium<span class="ltx_note ltx_role_footnote"><sup class="ltx_note_mark">1</sup><span class="ltx_note_outer"><span class="ltx_note_content"><sup class="ltx_note_mark">1</sup><a href="http://wiki.apertium.org/" title="" class="ltx_ref ltx_url ltx_font_typewriter">http://wiki.apertium.org/</a></span></span></span> machine translation
system (Forcada et al. 2011). The system is composed of canonical NLP
components for rule-based systems: morphological analysis, chunking, lexical
selection, chunk re-ordering and structural translation and morphological
generation. One of the points that I want to highlight in this poster is the
workflow and the supporting infrastructure for it; unlike a typical coursework or
research project, this machine translation has been modeled as a <span class="ltx_text ltx_font_italic">lexicon
and grammar engineering while language learning</span> type of project. On software
engineering side I have developed tools to extend mono- and bilingual lexicons
while learning the OOV words (notably, words that are OOV for RBMT are new words
for the language learner) in texts, and I am in process to extend these tools
for the grammar learning—strucutural transfer interaction. I have also
modernised the build infrastructure from sf.net SVN to github with full support
of continuous integration and automatic testing, providing an excellent platform
for language learners to extend the lexicons and grammars without fear of
breaking other existing systems that depend on these lexicons and grammars.</p>
<p class="ltx_p">Apertium systems are modular pipelines combining basic NLP tools
with machine translation specific modules. The three lexical modules
that are the most important for the system development and language learning are
the morpholgical analysers and the lexical translation, which
correspond the vocabulary of the learner / MT system. The resources for Finnish
and German morphological analysis were available at the start of the project,
but to our knowledge, this is the first free and open source Finnish-German
bilingual resource of its kind. For the other parts of the pipeline that are
more specific to apertium, such as chunking (shallow syntax parsing),
re-ordering and transfer rules.</p>
</div>
<span class="ltx_ERROR undefined">\selectlanguage</span>
<div id="p1" class="ltx_para">
<p class="ltx_p">English</p>
</div>
<div id="p2" class="ltx_para">
<ul id="I1" class="ltx_itemize">
<li id="I1.i1" class="ltx_item" style="list-style-type:none;">
<span class="ltx_tag ltx_tag_itemize">•</span>
<div id="I1.i1.p1" class="ltx_para">
<p class="ltx_p">Forcada, M.L. et al. (2011): Apertium: a free/open-source platform for
rule-based machine translation. <em class="ltx_emph">Machine translation</em> 25(2). 127–144.</p>
</div>
</li>
</ul>
</div>
</article>
</div>
<footer class="ltx_page_footer">
<div class="ltx_page_logo">Generated on Tue Mar 13 16:25:56 2018 by <a href="http://dlmf.nist.gov/LaTeXML/">LaTeXML <img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAsAAAAOCAYAAAD5YeaVAAAAAXNSR0IArs4c6QAAAAZiS0dEAP8A/wD/oL2nkwAAAAlwSFlzAAALEwAACxMBAJqcGAAAAAd0SU1FB9wKExQZLWTEaOUAAAAddEVYdENvbW1lbnQAQ3JlYXRlZCB3aXRoIFRoZSBHSU1Q72QlbgAAAdpJREFUKM9tkL+L2nAARz9fPZNCKFapUn8kyI0e4iRHSR1Kb8ng0lJw6FYHFwv2LwhOpcWxTjeUunYqOmqd6hEoRDhtDWdA8ApRYsSUCDHNt5ul13vz4w0vWCgUnnEc975arX6ORqN3VqtVZbfbTQC4uEHANM3jSqXymFI6yWazP2KxWAXAL9zCUa1Wy2tXVxheKA9YNoR8Pt+aTqe4FVVVvz05O6MBhqUIBGk8Hn8HAOVy+T+XLJfLS4ZhTiRJgqIoVBRFIoric47jPnmeB1mW/9rr9ZpSSn3Lsmir1fJZlqWlUonKsvwWwD8ymc/nXwVBeLjf7xEKhdBut9Hr9WgmkyGEkJwsy5eHG5vN5g0AKIoCAEgkEkin0wQAfN9/cXPdheu6P33fBwB4ngcAcByHJpPJl+fn54mD3Gg0NrquXxeLRQAAwzAYj8cwTZPwPH9/sVg8PXweDAauqqr2cDjEer1GJBLBZDJBs9mE4zjwfZ85lAGg2+06hmGgXq+j3+/DsixYlgVN03a9Xu8jgCNCyIegIAgx13Vfd7vdu+FweG8YRkjXdWy329+dTgeSJD3ieZ7RNO0VAXAPwDEAO5VKndi2fWrb9jWl9Esul6PZbDY9Go1OZ7PZ9z/lyuD3OozU2wAAAABJRU5ErkJggg==" alt="[LOGO]"></a>
</div></footer>
</div>
</body>
</html>