-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathPirinen-2016-iwclul-rbmt-representations.html
1113 lines (1084 loc) · 94.9 KB
/
Pirinen-2016-iwclul-rbmt-representations.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html><html>
<head>
<title>A Example of Universal dependencies for Uralic languages</title>
<!--Generated on Fri Sep 29 15:46:02 2017 by LaTeXML (version 0.8.2) http://dlmf.nist.gov/LaTeXML/.-->
<!--Document created on .-->
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<link rel="stylesheet" href="../latexml/LaTeXML.css" type="text/css">
<link rel="stylesheet" href="../latexml/ltx-article.css" type="text/css">
</head>
<body>
<div class="ltx_page_main">
<div class="ltx_page_content">
<article class="ltx_document ltx_authors_1line">
<h1 class="ltx_title ltx_title_document">Intermediate representation in rule-based machine translation for the Uralic languages<span class="ltx_ERROR undefined">\footnotepubrights</span>
This work is licensed under a Creative Commons AttributionâNoDerivatives
4.0 International Licence. Licence details:
<span class="ltx_ERROR undefined">\url</span>http://creativecommons.org/licenses/by-nd/4.0/.
Original publication in proceedings of second IWCLUL held in Szeged 2016</h1>
<div class="ltx_authors">
<span class="ltx_creator ltx_role_author">
<span class="ltx_personname"><span class="ltx_text" style="font-size:90%;">Francis M. Tyers,
<br class="ltx_break">HSL-fakultehta
<br class="ltx_break">UiT Norgga árktalaš universitehta
<br class="ltx_break"><span class="ltx_text ltx_font_typewriter">[email protected]</span></span>
</span></span>
<span class="ltx_author_before"> </span><span class="ltx_creator ltx_role_author">
<span class="ltx_personname"><span class="ltx_text" style="font-size:90%;">Tommi A. Pirinen
<br class="ltx_break">ADAPT Centre
<br class="ltx_break">School of Computing,
<br class="ltx_break">Dublin City University
<br class="ltx_break"><span class="ltx_text ltx_font_typewriter">[email protected]</span></span>
</span></span>
</div>
<div class="ltx_date ltx_role_creation"></div>
<div class="ltx_abstract">
<h6 class="ltx_title ltx_title_abstract">Abstract</h6>
<p class="ltx_p">This paper presents some of the major obstacles and challenges in creating machine translation systems
between Uralic languages where the intermediate representation is based on morphology and syntax. The Uralic languages are very alike in many ways: similar case inventories, word order and non-finite clause forms. However current rule-based grammatical resources take many different approaches to encoding this information. These approaches are sometimes based on legacy or traditional grammatical description, important for making the tools comfortable for linguists, but sometimes based on arbitrary and incompatible decisions. This paper presents an overview of some of the issues in working with existing tools and representations and provides some guidelines and suggestions to facilitate future work.</p>
</div>
<section id="S1" class="ltx_section">
<h2 class="ltx_title ltx_title_section">
<span class="ltx_tag ltx_tag_section">1 </span>Introduction</h2>
<div id="S1.p1" class="ltx_para">
<p class="ltx_p">Creating <span class="ltx_text ltx_font_italic">rule-based machine translation</span> (RBMT) systems is a process where one creates a mapping between units of source language and target language.
The units can be different depending on the approach to the problem, i.e., on scale of translating word-forms to word-forms to translating via an intermediate abstract universal language, or an <span class="ltx_text ltx_font_italic">interlingua</span>.
In this article we study the approach of using just morphological analysis with the Uralic languages.
The problem of such a system is that, even when morphologies of the closely related Uralic languages are expected to match, there are often engineering issues that make the work more tedious and cumbersome than necessary.
Minimising the amount of simple engineering work is vital for making rule-based machine attractive to linguists and programmers alike.</p>
</div>
<div id="S1.p2" class="ltx_para">
<p class="ltx_p">The rest of the article is structured as follows: first we describe the backgrounds of the problem in <a href="#S2" title="2 Background ‣ Intermediate representation in rule-based machine translation for the Uralic languages\footnotepubrights This work is licensed under a Creative Commons AttributionâNoDerivatives 4.0 International Licence. Licence details: \urlhttp://creativecommons.org/licenses/by-nd/4.0/. Original publication in proceedings of second IWCLUL held in Szeged 2016" class="ltx_ref"><span class="ltx_text ltx_ref_tag">2</span></a>, then we introduce the resources we are going to use in <a href="#S3" title="3 Resources ‣ Intermediate representation in rule-based machine translation for the Uralic languages\footnotepubrights This work is licensed under a Creative Commons AttributionâNoDerivatives 4.0 International Licence. Licence details: \urlhttp://creativecommons.org/licenses/by-nd/4.0/. Original publication in proceedings of second IWCLUL held in Szeged 2016" class="ltx_ref"><span class="ltx_text ltx_ref_tag">3</span></a>, we suggest some common best practices in <a href="#S6" title="6 Guidelines ‣ Intermediate representation in rule-based machine translation for the Uralic languages\footnotepubrights This work is licensed under a Creative Commons AttributionâNoDerivatives 4.0 International Licence. Licence details: \urlhttp://creativecommons.org/licenses/by-nd/4.0/. Original publication in proceedings of second IWCLUL held in Szeged 2016" class="ltx_ref"><span class="ltx_text ltx_ref_tag">6</span></a>, in <a href="#S7" title="7 Universal dependencies ‣ Intermediate representation in rule-based machine translation for the Uralic languages\footnotepubrights This work is licensed under a Creative Commons AttributionâNoDerivatives 4.0 International Licence. Licence details: \urlhttp://creativecommons.org/licenses/by-nd/4.0/. Original publication in proceedings of second IWCLUL held in Szeged 2016" class="ltx_ref"><span class="ltx_text ltx_ref_tag">7</span></a> we briefly describe universal parts-of-speech and morphological features, and finally in <a href="#S8" title="8 Concluding remarks ‣ Intermediate representation in rule-based machine translation for the Uralic languages\footnotepubrights This work is licensed under a Creative Commons AttributionâNoDerivatives 4.0 International Licence. Licence details: \urlhttp://creativecommons.org/licenses/by-nd/4.0/. Original publication in proceedings of second IWCLUL held in Szeged 2016" class="ltx_ref"><span class="ltx_text ltx_ref_tag">8</span></a> we provide some short concluding remarks.</p>
</div>
</section>
<section id="S2" class="ltx_section">
<h2 class="ltx_title ltx_title_section">
<span class="ltx_tag ltx_tag_section">2 </span>Background</h2>
<div id="S2.p1" class="ltx_para">
<p class="ltx_p">RBMT is a popular way of developing high-quality machine translations between related languages <cite class="ltx_cite ltx_citemacro_cite">[<a href="#bib.bib6" title="Apertium: a free/open-source platform for rule-based machine translation platform" class="ltx_ref">1</a>]</cite>.
The building of an RBMT system rapidly for related languages is possible, as has been done with, e.g. Dutch and Afrikaans <cite class="ltx_cite ltx_citemacro_cite">[<a href="#bib.bib145" title="Rapid rule-based machine translation between dutch and afrikaans" class="ltx_ref">5</a>]</cite>.
A wide-coverage machine translation requires wide-coverage lexical resources for the languages.
Developing an analyser to a stage where it is usable by multiple applications, including RBMT, can take years, so it is often a good idea to use readily available resources instead of re-writing a new analyser from the scratch.
However, the majority of existing analysers are made with language-dependent annotation systems, which unnecessarily complicate the description of machine translation.
It should be clear, that if two related languages use the same morphological and syntactic structures to describe a phenomenon, a rule mapping between the two should be entirely trivial.
This is not the case when taking most off-the-shelf analysers for contemporary Uralic morphologies. Table <a href="#S2.T1" title="Table 1 ‣ 2 Background ‣ Intermediate representation in rule-based machine translation for the Uralic languages\footnotepubrights This work is licensed under a Creative Commons AttributionâNoDerivatives 4.0 International Licence. Licence details: \urlhttp://creativecommons.org/licenses/by-nd/4.0/. Original publication in proceedings of second IWCLUL held in Szeged 2016" class="ltx_ref"><span class="ltx_text ltx_ref_tag">1</span></a> shows an example of the morphological annotation of five Uralic languages for a simple five-word sentence.</p>
</div>
<figure id="S2.T1" class="ltx_table">
<p class="ltx_p"><span class="ltx_rule" style="width:100%;height:1px;background:black;display:inline-block;"> </span>
<span class="ltx_tabular ltx_align_middle">
<span class="ltx_tbody">
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><em class="ltx_emph" style="font-size:70%;">James</em></span>
<span class="ltx_td ltx_align_left"><em class="ltx_emph" style="font-size:70%;">ja</em></span>
<span class="ltx_td ltx_align_left"><em class="ltx_emph" style="font-size:70%;">Mary</em></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">+N+Prop+Sem/Mal+Sg+Nom</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">+CC</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">+N+Prop+Sem/Fem+Sg+Nom</span></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><em class="ltx_emph" style="font-size:70%;">leaba</em></span>
<span class="ltx_td ltx_align_left"><em class="ltx_emph" style="font-size:70%;">gárdimis</em></span>
<span class="ltx_td ltx_align_left"><em class="ltx_emph" style="font-size:70%;">.</em></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">+V+IV+Ind+Prs+Du3</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">+N+Sg+Loc</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">+CLB</span></span></span>
</span>
</span>
<br class="ltx_break"><span class="ltx_rule" style="width:100%;height:1px;background:black;display:inline-block;"> </span>
<span class="ltx_tabular ltx_guessed_headers ltx_align_middle">
<span class="ltx_thead">
<span class="ltx_tr">
<span class="ltx_td ltx_align_left ltx_th ltx_th_column ltx_th_row"><em class="ltx_emph" style="font-size:70%;">ÐжеймÑ</em></span>
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">маÑÑо</em></span>
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">ÐаÑиÑ</em></span></span>
</span>
<span class="ltx_tbody">
<span class="ltx_tr">
<span class="ltx_td ltx_align_left ltx_th ltx_th_row"><span class="ltx_text" style="font-size:70%;">+N+Prop+Sem/Mal+Sg+Nom+Indef</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">маÑÑо+Po+COM</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">+N+Prop+Sem/Fem+Pl+Nom+Indef</span></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left ltx_th ltx_th_row"><em class="ltx_emph" style="font-size:70%;">ÑадпиÑеÑÑÑÑ</em></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">.</span></span>
<span class="ltx_td"></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left ltx_th ltx_th_row"><span class="ltx_text" style="font-size:70%;">+N+SP+Ine+Indef+Der/Pr+V+Ind+Prs+ScPl3</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">+CLB</span></span>
<span class="ltx_td"></span></span>
</span>
</span>
<br class="ltx_break"><span class="ltx_rule" style="width:100%;height:1px;background:black;display:inline-block;"> </span>
<span class="ltx_tabular ltx_guessed_headers ltx_align_middle">
<span class="ltx_thead">
<span class="ltx_tr">
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">James</em></span>
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">ja</em></span>
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">Mary</em></span>
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">ovat</em></span></span>
</span>
<span class="ltx_tbody">
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">N Prop Nom Sg</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Part</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">N Prop Nom Sg</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">V Prs Act Pl3</span></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><em class="ltx_emph" style="font-size:70%;">puutarhassa</em></span>
<span class="ltx_td ltx_align_left"><em class="ltx_emph" style="font-size:70%;">.</em></span>
<span class="ltx_td"></span>
<span class="ltx_td"></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">N Ine Sg</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Punct</span></span>
<span class="ltx_td"></span>
<span class="ltx_td"></span></span>
</span>
</span>
<br class="ltx_break"><span class="ltx_rule" style="width:100%;height:1px;background:black;display:inline-block;"> </span>
<span class="ltx_tabular ltx_guessed_headers ltx_align_middle">
<span class="ltx_thead">
<span class="ltx_tr">
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">James</em></span>
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">ja</em></span>
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">Mary</em></span>
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">on</em></span></span>
</span>
<span class="ltx_tbody">
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">+H+sg+nom</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">+J</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">+H+sg+nom</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">+V+indic+pres+ps3+pl+ps+af</span></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><em class="ltx_emph" style="font-size:70%;">aias</em></span>
<span class="ltx_td ltx_align_left"><em class="ltx_emph" style="font-size:70%;">.</em></span>
<span class="ltx_td"></span>
<span class="ltx_td"></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">+S+sg+in</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">.</span></span>
<span class="ltx_td"></span>
<span class="ltx_td"></span></span>
</span>
</span>
<br class="ltx_break"><span class="ltx_rule" style="width:100%;height:1px;background:black;display:inline-block;"> </span>
<span class="ltx_tabular ltx_align_middle">
<span class="ltx_tbody">
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><em class="ltx_emph" style="font-size:70%;">James</em></span>
<span class="ltx_td ltx_align_left"><em class="ltx_emph" style="font-size:70%;">és</em></span>
<span class="ltx_td ltx_align_left"><em class="ltx_emph" style="font-size:70%;">Mary</em></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">a</span></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">/NOUN</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">/CONJ</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">/NOUN</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">/ART</span></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><em class="ltx_emph" style="font-size:70%;">kértben</em></span>
<span class="ltx_td ltx_align_left"><em class="ltx_emph" style="font-size:70%;">vannak</em></span>
<span class="ltx_td ltx_align_left"><em class="ltx_emph" style="font-size:70%;">.</em></span>
<span class="ltx_td"></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">/ADJ¡CAS¡INE¿¿</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">/VERB¡PLUR¿</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">/PUNCT</span></span>
<span class="ltx_td"></span></span>
</span>
</span><span class="ltx_text" style="font-size:70%;">
<span class="ltx_rule" style="width:100%;height:1px;background:black;display:inline-block;"> </span>
</span></p>
<figcaption class="ltx_caption" style="font-size:70%;"><span class="ltx_tag ltx_tag_table">Table 1: </span>Translations of the sentence ‘James and Mary are in the garden.’ in several Uralic languages (North Sámi, Erzya, Finnish, Estonian, Hungarian) with the tag strings used in their morphological analysers. There are examples of real morphosyntactic differences (compare the third-person dual in North Sámi with the third-person plural in other languages) and arbitrary tag differences (compare the tag that the word for <em class="ltx_emph">and</em> receives in the different languages).</figcaption>
</figure>
<section id="S2.SS1" class="ltx_subsection">
<h3 class="ltx_title ltx_title_subsection">
<span class="ltx_tag ltx_tag_subsection">2.1 </span>Intermediate representations</h3>
<figure id="S2.F1" class="ltx_figure"><svg id="S2.F1.pic1" class="ltx_picture ltx_centering" style="width:166.4pt;height:96.3pt;" height="133.21" overflow="visible" version="1.1" viewBox="-113.6 -119.6 230.19 133.21" width="230.19"><g transform="matrix(1 0 0 -1 0 -105.99)"><g stroke="#000000"><g fill="#000000"><g color="#000000" stroke-width="0.4pt"><g transform="matrix(1 0 0 1 -57.08 -3.46)"><g class="ltx_svg_fog" transform="matrix(1 0 0 -1 0 12.45)"><switch><foreignObject color="#000000" height="100%" overflow="visible" width="114.15">
<p class="ltx_p"><span class="ltx_text ltx_font_sansserif">interlingua</span></p></foreignObject></switch></g></g><g transform="matrix(1 0 0 1 -68.82 -55.37)"><g class="ltx_svg_fog" transform="matrix(1 0 0 -1 0 2.77)"><switch><foreignObject color="#000000" height="100%" overflow="visible" width="100%">
<p class="ltx_p"></p></foreignObject></switch></g></g><g transform="matrix(1 0 0 1 -106.22 -94.74)"><g class="ltx_svg_fog" transform="matrix(1 0 0 -1 0 2.77)"><switch><foreignObject color="#000000" height="100%" overflow="visible" width="100%">
<p class="ltx_p"></p></foreignObject></switch></g></g><g transform="matrix(1 0 0 1 106.22 -94.74)"><g class="ltx_svg_fog" transform="matrix(1 0 0 -1 0 2.77)"><switch><foreignObject color="#000000" height="100%" overflow="visible" width="100%">
<p class="ltx_p"></p></foreignObject></switch></g></g><g transform="matrix(1 0 0 1 68.82 -55.37)"><g class="ltx_svg_fog" transform="matrix(1 0 0 -1 0 2.77)"><switch><foreignObject color="#000000" height="100%" overflow="visible" width="100%">
<p class="ltx_p"></p></foreignObject></switch></g></g><g stroke-width="0.6pt"><g stroke-dasharray="none" stroke-dashoffset="0.0pt"><g stroke-linejoin="miter"></g></g><path d="M -13.59 -12.12 L -101.34 -90.39" style="fill:none"></path><g transform="matrix(0.747 0.66626 -0.66626 0.747 -13.59 -12.12)"><path d="M -7.83 3.51 L 0.45 0 L -7.83 -3.51 Z"></path></g><g transform="matrix(0.747 0.66626 -0.66626 0.747 -93.1 -72.59)"><g class="ltx_svg_fog" transform="matrix(1 0 0 -1 0 12.45)"><switch><foreignObject color="#000000" height="100%" overflow="visible" width="83.02">
<p class="ltx_p"><span class="ltx_text ltx_font_sansserif">analysis</span></p></foreignObject></switch></g></g></g><g stroke-width="0.6pt"><path d="M 12.46 -11.12 L 100.21 -89.38" style="fill:none"></path><g transform="matrix(0.747 -0.66626 0.66626 0.747 100.21 -89.38)"><path d="M -7.83 3.51 L 0.45 0 L -7.83 -3.51 Z"></path></g><g transform="matrix(0.747 -0.66626 0.66626 0.747 23.33 -10.36)"><g class="ltx_svg_fog" transform="matrix(1 0 0 -1 0 12.45)"><switch><foreignObject color="#000000" height="100%" overflow="visible" width="103.78">
<p class="ltx_p"><span class="ltx_text ltx_font_sansserif">generation</span></p></foreignObject></switch></g></g></g><g stroke-width="0.6pt"><path d="M -101.34 -94.74 L 99.82 -94.74" style="fill:none"></path><g transform="matrix(1 0 0 1 99.82 -94.74)"><path d="M -7.83 3.51 L 0.45 0 L -7.83 -3.51 Z"></path></g><g transform="matrix(1 0 0 1 -31.13 -109.46)"><g class="ltx_svg_fog" transform="matrix(1 0 0 -1 0 12.45)"><switch><foreignObject color="#000000" height="100%" overflow="visible" width="62.27">
<p class="ltx_p"><span class="ltx_text ltx_font_sansserif">direct</span></p></foreignObject></switch></g></g></g><g stroke-width="0.6pt"><path d="M -63.93 -55.37 L 62.42 -55.37" style="fill:none"></path><g transform="matrix(1 0 0 1 62.42 -55.37)"><path d="M -7.83 3.51 L 0.45 0 L -7.83 -3.51 Z"></path></g><g transform="matrix(1 0 0 1 -41.51 -70.09)"><g class="ltx_svg_fog" transform="matrix(1 0 0 -1 0 12.45)"><switch><foreignObject color="#000000" height="100%" overflow="visible" width="83.02">
<p class="ltx_p"><span class="ltx_text ltx_font_sansserif">transfer</span></p></foreignObject></switch></g></g></g></g></g></g></g></svg>
<figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_figure">Figure 1: </span>The Vauquois triangle which illustrates the amount of transfer needed for different levels of intermediate representation.</figcaption>
</figure>
<div id="S2.SS1.p1" class="ltx_para">
<p class="ltx_p">In machine translation, an intermediate representation is an abstraction away from the surface forms of the language. Figure <a href="#S2.F1" title="Figure 1 ‣ 2.1 Intermediate representations ‣ 2 Background ‣ Intermediate representation in rule-based machine translation for the Uralic languages\footnotepubrights This work is licensed under a Creative Commons AttributionâNoDerivatives 4.0 International Licence. Licence details: \urlhttp://creativecommons.org/licenses/by-nd/4.0/. Original publication in proceedings of second IWCLUL held in Szeged 2016" class="ltx_ref"><span class="ltx_text ltx_ref_tag">1</span></a> shows the Vauquois triangle, a common illustration of different levels of intermediate representation.</p>
</div>
<div id="S2.SS1.p2" class="ltx_para">
<p class="ltx_p">At the bottom of the triangle, there is no intermediate representation and translation is performed on a word-for-word basis. At the apex of the triangle is interlingual translation, where the source language is first mapped to a language-independent semantic representation, and this representation is then used to generate the target language.</p>
</div>
<div id="S2.SS1.p3" class="ltx_para">
<p class="ltx_p">In the middle is (morpho-)syntactic transfer. Here the source language is analysed to a language-dependent intermediate representation (usually based on a combination of syntactic structure and morphosyntactic features) and then transfer rules are applied to convert the source language intermediate representation to one compatible with the target-language generation component.</p>
</div>
</section>
</section>
<section id="S3" class="ltx_section">
<h2 class="ltx_title ltx_title_section">
<span class="ltx_tag ltx_tag_section">3 </span>Resources</h2>
<div id="S3.p1" class="ltx_para">
<p class="ltx_p">In this paper we make use of five sets of linguistic data for five different Uralic languages: Finnish, North Sámi, Erzya, Estonian and Hungarian.
We take the North Sámi and Erzya data from the Giellatekno language technology repository.<span class="ltx_note ltx_role_footnote"><sup class="ltx_note_mark">1</sup><span class="ltx_note_outer"><span class="ltx_note_content"><sup class="ltx_note_mark">1</sup><span class="ltx_ERROR undefined">\url</span>http://giellatekno.uit.no</span></span></span>
The North Sámi data has primarily been developed by the Divvun and Giellatekno groups at UiT Norgga árktalaš universitehta and the Erzya data has been developed by Jack Rueter at Helsingin yliopisto <cite class="ltx_cite ltx_citemacro_cite">[<a href="#bib.bib116" title="Adnominal person in the morphological system of erzya" class="ltx_ref">9</a>]</cite>.
For the Estonian data, we use the <em class="ltx_emph">plamk</em> analyser<span class="ltx_note ltx_role_footnote"><sup class="ltx_note_mark">2</sup><span class="ltx_note_outer"><span class="ltx_note_content"><sup class="ltx_note_mark">2</sup><span class="ltx_ERROR undefined">\url</span>https://github.com/jjpp/plamk</span></span></span> written by Jaak Pruulmann-Vengerfeldt, for Finnish, <em class="ltx_emph">omorfi</em> <cite class="ltx_cite ltx_citemacro_cite">[<a href="#bib.bib194" title="Omorfi–free and open source morphological lexical database for Finnish" class="ltx_ref">6</a>]</cite><span class="ltx_note ltx_role_footnote"><sup class="ltx_note_mark">3</sup><span class="ltx_note_outer"><span class="ltx_note_content"><sup class="ltx_note_mark">3</sup><span class="ltx_ERROR undefined">\url</span>https://github.com/flammie/omorfi</span></span></span> and for Hungarian, hunmorph <cite class="ltx_cite ltx_citemacro_cite">[<a href="#bib.bib136" title="Hunmorph: open source word analysis" class="ltx_ref">10</a>]</cite>.<span class="ltx_note ltx_role_footnote"><sup class="ltx_note_mark">4</sup><span class="ltx_note_outer"><span class="ltx_note_content"><sup class="ltx_note_mark">4</sup><span class="ltx_ERROR undefined">\url</span>http://mokk.bme.hu/resources/hunmorph/</span></span></span></p>
</div>
</section>
<section id="S4" class="ltx_section">
<h2 class="ltx_title ltx_title_section">
<span class="ltx_tag ltx_tag_section">4 </span>Strategies</h2>
<div id="S4.p1" class="ltx_para">
<p class="ltx_p">There a different ways to fix systematic mismatches.
We evaluate the followings:</p>
</div>
<section id="S4.SS1" class="ltx_subsection">
<h3 class="ltx_title ltx_title_subsection">
<span class="ltx_tag ltx_tag_subsection">4.1 </span>Relabelling</h3>
<div id="S4.SS1.p1" class="ltx_para">
<p class="ltx_p">An obvious approach to getting around the problem of divergent tagsets is to simply perform relabelling. This is where you replace the canonical tags in one language with their equivalents in the other language, or with a common equivalent in both languages.</p>
</div>
<div id="S4.SS1.p2" class="ltx_para">
<table class="ltx_tabular ltx_centering ltx_align_middle">
<tbody class="ltx_tbody">
<tr class="ltx_tr">
<td class="ltx_td ltx_align_center">+CC <math id="S4.SS1.p2.m1" class="ltx_Math" alttext="\rightarrow" display="inline"><mo>→</mo></math> <cnjcoo> <math id="S4.SS1.p2.m2" class="ltx_Math" alttext="\leftarrow" display="inline"><mo>←</mo></math> +J+Coord</td>
</tr>
</tbody>
</table>
</div>
<div id="S4.SS1.p3" class="ltx_para">
<p class="ltx_p">However, this solution has its disadvantages.
Even though +J and +CC both are used for conjuctions, the <em class="ltx_emph">plamk</em> tag is also used with subordinating and other conjunctions, while the Giellatekno tag excludes those. Relabelling +J+Coord to +CC and any other +J to +CS might work on the analyser, but will not work in a disambiguation rule saying “select the noun reading if the word to the right is tagged +J”, here we need to relabel +J to (+CS or +CC). In the opposite direction, +CS would need to be relabelled to (+J but not +Coord). The distinction between these may be irrelevant for the translation process (in all cases, <em class="ltx_emph">ja</em> in North Sámi will be translated to <em class="ltx_emph">ja</em> in Estonian), but for the intervening grammatical tools, it may be vital to make (or not) the distinction.</p>
</div>
</section>
<section id="S4.SS2" class="ltx_subsection">
<h3 class="ltx_title ltx_title_subsection">
<span class="ltx_tag ltx_tag_subsection">4.2 </span>Interlingua</h3>
<div id="S4.SS2.p1" class="ltx_para">
<p class="ltx_p">Another potential solution is to use a semantic interlingua (see description in section <a href="#S2.SS1" title="2.1 Intermediate representations ‣ 2 Background ‣ Intermediate representation in rule-based machine translation for the Uralic languages\footnotepubrights This work is licensed under a Creative Commons AttributionâNoDerivatives 4.0 International Licence. Licence details: \urlhttp://creativecommons.org/licenses/by-nd/4.0/. Original publication in proceedings of second IWCLUL held in Szeged 2016" class="ltx_ref"><span class="ltx_text ltx_ref_tag">2.1</span></a>). This is the approach adopted by the machine translation system based on Grammatical Framework <cite class="ltx_cite ltx_citemacro_cite">[<a href="#bib.bib117" title="Grammatical framework: programming with multilingual grammars" class="ltx_ref">8</a>]</cite>.<span class="ltx_note ltx_role_footnote"><sup class="ltx_note_mark">5</sup><span class="ltx_note_outer"><span class="ltx_note_content"><sup class="ltx_note_mark">5</sup><span class="ltx_ERROR undefined">\url</span>http://grammaticalframework.org</span></span></span> In this framework there is no direct transfer of morphological features.</p>
</div>
</section>
</section>
<section id="S5" class="ltx_section">
<h2 class="ltx_title ltx_title_section">
<span class="ltx_tag ltx_tag_section">5 </span>Specific linguistic issues</h2>
<div id="S5.p1" class="ltx_para">
<p class="ltx_p">There are a number of linguistic issues in RBMT.
We cover the following in detail:</p>
</div>
<section id="S5.SS1" class="ltx_subsection">
<h3 class="ltx_title ltx_title_subsection">
<span class="ltx_tag ltx_tag_subsection">5.1 </span>Copula</h3>
<div id="S5.SS1.p1" class="ltx_para">
<p class="ltx_p">There are two main copula constructions in the Uralic languages, the first functions more or less like in the Germanic languages. The copula is a normal verb that agrees with the subject. The second copula construction works like in the Turkic languages. In languages with the Turkic-style copula, it does not typically surface in the third-person singular present tense. In our examples, North Sámi, Finnish and Estonian are of the Germanic type, while Hungarian and Erzya are of the Turkic type.</p>
</div>
<div id="S5.SS1.p2" class="ltx_para">
<table class="ltx_tabular ltx_centering ltx_align_middle">
<tbody class="ltx_tbody">
<tr class="ltx_tr">
<td class="ltx_td"></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">‘She is a student.’</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">‘She was a student.’</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">North Sámi</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Son lea studeanta.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Son lei studeanta.</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Erzya</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Сон ÑÑÑденÑ.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Сон ÑÑÑденÑелÑ.</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Finnish</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Hän on opiskelija.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Hän oli opiskelija.</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Estonian</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Ta on üliõpilane.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Ta oli üliõpilane.</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Hungarian</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Šhallgató.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Šhallgató volt.</span></td>
</tr>
</tbody>
</table>
</div>
<div id="S5.SS1.p3" class="ltx_para">
<p class="ltx_p">In North Sámi, Finnish and Estonian, the treatment of <em class="ltx_emph">lea, on</em> is similar. It is a verb which inflects and agrees like other verbs.</p>
</div>
<div id="S5.SS1.p4" class="ltx_para">
<p class="ltx_p">There are divergences when we look at the Erzya and Hungarian examples. Although they have the same structure, zero copula in the present tense and surfaced copula in the past tense. The morphological analyser for Erzya treats the copula as a derivation:</p>
</div>
<div id="S5.SS1.p5" class="ltx_para">
<p class="ltx_p">ÑÑÑденÑ+N+Sg+Nom+Indef+Der/Pr+V+Ind+Prs+ScSg3</p>
</div>
<div id="S5.SS1.p6" class="ltx_para">
<p class="ltx_p">Where in Hungarian it is simply omitted in the present (if it surfaced it would be <em class="ltx_emph">van</em>), and in the past it is considered a verb form.</p>
</div>
</section>
<section id="S5.SS2" class="ltx_subsection">
<h3 class="ltx_title ltx_title_subsection">
<span class="ltx_tag ltx_tag_subsection">5.2 </span>Non-finite verb forms</h3>
<div id="S5.SS2.p1" class="ltx_para">
<p class="ltx_p">Non-finite verb forms are infinitives and participles on the on hand and derivations on the another. There are a different number of them between languages and their tasks vary from being syntactic arguments of constructions to derived words, and a wide range of analyses are used to accommodate that. There are some differences in the table <a href="#S5.T2" title="Table 2 ‣ 5.2 Non-finite verb forms ‣ 5 Specific linguistic issues ‣ Intermediate representation in rule-based machine translation for the Uralic languages\footnotepubrights This work is licensed under a Creative Commons AttributionâNoDerivatives 4.0 International Licence. Licence details: \urlhttp://creativecommons.org/licenses/by-nd/4.0/. Original publication in proceedings of second IWCLUL held in Szeged 2016" class="ltx_ref"><span class="ltx_text ltx_ref_tag">2</span></a></p>
</div>
<figure id="S5.T2" class="ltx_table">
<table class="ltx_tabular ltx_centering ltx_guessed_headers ltx_align_middle">
<thead class="ltx_thead">
<tr class="ltx_tr">
<th class="ltx_td ltx_align_left ltx_th ltx_th_column"><span class="ltx_text ltx_font_bold" style="font-size:70%;">Language</span></th>
<th class="ltx_td ltx_align_left ltx_th ltx_th_column"><span class="ltx_text ltx_font_bold" style="font-size:70%;">Sentence</span></th>
<th class="ltx_td ltx_align_left ltx_th ltx_th_column"><span class="ltx_text ltx_font_bold" style="font-size:70%;">Non-finite tag</span></th>
</tr>
</thead>
<tbody class="ltx_tbody">
<tr class="ltx_tr">
<td class="ltx_td ltx_border_t"></td>
<td class="ltx_td ltx_align_left ltx_border_t"><span class="ltx_text" style="font-size:70%;">‘I see the man who is running’</span></td>
<td class="ltx_td ltx_border_t"></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">North Sámi</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Oidnen dievddu viehkame</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Actio+Ess</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Erzya</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">ÐеÑн ÑÑÑанÑÑ, конаÑÑ Ñийни.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Der/Ы+ActPrcShort+A</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Finnish</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Näen miehen juoksemassa.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">InfMA+Ine</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Estonian</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Näen meest, kes jookseb.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">—</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Hungarian</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Látom a futó embert.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">/VERB[IMPERF_PART]/ADJ</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_border_t"></td>
<td class="ltx_td ltx_align_left ltx_border_t"><span class="ltx_text" style="font-size:70%;">‘While running I saw the man’</span></td>
<td class="ltx_td ltx_border_t"></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">North Sámi</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Oidnen dievddu viegadettiinan.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Ger+Px1Sg</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Erzya</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">ÐеÑн ÑийниÑÑ ÑÑÑанÑÑ.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Der/ЫÑÑ+ActDemPrc+A</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Finnish</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Näin miehen juostessani.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">InfE+Ine+PxSg1</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Estonian</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Jooksmise ajal nägin ma meest.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Der/mine+Gen</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Hungarian</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Futás közben láttam az embert.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">/VERB[GERUND]/NOUN</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_border_t"></td>
<td class="ltx_td ltx_align_left ltx_border_t"><span class="ltx_text" style="font-size:70%;">‘I see the running man.’</span></td>
<td class="ltx_td ltx_border_t"></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">North Sámi</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Oainnán viehkki dievddu.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">PrsPrc</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Erzya</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">ЧийнемаÑÑ ÑÐµÐ´ÐµÐ½Ñ ÐºÐµÑÑвÑÑ.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Der/ÐмÐ+Nom</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Finnish</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Näen juoksevan miehen.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">PrsPrc</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Estonian</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Näen jooksvat meest.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Der/v+A+Nom</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Hungarian</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Látom a futó embert.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">/VERB[IMPERF_PART]/ADJ</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_border_t"></td>
<td class="ltx_td ltx_align_left ltx_border_t"><span class="ltx_text" style="font-size:70%;">‘Running is fun.’</span></td>
<td class="ltx_td ltx_border_t"></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">North Sámi</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Viehkan lea suohtas.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Actio+Nom</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Erzya</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">ÐелезÑÐ½Ñ ÑÑкÑÐ½Ñ ÑийнемаÑÑ.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Der/ÐмÐ+Nom</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Finnish</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Juokseminen on kivaa.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Der/minen+Nom</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Estonian</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Jooksmine on lahe.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Der/mine+Nom</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Hungarian</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">A futás jó dolog.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">/VERB[GERUND]/NOUN</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_border_t"></td>
<td class="ltx_td ltx_align_left ltx_border_t"><span class="ltx_text" style="font-size:70%;">‘I like running.’</span></td>
<td class="ltx_td ltx_border_t"></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">North Sámi</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Liikon viehkat.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Inf</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Erzya</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">ЧийнемÑÑÑ Ð½ÐµÐ¸Ñ ÑÑÑанÑÑ.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Inf+Ela</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Finnish</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Pidän juoksemisesta.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Der/minen+Ela</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Estonian</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Mulle meeldib joosta.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Inf</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left ltx_border_b"><span class="ltx_text" style="font-size:70%;">Hungarian</span></td>
<td class="ltx_td ltx_align_left ltx_border_b"><span class="ltx_text" style="font-size:70%;">Szeretem futni.</span></td>
<td class="ltx_td ltx_align_left ltx_border_b"><span class="ltx_text" style="font-size:70%;">/VERB¡INF¿</span></td>
</tr>
</tbody>
</table>
<figcaption class="ltx_caption ltx_centering" style="font-size:70%;"><span class="ltx_tag ltx_tag_table">Table 2: </span>Examples of the use and tagging of non-finite verb forms in the languages in our sample. It is not to be expected that the tags are completely equivalent, but for example, given the similarity in structure, should there be a difference in annotation between Finnish PrsPrc and Estonian Der/v+A?</figcaption>
</figure>
</section>
<section id="S5.SS3" class="ltx_subsection">
<h3 class="ltx_title ltx_title_subsection">
<span class="ltx_tag ltx_tag_subsection">5.3 </span>Derivation, compounding and lexicalisation</h3>
<div id="S5.SS3.p1" class="ltx_para">
<p class="ltx_p">A classical problem in computational morphologies lies in question of lexicalisation and productivity of certain processes; is a morphologically created word-form a new word or a form of a, possibly distant root.
Morphologies take widely different and opposing approaches to this ranging from lexicalise-everything to collect-everything. See examples below:</p>
<table class="ltx_tabular ltx_centering ltx_guessed_headers ltx_align_middle">
<thead class="ltx_thead">
<tr class="ltx_tr">
<th class="ltx_td ltx_align_left ltx_th ltx_th_column ltx_th_row">
<br class="ltx_break">
</th>
<th class="ltx_td ltx_align_left ltx_th ltx_th_column"><span class="ltx_text" style="font-size:70%;">‘to drink’</span></th>
<th class="ltx_td ltx_align_left ltx_th ltx_th_column"><span class="ltx_text" style="font-size:70%;">‘a drink’</span></th>
<th class="ltx_td ltx_align_left ltx_th ltx_th_column"><span class="ltx_text" style="font-size:70%;">‘drinker’</span></th>
<th class="ltx_td ltx_align_left ltx_th ltx_th_column"><span class="ltx_text" style="font-size:70%;">‘brewery’</span></th>
</tr>
</thead>
<tbody class="ltx_tbody">
<tr class="ltx_tr">
<th class="ltx_td ltx_align_left ltx_th ltx_th_row"><span class="ltx_text" style="font-size:70%;">North Sámi</span></th>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">juhkat</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">juhkamuš</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">—</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">vuolla·buvttadeaddji</span></td>
</tr>
<tr class="ltx_tr">
<th class="ltx_td ltx_align_left ltx_th ltx_th_row"><span class="ltx_text" style="font-size:70%;">Erzya</span></th>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">ÑимемÑ</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Ñимема-пелÑ</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">ÑимиÑÑ</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">пиÑÐ½Ñ Ð·Ð°Ð²Ð¾Ð´</span></td>
</tr>
<tr class="ltx_tr">
<th class="ltx_td ltx_align_left ltx_th ltx_th_row"><span class="ltx_text" style="font-size:70%;">Finnish</span></th>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">juoda</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">juo-ma</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">juo—ja</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">olut·tehdas</span></td>
</tr>
<tr class="ltx_tr">
<th class="ltx_td ltx_align_left ltx_th ltx_th_row"><span class="ltx_text" style="font-size:70%;">Estonian</span></th>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">jooma</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">joo—gi</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">joo—</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">õlle·tehas</span></td>
</tr>
<tr class="ltx_tr">
<th class="ltx_td ltx_align_left ltx_th ltx_th_row"><span class="ltx_text" style="font-size:70%;">Hungarian</span></th>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">iszik</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">ital</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">iv—ó</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">sör·fÅzde</span></td>
</tr>
</tbody>
</table>
</div>
<div id="S5.SS3.p2" class="ltx_para">
<p class="ltx_p">The symbols ‘·’, ‘-’ and ‘—’ stand for compounding, inflection and derivation, respectively.</p>
</div>
</section>
<section id="S5.SS4" class="ltx_subsection">
<h3 class="ltx_title ltx_title_subsection">
<span class="ltx_tag ltx_tag_subsection">5.4 </span>Pronouns and determiners</h3>
<div id="S5.SS4.p1" class="ltx_para">
<p class="ltx_p">The distinction between pronoun and determiner is not widely made in traditional grammars of most Uralic languages. Words which may be considered both pronouns and determiners are lumped into a single morphosyntactic class (usually pronoun). Consider the following examples involving the word ‘this’</p>
</div>
<div id="S5.SS4.p2" class="ltx_para">
<table class="ltx_tabular ltx_centering ltx_guessed_headers ltx_align_middle">
<tbody class="ltx_tbody">
<tr class="ltx_tr">
<td class="ltx_td"></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">‘I see this house.’</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">‘I see this.’</span></td>
</tr>
<tr class="ltx_tr">
<th class="ltx_td ltx_align_left ltx_th ltx_th_row"><span class="ltx_text" style="font-size:70%;">North Sámi</span></th>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Oainnán dán viesu.</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Oainnán dán.</span></td>
</tr>
<tr class="ltx_tr">
<th class="ltx_td ltx_align_left ltx_th ltx_th_row"><span class="ltx_text" style="font-size:70%;">Erzya</span></th>
<td class="ltx_td ltx_align_left">
<span class="ltx_text" style="font-size:70%;">ÐеÑн Ñе_</span><span class="ltx_text ltx_font_smallcaps" style="font-size:70%;">det</span><span class="ltx_text" style="font-size:70%;"> кÑдонÑÑ.</span>
</td>
<td class="ltx_td ltx_align_left">
<span class="ltx_text" style="font-size:70%;">ÐеÑн ÑенÑ_</span><span class="ltx_text ltx_font_smallcaps" style="font-size:70%;">pron</span><span class="ltx_text" style="font-size:70%;">.</span>
</td>
</tr>
<tr class="ltx_tr">
<th class="ltx_td ltx_align_left ltx_th ltx_th_row"><span class="ltx_text" style="font-size:70%;">Finnish</span></th>
<td class="ltx_td ltx_align_left">
<span class="ltx_text" style="font-size:70%;">Mä näen tämän_</span><span class="ltx_text ltx_font_smallcaps" style="font-size:70%;">pron</span><span class="ltx_text" style="font-size:70%;"> talon.</span>
</td>
<td class="ltx_td ltx_align_left">
<span class="ltx_text" style="font-size:70%;">Mä näen tämän_</span><span class="ltx_text ltx_font_smallcaps" style="font-size:70%;">pron</span><span class="ltx_text" style="font-size:70%;">.</span>
</td>
</tr>
<tr class="ltx_tr">
<th class="ltx_td ltx_align_left ltx_th ltx_th_row"><span class="ltx_text" style="font-size:70%;">Estonian</span></th>
<td class="ltx_td ltx_align_left">
<span class="ltx_text" style="font-size:70%;">Ma näen selle_</span><span class="ltx_text ltx_font_smallcaps" style="font-size:70%;">pron</span><span class="ltx_text" style="font-size:70%;"> maja.</span>
</td>
<td class="ltx_td ltx_align_left">
<span class="ltx_text" style="font-size:70%;">Ma näen selle_</span><span class="ltx_text ltx_font_smallcaps" style="font-size:70%;">pron</span><span class="ltx_text" style="font-size:70%;">.</span>
</td>
</tr>
<tr class="ltx_tr">
<th class="ltx_td ltx_align_left ltx_th ltx_th_row"><span class="ltx_text" style="font-size:70%;">Hungarian</span></th>
<td class="ltx_td ltx_align_left">
<span class="ltx_text" style="font-size:70%;">Nézem ezt_</span><span class="ltx_text ltx_font_smallcaps" style="font-size:70%;">det/noun</span><span class="ltx_text" style="font-size:70%;"> a_</span><span class="ltx_text ltx_font_smallcaps" style="font-size:70%;">art</span><span class="ltx_text" style="font-size:70%;"> házat.</span>
</td>
<td class="ltx_td ltx_align_left">
<span class="ltx_text" style="font-size:70%;">Nézem azt_</span><span class="ltx_text ltx_font_smallcaps" style="font-size:70%;">det/noun</span>
</td>
</tr>
</tbody>
</table>
</div>
<div id="S5.SS4.p3" class="ltx_para">
<p class="ltx_p">In traditional grammars of North Sámi, Finnish and Estonian both the pronominal and the modifier analyses of ‘this’ are classified as pronouns. In Hungarian and Erzya, a distinction is made, with Hungarian making a pronoun/determiner distinction and Erzya making a distinction between quantifier (determiner) and nominalised quantifier.</p>
</div>
<div id="S5.SS4.p4" class="ltx_para">
<p class="ltx_p">If we consider a standard definition of <em class="ltx_emph">pronoun</em> to be ‘that which stands in place (pro-) of a noun phrase (-noun)’ then we can see that in the above, only the tools for Erzya follow this. The other languages leave the distinction to tools later in the pipeline.</p>
</div>
</section>
<section id="S5.SS5" class="ltx_subsection">
<h3 class="ltx_title ltx_title_subsection">
<span class="ltx_tag ltx_tag_subsection">5.5 </span>Non-inflecting words</h3>
<div id="S5.SS5.p1" class="ltx_para">
<p class="ltx_p">All languages in the Uralic family have a wide variety of non-inflecting word forms. Depending on the grammatical tradition followed by the language resource these may be simply lumped into a single class, or they may have extensive syntactic or semantic subcategorisation. Table <a href="#S5.T3" title="Table 3 ‣ 5.5 Non-inflecting words ‣ 5 Specific linguistic issues ‣ Intermediate representation in rule-based machine translation for the Uralic languages\footnotepubrights This work is licensed under a Creative Commons AttributionâNoDerivatives 4.0 International Licence. Licence details: \urlhttp://creativecommons.org/licenses/by-nd/4.0/. Original publication in proceedings of second IWCLUL held in Szeged 2016" class="ltx_ref"><span class="ltx_text ltx_ref_tag">3</span></a> gives a number of examples of non-inflecting words and the equivalent morphological analyses they receive in each of the languages we are studying. To a machine translation practitioner, these distinctions are largely superfluous, <em class="ltx_emph">ja</em> in North Sámi will be translated as <em class="ltx_emph">ja</em> in Finnish and <em class="ltx_emph">ja</em> in Estonian. However, the distinctions may be vital for the intervening disambiguation tools, and as such need to be taken into account.</p>
</div>
<figure id="S5.T3" class="ltx_table">
<table class="ltx_tabular ltx_centering ltx_guessed_headers ltx_align_middle">
<tbody class="ltx_tbody">
<tr class="ltx_tr">
<td class="ltx_td"></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text ltx_font_bold" style="font-size:70%;">North Sámi</span></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text ltx_font_bold" style="font-size:70%;">Erzya</span></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text ltx_font_bold" style="font-size:70%;">Finnish</span></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text ltx_font_bold" style="font-size:70%;">Estonian</span></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text ltx_font_bold" style="font-size:70%;">Hungarian</span></td>
</tr>
<tr class="ltx_tr">
<th class="ltx_td ltx_align_left ltx_th ltx_th_row"><span class="ltx_text" style="font-size:70%;">and</span></th>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">ja+CC</span></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">маÑÑо+Po+COM</span></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">ja Part</span></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">ja+J</span></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">és /CONJ</span></td>
</tr>
<tr class="ltx_tr">
<th class="ltx_td ltx_align_left ltx_th ltx_th_row"><span class="ltx_text" style="font-size:70%;">very</span></th>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">hui+Adv</span></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">пек+Adv+AdA</span></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">tosi Part</span></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">väga+Adv</span></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">nagyon /ADV</span></td>
</tr>
<tr class="ltx_tr">
<th class="ltx_td ltx_align_left ltx_th ltx_th_row"><span class="ltx_text" style="font-size:70%;">under</span></th>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">vuolde+Po</span></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">алов+Po+Lat</span></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">alle Part</span></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">alla+K</span></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">alatt /POSTP</span></td>
</tr>
<tr class="ltx_tr">
<th class="ltx_td ltx_align_left ltx_th ltx_th_row"><span class="ltx_text" style="font-size:70%;">now</span></th>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">dál+Adv</span></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">ней+Adv+Temp</span></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">nyt Part</span></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">praegu+Adv</span></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">most /ADV</span></td>
</tr>
<tr class="ltx_tr">
<th class="ltx_td ltx_align_left ltx_th ltx_th_row"><span class="ltx_text" style="font-size:70%;">hello</span></th>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">bures+Interj</span></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">ÑÑмбÑаÑи+Interj+Formulaic</span></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">moi Part</span></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">tere+I</span></td>
<td class="ltx_td ltx_align_center"><span class="ltx_text" style="font-size:70%;">szia /UTT-INT</span></td>
</tr>
</tbody>
</table>
<figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table">Table 3: </span>Some examples of non-inflecting words with divergent morphological and syntactic annotation. In terms of morphology, the transfer of these tags may be a simple one-to-one substitution. However the syntactic environments may vary substantially.</figcaption>
</figure>
</section>
</section>
<section id="S6" class="ltx_section">
<h2 class="ltx_title ltx_title_section">
<span class="ltx_tag ltx_tag_section">6 </span>Guidelines</h2>
<section id="S6.SS1" class="ltx_subsection">
<h3 class="ltx_title ltx_title_subsection">
<span class="ltx_tag ltx_tag_subsection">6.1 </span>Separation of lexicon and morphotactics</h3>
<div id="S6.SS1.p1" class="ltx_para">
<p class="ltx_p">One of the main components of any rule-based system for morphologically-complex languages is a lexicon consisting of stems and inflectional/derivation categories. In some cases, such as for Finnish, these are partly provided by a state institution, such as a language board. In other cases they are the product of many years of work.</p>
</div>
<div id="S6.SS1.p2" class="ltx_para">
<p class="ltx_p">Although categorising stems for inclusion in a morphological lexicon (many contain over 100,000 entries) can take a substantial amount of work, even if done semi-automatically, implementing the morphotactics (that is, the rules covering inflection, derivation and compounding) may take substantially less time.</p>
</div>
</section>
<section id="S6.SS2" class="ltx_subsection">
<h3 class="ltx_title ltx_title_subsection">
<span class="ltx_tag ltx_tag_subsection">6.2 </span>Maximise parallelism</h3>
<div id="S6.SS2.p1" class="ltx_para">
<p class="ltx_p">In line with the Universal Dependencies project (see <a href="#S7" title="7 Universal dependencies ‣ Intermediate representation in rule-based machine translation for the Uralic languages\footnotepubrights This work is licensed under a Creative Commons AttributionâNoDerivatives 4.0 International Licence. Licence details: \urlhttp://creativecommons.org/licenses/by-nd/4.0/. Original publication in proceedings of second IWCLUL held in Szeged 2016" class="ltx_ref"><span class="ltx_text ltx_ref_tag">7</span></a>), we propose the adoption of a principle of maximum parallelism. In short “things that are the same should be tagged the same”. We do not propose that this should mean that all distinctions should be made in all languages. For example, those Uralic languages without object conjugation should not be required to adopt the agreement tags of those that have it. But it should be possible to come up with principled and consistent guidelines for closed categories.</p>
</div>
</section>
</section>
<section id="S7" class="ltx_section">
<h2 class="ltx_title ltx_title_section">
<span class="ltx_tag ltx_tag_section">7 </span>Universal dependencies</h2>
<div id="S7.p1" class="ltx_para">
<p class="ltx_p">Universal dependencies is a large multi-language project <cite class="ltx_cite ltx_citemacro_cite">[<a href="#bib.bib112" title="Universal dependency annotation for multilingual parsing." class="ltx_ref">3</a>]</cite> aiming at common tagset for part-of-speech, morphosyntactic features and dependency relations.
We do not propose adopting the exact tagset of the universal dependency project. Most projects working on Uralic languages have been ongoing for many years and the tools that they create are used for more than just machine translation. What we find more important is to adopt, or make available tools based on a consistent theoretical background and consistent morphosyntactic description. This could form the basis of a kind of <em class="ltx_emph">universal</em> morphosyntactic interlingua for the Uralic languages. These tools do not have to replace the current tools, and may be automatically generated from them, but they must be consistent. A systematic mapping needs to be considered while developing.
The national Uralic languages have specifications for universal dependencies <cite class="ltx_cite ltx_citemacro_cite">[<a href="#bib.bib113" title="Universal dependencies for finnish" class="ltx_ref">7</a>, <a href="#bib.bib114" title="Estonian dependency treebank and its annotation scheme" class="ltx_ref">4</a>, <a href="#bib.bib115" title="Hungarian dependency treebank." class="ltx_ref">11</a>]</cite>. But these specifications differ in unnecessary ways. For example, consider the annotation of ‘that house’ in the two treebanks for Finnish: Turku Dependency Treebank (TDT) and FinnTreeBank (FTB); and Hungarian:</p>
</div>
<div id="S7.p2" class="ltx_para">
<table class="ltx_tabular ltx_centering ltx_align_middle">
<tbody class="ltx_tbody">
<tr class="ltx_tr">
<td class="ltx_td"></td>
<td class="ltx_td ltx_align_left" colspan="2"><span class="ltx_text" style="font-size:70%;">this</span></td>
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">house</span></td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Finnish (TDT)</span></td>
<td class="ltx_td ltx_align_left" colspan="2">
<span class="ltx_text" style="font-size:70%;">tämä_</span><span class="ltx_text ltx_font_smallcaps" style="font-size:70%;">PRON</span>
</td>
<td class="ltx_td ltx_align_left">
<span class="ltx_text" style="font-size:70%;">talo_</span><span class="ltx_text ltx_font_smallcaps" style="font-size:70%;">NOUN</span>
</td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Finnish (FTB)</span></td>
<td class="ltx_td ltx_align_left" colspan="2">
<span class="ltx_text" style="font-size:70%;">tämä_</span><span class="ltx_text ltx_font_smallcaps" style="font-size:70%;">DET</span>
</td>
<td class="ltx_td ltx_align_left">
<span class="ltx_text" style="font-size:70%;">talo_</span><span class="ltx_text ltx_font_smallcaps" style="font-size:70%;">NOUN</span>
</td>
</tr>
<tr class="ltx_tr">
<td class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Hungarian</span></td>
<td class="ltx_td ltx_align_left">
<span class="ltx_text" style="font-size:70%;">az_</span><span class="ltx_text ltx_font_smallcaps" style="font-size:70%;">PRON</span>
</td>
<td class="ltx_td ltx_align_left">
<span class="ltx_text" style="font-size:70%;">a_</span><span class="ltx_text ltx_font_smallcaps" style="font-size:70%;">ART</span>
</td>
<td class="ltx_td ltx_align_left">
<span class="ltx_text" style="font-size:70%;">ház_</span><span class="ltx_text ltx_font_smallcaps" style="font-size:70%;">NOUN</span>
</td>
</tr>
</tbody>
</table>
</div>
</section>
<section id="S8" class="ltx_section">
<h2 class="ltx_title ltx_title_section">
<span class="ltx_tag ltx_tag_section">8 </span>Concluding remarks</h2>
<div id="S8.p1" class="ltx_para">
<p class="ltx_p">Rule-based machine translation provides a fascinating basis for exploring real linguistic differences between the Uralic languages. However, as we have shown, in current state-of-the-art tools, real linguistic differences are hidden behind a combination of incompatible tagsets and idiosyncratic traditional grammatical norms. We do not propose that the North Sámi adopt the Finnish norms or the Hungarians the Erzya norms, instead we propose developing a common morphological annotation scheme for the Uralic languages based on guidelines of the Universal dependencies project. It is not our aim for this to supercede national standards, but provide a common bridge between them to facilitate the cross-linguistic study and functional rule-based machine translation.</p>
</div>
</section>
<section id="Sx1" class="ltx_section">
<h2 class="ltx_title ltx_title_section">Acknowledgements</h2>
<div id="Sx1.p1" class="ltx_para">
<p class="ltx_p">Heiki-Jaan Kaalep, Jack Rueter, László Tihany as well as the anonymous reviewers have all contributed to the language examples, the remaining mistakes are ours.</p>
</div>
</section>
<section id="A1" class="ltx_appendix">
<h2 class="ltx_title ltx_title_appendix">
<span class="ltx_tag ltx_tag_appendix">Appendix A </span>Example of Universal dependencies for Uralic languages</h2>
<div id="A1.p1" class="ltx_para">
<p class="ltx_p">Example is shown in table <a href="#A1.T4" title="Table 4 ‣ Appendix A Example of Universal dependencies for Uralic languages ‣ Intermediate representation in rule-based machine translation for the Uralic languages\footnotepubrights This work is licensed under a Creative Commons AttributionâNoDerivatives 4.0 International Licence. Licence details: \urlhttp://creativecommons.org/licenses/by-nd/4.0/. Original publication in proceedings of second IWCLUL held in Szeged 2016" class="ltx_ref"><span class="ltx_text ltx_ref_tag">4</span></a>.</p>
</div>
<figure id="A1.T4" class="ltx_table">
<p class="ltx_p"><span class="ltx_rule" style="width:100%;height:1px;background:black;display:inline-block;"> </span>
<span class="ltx_tabular ltx_guessed_headers ltx_align_middle">
<span class="ltx_thead">
<span class="ltx_tr">
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">James</em></span>
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">ja</em></span>
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">Mary</em></span></span>
</span>
<span class="ltx_tbody">
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">PROPN</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">CONJ</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">PROPN</span></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Number=Sing—Case=Nom</span></span>
<span class="ltx_td"></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Number=Sing—Case=Nom</span></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">leaba</em></span>
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">gárdimis</em></span>
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">.</em></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">VERB</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">NOUN</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">PUNCT</span></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Mood=Ind—Tense=Pres—Person=3—Number=Dual</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Number=Sing—Case=Loc</span></span>
<span class="ltx_td"></span></span>
</span>
</span>
<br class="ltx_break"><span class="ltx_rule" style="width:100%;height:1px;background:black;display:inline-block;"> </span>
<span class="ltx_tabular ltx_guessed_headers ltx_align_middle">
<span class="ltx_tbody">
<span class="ltx_tr">
<span class="ltx_td ltx_align_left ltx_th ltx_th_row"><em class="ltx_emph" style="font-size:70%;">ÐжеймÑ</em></span>
<span class="ltx_td ltx_align_left"><em class="ltx_emph" style="font-size:70%;">маÑÑо</em></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left ltx_th ltx_th_row"><span class="ltx_text" style="font-size:70%;">PROPN</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">CONJ</span></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left ltx_th ltx_th_row"><span class="ltx_text" style="font-size:70%;">Number=Sing—Case=Nom—Definite=Ind</span></span>
<span class="ltx_td"></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left ltx_th ltx_th_row"><em class="ltx_emph" style="font-size:70%;">ÐаÑиÑ</em></span>
<span class="ltx_td ltx_align_left"><em class="ltx_emph" style="font-size:70%;">ÑадпиÑеÑÑ-</em></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left ltx_th ltx_th_row"><span class="ltx_text" style="font-size:70%;">PROPN</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">NOUN</span></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left ltx_th ltx_th_row"><span class="ltx_text" style="font-size:70%;">Number=Plur—Case=Nom—Definite=Ind</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Case=Ine—Definite=Ind</span></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left ltx_th ltx_th_row"><em class="ltx_emph" style="font-size:70%;">-ÑÑ</em></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">.</span></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left ltx_th ltx_th_row"><span class="ltx_text" style="font-size:70%;">VERB</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">PUNCT</span></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left ltx_th ltx_th_row"><span class="ltx_text" style="font-size:70%;">Mood=Ind—Tense=Pres—Pers[subj]=3—Number[subj]=Plur</span></span>
<span class="ltx_td"></span></span>
</span>
</span>
<br class="ltx_break"><span class="ltx_rule" style="width:100%;height:1px;background:black;display:inline-block;"> </span>
<span class="ltx_tabular ltx_guessed_headers ltx_align_middle">
<span class="ltx_thead">
<span class="ltx_tr">
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">James</em></span>
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">ja</em></span>
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">Mary</em></span></span>
</span>
<span class="ltx_tbody">
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">PROPN</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">CONJ</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">PROPN</span></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Number=Sing—Case=Nom</span></span>
<span class="ltx_td"></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Number=Sing—Case=Nom</span></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">ovat</em></span>
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">puutarhassa</em></span>
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">.</em></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">VERB</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">NOUN</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">PUNCT</span></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Mood=Ind—Tense=Pres—Person=3—Number=Plur</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Number=Sing—Case=Ine</span></span>
<span class="ltx_td"></span></span>
</span>
</span>
<br class="ltx_break"><span class="ltx_rule" style="width:100%;height:1px;background:black;display:inline-block;"> </span>
<span class="ltx_tabular ltx_guessed_headers ltx_align_middle">
<span class="ltx_thead">
<span class="ltx_tr">
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">James</em></span>
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">ja</em></span>
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">Mary</em></span></span>
</span>
<span class="ltx_tbody">
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">PROPN</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">CONJ</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">PROPN</span></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Number=Sing—Case=Nom</span></span>
<span class="ltx_td"></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Number=Sing—Case=Nom</span></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">on</em></span>
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">aias</em></span>
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">.</em></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">VERB</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">NOUN</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">PUNCT</span></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Mood=Ind—Tense=Pres—Person=3—Number=Plur</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Number=Sing—Case=Ine</span></span>
<span class="ltx_td"></span></span>
</span>
</span>
<br class="ltx_break"><span class="ltx_rule" style="width:100%;height:1px;background:black;display:inline-block;"> </span>
<span class="ltx_tabular ltx_guessed_headers ltx_align_middle">
<span class="ltx_thead">
<span class="ltx_tr">
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">James</em></span>
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">és</em></span>
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">Mary</em></span></span>
</span>
<span class="ltx_tbody">
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">PROPN</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">CONJ</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">PROPN</span></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Number=Sing—Case=Nom</span></span>
<span class="ltx_td"></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Number=Sing—Case=Nom</span></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">kértben</em></span>
<span class="ltx_td ltx_align_left ltx_th ltx_th_column"><em class="ltx_emph" style="font-size:70%;">.</em></span>
<span class="ltx_td"></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">NOUN</span></span>
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">PUNCT</span></span>
<span class="ltx_td"></span></span>
<span class="ltx_tr">
<span class="ltx_td ltx_align_left"><span class="ltx_text" style="font-size:70%;">Number=Sing—Case=Ine</span></span>
<span class="ltx_td"></span>
<span class="ltx_td"></span></span>
</span>
</span><span class="ltx_text" style="font-size:70%;">
<span class="ltx_rule" style="width:100%;height:1px;background:black;display:inline-block;"> </span>
</span></p>
<figcaption class="ltx_caption" style="font-size:70%;"><span class="ltx_tag ltx_tag_table">Table 4: </span>An example of applying universal part-of-speech tags and morphological features to the Uralic languages. Note how the massive differences in annotation are reduced to only the linguistically relevant compared to Table <a href="#S2.T1" title="Table 1 ‣ 2 Background ‣ Intermediate representation in rule-based machine translation for the Uralic languages\footnotepubrights This work is licensed under a Creative Commons AttributionâNoDerivatives 4.0 International Licence. Licence details: \urlhttp://creativecommons.org/licenses/by-nd/4.0/. Original publication in proceedings of second IWCLUL held in Szeged 2016" class="ltx_ref"><span class="ltx_text ltx_ref_tag">1</span></a>.</figcaption>
</figure>
</section>
<section id="bib" class="ltx_bibliography">
<h2 class="ltx_title ltx_title_bibliography">References</h2>
<ul id="L1" class="ltx_biblist">
<li id="bib.bib6" class="ltx_bibitem ltx_bib_article">
<span class="ltx_bibtag ltx_bib_key ltx_role_refnum">[1]</span>
<span class="ltx_bibblock"><span class="ltx_text ltx_bib_author">M. L. Forcada, M. G. Rosell, J. Nordfalk, J. O’Regan, S. Ortiz-Rojas, J. A. Pérez-Ortiz, G. R. nchez, F. Sánchez-MartÃnez and F. M. Tyers</span><span class="ltx_text ltx_bib_year"> (2010)</span>
</span>
<span class="ltx_bibblock"><span class="ltx_text ltx_bib_title">Apertium: a free/open-source platform for rule-based machine translation platform</span>.
</span>
<span class="ltx_bibblock"><span class="ltx_text ltx_bib_journal">Machine Translation</span>.
</span>
<span class="ltx_bibblock ltx_bib_cited">Cited by: <a href="#S2.p1" title="2 Background ‣ Intermediate representation in rule-based machine translation for the Uralic languages\footnotepubrights This work is licensed under a Creative Commons AttributionâNoDerivatives 4.0 International Licence. Licence details: \urlhttp://creativecommons.org/licenses/by-nd/4.0/. Original publication in proceedings of second IWCLUL held in Szeged 2016" class="ltx_ref"><span class="ltx_text ltx_ref_tag">2</span></a>.
</span>
</li>
<li id="bib.bib38" class="ltx_bibitem ltx_bib_article">
<span class="ltx_bibtag ltx_bib_key ltx_role_refnum">[2]</span>
<span class="ltx_bibblock"><span class="ltx_text ltx_bib_author">M. L. Forcada, M. G. Rosell, J. Nordfalk, J. O’Regan, S. Ortiz-Rojas, J. A. Pérez-Ortiz, G. R. nchez, F. Sánchez-MartÃnez and F. M. Tyers</span><span class="ltx_text ltx_bib_year"> (2010)</span>
</span>
<span class="ltx_bibblock"><span class="ltx_text ltx_bib_title">Apertium: a free/open-source platform for rule-based machine translation platform</span>.
</span>
<span class="ltx_bibblock"><span class="ltx_text ltx_bib_journal">Machine Translation</span>.
</span>
<span class="ltx_bibblock ltx_bib_cited">Cited by: <a href="#bib.bib6" title="Apertium: a free/open-source platform for rule-based machine translation platform" class="ltx_ref">1</a>.
</span>
</li>
<li id="bib.bib112" class="ltx_bibitem ltx_bib_inproceedings">
<span class="ltx_bibtag ltx_bib_key ltx_role_refnum">[3]</span>
<span class="ltx_bibblock"><span class="ltx_text ltx_bib_author">R. T. McDonald, J. Nivre, Y. Quirmbach-Brundage, Y. Goldberg, D. Das, K. Ganchev, K. B. Hall, Petrov, H. Zhang and O. Täckström</span><span class="ltx_text ltx_bib_year"> (2013)</span>
</span>
<span class="ltx_bibblock"><span class="ltx_text ltx_bib_title">Universal dependency annotation for multilingual parsing.</span>.
</span>
<span class="ltx_bibblock">In <span class="ltx_text ltx_bib_inbook">ACL (2)</span>,
</span>