You are on page 1of 5

Volume 5, Issue 4, April – 2020 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165

Bidirectional Dictionary Based Machine


Translation for Wolaytegna-Amharic by Java
Temesgen Mengistu Helana
Department of Computer Science
School of Informatics
Wolaita Sodo University, WSU
Wolaita Sodo, Ethiopia

Abstract:- In this paper, Wolaytegna to Amharic In this research work the Bilingual dictionary which is
machine translation were conducted using dictionary used in the Wolaytegna to Amharic translation and vice
based machine translation approach. Machine versa is the core components of a machine translation of
translation system one of a key purpose Natural these two languages. There are many approaches for
Language processing and it is a process of translating developing the MT systems, each approach has their own
from one language to another. In this study the advantages and disadvantages. Out of these approaches
researcher were translated two Ethiopian languages one dictionary based machine translation the most
local language (Wolaytegna) and the other one is official recommended for linguistically less resourced language
language of the country (Amharic) by using dictionary like Wolaytegna. In Ethiopia there are about 80 different
based approach. This research is very important for the languages are available from which Wolaytegna is the 7th
development of the Wolaytegna language which is most spoken language which is spoken by around 7 million
spoken by around 7 million people in Wolaytta zone and people in the country specially by Wolaytta people in
other part of the Ethiopia. For this research we used SNNPR and one of a language with few resource published
Java, MYSQL database and 5400 word entries in electronically in internet and other different media.
dictionary were created in the database to create Oppositely Amharic is historically advantageous language
accurate translation. For all words of source language in Ethiopia because different regime at different period in
we defined meaning in target language in bilingual Ethiopia used the language as official language of the
dictionary. The proposed methodology uses dictionary country so that it is one of linguistically well resourced
for translating word by word without much because this languages compared to other Ethiopian languages.
kind of approach is very advisable for linguistically less
resourced language like Wolaytegna. So this research work will supports Amharic speakers
to use Wolaytegna and Vice-versa by using dictionary
Keyword:- Wolaytegna, Machine translation, Dictionary, based machine translation.
Bilingual, Multilingual, Natural language processing.
The biggest challenge for Statistical Machine
I. INTRODUCTION Translation is to get the high quality corpus because of
insufficient sources of the data for the language like
Translation systems plays a vital role in narrowing the Wolaytegna. Dictionary Based Machine Translation
communication barrier between human race from different (DBMT) approach is used when less number of linguistic
corner of the world. Natural Language Processing (NLP) is resources is available for the languages. In the dictionary
a core discipline in machine translation and it is field of based translation, a system is defined which contains set of
computer science devoted to the improvement of models source language word and corresponding target language
and technologies empowering computers to use human words. During the run time, dictionary based translation use
languages both as input and output [3]. One of the aim of bilingual corpus as its database which is defined in the form
NLP is to develop computational models that can have of dictionary. This database is stored in the translation
equal performance like in the task of reading, writing, memory. Since the two languages, Wolaytegna and
learning, speaking and understanding. Computational Amharic have the same grammatical sentence structure, so
models are useful to explore the nature of linguistic that when the system encounters the any sentence the
communication as well as for enabling effective human- system does not require any rearrange in the sentence
machine interaction. rather it translates directly by retrieving from the translation
memory.
This speedy growth of data on internet was
encouragement for the MT researchers to develop more
profitable MT systems to deliver a worldwide
communication.

IJISRT20APR095 www.ijisrt.com 1344


Volume 5, Issue 4, April – 2020 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
II. CHALLENGES IN DICTIONARY BASED III. PROPOSED SYSTEM MODEL
MACHINE TRANSLATION SYSTEM
Most of the Wolaytegna to Amharic translation looks
In this research some of few challenge while similar to the following sample example described below.
translating Wolaytegna to Amharic are in Wolaytegna some The sentence formation [Subject, Object , Verb] is similar in
words have same spelling and pronunciation but different both languages.
meaning based on sentence context. So that the dictionary Example: => Temesgen went to school.
based machine translation translates one language to the
other by using word based dictionary stored in database.

Amharic language has no proposition and it takes both


prefix and postfix but Wolaytegna takes only postfix. This
difference made to two language translation challenging
because form one root Wolaytegna word we may numerous
words with postfix which are may or may not from same
Amharic word. The following example illustrates sample Fig 1:- The diagram illustrates grammatically how the two
words: language translation works

Example: As described in the above diagram translation cannot


Naaga - ተተተ - wait (he) have Fig happen similarly for all Wolaytegna sentence to
Naagu- ተተተ wait (she) Amharic because while we are working with tense, gender,
Naagikke- ተተተተተተ don’t wait (he) pronounce and other aspects in special case sometimes the
Naagiis- ተተተተ waited sentence reverses and words may be reshuffled. Within this
Naago ተተተተ let him to wait translation system we considered all these aspects.
Naagoo- ተተተተተ can I wait? …… (he)

Fig 2:- Framework of dictionary based machine translation of Wolaytegna to Amharic

We will explain basic details of those steps in C. Detecting contextual ambiguity from alternatives
architectural model in the following content. A word may have more than one meaning in different
sentence with same spelling and pronunciation.
A. Splitting sentence into words
In this research translation can be word based, phrase D. Retrieving target word
based or sentence based; so that if the input is phrase or In this stage of machine translation checking for the
sentence it must be break into words because the entry in availability of each words of source language in given
database is only word based in dictionary. sentence and storing to defined array.

B. Identifying tense, gender, plurality E. Reconstructing sentence in target language


Since both language uses prefix and postfix to identify After extracting equivalent meaning from dictionary
tense, gender and plurality basically it is better to detect the for each words in source text reconstructing process takes
root word and prefix and postfix attached to the word. In place.
this research work we faced challenge with considering the
three key morphological contents.

IJISRT20APR095 www.ijisrt.com 1345


Volume 5, Issue 4, April – 2020 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
IV. THE ALGORITHM Step 4: for word in words
If Check in database
A. Pseudo code Retrieve matching meaning value
Step1 : Start target_text value
Else Break, not found in db
Step2 : read source text Step 5: display target_text
Step 6: End
Step 3: split source_text into word
B. Flowchart
The following flowchart elucidates how the machine
translation algorithm works. In this case we considered if
the input text is a single word or a sentence it can translates
the input into the target text using dictionary database.

Fig 3:- Dictionary based machine translation algorithm of Wolaytegna to Amharic

V. EXPERMENTAL RESULTS

In the following section some of the screenshot


outputs of the experimental reports and the sample codes
related with bidirectional dictionary based machine
translation system were illustrated.

Fig 5:- Retrieving the MT entry from the dictionary

Fig 4:- Inserting created text dictionary into database

IJISRT20APR095 www.ijisrt.com 1346


Volume 5, Issue 4, April – 2020 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

Fig 6:- Store entries for dictionary in wamp database.

Fig 7:- The main interface before feeding texts of source language.

IJISRT20APR095 www.ijisrt.com 1347


Volume 5, Issue 4, April – 2020 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

Fig 8:- The figure illustrated how bilingual machine translator works

VI. CONCLUSION [6]. Tariku Tsegaye, “ENGLISH -TIGRIGNA


FACTORED STATISTICAL MACHINE
Machine Translation plays a vital role in breaking the TRANSLATION” 2014
language barrier and promoting the Interlingua [7]. S. Kereto, C. Wongchaisuwat, Y. Poovarawan,
communication in a multilingual country like Ethiopia. In “Machine translation research and development´. In
this paper, dictionary based machine translation approach is proceedings of the Symposium on Natural Language
used for developing the MT system for Wolaytegna and processing in Thailand, pages 167-195, March 1993.
Amharic. The dictionary based approach is well suited for [8]. K. Narayana Murthy “A Machine Assisted Translation
the languages which have the minimal linguistic resources System” ,Department of Computer and Information
and for the languages with the similar structure. For Sciences, University of Hyderabad, Hyderabad,
dictionary based approach bilingual dictionary is the crucial INDIA.
resource. Here bilingual dictionary with 4500 entries is [9]. Murthy. K,“MAT: A Machine Assisted Translation
developed and stored in Mysql and the user interface were system”, In Proceedings of Symposium on Translation
designed with java. The postfix and prefix of Amharic Support System( STRANS-2002), IIT Kanpur.2002,
language words are translated to Wolaytegna language pp. 134-139.
word with only postfix by machine translation algorithm. [10]. Balajapally, P., Bandaru, P., Ganapathiraju, M.,
Dictionary based approach can be further improved by Balakrishnan, N., & Reddy, R. (2006).
adding more corpus and contextualize grammatical Multilingual Book Reader: Transliteration, Word-to-
translation for both the languages. Words with multiple Word Translation and Full-text Translation. In VAVA
meaning were challenge in Bilingual dictionary based 2006.
machine translation in Wolaytegna to Amharic. [11]. Jingying Zhao, Hai Guo, Zhenhong Zheng, Nan Jiang,
“The Implemention of Chinese-Tai Lue Electronic
REFERENCES Dictionray Based on C#”, Department of Computer
Science and Engineering University of Dalian
[1]. D V Sindhu and B M Sagar 2017 IOP Conf. Ser.: Nationalities CHINA, 2010
Mater. Sci. Eng. 225 012182
[2]. W.John Hutchins and Halord L. Somers, “An
Introduction To Machine Translation”, Academic
Press Ltd.,1992, pp 1-9.
[3]. Daniel Jurafsky and James H. Martin, Speech and
Language Processing, Pearson Education Inc, 2005.
[4]. Remya Rajan , , “Rule Based Machine Translation
from English to Malayalam” .2009
[5]. K. Narayana Murthy “A Machine Assisted Translation
System” ,Department of
Computer and Information Sciences, University of
Hyderabad, Hyderabad, INDIA.

IJISRT20APR095 www.ijisrt.com 1348

You might also like