![]() ![]() ![]() The best MAP values for Bengali, Hindi and English queries for our experiment were 7.26, 4.77 and 36.49 respectively. The CLEF results suggested the need for a rich bilingual lexicon for CLIR involving Indian languages. Lucene framework was used for stemming, indexing, retrieval and scoring of the corpus documents. ![]() ![]() Other language-specific resources included a Bengali morphological analyzer, a Hindi stemmer and a set of 200 Hindi and 273 Bengali stop-words. We adopted Automatic Query Generation and Machine Translation approach for our experiment. Under this limited resources, we mostly depended on our phoneme-based transliterations to generate equivalent English query from Hindi and Bengali topics. But neither we had any effective Bengali-English bilingual lexicon nor any parallel corpora to build the statistical lexicon. For our experiment, we had access to a Hindi-English bilingual lexicon, 'Shabdanjali', consisting of approx. The cross-language task includes the retrieval of English documents in response to queries in two most widely spoken Indian languages, Hindi and Bengali. This paper describes our experiment on two cross-lingual and one monolingual English text retrievals at CLEF 1 in the ad-hoc track. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |