• Login
    View Item 
    •   Research Bank Home
    • Study Areas
    • Computing
    • Computing Conference Papers
    • View Item
    •   Research Bank Home
    • Study Areas
    • Computing
    • Computing Conference Papers
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Ensemble Statistical and Heuristic Models for Unsupervised Word Alignment

    Mohaghegh, Mahsa; Sarrafzadeh, Hossein; Mohammadi, Mehdi

    Thumbnail
    Share
    View fulltext online
    ICMLA_CameraReady.pdf (139.8Kb)
    Date
    2014
    Citation:
    Mohaghegh, M., Sarrafzadeh, A., and Mohammadi, M. (2014). Ensemble Statistical and Heuristic Models for Unsupervised Word Alignment. The 13th International Conference on Machine Learning and Applications (ICMLA'14)(Ed.), Detroit, Michigan, USA
    Permanent link to Research Bank record:
    https://hdl.handle.net/10652/2969
    Abstract
    Statistical word alignment models need large amounts of training data while they are weak in small-sized corpora. This paper proposes a new approach of an unsupervised hybrid word alignment technique using an ensemble learning method. This algorithm uses three base alignment models in several rounds to generate alignments. The ensemble algorithm uses a weighed scheme for resampling training data and a voting score to consider aggregated alignments. The underlying alignment algorithms used in this study include IBM Model 1, 2 and a heuristic method based on Dice measurement. Our experimental results show that by this approach, the alignment error rate could be improved by at least 15% for the base alignment models.
    Keywords:
    statistical machine translation (SMT), statistical word alignment, ensemble learning, heuristic word alignment
    ANZSRC Field of Research:
    200323 Translation and Interpretation Studies
    Copyright Holder:
    IEEE (Institute of Electrical and Electronics Engineers)
    Available Online at:
    http://www.icmla-conference.org/icmla14/
    http://www.researchgate.net/publication/272353675_Ensemble_Statistical_and_Heuristic_Models_for_Unsupervised_Word_Alignment
    Rights:
    This digital work is protected by copyright. It may be consulted by you, provided you comply with the provisions of the Act and the following conditions of use: Any use you make of these documents or images must be for research or private study purposes only, and you may not make them available to any other person. You will recognise the author's and publishers rights and give due acknowledgement where appropriate.
    Metadata
    Show detailed record
    This item appears in
    • Computing Conference Papers [147]

    Library home
    Send Feedback
    Research publications
    Unitec
    Moodle
    © Unitec Institute of Technology, Private Bag 92025, Victoria Street West, Auckland 1142
     

     

    Usage

    Downloads, last 12 months
    58
     
     

    Usage Statistics

    For this itemFor the Research Bank

    Share

    About

    About Research BankResearch at UnitecContact us

    Help for authors  

    How to add researchOpen Access GuideVersions Toolkit

    Register for updates  

    LoginRegister

    Browse Research Bank  

    EverywhereAcademic study areasAuthorDateSubjectTitleType of researchSupervisorThis CollectionAuthorDateSubjectTitleType of researchSupervisor

    Library home
    Send Feedback
    Research publications
    Unitec
    Moodle
    © Unitec Institute of Technology, Private Bag 92025, Victoria Street West, Auckland 1142