[Paper Summary: Modesty is the Formula for Success] Good applications for crummy machine translation

本文探讨了自ALPAC报告以来,机器翻译(MT)的应用和评价问题。作者强调找到MT的利基应用至关重要,而不是追求普适性。文章指出,MT的成功在于识别高回报的特定应用,例如工作站解决方案,它逐渐引入自动化工具,帮助专业译者提高效率。同时,MT也可服务于对质量要求不高的终端用户。传统评价指标往往忽视系统特定优势,而提出MT评估应关注预期用途。通过设定合理预期和经济意义,MT能在某些领域找到其价值。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Good applications for crummy machine translation

— Kenneth W. Church & Eduard H. Hovy, 1993


  • There is a risk that eval can devolve into mindless metrics.
  • The success of the eval often depends very strongly on the selection of an appropriate application.
  • It is wise to identify the niche application first (where strengths of machines are valued) and then we will be in a much better position to address evaluation questions, and then steer them towards high-payoff niches of functionality.
  • Agree with ALPAC that this basic research could not be justified in terms of short-term return on investiment. When compared with human capabilities, MT systems of the time were not deemed a success, and might never be.

Has anything changed since ALPAC?

  → Increasing commercial value

The application venue of MT has been shifted from government to industry, so as the punding providers. One must choose an application that exploits the strengths of the machine and does not compete with the strengths of human. This point is well put in the following: The question now is not whether MT is feasible, but in what domains it is most likely to be effective... The object of an evaluation is, to determine whether a system permits an adequate response to given needs and constraints. --- Lehrberger and Bourbeau, 1988


The blame is to be laid on the desire for generality

In spite of all the literature on MT, the general evaluation measures often fail to pinpoint the strengths of systems. They seem to confound important and less important aspects. Unfortunately, this failure seems to be characteristic of many of the task-independent evaluation metrics. We propose that MT eval metrics should be sensitive to the intended use of the system. And it becomes crucial to the success of an MT effort to identify high-payoff niche application so that the MT system will stand up well to the eval, even though the system might produce crummy translations. By and large 这有一点grants-oriented的嫌疑但是世界又需要这样的effort


Traditional Eval Metrics
  • System-based
    Tied to a particular system, can’t be used effectively f
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值