Machine Translation Quality Estimation

Evaluating MTQE Results

Once MTQE has been enabled and employed as part of the Machine Translation process, MTQE scores for content and engine can be measured for accuracy. Direct comparison between segment-level MTQE scores and post-editing analysis is not available, but the following options provide ways to quantitatively and qualitatively evaluate MTQE scores.

Evaluating with Post-editing Analysis

Post-editing analysis indicates editing effort; how much text the Linguist or Proofreader had to edit. For post-editing analysis in projects with Machine Translation and MTQE, results are calculated as the difference between the machine translation suggestions and the final text after post-editing is finished.

In order to evaluate the results of the post-editing analysis, run a Default Analysis before the first step of the workflow to see how MT matches are categorized into MTQE bands.

When post-editing is complete, run a Post-editing Analysis with the Analyze MT option.

If the machine translation or non-translatable suggestion was accepted without any editing, the results will indicate 100%.

If the machine translation has been changed, the match rate is lower and the more the segment is changed, the lower the score will be. This is the same score-counting algorithm as the one used to calculate the score of translation memory fuzzy matches.

If the default analysis indicates a high number of quality MT matches (75% or above), the post-editing analysis reflects the correspondingly minimal to moderate amount of editing to the MT suggestions.

Evaluating the Segment Changes

To evaluate the substance of the changes made during post-editing, create a workflow that generates a report showing the changes on a segment-level.

To create this workflow, follow these steps:

  1. Create a project with two Workflow steps (e.g. pre-translation and post-editing).

  2. In the first Workflow step, pre-translate the job with only MT. This provides a snapshot of the matches to be used.

  3. In the second Workflow step, let the translator post-edit normally.

  4. Once the workflow is completed, run the post-editing analysis to see the edit distance between the two steps (the number of changes).

  5. Select the relevant jobs, then go to Tools and select Export Workflow Changes.

    The different versions of the segments are presented.

Quality Scores

Scoring categories:

  • 100% -Excellent MT match, probably no post-editing required

  • 99% - Near-perfect MT output, possibly minor post-editing required for mostly typographical errors

  • 75% - Good MT match, but likely to require some post-editing

  • No score - When there is no score, it is very likely that the MT output is of low quality. In general, it is recommended that this output not be post-edited but used for reference only.

MTQE scores appear at the segment level together with other translation resources (TM, NT, TB). Match origin is presented in a tooltip and at the bottom of the CAT panel in the metadata section.

Was this article helpful?

Comments

0 comments

Article is closed for comments.