OmegaT

Using segment IDs as semantic identifiers

If you have downloaded this presentation and are looking at it offline, the latest version (in case of any updates) is always available at:

  • Alternative translations need to be created when a repeated segment must have multiple translations.
  • The purpose of alternative translations is to prevent auto-propagation of the default translation into certain repetitions of the segment.
  • This is achieved by creating in-context exact matches (adding "context" as matching criteria):
    • Surrounding text (previous and next segments)
    • ID or resname attributes
      • Filename
  • Default translations = source text match
    Alternative translations = source text + context match

Alternative translations

  • Alternative translations must be created manually for any segment where the default translation should not be used.
  • That is okay when there are just a few special cases where the default translation is not suitable for whatever reason.
  • However, when that's the case too often in the project, creating many alternative translations can be cumbersome, time-consuming and error-prone.
    • For ex., if half the repetitions require one translation and the other hand another translation, and there are many repetitions.

Alternative translations

Default XLIFF filter

  • Expects a target node populated with the source text
  • Repairs non-unique IDs in XLIFF files
    • by adding _01, _02, etc.

Okapi XLIFF (plugin)

  • Extracts the text content of both source and target nodes
  • Is more lenient and accepts non-unique IDs (same ID value used in more than one segment)

Two XLIFF filters available

<trans-unit xml:space="preserve" id="01_welcome">
  <source xml:lang="en">Welcome to this questionnaire!</source>
  <target xml:lang="xx"></target>
</trans-unit>
<trans-unit xml:space="preserve" id="01_welcome">
  <source xml:lang="en">Welcome to this questionnaire!</source>
  <target xml:lang="xx">Welcome to this questionnaire!</target>
</trans-unit>

Default XLIFF filter

Okapi XLIFF (plugin)

Compliant XML

ID attribute values must be unique in an XML document.

Okapi is lenient

Since IDs don't need to be unique in bilingual XLIFF edited in OmegaT, they can be used to identify groups of segments (that share the same ID).

Semantic identification

If non-unique IDs (or resnames) are tolerated, they can be used to identify groups of segments.

That allows to create one alternative translation for a group of segments rather than many alternative translations (one for each of the segments).

OmegaT - Using IDs as semantic identifiers

By msoutopico

OmegaT - Using IDs as semantic identifiers

Advanced session about using IDs to identify segment groups semantically (for localization engineers)

  • 149