cApStAn joins the
OASIS XLIFF TC :)

Contents

  • Introduce myself
  • How I use XLIFF
  • Challenges faced when working with XLIFF
  • Proposals/ideas for the TC/new version

 

  • XLIFF versions

Manuel Souto Pico

Education:

  • Translation and applied linguistics

Work:

  • Software localization testing engineer - Lionbridge Tampere
  • Terminologist and technical support specialist - STAR Spain
  • Language coordinator - Lionbridge Dublin
  • Translation technologist - cApStAn

Skills:

  • Languages: English, French, Portuguese, Spanish, Arabic
  • Programming: Python, PHP, bash, XML+, HTML+
  • Working with CAT tools and XLIFF since 2010 approx.

cApStAn Linguistic Quality Control

  • Founded in Belgium in 2000
  • Methodology for standardised evaluation of translation quality
  • Offices in Brussels and Philadelphia
  • Close cooperation with academic world

Scenarios / use cases

OmegaT

OmegaT is the CAT tool we use as XLIFF editor. It is:

  • Free software, hence convenient for low budget proj
  • Filters for more than 30 file formats (XLIFF, DOCX, etc):
    • XLIFF 1.2 (off the shelf -- only extracts the target)
    • XLIFF 1.2 and 2.0 (Okapi filter plugin - bilingual)
  • Open source: you may modify the code (or hire a developer to do it for you) to suit your own requirements
  • Customisable and expandable by means of scripts/macros and plugins

XLIFF created by third-parties

  • Third-party organizations (content creators or platform developers):
    • Create XLIFF files (often with home-made routines)
  • Our role in those cases is to provide:
    • internalization consultancy (e.g. XLIFF preparation best practices), feedback and advanced CAT tool support to file preparators
    • CAT tool support to users
      • installation and customization
      • training
      • translation and revision guides
      • helpdesk

XLIFF created by my team

  • Third-party organizations (content creators or platform developers):
    • Provide source content
  • Our role in those cases is to provide:
    • internalization consultancy about the best way to produce source content
    • localization engineering (creating XLIFF files and translation packages)
      • Okapi for full workflow automation in the server
      • memoQ for better control over file preparation (e.g. more sophisticated filters)
    • CAT tool support to users (...)

no XLIFF

  • Roles as in the previous two scenarios...
  • XLIFF makes sense to ensure interoperability but not so much when we provide the CAT tool / license.
  • Relative added value (other than interoperability):
    • Pros:
      • segment ID guarantees ICE translations
      • protection of source files
      • segment status (no support in OmegaT)
    • Cons:
      • one extra layer of complexity
      • no direct preview

Source-language survey

Below are some ways citizens can get involved in the work of the European Union. Which, if any, have you heard of?

  • Events or online activities organised by together.eu
  • Petitions to the European Parliament
  • Contacting an MEP about an issue

And which, if any, of these have you actively taken part in?

  • Events or online activities organised by together.eu
  • Petitions to the European Parliament
  • Contacting an MEP about an issue

Below are some ways citizens can get involved in the work of the European Union. Which, if any, have you heard of?

  • Events or online activities organised by together.eu
  • Petitions to the European Parliament
  • Contacting an MEP about an issue

And which, if any, of these have you actively taken part in?

  • Events or online activities organised by together.eu
  • Petitions to the European Parliament
  • Contacting an MEP about an issue

trans-unit: id="abc123"

trans-unit: id="xyz567"

Greek translation

Below are some ways citizens can get involved in the work of the European Union. Which, if any, have you heard of?

  • Events or online activities organised by together.eu
  • Petitions to the European Parliament
  • Contacting an MEP about an issue

And which, if any, of these have you actively taken part in?

  • Events or online activities organised by together.eu
  • Petitions to the European Parliament
  • Contacting an MEP about an issue
  • Τα αιτήματα προς το Ευρωπαϊκό Κοινοβούλιο
  • Σε αιτήματα προς το Ευρωπαϊκό Κοινοβούλιο

Czech translation

Below are some ways citizens can get involved in the work of the European Union. Which, if any, have you heard of?

  • Events or online activities organised by together.eu
  • Petitions to the European Parliament
  • Contacting an MEP about an issue

And which, if any, of these have you actively taken part in?

  • Events or online activities organised by together.eu
  • Petitions to the European Parliament
  • Contacting an MEP about an issue
  • O peticích určených Evropskému parlamentu
  • Peticí určených Evropskému parlamentu

Hungarian translation

Below are some ways citizens can get involved in the work of the European Union. Which, if any, have you heard of?

  • Events or online activities organised by together.eu
  • Petitions to the European Parliament
  • Contacting an MEP about an issue

And which, if any, of these have you actively taken part in?

  • Events or online activities organised by together.eu
  • Petitions to the European Parliament
  • Contacting an MEP about an issue
  • Az Európai Parlamenthez benyújtott petíciók
  • Az Európai Parlamenthez benyújtott petíciókban

Challenges

Multi-disciplinary teams

  • People at organizations responsible for creating XLIFF files we must work with do not have the localization engineering know-how they need:
    • to develop translation-friendly authoring tools
    • to create translation-friendly XLIFF files
    • to understand our feedback
  • They are often very reluctant to re-think their tools and content to introduce localization best practices
  • We spend a lot of time in testing and making our feedback accessible to them, and
  • We might end up working in sub-optimal conditions (valid XLIFF but poorly crafted).

Technology

Open source CAT tools and localization tool-kits we use are often not optimal or it is not easy to improve them.

  • Open-source promise: "Just hire a developer to do what you need"
  • In practice:
    • Developers who know the code base
      • are not easy to find
      • don't have capacity
      • have prohibitive prices
    • Other developers
      • are not reliable and might introduce bugs

Technology

Commercial tools (memoQ, Trados, Swordfish*):

  • tend to be more mature in terms of modern translation best practices
  • manufacturers react more quickly if a bug is found
  • not affordable as workstation tools for linguists in some low-budget projects
    • often we use them as localization took-kits to generate XLIFF files that we then translate/revise in OmegaT

 

Proposals / Ideas

A second level of validation??

XLIFF files

  • can validate in an XLIFF checker or against the DTD
  • while being totally unworkable

Could the specs include some constraints that prevent bad practices? like:

  • line breaks and whitespace
  • leading/trailing tags
  • inline codes

Tech fiction? + I am aware these deficiencies are probably very rare in the localization industry.

manuel.souto@capstan.be

OASIS XLIFF TC

By cApStAn LQC

OASIS XLIFF TC

OASIS XLIFF TC

  • 165