cApStAn joins the
OASIS XLIFF TC :)
Contents
- Introduce myself
- How I use XLIFF
- Challenges faced when working with XLIFF
- Proposals/ideas for the TC/new version
- XLIFF versions
Manuel Souto Pico
Education:
- Translation and applied linguistics
Work:
- Software localization testing engineer - Lionbridge Tampere
- Terminologist and technical support specialist - STAR Spain
- Language coordinator - Lionbridge Dublin
- Translation technologist - cApStAn
Skills:
- Languages: English, French, Portuguese, Spanish, Arabic
- Programming: Python, PHP, bash, XML+, HTML+
- Working with CAT tools and XLIFF since 2010 approx.
cApStAn Linguistic Quality Control
- Founded in Belgium in 2000
- Methodology for standardised evaluation of translation quality
- Offices in Brussels and Philadelphia
- Close cooperation with academic world
Scenarios / use cases
OmegaT
OmegaT is the CAT tool we use as XLIFF editor. It is:
- Free software, hence convenient for low budget proj
- Filters for more than 30 file formats (XLIFF, DOCX, etc):
- XLIFF 1.2 (off the shelf -- only extracts the target)
- XLIFF 1.2 and 2.0 (Okapi filter plugin - bilingual)
- Open source: you may modify the code (or hire a developer to do it for you) to suit your own requirements
- Customisable and expandable by means of scripts/macros and plugins
XLIFF created by third-parties
- Third-party organizations (content creators or platform developers):
- Create XLIFF files (often with home-made routines)
- Our role in those cases is to provide:
- internalization consultancy (e.g. XLIFF preparation best practices), feedback and advanced CAT tool support to file preparators
- CAT tool support to users
- installation and customization
- training
- translation and revision guides
- helpdesk
XLIFF created by my team
- Third-party organizations (content creators or platform developers):
- Provide source content
- Our role in those cases is to provide:
- internalization consultancy about the best way to produce source content
- localization engineering (creating XLIFF files and translation packages)
- Okapi for full workflow automation in the server
- memoQ for better control over file preparation (e.g. more sophisticated filters)
- CAT tool support to users (...)
no XLIFF
- Roles as in the previous two scenarios...
- XLIFF makes sense to ensure interoperability but not so much when we provide the CAT tool / license.
- Relative added value (other than interoperability):
- Pros:
- segment ID guarantees ICE translations
- protection of source files
- segment status (no support in OmegaT)
- Cons:
- one extra layer of complexity
- no direct preview
- Pros:
Source-language survey
Below are some ways citizens can get involved in the work of the European Union. Which, if any, have you heard of?
- Events or online activities organised by together.eu
- Petitions to the European Parliament
- Contacting an MEP about an issue
And which, if any, of these have you actively taken part in?
- Events or online activities organised by together.eu
- Petitions to the European Parliament
- Contacting an MEP about an issue
Below are some ways citizens can get involved in the work of the European Union. Which, if any, have you heard of?
- Events or online activities organised by together.eu
- Petitions to the European Parliament
- Contacting an MEP about an issue
And which, if any, of these have you actively taken part in?
- Events or online activities organised by together.eu
- Petitions to the European Parliament
- Contacting an MEP about an issue
trans-unit: id="abc123"
trans-unit: id="xyz567"
Greek translation
Below are some ways citizens can get involved in the work of the European Union. Which, if any, have you heard of?
- Events or online activities organised by together.eu
- Petitions to the European Parliament
- Contacting an MEP about an issue
And which, if any, of these have you actively taken part in?
- Events or online activities organised by together.eu
- Petitions to the European Parliament
- Contacting an MEP about an issue
- Τα αιτήματα προς το Ευρωπαϊκό Κοινοβούλιο
- Σε αιτήματα προς το Ευρωπαϊκό Κοινοβούλιο
Czech translation
Below are some ways citizens can get involved in the work of the European Union. Which, if any, have you heard of?
- Events or online activities organised by together.eu
- Petitions to the European Parliament
- Contacting an MEP about an issue
And which, if any, of these have you actively taken part in?
- Events or online activities organised by together.eu
- Petitions to the European Parliament
- Contacting an MEP about an issue
- O peticích určených Evropskému parlamentu
- Peticí určených Evropskému parlamentu
Hungarian translation
Below are some ways citizens can get involved in the work of the European Union. Which, if any, have you heard of?
- Events or online activities organised by together.eu
- Petitions to the European Parliament
- Contacting an MEP about an issue
And which, if any, of these have you actively taken part in?
- Events or online activities organised by together.eu
- Petitions to the European Parliament
- Contacting an MEP about an issue
- Az Európai Parlamenthez benyújtott petíciók
- Az Európai Parlamenthez benyújtott petíciókban
Challenges
Multi-disciplinary teams
- People at organizations responsible for creating XLIFF files we must work with do not have the localization engineering know-how they need:
- to develop translation-friendly authoring tools
- to create translation-friendly XLIFF files
- to understand our feedback
- They are often very reluctant to re-think their tools and content to introduce localization best practices
- We spend a lot of time in testing and making our feedback accessible to them, and
- We might end up working in sub-optimal conditions (valid XLIFF but poorly crafted).
Technology
Open source CAT tools and localization tool-kits we use are often not optimal or it is not easy to improve them.
- Open-source promise: "Just hire a developer to do what you need"
- In practice:
- Developers who know the code base
- are not easy to find
- don't have capacity
- have prohibitive prices
- Other developers
- are not reliable and might introduce bugs
- Developers who know the code base
Technology
Commercial tools (memoQ, Trados, Swordfish*):
- tend to be more mature in terms of modern translation best practices
- manufacturers react more quickly if a bug is found
- not affordable as workstation tools for linguists in some low-budget projects
- often we use them as localization took-kits to generate XLIFF files that we then translate/revise in OmegaT
Proposals / Ideas
A second level of validation??
XLIFF files
- can validate in an XLIFF checker or against the DTD
- while being totally unworkable
Could the specs include some constraints that prevent bad practices? like:
- line breaks and whitespace
- leading/trailing tags
- inline codes
Tech fiction? + I am aware these deficiencies are probably very rare in the localization industry.
manuel.souto@capstan.be
OASIS XLIFF TC
By cApStAn LQC
OASIS XLIFF TC
OASIS XLIFF TC
- 165