OmegaT (2)
Standard functions for translation and revision
If you have downloaded this presentation and are looking at it offline, the latest version (in case of any updates) is always available at:
OmegaT training for ETS
Roles and responsibilities
- cApStAn prepares and provides project templates
- Engineers prepare projects based on those templates:
- Adding files
- Defining project name, language codes, etc.
- PMs need to edit translations
-
- in the translations of the project (working TM)
- in a reference TM added to the project
- PMs need to update source files
- cApStAn provides training and technical support
Is your OmegaT customization up to date?
## Update 73_cs0 (2021-11-29)
Contents
- Unpacking / exporting / packing
-
GUI and editor
- Project files, segment properties, notes, comments
-
Matches
- Assessing, inserting, replacing
-
Repetitions
- Auto-propagation, alternative/multiple transl.
-
Tags
- Inserting, moving, fixing
-
QA checks
- Spell checking, completion, LT, tags, glossary
shortcuts save a lot of time
Clicking is handy but...
Ctrl+S
to save
Ctrl+C
to copy
Ctrl+P
to cut (copy and remove)
Ctrl+X
to paste
Build some muscle memory
We propose just a few shortcuts to remember:
- Ctrl+L
- Ctrl+J
- Ctrl+T
- Ctrl+D
- Ctrl+I
- Ctrl+Space
Un/pack
Open/Close
Download
A more mundane question
[no-quiz]
If an OmegaT project is a folder which contains other folders and files...
then a project package is...
close
unpack
pack
incoming
outgoing
working folder
volatile memory
close
Both the project package and the project itself have the same content. Confusing?
Unpack and pack the project
open
Unpack and pack+delete
open
pack &
incoming
outgoing
working folder
volatile memory
Alternative, you can delete the project when you pack it.
delete
Unpacking options
-
When unpacking
- Do you want to delete the OMT package?
-
When packing
- The OMT package is saved inside the project folder
- The OMT package will not include:
- OMT/Zip/Rar/7z packages
- Master TMs and target files
- Hidden files, etc.
-
When packing and deleting
- The OMT package is saved in the parent folder (at the same level as the project itself)
If the user happens to unpack this OMT file from inside the project, they will have to choose which translation must be kept.
Interface
Editor
Panes
Segment properties
- File name
- ID (if set)
- Duplicate status
- Resource name
- Notes
- Author of the first translation
- Date/time of the first translation
- Author of the last edit
- Date/time of the last edit
- Match type
source text
target text
Match type
Duplicate status
Project Files pane
filename of each file
filter used
encoding
segment counts
progress tracking
Project Files pane
This dialog will always tell you in which file you are.
Ctrl+L
to show the list of files
Moving through the project
Enter
Ctrl+Enter
Forward/backward keys:
Go-to functions / shortcuts
-
Ctrl+U
- Go to next untranslated segment
- When you are translating
-
Ctrl+Shift+U
- Go to next translated segment
- When you are reviewing (but not all is translated)
-
Ctrl+J
- Go to a specific segment by its number
-
Ctrl+Shift+P
- Back/Forward in history
- Very useful if you go back to where you were
If you don't remember a shortcut, you may check the Go To menu.
Ctrl+J
go to specific segment by number
Create
translated
documents
Generate target files / export
- To be done always before any preview
-
Ctrl+D
- Generate all target files in the project
- Easy to remember
- Slow if you have many files or if files are big
-
Ctrl+Shift+D
- Generate current target file
- Faster, so more practical
Ctrl+D
create translated documents
Ctrl+Shift+D
create current translated document
Matches
Types of matches
- Exact match
- Fuzzy match
Strongly agree. <segment 0009>
-------------------------------------------------
Strongly agreedisagree.
<50/50/75%>
Strongly agree. <segment 0001>
-------------------------------------------------
Strongly agree.
<100/100/100%>
- In-context exact (ICE)
Match similarity metric
- Three percentages:
- Stemmed without stopwords
- Inflected form without stopwords
- All tokens (inflected forms and stopwords)
Differences are conveniently shown in diff mode.
Types of match behavior
- A reference match (of any score)
- Must be inserted manually
-
Auto-populated match (exact)
- It is inserted automatically when the project loads...
- ... as long as the segment is not translated
- It is inserted automatically when the project loads...
- An enforced match (exact)
- It is inserted automatically when the project loads...
- ... replacing any text in the segment.
- It is inserted automatically when the project loads...
Automatic leverage
Manual leverage
Manual leverage
- Matches from /tm must be leveraged manually
- Exact matches: probably do not require updates
- Fuzzy matches: probably do require updates
- Updates:
- Inserting full match
+ editing/overwriting
invalid part -
Selection of valid part
+ insertion
+ editing - Automatically!
- Inserting full match
Carrying over matches
A match must be active so that it can be carried over.
-
Ctrl+#
- Activates the match numbered with #
An active match can be carried over to the translation in 2 ways.
-
Ctrl+R
- Replaces the translation with the match or the selected part
-
Ctrl+i
- Inserts the match or the selected part
- It can be used to replace in combination with Ctrl+A
Ctrl+i
insert match or selection
Ctrl+R
activates segment by number (1-5)
Ctrl+#
replace with match or selection
Repetitions
Repetitions
Repetitions
Email message | |
Subject: | |
Fw: Summer Streets | |
Dear Subscriber | |
... | ... |
Library Catalog | Catálogo bibliotecario |
genetically modified food | |
Food for the world | |
Subject: | |
biotechnology | |
Modern farming |
files/editor
working TM
file 1
file 2
Library Catalog | Catálogo bibliotecario |
Tema:
Subject:
Tema:
Subject:
Auto-propagation
Email message | |
Subject: | |
Fw: Summer Streets | |
Dear Subscriber | |
... | ... |
Library Catalog | Catálogo bibliotecario |
genetically modified food | |
Food for the world | |
Subject: | |
biotechnology | |
Modern farming |
files/editor
working TM
Library Catalog | Catálogo bibliotecario |
Tema:
Subject:
Tema:
Subject:
Tema:
Tema:
Tema:
Auto-propagation
Email message | |
Subject: | |
Fw: Summer Streets | |
Dear Subscriber | |
... | ... |
Library Catalog | Catálogo bibliotecario |
genetically modified food | |
Food for the world | |
Subject: | |
biotechnology | |
Modern farming |
files/editor
working TM
Library Catalog | Catálogo bibliotecario |
Subject:
Tema:
Subject:
Tema:
Tema:
Tema:
Auto-propagation
Email message | |
Subject: | |
Fw: Summer Streets | |
Dear Subscriber | |
... | ... |
Library Catalog | Catálogo bibliotecario |
genetically modified food | |
Food for the world | |
Subject: | |
biotechnology | |
Modern farming |
files/editor
working TM
Library Catalog | Catálogo bibliotecario |
Subject:
Tema:
Subject:
Context determines meaning
Tema:
Tema:
Tema:
The subject field of a book
The topic of an email
Email message | |
Subject: | |
Fw: Summer Streets | |
Dear Subscriber | |
... | ... |
Library Catalog | Catálogo bibliotecario |
genetically modified food | |
Food for the world | |
Subject: | |
biotechnology | |
Modern farming |
files/editor
working TM
Library Catalog | Catálogo bibliotecario |
Subject:
Tema:
Subject:
Alternative translations
Tema:
Tema:
Tema:
Create alternative translation
Asunto:
Subject:
Asunto:
Email message | |
Subject: | |
Fw: Summer Streets | |
Dear Subscriber | |
... | ... |
Library Catalog | Catálogo bibliotecario |
genetically modified food | |
Food for the world | |
Subject: | |
biotechnology | |
Modern farming |
files/editor
working TM
Library Catalog | Catálogo bibliotecario |
Subject:
Tema:
Subject:
Alternative translations
Tema:
Tema:
Tema:
Create alternative translation
Asunto:
Subject:
Asunto:
prev: Email message
next: Fw: Summer Streets
Alternative translations are like crossing the street.
If you proceed with caution you should be fine ;)
In-context exact (ICE) match
- Alternative translations rely on a context-wide match
- Source text
- Previous and next source texts
- File name
Not very robust if other repetitions have the same context.
ID-bound match (XLIFF)
- Alternative translations based on stricter kinds of context
- Source text
- Segment ID
- File name
For formats that allow it (e.g. XLIFF), we can use segment IDs to create a unique context.
Create alternative translation
A default translation exists.
Right click the segment and select "Create Alternative Translation"
Edit the translation, then press Ctrl+S to save.
When you save, the new alternative translation will appear in the Multiple Translation pane.
Restore default translation
An alternative translation exists.
Delete the translation, then Ctrl+S to save.
When you save, the default translation appears in the segment.
And the alternative translation disappears from the Multiple Translations pane.
Tags
(placeables)
Tags stand for codes/markup
-
Structural markup/codes define the different parts of the document and how they are arranged together. Parts can be paragraph, list, table, footnote, title, etc.
- Some of those parts contain text. Text might contain some embedded elements, like icons, buttons, etc.
-
Formatting markup/codes define how the text is styled or decorated. Styles include bold, italics, underlining, font-family, colour, size, etc.
- These codes embed the formatted text, which can be a whole sentence or part of a sentence.
Let's see an example: a web page
<!DOCTYPE html>
<html lang="en" dir="ltr">
<head>
<meta charset="utf-8">
<title></title>
<link rel="stylesheet" href="github-markdown.css">
</head>
<body class="markdown-body">
<h1>Page about OmegaT</h1>
<p>This is <i>just</i> a draft. Stay tuned ;)</p>
<div id="mc_embed_signup">
<form action="https://capstan.us18.list-manage.com" method="post">
<div id="mc_embed_signup_scroll">
<label for="mce-EMAIL">Subscribe</label>
<input type="email" value="" name="EMAIL" class="email" id="mce-EMAIL" placeholder="email address" required>
<!-- do not remove this or risk form bot signups-->
<div class="clear"><input type="submit" value="Subscribe" class="button"></div>
</form>
</div>
</body>
</html>
Namely, an HTML file to translate...
<h1>Page about OmegaT</h1>
<p>This is <i>just</i> a draft. Stay tuned ;)</p>
<label for="mce-EMAIL">Subscribe</label>
Page about OmegaT
This is <i>just</i> a draft. Stay tuned ;)
Subscribe
<?xml version="1.0" encoding="UTF-8"?>
<xliff version="1.2" xmlns="urn:oasis:names:tc:xliff:document:1.2" its:version="2.0">
<file original="03_index.html" source-language="en-US" target-language="fr-FR" datatype="html">
<body>
<trans-unit id="tu2" restype="x-h1">
<source xml:lang="en-US">Page about OmegaT</source>
<target xml:lang="fr-FR">Page about OmegaT</target>
</trans-unit>
<trans-unit id="tu3" restype="x-paragraph">
<source xml:lang="en-US">This is <g id="1">just</g> a draft. Stay tuned ;)</source>
<target xml:lang="fr-FR">This is <g id="1">just</g> a draft. Stay tuned ;)</target>
</trans-unit>
<trans-unit id="tu4">
<source xml:lang="en-US"><g id="1">Subscribe</g> <x id="2"/> <x id="3"/></source>
<target xml:lang="fr-FR"><g id="1">Subscribe</g> <x id="2"/> <x id="3"/></target>
</trans-unit>
<trans-unit id="tu5">
<source xml:lang="en-US"><x id="1"/></source>
<target xml:lang="fr-FR"><x id="1"/></target>
</trans-unit>
<trans-unit id="tu7" restype="x-value">
<source xml:lang="en-US">Subscribe</source>
<target xml:lang="fr-FR">Subscribe</target>
</trans-unit>
<trans-unit id="tu6">
<source xml:lang="en-US"><x id="1"/></source>
<target xml:lang="fr-FR"><x id="1"/></target>
</trans-unit>
</body>
</file>
</xliff>
This is the XLIFF used to translate that HTML file
from which OmegaT will only extract the target text.
Page about OmegaT
This is <g id="1">just</g> a draft. Stay tuned ;)
Subscribe
Subscribe
from which OmegaT will only extract the target text.
This is what it looks like when you add the file to the OmegaT project.
Only inline tags need to be placed in the translation.
Leading and trailing tags do not need to be extracted/placed.
Tags in PISA XLIFF files
- Tags stand for formatting codes.
Not real OmegaT tags (customization is necessary to lock them).
Tags in PISA XLIFF files
- XLIFF files contain escaped HTML (not real OmegaT tags)
- The escaped tags are converted to HTML code in preview
<!DOCTYPE html>
<html lang="en" dir="ltr">
<body class="markdown-body">
<table class="greytable">
<tr>
<th class="alignLeft" id="Q03tr1th1">Statement</th>
<th id="Q03tr1th2" class="center">True</th>
<th id="Q03tr1th3" class="center">False</th>
</tr>
<tr>
<td id="Q03tr2td1">The red line for median <b>solar</b> capacity would move to the left.</td>
<td id="Q03tr2td2"><input type="radio" id="M105Q03RADIO_1_1" value="0" /></td>
<td id="Q03tr2td3"><input type="radio" id="M105Q03RADIO_1_2" value="1" /></td>
</tr>
<tr>
<td id="Q03tr3td1">The red line for median <b>wind</b> capacity would move down.</td>
<td id="Q03tr3td2"><input type="radio" id="M105Q03RADIO_2_1" value="0" /></td>
<td id="Q03tr3td3"><input type="radio" id="M105Q03RADIO_2_2" value="1" /></td>
</tr>
</table>
</body>
</html>
Now, let's see a PISA unit.
Again, we want to translate only the text.
<th class="alignLeft" id="Q03tr1th1">Statement</th>
<th id="Q03tr1th2" class="center">True</th>
<th id="Q03tr1th3" class="center">False</th>
<td id="Q03tr2td1">The red line for median <b>solar</b> capacity would move to the left.</td>
<td id="Q03tr3td1">The red line for median <b>wind</b> capacity would move down.</td>
Again, we want to translate only the text.
Statement
True
False
The red line for median <b>solar</b> capacity would move to the left.
The red line for median <b>wind</b> capacity would move down.
Again, we want to translate only the text.
<?xml version="1.0" encoding="UTF-8"?>
<xliff xmlns:utt="http://www.ets.org/utt" version="1.0">
<file datatype="plaintext" xml:space="preserve" original="stimulus3.html" source-language="eng-ZZZ" target-language="esp-URY">
<body>
<trans-unit id="M105_question3_Q03tr1th1_5c9d2136724fa6.98760829">
<source xml:lang="eng-ZZZ">Statement</source>
<target xml:lang="esp-URY">Statement</target>
</trans-unit>
<trans-unit id="M105_question3_Q03tr1th2_5c9d2136725139.12753161">
<source xml:lang="eng-ZZZ">True</source>
<target xml:lang="esp-URY">True</target>
</trans-unit>
<trans-unit id="M105_question3_Q03tr1th3_5c9d2136725238.08930641">
<source xml:lang="eng-ZZZ">False</source>
<target xml:lang="esp-URY">False</target>
</trans-unit>
<trans-unit id="M105_question3_Q03tr2td1_5c9d2136725402.17773132">
<source xml:lang="eng-ZZZ">The red line for median <b>solar</b> capacity would move to the left.</source>
<target xml:lang="esp-URY">The red line for median <b>solar</b> capacity would move to the left.</target>
</trans-unit>
<trans-unit id="M105_question3_Q03tr3td1_5c9d2136725782.87515073">
<source xml:lang="eng-ZZZ">The red line for median <b>wind</b> capacity would move down.</source>
<target xml:lang="esp-URY">The red line for median <b>wind</b> capacity would move down.</target>
</trans-unit>
</body>
</file>
</xliff>
This is the XLIFF used to translate that HTML file
This is what the XLIFF file looks like when translate it in OmegaT.
<?xml version="1.0" encoding="UTF-8"?>
<xliff xmlns:utt="http://www.ets.org/utt" version="1.0">
<file datatype="plaintext" xml:space="preserve" original="stimulus3.html" source-language="eng-ZZZ" target-language="esp-URY">
<body>
<trans-unit id="M105_question3_Q03tr1th1_5c9d2136724fa6.98760829">
<source xml:lang="eng-ZZZ">Statement</source>
<target xml:lang="esp-URY">Afirmación</target>
</trans-unit>
<trans-unit id="M105_question3_Q03tr1th2_5c9d2136725139.12753161">
<source xml:lang="eng-ZZZ">True</source>
<target xml:lang="esp-URY">Verdadera</target>
</trans-unit>
<trans-unit id="M105_question3_Q03tr1th3_5c9d2136725238.08930641">
<source xml:lang="eng-ZZZ">False</source>
<target xml:lang="esp-URY">Falsa</target>
</trans-unit>
<trans-unit id="M105_question3_Q03tr2td1_5c9d2136725402.17773132">
<source xml:lang="eng-ZZZ">The red line for median <b>solar</b> capacity would move to the left.</source>
<target xml:lang="esp-URY">La línea roja que indica la capacidad <b>solar</b> mediana se movería hacia la izquierda.</target>
</trans-unit>
<trans-unit id="M105_question3_Q03tr3td1_5c9d2136725782.87515073">
<source xml:lang="eng-ZZZ">The red line for median <b>wind</b> capacity would move down.</source>
<target xml:lang="esp-URY">La línea roja que indica la capacidad <b>eólica</b> mediana se movería hacia abajo.</target>
</trans-unit>
</body>
</file>
</xliff>
And this is the translated XLIFF file.
What we have just seen is how HTML markup has historically been handled in PISA.
Not necessarily the best approach, it has pros and cons, and it can be reconsidered.
Inserting tags: next missing
Ctrl+T
- Inserts the next missing tag.
- During translation...
-
-
During editing (on a given translation)...
-
Inserting tags: auto-complete
Ctrl+Space
- Allows you to insert any missing tag...
- Or to inserts any tag pair to create an embedding...
Fixing tags: position
Drag & Drop:
- Double-click to select the tag
- Drag it to its new position
- Drop it
or Cut & Paste:
- Double-click to select the tag
- Cut the tag (Ctrl+X)
- Click on the new position
- Paste the tag (Ctrl+P)
Ctrl+T
to insert next missing tag
Ctrl+Space
to launch the Auto-Completer
Search
Search is a powerful feature
The Search dialog (Ctrl+F) can help you find:
- Find concordances in the working or reference TMs
- (what doesn't appear in the Matches pane)
- Find any text in reference files
- (in any folder, not necessarily in the project)
- Find segments with some particular property:
- Some text in the note
- Edited in a time range or by someone in particular
-
Filter by certain results:
- Useful to focus on one particular aspect.
Ctrl+F
to launch the search dialog to find things
QA
QA checks: last but never least
Some QA checks are automated:
- Tags
- Spelling mistakes
- Adherence to glossary (terminological consistency)
- LanguageTool issues
Some QA checks are not automated:
- Completion
- Statistics: zero remaining segments
All tag issues should always be fixed! (manually)
If you need help with OmegaT:
You can contact cApStAn's OmegaT Helpdesk on https://pisa.capstan.be/
Please do not struggle!
If you find a problem in OmegaT
Please let us know at cApStAn's OmegaT Helpdesk on https://pisa.capstan.be/
We will strive to find a solution.
(if you tell us about the issue)
OmegaT for PMs and engineers (2)
By cApStAn LQC
OmegaT for PMs and engineers (2)
Session 2: Standard functions for translation and review (for PMs)
- 211