OmegaT (2)

Standard functions for translation and revision

If you have downloaded this presentation and are looking at it offline, the latest version (in case of any updates) is always available at:

OmegaT training for ETS

Roles and responsibilities

  • cApStAn prepares and provides project templates
  • Engineers prepare projects based on those templates:
    • Adding files
    • Defining project name, language codes, etc.
  • PMs need to edit translations
    • in the translations of the project (working TM)
    • in a reference TM added to the project
  • PMs need to update source files
  • cApStAn provides training and technical support

Is your OmegaT customization up to date?

## Update 73_cs0 (2021-11-29)

Contents

  • Unpacking / exporting / packing
  • GUI and editor
    • Project files, segment properties, notes, comments
  • Matches
    • Assessing, inserting, replacing
  • Repetitions
    • Auto-propagation, alternative/multiple transl.
  • Tags
    • Inserting, moving, fixing
  • QA checks
    • Spell checking, completion, LT, tags, glossary

shortcuts save a lot of time

Clicking is handy but...

Ctrl+S

to save

Ctrl+C

to copy

Ctrl+P

to cut (copy and remove)

Ctrl+X

to paste

Build some muscle memory

We propose just a few shortcuts to remember:

  • Ctrl+L
  • Ctrl+J
  • Ctrl+T
  • Ctrl+D
  • Ctrl+I
  • Ctrl+Space

Un/pack
Open/Close
Download

A more mundane question

[no-quiz]

If an OmegaT project is a folder which contains other folders and files...

then a project package is...

close

unpack

pack

incoming

outgoing

working folder

volatile memory

close

Both the project package and the project itself have the same content. Confusing?

Unpack and pack the project

open

Unpack and pack+delete

open

pack &

incoming

outgoing

working folder

volatile memory

Alternative, you can delete the project when you pack it.

delete

Unpacking options

  • When unpacking
    • Do you want to delete the OMT package?
  • When packing
    • The OMT package is saved inside the project folder
    • The OMT package will not include:
      • OMT/Zip/Rar/7z packages
      • Master TMs and target files
      • Hidden files, etc.
  • When packing and deleting
    • The OMT package is saved in the parent folder (at the same level as the project itself)

If the user happens to unpack this OMT file from inside the project, they will have to choose which translation must be kept.

Interface
Editor
Panes

Segment properties

  • File name
  • ID (if set)
  • Duplicate status
  • Resource name
  • Notes
  • Author of the first translation
  • Date/time of the first translation
  • Author of the last edit
  • Date/time of the last edit
  • Match type

source text

target text

Match type

Duplicate status

Project Files pane

filename of each file

filter used

encoding

segment counts

progress tracking

Project Files pane

This dialog will always tell you in which file you are.

Ctrl+L

to show the list of files

Moving through the project

 

Enter

     Ctrl+Enter

Forward/backward keys:

Go-to functions / shortcuts

  • Ctrl+U
    • Go to next untranslated segment
    • When you are translating
  • Ctrl+Shift+U
    • Go to next translated segment
    • When you are reviewing (but not all is translated)
  • Ctrl+J
    • Go to a specific segment by its number
  • Ctrl+Shift+P
    • Back/Forward in history
    • Very useful if you go back to where you were

If you don't remember a shortcut, you may check the Go To menu.

Ctrl+J

go to specific segment by number

Create

translated

documents

Generate target files / export

  • To be done always before any preview
  • Ctrl+D
    • Generate all target files in the project
    • Easy to remember
    • Slow if you have many files or if files are big
  • Ctrl+Shift+D
    • Generate current target file
    • Faster, so more practical

Ctrl+D

create translated documents

Ctrl+Shift+D

create current translated document

Matches

Types of matches

  • Exact match



 

 

  • Fuzzy match

Strongly agree. <segment 0009>
-------------------------------------------------
Strongly agreedisagree.
<50/50/75%>

Strongly agree. <segment 0001>
-------------------------------------------------
Strongly agree.
<100/100/100%>

  • In-context exact (ICE)

Match similarity metric

  • Three percentages:
    • Stemmed without stopwords
    • Inflected form without stopwords
    • All tokens (inflected forms and stopwords)

Differences are conveniently shown in diff mode.

Types of match behavior

  • A reference match (of any score)
    • Must be inserted manually
  • Auto-populated match (exact)
    • It is inserted automatically when the project loads...
      • ... as long as the segment is not translated
  • An enforced match (exact)
    • It is inserted automatically when the project loads...
      • ... replacing any text in the segment. 

Automatic leverage

Manual leverage

Manual leverage

  • Matches from /tm must be leveraged manually
    • Exact matches: probably do not require updates
    • Fuzzy matches: probably do require updates
  • Updates:
    • Inserting full match
      + editing/overwriting
      invalid part
    • Selection of valid part
      + insertion
      + editing
    • Automatically!

Carrying over matches

A match must be active so that it can be carried over.

  • Ctrl+#
    • Activates the match numbered with #

An active match can be carried over to the translation in 2 ways.

  • Ctrl+R
    • Replaces the translation with the match or the selected part
  • Ctrl+i
    • Inserts the match or the selected part
    • It can be used to replace in combination with Ctrl+A

Ctrl+i

insert match or selection

Ctrl+R

activates segment by number (1-5)

Ctrl+#

replace with match or selection

Repetitions

Repetitions

 Repetitions

Email message
Subject:
Fw: Summer Streets
Dear Subscriber
... ...
Library Catalog Catálogo bibliotecario
genetically modified food
Food for the world
Subject:
biotechnology
Modern farming

files/editor

working TM

file 1

file 2

Library Catalog Catálogo bibliotecario

Tema:

Subject:

Tema:

Subject:

Auto-propagation

Email message
Subject:
Fw: Summer Streets
Dear Subscriber
... ...
Library Catalog Catálogo bibliotecario
genetically modified food
Food for the world
Subject:
biotechnology
Modern farming

files/editor

working TM

Library Catalog Catálogo bibliotecario

Tema:

Subject:

Tema:

Subject:

Tema:

Tema:

Tema:

Auto-propagation

Email message
Subject:
Fw: Summer Streets
Dear Subscriber
... ...
Library Catalog Catálogo bibliotecario
genetically modified food
Food for the world
Subject:
biotechnology
Modern farming

files/editor

working TM

Library Catalog Catálogo bibliotecario

Subject:

Tema:

Subject:

Tema:

Tema:

Tema:

Auto-propagation

Email message
Subject:
Fw: Summer Streets
Dear Subscriber
... ...
Library Catalog Catálogo bibliotecario
genetically modified food
Food for the world
Subject:
biotechnology
Modern farming

files/editor

working TM

Library Catalog Catálogo bibliotecario

Subject:

Tema:

Subject:

Context determines meaning

Tema:

Tema:

Tema:

The subject field of a book

The topic of an email

Email message
Subject:
Fw: Summer Streets
Dear Subscriber
... ...
Library Catalog Catálogo bibliotecario
genetically modified food
Food for the world
Subject:
biotechnology
Modern farming

files/editor

working TM

Library Catalog Catálogo bibliotecario

Subject:

Tema:

Subject:

Alternative translations

Tema:

Tema:

Tema:

Create alternative translation

Asunto:

Subject:

Asunto:

\}
Email message
Subject:
Fw: Summer Streets
Dear Subscriber
... ...
Library Catalog Catálogo bibliotecario
genetically modified food
Food for the world
Subject:
biotechnology
Modern farming

files/editor

working TM

Library Catalog Catálogo bibliotecario

Subject:

Tema:

Subject:

Alternative translations

Tema:

Tema:

Tema:

Create alternative translation

Asunto:

Subject:

Asunto:

prev: Email message

next: Fw: Summer Streets

\{

Alternative translations are like crossing the street.

If you proceed with caution you should be fine ;)

In-context exact (ICE) match

  • Alternative translations rely on a context-wide match
    • Source text
    • Previous and next source texts
    • File name

Not very robust if other repetitions have the same context.

ID-bound match (XLIFF)

  • Alternative translations based on stricter kinds of context
    • Source text
    • Segment ID
    • File name

For formats that allow it (e.g. XLIFF), we can use segment IDs to create a unique context.

Create alternative translation

A default translation exists.

Right click the segment and select "Create Alternative Translation"

Edit the translation, then press Ctrl+S to save.

When you save, the new alternative translation will appear in the Multiple Translation pane.

Restore default translation

An alternative translation exists.

Delete the translation, then Ctrl+S to save.

When you save, the default translation appears in the segment.

And the alternative translation disappears from the Multiple Translations pane.

Tags
(placeables)

Tags stand for codes/markup

  • Structural markup/codes define the different parts of the document and how they are arranged together. Parts can be paragraph, list, table, footnote, title, etc.
    • Some of those parts contain text. Text might contain some embedded elements, like icons, buttons, etc.
  • Formatting markup/codes define how the text is styled or decorated. Styles include bold, italics, underlining, font-family, colour, size, etc.
    • These codes embed the formatted text, which can be a whole sentence or part of a sentence.

Let's see an example: a web page

<!DOCTYPE html>
<html lang="en" dir="ltr">
	<head>
		<meta charset="utf-8">
		<title></title>
		<link rel="stylesheet" href="github-markdown.css">
	</head>
	<body class="markdown-body">
		<h1>Page about OmegaT</h1>
		<p>This is <i>just</i> a draft. Stay tuned ;)</p>
		<div id="mc_embed_signup">
		<form action="https://capstan.us18.list-manage.com" method="post">
		    <div id="mc_embed_signup_scroll">
			<label for="mce-EMAIL">Subscribe</label>
			<input type="email" value="" name="EMAIL" class="email" id="mce-EMAIL" placeholder="email address" required>
		    <!-- do not remove this or risk form bot signups-->
            <div class="clear"><input type="submit" value="Subscribe" class="button"></div>
		</form>
		</div>
	</body>
</html>

Namely, an HTML file to translate...









		<h1>Page about OmegaT</h1>
		<p>This is <i>just</i> a draft. Stay tuned ;)</p>



			<label for="mce-EMAIL">Subscribe</label>














		    Page about OmegaT     
		   This is <i>just</i> a draft. Stay tuned ;)    



			                       Subscribe        






<?xml version="1.0" encoding="UTF-8"?>
<xliff version="1.2" xmlns="urn:oasis:names:tc:xliff:document:1.2" its:version="2.0">
    <file original="03_index.html" source-language="en-US" target-language="fr-FR" datatype="html">
        <body>
            <trans-unit id="tu2" restype="x-h1">
                <source xml:lang="en-US">Page about OmegaT</source>
                <target xml:lang="fr-FR">Page about OmegaT</target>
            </trans-unit>
            <trans-unit id="tu3" restype="x-paragraph">
                <source xml:lang="en-US">This is <g id="1">just</g> a draft. Stay tuned ;)</source>
                <target xml:lang="fr-FR">This is <g id="1">just</g> a draft. Stay tuned ;)</target>
            </trans-unit>
            <trans-unit id="tu4">
                <source xml:lang="en-US"><g id="1">Subscribe</g> <x id="2"/> <x id="3"/></source>
                <target xml:lang="fr-FR"><g id="1">Subscribe</g> <x id="2"/> <x id="3"/></target>
            </trans-unit>
            <trans-unit id="tu5">
                <source xml:lang="en-US"><x id="1"/></source>
                <target xml:lang="fr-FR"><x id="1"/></target>
            </trans-unit>
            <trans-unit id="tu7" restype="x-value">
                <source xml:lang="en-US">Subscribe</source>
                <target xml:lang="fr-FR">Subscribe</target>
            </trans-unit>
            <trans-unit id="tu6">
                <source xml:lang="en-US"><x id="1"/></source>
                <target xml:lang="fr-FR"><x id="1"/></target>
            </trans-unit>
        </body>
    </file>
</xliff>

This is the XLIFF used to translate that HTML file

from which OmegaT will only extract the target text.


                                                                                                                                                                                                                                                 




                                         Page about OmegaT



                                         This is <g id="1">just</g> a draft. Stay tuned ;)
      


                                                   Subscribe
            
            
            
            
            
            
            

                                         Subscribe
            
            
            
            
            
        

from which OmegaT will only extract the target text.

This is what it looks like when you add  the file to the OmegaT project.

Only inline tags need to be placed in the translation.

Leading and trailing tags do not need to be extracted/placed.

Tags in PISA XLIFF files

  • Tags stand for formatting codes.

Not real OmegaT tags (customization is necessary to lock them).

Tags in PISA XLIFF files

  • XLIFF files contain escaped HTML (not real OmegaT tags)

 

 

 

 

  • The escaped tags are converted to HTML code in preview
<!DOCTYPE html>
<html lang="en" dir="ltr">
  <body class="markdown-body">
    <table class="greytable">
      <tr>
        <th class="alignLeft" id="Q03tr1th1">Statement</th>
        <th id="Q03tr1th2" class="center">True</th>
        <th id="Q03tr1th3" class="center">False</th>
      </tr>
      <tr>
        <td id="Q03tr2td1">The red line for median <b>solar</b> capacity would move to the left.</td>
        <td id="Q03tr2td2"><input type="radio" id="M105Q03RADIO_1_1" value="0" /></td>
        <td id="Q03tr2td3"><input type="radio" id="M105Q03RADIO_1_2" value="1" /></td>
      </tr>
      <tr>
        <td id="Q03tr3td1">The red line for median <b>wind</b> capacity would move down.</td>
        <td id="Q03tr3td2"><input type="radio" id="M105Q03RADIO_2_1" value="0" /></td>
        <td id="Q03tr3td3"><input type="radio" id="M105Q03RADIO_2_2" value="1" /></td>
      </tr>
    </table>
  </body>
</html>
  





Now, let's see a PISA unit.

Again, we want to translate only the text.






        <th class="alignLeft" id="Q03tr1th1">Statement</th>
        <th id="Q03tr1th2" class="center">True</th>
        <th id="Q03tr1th3" class="center">False</th>


        <td id="Q03tr2td1">The red line for median <b>solar</b> capacity would move to the left.</td>




        <td id="Q03tr3td1">The red line for median <b>wind</b> capacity would move down.</td>



        
        
        
  





Again, we want to translate only the text.






                                             Statement
                                          True
                                          False


                           The red line for median <b>solar</b> capacity would move to the left.




                           The red line for median <b>wind</b> capacity would move down.



        
        
        
  





Again, we want to translate only the text.

<?xml version="1.0" encoding="UTF-8"?>
<xliff xmlns:utt="http://www.ets.org/utt" version="1.0">
  <file datatype="plaintext" xml:space="preserve" original="stimulus3.html" source-language="eng-ZZZ" target-language="esp-URY">
    <body>
      <trans-unit id="M105_question3_Q03tr1th1_5c9d2136724fa6.98760829">
        <source xml:lang="eng-ZZZ">Statement</source>
        <target xml:lang="esp-URY">Statement</target>
      </trans-unit>
      <trans-unit id="M105_question3_Q03tr1th2_5c9d2136725139.12753161">
        <source xml:lang="eng-ZZZ">True</source>
        <target xml:lang="esp-URY">True</target>
      </trans-unit>
      <trans-unit id="M105_question3_Q03tr1th3_5c9d2136725238.08930641">
        <source xml:lang="eng-ZZZ">False</source>
        <target xml:lang="esp-URY">False</target>
      </trans-unit>
      <trans-unit id="M105_question3_Q03tr2td1_5c9d2136725402.17773132">
        <source xml:lang="eng-ZZZ">The red line for median &lt;b&gt;solar&lt;/b&gt; capacity would move to the left.</source>
        <target xml:lang="esp-URY">The red line for median &lt;b&gt;solar&lt;/b&gt; capacity would move to the left.</target>
      </trans-unit>
      <trans-unit id="M105_question3_Q03tr3td1_5c9d2136725782.87515073">
        <source xml:lang="eng-ZZZ">The red line for median &lt;b&gt;wind&lt;/b&gt; capacity would move down.</source>
        <target xml:lang="esp-URY">The red line for median &lt;b&gt;wind&lt;/b&gt; capacity would move down.</target>
      </trans-unit>
    </body>
  </file>
</xliff>

This is the XLIFF used to translate that HTML file

This is what the XLIFF file looks like when translate it in OmegaT.

<?xml version="1.0" encoding="UTF-8"?>
<xliff xmlns:utt="http://www.ets.org/utt" version="1.0">
  <file datatype="plaintext" xml:space="preserve" original="stimulus3.html" source-language="eng-ZZZ" target-language="esp-URY">
    <body>
      <trans-unit id="M105_question3_Q03tr1th1_5c9d2136724fa6.98760829">
        <source xml:lang="eng-ZZZ">Statement</source>
        <target xml:lang="esp-URY">Afirmación</target>
      </trans-unit>
      <trans-unit id="M105_question3_Q03tr1th2_5c9d2136725139.12753161">
        <source xml:lang="eng-ZZZ">True</source>
        <target xml:lang="esp-URY">Verdadera</target>
      </trans-unit>
      <trans-unit id="M105_question3_Q03tr1th3_5c9d2136725238.08930641">
        <source xml:lang="eng-ZZZ">False</source>
        <target xml:lang="esp-URY">Falsa</target>
      </trans-unit>
      <trans-unit id="M105_question3_Q03tr2td1_5c9d2136725402.17773132">
        <source xml:lang="eng-ZZZ">The red line for median &lt;b&gt;solar&lt;/b&gt; capacity would move to the left.</source>
        <target xml:lang="esp-URY">La línea roja que indica la capacidad &lt;b&gt;solar&lt;/b&gt; mediana se movería hacia la izquierda.</target>
      </trans-unit>
      <trans-unit id="M105_question3_Q03tr3td1_5c9d2136725782.87515073">
        <source xml:lang="eng-ZZZ">The red line for median &lt;b&gt;wind&lt;/b&gt; capacity would move down.</source>
        <target xml:lang="esp-URY">La línea roja que indica la capacidad &lt;b&gt;eólica&lt;/b&gt; mediana se movería hacia abajo.</target>
      </trans-unit>
    </body>
  </file>
</xliff>

And this is the translated XLIFF file.

What we have just seen is how HTML markup has historically been handled in PISA.

Not necessarily the best approach, it has pros and cons, and it can be reconsidered.

Inserting tags: next missing

Ctrl+T

  • Inserts the next missing tag.
    • During translation...

 

 

    • During editing (on a given translation)...

Inserting tags: auto-complete

Ctrl+Space

  • Allows you to insert any missing tag...

 

 

 

 

  • Or to inserts any tag pair to create an embedding...

Fixing tags: position

Drag & Drop:

  • Double-click to select the tag
  • Drag it to its new position
  • Drop it

or Cut & Paste:

  • Double-click to select the tag
  • Cut the tag (Ctrl+X)
  • Click on the new position
  • Paste the tag (Ctrl+P)

Ctrl+T

to insert next missing tag

Ctrl+Space

to launch the Auto-Completer

Search

Search is a powerful feature

The Search dialog (Ctrl+F) can help you find:

  • Find concordances in the working or reference TMs
    • (what doesn't appear in the Matches pane)
  • Find any text in reference files
    • (in any folder, not necessarily in the project)
  • Find segments with some particular property:
    • Some text in the note
    • Edited in a time range or by someone in particular
  • Filter by certain results:
    • Useful to focus on one particular aspect.

Ctrl+F

to launch the search dialog to  find things

QA

QA checks: last but never least

Some QA checks are automated:

  • Tags
  • Spelling mistakes
  • Adherence to glossary (terminological consistency)
  • LanguageTool issues

Some QA checks are not automated:

  • Completion
    • Statistics: zero remaining segments

All tag issues should always be fixed! (manually)

If you need help with OmegaT:

You can contact cApStAn's OmegaT Helpdesk on https://pisa.capstan.be/
 

Please do not struggle!

If you find a problem in OmegaT

Please let us know at cApStAn's OmegaT Helpdesk on https://pisa.capstan.be/

We will strive to find a solution.

(if you tell us about the issue)

OmegaT for PMs and engineers (2)

By cApStAn LQC

OmegaT for PMs and engineers (2)

Session 2: Standard functions for translation and review (for PMs)

  • 211