Understanding bidirectionality in OmegaT
Some
Character
Glyph
Context-dependent, font dependent.... examples
ه ههه
U+0647 U+0647U+0647U+0647
ع ععع
U+0647 U+0647U+0647U+0647
How
bidirectionality works
Cursor
Arabic or Hebrew are bidirectional languages
Logical order
We should not try to fix the visual appearance by changing the order of the text.
Figures, mathematical expressions, foreign names in Latin script, etc. are LTR blocks
The cursor shows the directionality, either through its movement (Word) or the flag (OmegaT)
Right-to-left
Left-to-right
Cursor
Arabic or Hebrew are bidirectional languages, but the actual text always runs RTL.
Logical order
We should not try to tweak the visual appearance by changing the order of the text.
Figures, mathematical expressions, foreign names in Latin script, etc. are LTR blocks
The cursor shows the directionality, either through its movement (Word) or the flag (OmegaT)
Right-to-left
Left-to-right
Levels and
degrees of directionality
Handling tags
in bidirectional segments
Using Unicode bidirectionality control characters
Examples:
embeddings
ara
formulas
https://recordit.co/0qIfYjW9Q2
Recommendations for formatting tags
In Math texts, don't use them, Insert them at the end after a linebreak (e.g. XYZ_ara-ARE)
Examples:
scopes and styles
https://recordit.co/qeoJ4en3Iv
We can only intervene at the segment level, not above.
Measurements
easily (see https://vimeo.com/387945710 and https://vimeo.com/387943549 - they will hopefully help you understand what to do in these cases). And this quick summaries: https://recordit.co/BX04TRcIuH and https://recordit.co/JV7fJkHR7G.
mail: Re[4]: PISA2021 FT COG Math Batch 1 + XYZ [ara-ARE] Final Review
Font issues
Font issues
The target text will inherit the font settings of the source text, OmegaT does not modify the font.
Not all fonts can render all characters in all styles in all languages.
it can be a problem in the translation, or a technical glitch that messes directionality.
Watch out for double punctuation symbols (parenthesis, brackets, etc.)
The user wrote literally (but in Arabic):
"Code) 0 or (00"
Characters with neutral directionality might change their position and their glyph (representation).
References
- https://www.w3.org/International/articles/inline-bidi-markup/uba-basics
- http://www.i18nguy.com/markup/right-to-left.html
- https://r12a.github.io/scripts/tutorial/part4#bidi
Understanding bidirectionality in OmegaT
By msoutopico
Understanding bidirectionality in OmegaT
- 139