Runtime Rules

 

Please scroll down to look for the information you require:

  1. What are Runtime Rules

  2. Basic Runtime Rules

  3. How to Create a Basic Runtime Rule

  4. How to Save a Runtime Rule


For details on how to upload the rule to the engine please refer to https://languagestudio.freshdesk.com/support/solutions/articles/12000014443-how-to-upload-a-runtime-rule-to-ls4


What are runtime rules?

 

One option for an MT Strategy is to have one engine per language pair and domain and having rules and project folders that can be used for different projects where the context may differ.


Project folders and runtime rules gives you the option to upload glossaries and other files that can differ the terminology at runtime without having to re-train the engine; we call these, runtime rules.


Runtime rules are stored in project folders that are created in your account and are specific to Language Pair, Custom Engine and Project.  These include JavaScript rules, Runtime Glossaries (GLO), Pre-Translation Corrections (PTC), Post Translation Adjustments (PTA) and Non-Translatable Terms (NTT).


To add rules for an engine, simply make a new project folder with rule files stored in the new folder. At runtime, when translating, specify the project number for the project folder to use the settings associated with that folder and domain. When translating, this specific project folder will need to be selected.

 

 

Please Note - Runtime rules are always absolute and if there is an existing TM already on the engine that contradicts the runtime rules, the runtime rule will always be chosen.


Language Studio™ runtime rules can be simple search and replace rules with basic text or regular expressions, or can be more complex rules that leverage syntax, part of speech and other linguist information. Language Studio™ Linguists will guide you on the best approaches to solve a specific linguistic challenge.


To learn more about Regular Expressions, see these websites: http://en.wikipedia.org/wiki/Regular_expression, http://www.regular-expressions.info/ and http://regexlib.com/

 

What types of rules are there? 

There are two types of rules, Basic Runtime Rules and JavaScript Rules.


They are applied to the project to execute either BEFORE Machine Translation (we call this a pre-processing rule) or AFTER Machine Translation (called a post-processing rule).


Please refer to the JavaScript page, for information on how to create JavaScript rules

 

Basic Runtime Rules

 

The four basic runtime rules are:

Pre-Translation Correction (PTC): Often the original source material may contain outdated terminology. Language Studio™ allows for the adjustment of terminology so that older terms are transformed into their more modern form prior to translation. As the correct form is passed into the translation engine, the translation will also choose the correct form. Pre-Translation Corrections can also correct known errors such as spelling mistakes, common OCR errors, glued words, etc.


Runtime Glossaries (GLO): Language Studio™ allows glossaries to be defined on a customer, project and job level. Like RBMT systems, the preferred term can guide the translation. Unlike RBMT systems, most terms are already known by the SMT platform via the translation memories provided, thereby reducing the amount of work needed for a specialist or a linguist to refine the engine.


Non-Translatable Terms (NTT): Some terms such as product names, venue names, etc. should not be translated. Language Studio™ allows a list of non-translatable terms to be specified.


Post-Translation Adjustments (PTA): Statistics may determine a preferred term based upon the training data provided. A preferred term for one customer may not be the preferred term for another customer. One of our clients has two clients of their own in the commercial real-estate business. One prefers to call some buildings by their older name, while the other prefers the new name. A single engine can be used, with a Post-Translation Adjustment making the necessary change for each specific customer.


Benefits:

  •   Client has complete control of all their projects and rules.
  •   Can create as many projects and rules as they like.
  •   Can add and remove rules at will.
  •   No need to wait for training of an engine.

Disadvantages:

  •   Applying rules to a live engine can affect the speed of translations slightly.

 

Examples of when these rules can be applied:


Runtime rules are usually not a permanent solution, however, they offer a temporary solution for a quick fix or change. They are useful when a new version of the engine has yet to be trained.


PTC – Use this rule to show a list of terms that adjust the source text fixing common issues and making it more suitable for translation

For example, a PTC corrects common spelling mistakes that you might expect in a source text, can change common words from British to American English and repair common glued words.


NTT – Use this for displaying a list of monolingual terms that are used to ensure key terms are not translated.

For example, a company name such as Apple, you may want to keep as it is in the source and not translate Apple.


GLO – Use this to show a list of bilingual terms that are used to ensure terminology is translated a specific way to match the project.

For example, the words “real estate“ as two separate words translate into Spanish as “real inmuebles”, however the term “real estate” translates as “bienes raíces”. By adding a glossary the correct term will be picked.


PTA – Use this to show a list of terms in the target language that modify the translated output. This can be useful for normalization of target terms.


For example, change the translated output of Language Studio to show the trademark, Language Studio™.

 

How to Create a Runtime Rule

 

All runtime rules can be created in a text editor and must be written in UTF-8 format. 

 

Case Sensitive rule 

Each runtime rule has a case sensitive rule added to each segment. Application of a rule can be either Case Sensitive (CS) or Case Insensitive (CI) on the source. 

By default the Source side is CS. The Target side is also case sensitive. Language Studio™ has a special feature to allow case sensitivity to be specified by the user.

The case sensitive rule column can set the case sensitivity of the source, by using CS, CI, Null (no value given).

 

A case sensitive rule has one of the following values:

cs: case sensitive

ci: case insensitive

rcs: regex case sensitive

rci: regex case insensitive

 

Format

PTC, GLO and PTA format – 3 columns

Columns are tab separated, for example:

source term 1<tab>target term 1<tab>case sensitive rule

source term 2<tab>target term 2<tab> case sensitive rule

source term 3<tab>target term 3<tab> case sensitive rule

Please note – The <tab> is reference to the TAB stroke on the keyboard, do not write the word <tab>.

 

Example:

metallic plate<tab>placa metálica<tab>ci

plate<tab>plato<tab>ci

IMPORTANT NOTE<tab>NOTA IMPORTANTE<tab>cs

 

NTT format – 2 columns (no target needed)

source term 1<tab> case sensitive rule

source term 2<tab> case sensitive rule

source term 3<tab> case sensitive rule

 

Example:

Omniscien Technologies<tab>ci

JAVA<tab>cs

BASIC<tab>cs


Using Regular Expressions in Glossaries

For regular expression group capturing, use ${groupno}.

{source regex rule}<tab>target<tab>regex case sensitive rule


Example:

([0-9]+)\-year\-old<tab>$1 años<tab>rci


 

Sorting a Runtime Rule


To ensure that some patterns do not break other patterns, please make sure that you sort the rules by the length of the source search pattern, with the longest segment at the top. 

 

Correct

metallic plate<tab>placa metálica<tab>ci

plate<tab>plato<tab>ci

 

Incorrect

plate<tab>plato<tab>ci

metallic plate<tab>placa metálica<tab>ci

Example:

 

 

 


How to save a runtime rule

 

Saving the file

When the file is complete and sorted, save the file as .ptc .ntt .glo .pta. as appropriate (you may need to save the file as .txt at first, then rename it).

Example:

default.1303267.glo

default.1303267.ntt

default.1303267.pta

default.1303267.ptc

 

What Next?

To upload the rule to the engine, please refer to this link, https://languagestudio.freshdesk.com/support/solutions/articles/12000014443-how-to-upload-a-runtime-rule-to-ls4