Language Studio™ V4.0 Custom JavaScript Pre and Post Processing

There are 25 steps to processing a translation. Each step performs a specific function in preparing data for translation or in reformatting and adjusting text after translation. In between each major function is a custom JavaScript step that can be executed to perform custom functionality to meet your specific requirements that are normally outside the scope of standard pre and post processing.


Applying JavaScript Rules to Pre and Post Processing

JavaScript pre and post processing is specific to a custom engine sub-domain and a project. Each custom engine has a unique Domain Code. This is displayed in Language Studio™ Enterprise when you are viewing the sub-domain properties.

Sub-Domain Codes

Each JavaScript step has a separate file name that is made up as follows:

default.<domain code>.js<step>


So, for Domain Code 1301518 and step 5, the file name would be default.1301518.js5


Scripts are applied at a project level. To apply a default script that will be loaded automatically each time a job is executed, upload the file into the project folder. When a job is submitted for processing in the project folder and using the domain code specified, it will automatically be applied to the job.


Translation Workflow Steps

# ID Step Name Description
1 js1 Pre-process JavaScript 1 Custom script for modifying a document before it is loaded for processing.
This is useful for changing the document ahead of any processing. Examples include executing OCR, changing fonts or adjusting the document to address issues in the document that have a negative impact on the translation.

Input Parameters: sSourceFileName, JobID
2 rts Retrieve Source The source content is loaded in preparation for processing.
3 js2 Pre-process JavaScript 2 Custom script for modifying the source before being processing.
This is useful to manipulate documents ahead of processing, but once the document has already been loaded into memory. This is usually only useful for text, XML or HTML documents.
Examples include blocking sections of the document so that it is will not be translated.

Input Parameters: sAllText, JobID
4 etu  Extract Translation Units Extracts the contents out of the source file into translation units that are to be translated. When extracting, content is sentence segmented and only text with markup is extracted from the document. The resulting output is 1 segment per line. Identical segments are merged so as not to repeat the translation.
5 js3 Pre-process JavaScript 3 At this stage, all the text that is to be translated has been extracted as sentences. The data has 1 sentence per line.

Input Parameters: sAllSegments, sJobID
6 ptc Pre-Translation Correction (PTC) Extracts the contents out of the source file into translation units that are to be translated. When extracting, content is sentence segmented and only text with markup is extracted from the document. The resulting output is 1 segment per line. Identical segments are merged so as not to repeat the translation.
7 js4 Pre-process JavaScript 4 Input Parameters: sAllSegments, sJobID
8 ntt Non-Translatable Terms (NTT) Applies Non-Translatable Terms.
9 js5 Pre-process JavaScript 5 Input Parameters: sAllSegments, sJobID
10 glo Runtime Glossary (GLO) Applies Runtime Glossary
11 js6 Pre-process JavaScript 6 Input Parameters: sAllSegments, sJobID
12 tok Tokenize This step breaks each segment down into tokens for translation by inserting spaces between each logical token. For languages such as Chinese where there are no spaces between words, the text of the segment is analysed and spaces are inserted between each word automatically.
13 js7 Post-Process JavaScript 7 Input Parameters: sAllSegments
14 etm Extract Markup Extracts any XML markup storing its position in the original source segment. This includes HTML, XML, XLIFF and TMX tags.
15 tx Translate Translation of the segments.
16 txu Translate Unknown Words Automated resolution of unknown words. This step is integrated with the Translate step, but tracked separately for logging and troubleshooting purposes.
17 js8 Post-Process JavaScript 8 Input Parameters: sAllSourceSegments, sAllTargetSegments, sJobID
18 cap Capitalize Adjusts the capitalization of the MT output. This step does not apply to TrueCase translation engines.
19 js9 Post-Process JavaScript 9 Input Parameters: sAllSourceSegments, sAllTargetSegments, sJobID
20 det Reinsert and Detokenize Reinserts the XML markup in the correct locations after words have been translated and reordered. Any spaces around tokens that should not be in the final output are removed.
21 js10 Post-Process JavaScript 10 At this stage you can modify the output for specific final formatting and apply rules that are too complex for a PTA.

Input Parameters: sAllSourceSegments, sAllTargetSegments, sJobID
22 pta Post-Translation Adjustments (PTA) Applies Post-Translation Adjustments.
23 js11 Post-Process JavaScript 11 This is the last stage before the text is merged back into the original document format.

Input Parameters: sAllSourceSegments, sAllTargetSegments, sJobID
24 mrg Merge and Save Output Document Merges the translation back into the original source document format.
25 js12 Post-Process JavaScript 12 This stage provides an opportunity to modify the final file. This could be useful for changing fonts, inserting additional informaiton or reformatting the content.

Input Parameters: sAllSourceSegments, sAllTargetSegments, sJobID
       
sAllSourceSegments = Source text from step ETU (Extract Translation Units)
sAllTargetSegments = Text read from output file. After modified this text, will overwrite to output file.