Language Studio™ V4.0 Custom JavaScript Pre and Post Processing
There are 25 steps to processing a translation. Each step performs a specific function in preparing data for translation or in reformatting and adjusting text after translation. In between each major function is a custom JavaScript step that can be executed to perform custom functionality to meet your specific requirements that are normally outside the scope of standard pre and post processing.
Applying JavaScript Rules to Pre and Post Processing
JavaScript pre and post processing is specific to a custom engine sub-domain and a project. Each custom engine has a unique Domain Code. This is displayed in Language Studio™ Enterprise when you are viewing the sub-domain properties.
Each JavaScript step has a separate file name that is made up as follows:
default.<domain code>.js<step>
So, for Domain Code 1301518 and step 5, the file name would be default.1301518.js5
Scripts are applied at a project level. To apply a default script that will be loaded automatically each time a job is executed, upload the file into the project folder. When a job is submitted for processing in the project folder and using the domain code specified, it will automatically be applied to the job.
Translation Workflow Steps
# | ID | Step Name | Description |
1 | js1 | Pre-process JavaScript 1 | Custom script for modifying a document before it is loaded for processing. This is useful for changing the document ahead of any processing. Examples include executing OCR, changing fonts or adjusting the document to address issues in the document that have a negative impact on the translation. Input Parameters: sSourceFileName, JobID |
2 | rts | Retrieve Source | The source content is loaded in preparation for processing. |
3 | js2 | Pre-process JavaScript 2 | Custom script for modifying the source before being processing. This is useful to manipulate documents ahead of processing, but once the document has already been loaded into memory. This is usually only useful for text, XML or HTML documents. Examples include blocking sections of the document so that it is will not be translated. Input Parameters: sAllText, JobID |
4 | etu | Extract Translation Units | Extracts the contents out of the source file into translation units that are to be translated. When extracting, content is sentence segmented and only text with markup is extracted from the document. The resulting output is 1 segment per line. Identical segments are merged so as not to repeat the translation. |
5 | js3 | Pre-process JavaScript 3 | At this stage, all the text that is to be translated has been extracted as sentences. The data has 1 sentence per line. Input Parameters: sAllSegments, sJobID |
6 | ptc | Pre-Translation Correction (PTC) | Extracts the contents out of the source file into translation units that are to be translated. When extracting, content is sentence segmented and only text with markup is extracted from the document. The resulting output is 1 segment per line. Identical segments are merged so as not to repeat the translation. |
7 | js4 | Pre-process JavaScript 4 |
Input Parameters: sAllSegments, sJobID |
8 | ntt | Non-Translatable Terms (NTT) | Applies Non-Translatable Terms. |
9 | js5 | Pre-process JavaScript 5 |
Input Parameters: sAllSegments, sJobID |
10 | glo | Runtime Glossary (GLO) | Applies Runtime Glossary |
11 | js6 | Pre-process JavaScript 6 |
Input Parameters: sAllSegments, sJobID |
12 | tok | Tokenize | This step breaks each segment down into tokens for translation by inserting spaces between each logical token. For languages such as Chinese where there are no spaces between words, the text of the segment is analysed and spaces are inserted between each word automatically. |
13 | js7 | Post-Process JavaScript 7 |
Input Parameters: sAllSegments |
14 | etm | Extract Markup | Extracts any XML markup storing its position in the original source segment. This includes HTML, XML, XLIFF and TMX tags. |
15 | tx | Translate | Translation of the segments. |
16 | txu | Translate Unknown Words | Automated resolution of unknown words. This step is integrated with the Translate step, but tracked separately for logging and troubleshooting purposes. |
17 | js8 | Post-Process JavaScript 8 |
Input Parameters: sAllSourceSegments, sAllTargetSegments, sJobID |
18 | cap | Capitalize | Adjusts the capitalization of the MT output. This step does not apply to TrueCase translation engines. |
19 | js9 | Post-Process JavaScript 9 |
Input Parameters: sAllSourceSegments, sAllTargetSegments, sJobID |
20 | det | Reinsert and Detokenize | Reinserts the XML markup in the correct locations after words have been translated and reordered. Any spaces around tokens that should not be in the final output are removed. |
21 | js10 | Post-Process JavaScript 10 | At this stage you can modify the output for specific final formatting and apply rules that are too complex for a PTA. Input Parameters: sAllSourceSegments, sAllTargetSegments, sJobID |
22 | pta | Post-Translation Adjustments (PTA) | Applies Post-Translation Adjustments. |
23 | js11 | Post-Process JavaScript 11 | This is the last stage before the text is merged back into the original document format. Input Parameters: sAllSourceSegments, sAllTargetSegments, sJobID |
24 | mrg | Merge and Save Output Document | Merges the translation back into the original source document format. |
25 | js12 | Post-Process JavaScript 12 | This stage provides an opportunity to modify the final file. This could be useful for changing fonts, inserting additional informaiton or reformatting the content. Input Parameters: sAllSourceSegments, sAllTargetSegments, sJobID sAllSourceSegments = Source text from step ETU (Extract Translation Units) sAllTargetSegments = Text read from output file. After modified this text, will overwrite to output file. |