Language Studio requires data to develop MT systems. These data sets consist of;

Translation Memories

Bilingual Dictionaries and Glossaries (direct translations only)

Target Language Monolingual data in file format (.txt . MS Word (PDFs are less preferable)

URLs of websites in the same subject domain of required output.

Monolingual Source data (optional)

After all data is uploaded into the MT engine Version, the user must click Submit for Processing. Language Studio linguists only start to process data when this has been completed.

There are 2 ways to upload files to a Custom Engines Data Catalog.

Approach 1: Upload a Small to Medium Number of Files Directly to the Data Catalog

Follow the below steps:
  1. After logging in, select the Custom Engine Catalog from menu in the top left corner of the browser.

  2. Select the Custom Engine Sub-Domain that you want to upload the files to. The shortest path is to select the latest VERSION using the

    button on the right most column.

  3. Scroll down on the Sub-Domain Version screen until you see the Data Catalog. Click on the button to upload new files.

  4. Select the type of data that you wish to upload using the Data Type drop down list. Valid file formats will be displayed for each data type.

  5. Click on the button to upload files.

Approach 2: ZIP Large Files or Many Files and Upload to the File Pickup / Drop Off Area

Follow the below steps:

  1. After logging in, select the File Pickup / Drop Off from the top left corner of the browser.

  2. Click the button to upload files that have been zipped.

  3. After uploading, select your ZIP file and UNZIP it.

    Note: There is a limitation at present that the ZIP file cannot contain sub-folders. 

  4. Go through the steps of Approach 1 until you get to the Upload screen. On the Data Catalog - Upload screen, select Pickup / Drop Off Folder and select the files that you want to upload.

  5. On completion of upload, it is good practice to go back to the File Pickup / Drop Off area and delete the files as this area is temporary file storage only for moving files in and out of Language Studio.