All MT systems require data, and in general, the larger the data volumes the better to produce a quality system. However not all clients have the required data. Omniscien Technologies has developed a series of tools collectively called Advanced Data Manufacturing to allow a wider range of users access to the benefits of MT.

Language Studio™ Advanced Data Manufacturing is a set of more than 100 tools that can be applied to training data for the purpose of analysis and creation of new data. Not all these tools are required for every project. Some tools are even language specific. Language Studio™ Linguists analyze the requirements of a project, understand the data that is available and build a Customized Training Plan that incorporates and uses the appropriate tools to create supplementary data for each custom engine.

Advanced Data Manufacturing can often be very complex and determining which tools to utilize for optimal results is a task performed by the Language Studio™ Linguist assigned to your account or custom engine project. Language Studio™ Linguists have built a deep understanding of the intrinsic properties of each language though the creation of thousands of custom engines for clients. This experience and skill ensures the best results possible.

Different languages and language pairs require different tools. For example, in the case of Slavic languages and other languages that are heavily inflected, tools analyze the Language Studio™ Foundation Data and client data to determine the various inflected forms that are required, but are not present in the available data . Tools will then automatically manufacture bilingual data to rectify these gaps in data coverage.