View demo

File Preparation

Let our in-house localisation engineers optimise your files so that your multilingual content is easier to manage. You’ll also benefit from better use of your Translation Memory, save time and speed up the translation process.

Improve quality

Visual improvement of translated content through professional preparation.

Save time and money

Optimise your files to increase Translation Memory (TM) leverage.

About file preparation

In order for a language expert to work with your content, it must first be editable. And although Computer Assisted Translation (CAT) has come a long way in the past 20 years, it still relies on is the keen eye of an expert to understand when a file can and cannot be processed. This human touch is still needed with even the most advanced software programs available today.

The editability of text is not a problem for most file formats, but there are often times when this isn’t the case. This includes:

  • Extracting text from images or desktop publishing software
  • Removing unnecessary line breaks and space characters that would otherwise make working with your content difficult
  • Tidying up PDF files after optical character recognition (OCR) processing
  • Filtering irrelevant content in Excel files
  • The creation of tag settings (.ini) files for .xml and other similar file types

Text extraction

File formats where text extraction is often required include InDesign (.indd), QuarkXPress (.qxp), PDF documents (.pdf), Adobe Photoshop (.psd) and PowerPoint (.ppt). Expert engineers go through your ‘non-editable’ content and extract it to an editable format (e.g. in Microsoft Word, the LanguageWire Editor or other CAT tools). When the text has been translated, proofread or received another language service, it is then added back into your document in a way that it looks visually as close as possible to the source content.

The work involved with Text Extraction normally falls under one of two categories, conversion or extraction. Here are some common examples of each:

Conversion

  • Converting PDF files into Microsoft Word files with editable text
  • Making the text in rasterised images editable
  • Converting QuarkXPress files into InDesign

Extraction

  • Extracting text from Adobe Illustrator (.ai) files
  • Text extraction from layers (InDesign, AutoCAD, etc.)

Optical character recognition (OCR)

OCR is the process of converting electronic text into an editable format. It is frequently used in the translation world. Without it, we are unable to provide accurate analyses and, in turn, cost estimates.

OCR conversions look great visually. However, if you scratch the surface you will see formatting that no human would ever have implemented. This can seriously affect the translation process, lengthen your translation lead time and increase the overall cost of your translation project.

Our in-house engineers fix this content, optimising your files to increase Translation Memory (TM) leverage and speeding up your time-to-market.

In short

  • Preparation of content for processing in CAT tools by language experts
  • Extracting text from images or desktop publishing software
  • Removing unnecessary line breaks and space characters
  • Tidying up PDF files after optical character recognition (OCR) processing
  • Filtering irrelevant content in Excel files
  • The creation of tag settings (.ini) files for .xml and other similar file types
File Preparation