Electronic files are divided into two types: original layouts and non-editable copies of original layouts.
The opportunities, advantages, and benefits for customers from using computer-aided translation systems are obvious. Cat tools support 65 file formats, and their number has been gradually increasing.
To divide the source files into types correctly, firstly, it’s necessary to understand the principle which CAT tools use to divide the text into segments and take into account that the translation text length increases by 5-20%.
In addition to the CAT segmentation rules, it is important to bear in mind changes in the translation text length.
The source file has a correct layout: the translation needs to be formatted and the file is ready to be handed over to the client.
The source file has a wrong layout: fonts, rows, paragraphs, inscriptions, pictures, tables will not be in their proper places in the translation. Formatting won’t help. There is no other way but to make a correct layout. This work is done by DTP specialists, whose services will lead to higher costs and extending deadlines. Visit this page to learn the cost of DTP & LAYOUT.
Now we can divide file formats into two conditional types correctly: original layouts and non-editable copies of the original layouts.
1. Original Layouts of Electronic Files.
By providing such files for translation, the customer will save money and time. Usually, they have the correct layout and can be immediately uploaded to the CAT tool and translated. The system will segmentize the text correctly, and then it will take several minutes to format the finished translation. The most common formats for such files are listed below:
-
Microsoft Office
DOC/DOCX, XLS/XLSX, PPT/PPTX, PPS/PPSX, POT/POTX
-
Open Office
ODT, ODP
-
Text
TXT, RTF
-
Formats of AutoCad drawings
DWG (special utilities export / import drawing texts into MS Office package)
-
Hypertext, source code
HTML, XHTML, PHP
-
Bilingual interchange formats
XLIFF / XLF / SDLXLIFF / MQXLIFF / SDLXLIFF, PO, TTX
-
Desktop publishing
MIF, IDML
-
Technical writing
DITA XML, HELP+MANUAL XML
-
Localization
XML, Android XML, RESX, DTD, JSON, TJSON, YML, INC, INX, MIF, STRINGS, PROPERTIES
-
Subtitles
SRT, TTML
-
Script formats
STORY
-
Packages
TTX, SDLPPX / SDLRPX, ZIP, WSXZ
2. Non-editable Copies of Original Layouts.
Often these are scanned documents and PDF files. Recognition is the first step to prepare these kinds of files for translation. This means to recreate the original layout with a correct formatting. This work is done with the help of FineReader by specially trained DTP specialists. Our guys know the intricacies and nuances of CAT and take them into account when recreating the layout.
Recognition (OCR) services are much cheaper than DTP & Layout services, but the client also has to pay for them when he or she expects a competent and well-executed translation. The most common formats for such files are listed below:
- JPG/JPEG
- TIF/TIFF
- BMP
- PNG
- GIF
- DJVU/DJV
- DCX
- PCX
- JP2
- JPC
- JFIF
- JB2
- Any document formats when the text is saved as a picture, photo, or image.
Any non-editable copy always has the original layout in electronic format. You just have to find it. It’s often enough to ask for it your colleague, partner, or supplier. It will help to reduce the cost of Recognition (OCR) and DTP & Layout services. Sometimes, DTP & Layout services costs many or even dozens of times more expensive than the translation service itself.
Please, don’t send files for translation that have been automatically recognized by web services, FineReader, or other applications. It will only make the work more difficult and expensive.