Translating a PDF file (or any image file) can be necessary sometimes. Especially when you do not have the source document.
What do we mean by a “source document”? Glad you asked.
A PDF file does not come into existence as a PDF file. Often times, the document is created in a word processing application such as MS Word, MS Powerpoint, or InDesign.
It is always preferable to translate these source files rather than the resulting PDF.
We will explain why in a bit.
But before that, let’s look at what goes into translating a PDF file (or any image-based file format for that matter).
How much does it cost to translate a PDF file
The global translation community - all the translators, translation agencies, and many of the clients that regularly purchase translation services operate on a “per word” basis.
This just means that clients are given a price quote based on the number of words their documents contain, the translators are compensated based on the number of words they work on and that is how everything is calculated.
So the number one issue in translating a PDF file is determining how many words it contains.
MotaWord does that using a hybrid system of language detection, OCR (optical character recognition), document size analysis, and document page number analysis.
All of these calculations allow our Artificial Intelligence to determine the number of words in an image-based document and also to check to make sure the number of words calculated makes sense. (if you are getting 5,000 words to be translated on a 1 page document there must be a problem, right?)
So, in translating a PDF file the first issue is determining the number of words.
The second part, especially important while translating PDF files for USCIS use is the correct detection of language. We do that by analyzing the OCR’ed content and detecting the language from that analysis. There are times when our clients do not know whether a language is Arabic or Farsi. The same kind of confusion happens when they are also sending PDF files for translation in Asian languages.
MotaWord is able to detect the language of any document. This ensures that our clients can send in documents and start projects in a seamless way, without having to worry about figuring out what language a certain document is in.
Faster and More Accurate PDF file Translation
In November 2020 we pushed a major update to our language detection and OCR systems.
We work non-stop on ensuring that our systems are always performing better and faster. And thanks to this update for the image and PDF files you submit for translation on MotaWord, our translation quote engine is now 15x more accurate across all languages and it is 2x faster.
No matter how skewed your scanned documents are; no matter what the background of your scan looks like, it will perform flawlessly. Feel free to give it a try on the quote page.
We also ensured that our “automatic language detection” works with all image files. Previously it was only available for PDF files. Currently, we are able to detect the language of JPEG and PNG files. Two of the most common used formats. So the next time you are unsure of the language of a document, all you need to do is go to the MotaWord quote page and find out within seconds.
Free Service - who does not like that?
While we do not provide free translation of PDF files, getting a word count from a document in any language and format can easily be done using our fully automated quoting algorithms.
There is absolutely no obligation to purchase when you use the MotaWord quoting page. And our system will easily provide you with the number of words in your PDF document. This can be the first step in translating a pdf file that you have, even if you do not choose us to provide the service.
Another free service you can get from us is the language detection of a document.
If you are ever unsure of what language a birth certificate or immigration-related document is, you can easily go to our quote page, upload the document and our detection algorithm will easily tell you the language (and also the number of words in your content of course).
No need to thank us, we love working with documents and getting to help our translation community. In fact, many of our translators use our ordering engine to get word counts on many different document types.
Translating a pdf file (or not)
At the beginning of this article, we said translating a pdf file, while perfectly easy, is not the preferred method.
It is always preferable to translate the source file that created the PDF.
The number one reason is the ability to use a Translation Memory and also MotaWord’s unique ability to detect “Duplicate Content”. As we do not charge our clients for duplicate content within a document and also we keep a Translation Memory free of charge, sending source documents will ensure that all our clients are only paying the bare minimum and are not necessarily getting charged for repetitions or previously translated content.
We have an article that we wrote about this. We especially like the title; “Translation Memory - It's What Friends Use For Your Translations”
There are also formatting considerations.
When you send a PDF file to be translated, the formatting will be matched to the best of our ability but all the images, special design elements, and fonts will be lost.
This is especially true for corporate presentations that can contain images, specific layouts, and of course pagination limitations.
In these cases, sending the source file will ensure that the resulting work while translating a pdf file is exactly the same as the initial document in the source language.
Here we are; you now have much more information that you needed in translating a pdf file. But even as comprehensive as we have tried to be, we do realize that we might not have answered every question.
Never fear, if you are already a MotaWord client you know that we are available 24/7 to answer any question you may have. All you need to do is go to the MotaWord home page and use our live chat function to speak to us.
If you’d just like to see how amazing our quoting algorithms would perform in translating a pdf file, just go to the quote page and give it a try.