How to convert a (scanned) PDF into Word in Trados 2019
Téma indítója: Ulrike Cisar

Ulrike Cisar  Identity Verified
Németország
Local time: 16:22
Tag (2015 óta)
francia - német
+ ...
Jan 13

Hi,

I´d need some help from someone familiar with Trados 2019. I bought the latest update in autumn, mostly for its OCR software for the conversion of (scanned) PDF files into Word since I receive quite a large number of non-editable PDFs from a certain agency. I hoped the software would save me time and effort but somehow I just don´t see how the conversion of the file in Trados is done. Watching tutorials hasn´t helped, either. Is anybody here willing to help and share some us
... See more
Hi,

I´d need some help from someone familiar with Trados 2019. I bought the latest update in autumn, mostly for its OCR software for the conversion of (scanned) PDF files into Word since I receive quite a large number of non-editable PDFs from a certain agency. I hoped the software would save me time and effort but somehow I just don´t see how the conversion of the file in Trados is done. Watching tutorials hasn´t helped, either. Is anybody here willing to help and share some useful advice?

Thank you!
Collapse


 

Sonia Cunha-Goldner
Egyesült Államok
Local time: 10:22
Tag (2007 óta)
angol - portugál
Adobe Acrobat DC Jan 13

Hi, I use Adobe Acrobat DC to perform the OCR, and then import the PDF to Trados (I have 2021, but 2019 works too). Sometimes, it works, but sometimes I have to export from Adobe to Word and fix the errors, before importing to SDL Trados as a Word file.

 

Elena Bailey  Identity Verified
Spanyolország
Local time: 16:22
ProZ.com-tag
spanyol - angol
+ ...
Add pdf directly to project and double-check converted document in project folder Jan 13

I upload the pdf files directly to the project (when creating a new project). Trados converts the file when it "prepares" the project so you can then translate it. I usually double-check that the conversion has been done correctly before starting to translate it by going to the project folder (wherever you have saved it locally or in the cloud), and clicking on the source language folder. In this folder, you should see the original pdf file that you uploaded plus the converted file.
Someti
... See more
I upload the pdf files directly to the project (when creating a new project). Trados converts the file when it "prepares" the project so you can then translate it. I usually double-check that the conversion has been done correctly before starting to translate it by going to the project folder (wherever you have saved it locally or in the cloud), and clicking on the source language folder. In this folder, you should see the original pdf file that you uploaded plus the converted file.
Sometimes Trados struggles with scanned documents or poor quality files, so I also use Adobe in these cases to convert the pdf then I upload the word. But I would say that in general pdf conversion has improved in recent versions of Trados (I use the 2019 version too).
If for some reason your files are not converting at all, maybe double check the "file types" section under "options", then scroll down to see the options selected for pdf files.
Send me a private message if you still have issues and I can send you some screenshots.
Collapse


Yossi Rozenman
 

Roy Oestensen  Identity Verified
Norvégia
Local time: 16:22
Tag (2010 óta)
angol - norvég: bokmal
+ ...
I would invest in an OCR software Jan 14

I never trust the automatic OCR functions, as I find they mostly work well with editable pdf files, and even then the result isn't always satisfactory.

Personally I rather rely on Abby Finereader, which I can recommend. It gives me the option to indicate manually what parts of the document should be OCR-ed, which especially is useful when the PDF is a scanned, and therefore not editable, document. After that I can export it as for instance a Word document, which I can import into St
... See more
I never trust the automatic OCR functions, as I find they mostly work well with editable pdf files, and even then the result isn't always satisfactory.

Personally I rather rely on Abby Finereader, which I can recommend. It gives me the option to indicate manually what parts of the document should be OCR-ed, which especially is useful when the PDF is a scanned, and therefore not editable, document. After that I can export it as for instance a Word document, which I can import into Studio.
Collapse


Kevin Fulton
finnword1
Stepan Konev
Kenedy Chia
Jorge Payan
 

finnword1
Egyesült Államok
Local time: 10:22
angol - finn
+ ...
Use an external OCR software Jan 15

I 100% agree with Roy. Do OCR first, using a stand-alone software, then clean up the formatting and possible OCR errors, and import it into whichever CAT tool you use.

 

Schtroumpf
Local time: 16:22
német - francia
+ ...
Trados has never worked fine on PDF Jan 15

Despite their claims, and AFAIK, Trados has never succeeded in producing anything worth while from PDF, even if the document was not scanned but directly generated from MSO.
This is not only my humble opinion but also what I have always been told in my professional environment, and some people there have really deep insight into Trados.
I agree 120 % with the colleagues' suggestion that you should OCR your PDF before the Trados stage (and even fix the layout and spelling as well).


 

Dan Lucas  Identity Verified
Egyesült Királyság
Local time: 15:22
Tag (2014 óta)
japán - angol
Consider software, but also surcharges Jan 15

Roy Oestensen wrote:
Personally I rather rely on Abby Finereader, which I can recommend.

I initially misunderstood the OP's question and thought she was asking about editable PDFs. In my experience Studio handles those very well, and transparently.

For image-only, I too would recommend ABBYY FineReader (or Nuance Omnipage). The problem is that OCR processing of documents is seldom straightforward, even with the best OCR software out there.

I charge clients who want me to tackle a non-editable PDF file the equivalent of a couple of hours of my time to get the source file into readable state. The OCR process is just the start of it. At the very least, you'd want to compare the readable Word document to the image-only PDF source document before you started work, to ensure there are no unfortunate errors or spelling mistakes.

In addition, when I quote them a charge for the OCR, it's surprising how often clients suddenly find themselves able to produce a Word file after all. I prefer it that way, not least because they then have responsibility for the source file.

Regards,
Dan


 

A. & S. Witte
Németország
Local time: 16:22
Tag (2007 óta)
német - angol
+ ...
OK, sounds good. But does it really have to be subscription-based SaaS? Jan 17

Sonia Cunha-Goldner wrote:

Hi, I use Adobe Acrobat DC to perform the OCR, and then import the PDF to Trados (I have 2021, but 2019 works too). Sometimes, it works, but sometimes I have to export from Adobe to Word and fix the errors, before importing to SDL Trados as a Word file.


Admittedly, the above marks a success ratio that even a skilled operator of the automatic and semi-automatic modes of newer versions of FineReader like my wife will not be able to report as those mostly yield results requiring manual post-OCR formatting for your CAT tool, which is why she sometimes, in the case of poor source documents, uses its intricate manual mode. So that cuts it for me, although the comparison between the two tools obviously depends on what sort of source documents you still accept (how poor they can be if you take them). However, does it really have to be subscription-based Software as a Service? See we are only translators, not Jennifer and Jonathan Hart.

Cheers,

Sebastian


 

Anthony Rudd

Local time: 16:22
német - angol
+ ...
PDFs and Trados Jan 19

I would agree with "Schtroumpf". I recently had to process a non-scanned PDF file. Although Trados correctly recognized the text, the sequence of very many segments was "random", which made merging segments (many segments were partial sentences) impracticable. MemoQ processed the same file without problem.

 

Ulrike Cisar  Identity Verified
Németország
Local time: 16:22
Tag (2015 óta)
francia - német
+ ...
TÉMAINDÍTÓ
How to convert a (scanned) PDF into Word in Trados 2019 Jan 19

Thank you all for your valuable input and sharing your experiences!

 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

How to convert a (scanned) PDF into Word in Trados 2019

Advanced search







SDL Trados Studio 2021 Freelance
The leading translation software used by over 270,000 translators.

SDL Trados Studio 2021 has evolved to bring translators a brand new experience. Designed with user experience at its core, Studio 2021 transforms how new users get up and running and helps experienced users make the most of the powerful features.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »



Forums
  • All of ProZ.com
  • Terminológiai keresés
  • Munkák
  • Fórumok
  • Multiple search