Page 1 of 1

PDF to Text Not Rendering Thai Properly

Posted: Mon Oct 15, 2018 12:47 pm
by tbsqa
Hello,

I compare several hundred pdf documents in a variety of language. Most of them are in English, but about a third of them are in non-English characters. I have no problem when comparing Japanese, Chinese, or Arabic, but when I compar Thai a large number of boxes show up in the comparison window. I'm certain my issue lies within the PDF to Text plug-in because I had to configure that a few years ago to get useful comparisons for the languages above, but I'm unsure of whether it's an issue with ExamDiff Pro, Xpdf's pdf to text, or something else. I've visually ruled out anything with the PDFs themselves.

Re: PDF to Text Not Rendering Thai Properly

Posted: Mon Oct 15, 2018 4:47 pm
by psguru
The plug-in is a third-party app (Xpdf), and, unfortunately, we have no control of it.

Re: PDF to Text Not Rendering Thai Properly

Posted: Tue Oct 16, 2018 4:00 am
by tbsqa
That is unfortunately what I thought. Thank you for the help.

Re: PDF to Text Not Rendering Thai Properly

Posted: Tue Oct 16, 2018 11:07 pm
by JeremyNicoll
According to: https://www.xpdfreader.com/download.html there's a set of 'language packs' for Xpdf, and one of them is for Thai. I downloaded it and looked at it, and its instructions seem to say that the files within just need to be placed within specific folders where Xpdf is installed, and a config file updated.

The example config file in my version of Xpdf (inside ...\ExamDiff Pro\Plug-Ins\Xpdf) even explicitly mention Thai as an example, so I'd be surprised if it is not possible to make Xpdf's Thai support work.

It might be sensible first to download the most uptodate Xpdf from the download page I mentioned above, along with the langiage pack(s) you need and experiment with those outwith EDP, then if you can get that to work, replace the plugin that EDP has with the uptodate one.