I'm comparing a word document (left) with a PDF file (right). The PDF file was generated from a newer revision of the word document.
The PDF has hard line-breaks in places that the word document just visually wraps to the next line. As such, ExamDiff concludes that they are very different.
Is there a way to get ExamDiff to examine sentences rather than lines?
I've tried:
Replacing \n with spaces in ExamDiff
Playing with the wrap type and width settings in ExamDiff
Converting the word document to PDF, but this ends up producing a PDF with slightly different wrapping to the other PDF
Playing with the margins in Microsoft Word to achieve the same PDF wrapping
No, there's really no way to do it in EDP. If there was a tool that converts linebreaks within sentences to spaces, it could be used as a plug-in in EDP, but I don't know of such tool.
In the end I used this process to make the files comparable:
- From ExamDiff, copy the text from the left pane into Notepad++
- In Notepad++
//The following removes carriage returns
Press Ctrl+H
Find what: \r\n
Replace with: (leave this blank)
Search mode: Regular Expression
Click 'Replace All'
//The following places each sentence on its own line
Press Ctrl+H
Find what: \.
Replace with: \n
Search mode: Regular Expression
Click 'Replace All'
- Do the same for the PDF content from ExamDiff
- Now create a new ExamDiff window and compare the two texts from Notepad++
If it's the process you described, you could write two scripts, one for .DOC files and the other for .PDF, using, say, sed, and use them as additional plug-ins for these respective file types (Options | Tools | Plug-ins).