Don't know if this is helpful or not, but I built an application for a friend of mine in the publishing industry (which runs on Windows) in order to create royalty reports for individual authors from a single giant pdf which contains all the data for thousands of authors. So the Livecode app shells out to pdfinfo.exe, pdftk.exe, and pdftotext.exe to do all the actual pdf processing. It converts each page of the pdf to text, reads the text off each page to determine where one author's report ends and another's begins, and then splits the massive pdf up into separate pages corresponding to separate reports for each author which it saves as separate pdf files which can then be emailed to the appropriate recipient.
All these are standalone executables. I store them a folder "below" the main app. So to get all the text off of page "tPageNum" from the pdf "sFile", I run:
Code: Select all
put quote & specialFolderPath("resources") & slash & "library/pdftotext.exe" & quote into sPDFtotext
put shell(sPDFtotext && "-f" && tPageNum && "-l" && tPageNum && "-layout" && sFile && "-") into tShellResults
Now I have the text of that page in tShellResults.
I don't know but I expect that all these utilities also run on Mac. (They are all linux utils originally, I believe.)
If that's helpful to you, I'm happy to share more about it. And if not, no worries.
Jeff