PDFpen OCR Folder Action Script

As discussed on Mac Power Users episode 3, "Going Paperless," the nice people at Smile On My Mac put together an Applescript that, when combined with a folder action, gives you a way to automatically OCR documents using PDFpen or PDFpenPro. So here is the promised walk through:

What you'll need:

1. Some scanned PDF images;
2. PDFpen or PDFpenPro (See my review here);
3. A bit of patience.

Step 1 - Load up the Script Editor


Script Editor.png

This little application allows you to create and save AppleScripts.

Step 2 - Copy in the below script


on adding folder items to this_folder after receiving added_items
try
repeat with i from 1 to number of items in added_items
set this_item to item i of added_items
tell application "PDFpenPro"
open this_item
set theDoc to document 1
repeat with aPage in pages of theDoc
ocr aPage
-- Looks like we need to modify PDFpen so that we can detect when OCR is done; for now use 15 seconds
delay 15
end repeat
save theDoc
close theDoc
end tell
end repeat
on error errText
display dialog "Error: " & errText
end try
end adding folder items to

-------------

Note - if you use PDFpenPro instead of PDFpen, you'll need to open the script and edit the command that reads "tell application "PDFpen" to read "tell application "PDFpenPro".

Note 2 - Wordpress seems to have converted the double dash before the comment in to an em-dash and the quotes to smart quotes. Although I fixed it in the wordpress code, it still reverts to "fixing" things when I publish so you'll have to correct those in your editor. Sorry. If anyone knows a better way to post applescript via wordpress, please drop me a note.

Step 3 - Save the script


You need to save it to a specific directory:

HD/Library/Scripts/Folder Action Scripts/

I named mine "PDFpen Scriptacular"

Step 4 - Create a folder


Save the folder wherever is convenient. Perhaps in your documents folder or (for you anarchists) on the desktop. By the way, did you know that command-shift-n gets you a new folder? I named mine "OCR Drop."

Step 5 - Enable folder actions


Secondary click on the folder and enable folder actions under the "More" item.
Enable Folder Actions.jpg

Step 6 - Configure Folder Action


Right clicking the folder a second time gives you a new option, Configure Folder Action. Click it.
Configure Folder Actions-1.jpg

Step 7 - Pick Your Folder


On the menu that appears, hit the plus (+) sign under the "Folders with Actions" box.
FA pick folder.jpg

Select your folder, wherever you located it. It will then ask you to pick a script. Pick the PDFpen scriptacular.scpt
pick script.jpg

It should now look like this.
Script menu.jpg

Close the window and you are done.

Now just drag a few PDFs in and let the script go to work. Copy the OCR'd PDFs where they belong and you are done. There are a few additional points:

1. There is no Applescript command in PDFpen that reports when it is done doing an OCR so instead there is a 15 second timer. The PDFpen wizards report they are going to try and fix this in a future release.

2. While this script generally works, it sometimes gave me an error when I overloaded it. Be patient.

I want to give my personal thanks to the gang at Smile On My Mac, particularly Greg, who put this script together for Mac Power Users just because we asked.