How to create a searchable PDF file using Acrobat X or XI

Learn how to OCR PDF to convert a PDF scan into a fully searchable PDF file.

By Lori Kassuba – December 13, 2011

 



In this tutorial, learn how to OCR PDF to convert a PDF scan into a fully searchable PDF file. Have you ever opened a PDF file only to find that none of the information is searchable? In this tutorial, you'll learn how to convert a scanned document into a fully searchable PDF file using Acrobat X or XI. You'll also learn how to apply OCR, which stands for Optical Character Recognition. To apply OCR, the original scanner resolution must be at least 72 dpi or higher. In the Recognize Text dialog, you can set various capabilities like OCR language and PDF Output Style. There are three different PDF Output Styles you can select.

View transcript

How to create a searchable PDF file using Acrobat X or XI

Lori Kassuba – December 13, 2011

Have you ever opened a PDF file only to find that none of the information is searchable? Like this particular file, I’m unable to find any words within the document because it’s just a scanned image. However, if you have Acrobat X Std. or Pro it’s easy to change your document into a fully searchable file when I click OK on this dialog. Or, I can access the same command from the Tools pane, Recognize Text panel and select the In This File command.

To apply OCR, which stands for Optical Character Recognition, the original scanner resolution must be at least 72 dpi or higher. In the Recognize Text dialog you can set various capabilities like OCR language and PDF Output Style. There are three different PDF Output Styles you can select, I’m going to go ahead and use Searchable Image. This will ensure that the text is searchable and selectable, for cut and paste operations. This particular PDF Output style will also place an invisible text layer over the original image.

At this point you can also downsample your file to reduce the size if necessary.

The RecognizeText command now will convert your image to searchable text so that you can now search and find words in your document. Your scanned image is now fully searchable and much more usable. The Recognize Text command can also be run on multiple files or folders using the In Multiple Files command located under the Recognize Text panel.



Products covered:

Acrobat XIAcrobat X

Related topics:

Scan and Optimize

Top Searches:


13 comments

Comments for this tutorial are now closed.

Lori Kassuba

6, 2015-11-10 10, 2015

Hi Sylester,

Some of the video tutorials have closed captioning. You’ll see a CC button in the control area that you can turn on.

Thanks,
Lori

Sylester

5, 2015-11-03 03, 2015

Hey guys can you copy PDF transmitts over to veido files files?

Lori Kassuba

7, 2015-02-20 20, 2015

Hi Hiren Nakum,

You can create an Action to run OCR in batch but it does not create bookmarks.

Thanks,
Lori

Hiren Nakum

4, 2015-02-19 19, 2015

Good stuff!

Will this work in batch mode and will this also create bookmarks?

Thanks!

Hiren

Lori Kassuba

7, 2015-02-13 13, 2015

Hi Libby,

If you can’t search a PDF file it’s probably because the scanned image has not been OCR’d. You can OCR a document using Acrobat (and export to Word) or you can use the ExportPDF service to convert to PDF to Word (OCR during processing).

Thanks,
Lori

Libby

3, 2015-02-12 12, 2015

If I have a pdf document, I need Adobe Professional to convert it to a Word document? Or is it Acrobat?

After I convert it to a Word document, can I use the ‘find’ function?

Lori Kassuba

7, 2015-02-11 11, 2015

Hi Nabil Sinada,

Acrobat is designed as a desktop application not server applications. Server applications can handle these type of workloads much more effectively.
Unfortunately, I was unable to locate any Acrobat ACEs in Saudi Arabia at:
http://training.adobe.com/certification/ace-finder.html#p=1&country=saudi arabia

Lori

Nabil Sinada

8, 2015-02-08 08, 2015

I want to convert approx. 500,000 hardcopy documents to PDF. Does your SW is capable to scan and index that massive quantity?
Secondly, do you have a nearby (Saudi Arabia) representative who can support me handle my project?

Lori Kassuba

5, 2015-01-29 29, 2015

Hi ghynson,

You’ll need Acrobat and not just the free Reader to run the Text Recognition. Or, you could subscribe to the PDF Pack service to convert scanned images to PDF and recognize the text. More information about this here:
https://www.acrobat.com/en_us/products/pdf-pack/benefits.html

Thanks,
Lori

ghynson

9, 2015-01-23 23, 2015

I don’t have a recognize text tab in the tools menu, what now?

Lori Kassuba

12, 2014-11-14 14, 2014

Hi Morben,

What do you see when running the Remove Hidden Information command? Don’t actually select the Remove button - just use it to Preview what is hidden.

Lori

Morben

1, 2014-11-12 12, 2014

I exported a page to tiff image, then converted back to PDF. Resized it, run the Recognize Text and couldn’t find any text. Any suggestions?

Lori Kassuba

10, 2014-10-08 08, 2014

Hi Nandeesh,

In order to search the text of a scanned file, you need to run the Text Recognition command. This command is only found in Acrobat and not the free Reader.

Thanks,
Lori

Nandeesh

10, 2014-10-06 06, 2014

i have scanned some pages to pdf format, i have installed adobe reader 11.
is it possible to search words in adobe reader 11.

Lori Kassuba

10, 2014-07-01 01, 2014

Hi Bell,

You can set the primary OCR language but it is limited to those listed under the Recognize Text General Settings. There is a list of these languages in this discussion:
http://answers.acrobatusers.com/acrobat-xI-supports-languages-OCR-q67192.aspx

Thanks,
Lori

Bell

7, 2014-06-30 30, 2014

Is it possible to use this OCR for scanned papers, if on the paper there are small symbols (like small images, drawn with black, exactly like the letters) and the text is written not only horizontally, but also in the vertical direction?
Thank you!

Lori Kassuba

3, 2014-06-25 25, 2014

Hi Amanda,

This error appears if the PDF document already contains some type of editable text. So, perhaps the PDF is a mix of scanned images and others. If you really need to OCR a specific page, you could export that page as a TIFF image and then run OCR and replace the page again. However, you might lose a bit a quality especially if the page contains any vector information.

Thanks,
Lori

Amanda

6, 2014-06-23 23, 2014

Thanks for the video.

I am currently trying to make a 200 page PDF searchable and am finding that some pages pop up with the error that Adobe cannot perform OCR because “the pages contain renderable text”.

Is there a way to fix this?

Thanks.

Joe

3, 2014-04-16 16, 2014

Spot on! Thanks!

Inderpal Singh

5, 2013-07-05 05, 2013

Thanks for video.

Lori Kassuba

1, 2013-04-24 24, 2013

Yes, just be sure to run the Recognize Text command on your images to create searchable text.

Thanks,
Lori

gary

7, 2013-04-18 18, 2013

I scanned my book pages as a JPEG and then converted it to PDF. Will I still be able to convert it to a searchable PDF file?

Pam

8, 2012-02-21 21, 2012

My text become blurry sometimes after optimizing my scanned document.
After I run “Optimize Scanned PDF” a scanned document, the text becomes blurry when it was clear before I ran the “Optimize Scanned PDF”.

Any suggestions?

Pam,

Try modifying your settings in the Optimize Scanned PDF dialog in the compression area - it sounds like you’ve lost some quality here.

Lori

Comments for this tutorial are now closed.