Text from pictures in Excel

Excel recently added the ability to extract text from an image, either on the clipboard or from a selected file. To try this out I used a screenshot of a table with vertically aligned text, from a pdf copy of an AutoCAD file:

The procedure for importing the data is very straightforward. Select Get-Data From Picture on the Data Tab:

Select “Picture From Clipboard” and the process to detect and convert the text will start:

When complete it displays an image of the extracted text, with options to review or paste directly to the spreadsheet:

Unfortunately with vertical text the results were a little disappointing!

Rotating the image through 90 degrees (using IrfanView) the results were much better:

In the screenshot above the data in columns A to C was extracted from the image on the clipboard, which has been pasted in columns E to I. The results are still not perfect, in particular:

  • Some 1s at the start or end of a number have been missed.
  • Some zeros have been converted to o.
  • Spaces have been inserted into some numbers, usually associated with a 1.

In columns J to K the results have been converted to either numbers or #Value, using the Value function.

In columns N to P Value has been used in conjunction with Substitute, to remove any spaces inserted between numbers, so the result can be converted to valid numbers. Note that the results still need to be checked carefully, since there is no way to check where digits have been removed, other than a visual check of the original table.

This entry was posted in Charts, Excel and tagged , , , . Bookmark the permalink.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.