Monday, May 4, 2009

You Heard It Here First Fifth:
How Is Google Gonna Scan All Those Books?

Curious as to how Google was planning to scan millions of books for online resale, New Scientist Magazine searched through stacks of the company's patents, hoping to find an answer. And they did! The story was quickly picked up by NPR, then re-printed by The Guardian UK, then linked to at Publishers Weekly, until finally being squeezed of any remaining life by yours truly.
So what's the answer?
In NPR's words, "Google created some seriously nifty infrared camera technology that detects the three-dimensional shape and angle of book pages when the book is placed in the scanner. This information is transmitted to the OCR software, which adjusts for the distortions and allows the OCR software to read text more accurately. No more broken bindings, no more inefficient glass plates. Google has finally figured out a way to digitize books en masse."
And just like that, one of the world's few remaining mysteries is solved...anticlimactically. If you're the sort of person who enjoys reading patent-ese, click here to view Patent #7,508,978.