Google has 25 million books scanned but nobody is allowed to read them - Calgarypuck Forums - The Unofficial Calgary Flames Fan Community

Cecil Terwilliger · 04-25-2017, 05:29 PM

Good read.

https://www.theatlantic.com/technolo...m_source=1-2-2

Quote:

You were going to get one-click access to the full text of nearly every book that’s ever been published. Books still in print you’d have to pay for, but everything else—a collection slated to grow larger than the holdings at the Library of Congress, Harvard, the University of Michigan, at any of the great national libraries of Europe—would have been available for free at terminals that were going to be placed in every local library that wanted one.

At the terminal you were going to be able to search tens of millions of books and read every page of any book you found. You’d be able to highlight passages and make annotations and share them; for the first time, you’d be able to pinpoint an idea somewhere inside the vastness of the printed record, and send somebody straight to it with a link. Books would become as instantly available, searchable, copy-pasteable—as alive in the digital world—as web pages.

Quote:

By 2004, Google had started scanning. In just over a decade, after making deals with Michigan, Harvard, Stanford, Oxford, the New York Public Library, and dozens of other library systems, the company, outpacing Page’s prediction, had scanned about 25 million books. It cost them an estimated $400 million.

taco.vidal · 04-25-2017, 06:12 PM

This is important.

Quote:

On March 22 of that year, however, the legal agreement that would have unlocked a century’s worth of books and peppered the country with access terminals to a universal library was rejected under Rule 23(e)(2) of the Federal Rules of Civil Procedure by the U.S. District Court for the Southern District of New York

CaptainCrunch · 04-26-2017, 08:24 AM

I'm glad they've chosen to keep the necronomicron and the darkhold out normal peoples hands.

Kybosh · 04-26-2017, 08:51 AM

That's not fair.

Mazrim · 04-26-2017, 09:16 AM

And to think one of the main thrusts of the opposition to this was because other companies wouldn't be able to do the same thing. Would every company really want to scan millions of old out-of-print books on the off chance they actually make some money?

The argument about science journals being outrageously priced is true though - there would need to be something to prevent that from happening.

temple5 · 04-26-2017, 09:17 AM

I Google scanned books that are still under copywrite (I assume that exists for books) without paying the owners, why would anyone think they would be allowed to release that for free?

For older books, transcripts, maps etc should be fine but books you can still buy on Amazon, why would that be ok to release those books for free?

I like Google's idea but to think they would by force of will be able to circumvent that, thats a bit nasty to think a corporation would do that. I guess google, facebook, twitter have been doing that lately with no seeming hope of antitrust lawsuits but wow.

CorsiHockeyLeague · 04-26-2017, 09:21 AM

Did you actually read the article? They weren't going to release the things for free. If you searched for a passage that appeared in a book you'd get a small excerpt of a few sentences on either side of it, like a card catalogue. Their argument is that this is totally different from you being able to read the book, which is true.

BloodFetish · 04-26-2017, 11:09 AM

I watched a Ted Talk a while ago talking about how filling out a Captcha helps in the effort to digitize books by Google and others.

Essentially the random characters in a Captcha are not random, each are a tiny part of a scanned book page that needs verification by humans because computers have trouble deciphering that bit with with OCR.

https://www.ted.com/talks/luis_von_a...ation#t-205693

Mazrim · 04-26-2017, 11:23 AM

Quote:

Originally Posted by temple5

I Google scanned books that are still under copywrite (I assume that exists for books) without paying the owners, why would anyone think they would be allowed to release that for free?

For older books, transcripts, maps etc should be fine but books you can still buy on Amazon, why would that be ok to release those books for free?

The article said that books still for sale aren't part of this. The article also said how Google scanning and asking for forgiveness after is what lead to the solution of the class-action lawsuit - where Google would offer the entire library to institutions, charge a nominal fee for out-of-print books, and keep the money in escrow until the author steps up to claim the moneys they were owed from the sales.

The reason it fell through was that some people felt it was too much of a monopoly on books (even though the Kindle store on Amazon has the lion's share of ebook market) by Google, some felt they were going to overcharge too much to institutions for access in the future, and some felt that Google shouldn't be profiting at all from their efforts.

Coach · 04-26-2017, 11:24 AM

Quote:

Originally Posted by BloodFetish

I watched a Ted Talk a while ago talking about how filling out a Captcha helps in the effort to digitize books by Google and others.

Essentially the random characters in a Captcha are not random, each are a tiny part of a scanned book page that needs verification by humans because computers have trouble deciphering that bit with with OCR.

https://www.ted.com/talks/luis_von_a...ation#t-205693

Watched this as well. Also doing the same thing with duolingo translating the internet into multiple languages.