Convopage : @amac : . @scottros 's in dept Google Books article (url) is an example of press exacerbating an SV problem (+ contains errors) 1/

How Google Book Search Got Lost – Backchannel

's in dept Google Books article is an example of press exacerbating an SV problem (+ contains errors) 1/

Backchannel@backchnnl

Google Books was the company’s first moonshot. But 15 years later, the project is stuck in low-Earth orbit.

Google Books was the company’s first moonshot. But 15 years later, the project is stuck in low-Earth orbit.

backchannel.com

69 replies and sub-replies as of Apr 14 2017

While I ❤ @backchannel, @scottros’s history of great work, a Google Books #longread & ref to Mr. Penumbra’s…, it is a really flawed piece 2/

SV (and many others) often does two things poorly. 1) don’t celebrate victories, & 2) don’t celebrate fine tuning and maintenance 3/

This article reinforces both. First, it misses that the moonshot reached the moon it was aiming for. /4

The Google Book Search moonshot was scanning tens of millions of books so that people could find them. /5

Remember, this was back when Google was mostly search and go somewhere else for the content. /6

You always had a much more modest vision of GB than Brin did. Users get that GB never lived up to hype. Clear that Page abandoned it. No?

Larry & Sergey & users can speak for themselves but I always understood the goal to be index. Google back then was mostly search->leave /a

I thought users would be well served by the settlement. Google & @AuthorsGuild tried that but court did not agree. /b

Alexis C. Madrigal@alexismadrigal

but, as Google said publicly at the time, the settlement structure was @AuthorsGuild idea, not Google's. /fin

James Gleick@JamesGleick

I, too, still feel that users (though we call them "readers") would have been well served by that settlement. A lost opportunity.

I do not think this was how it was widely understood at the time

Tim O'Reilly@timoreilly

I would agree that an honest assessment is that the settlement with the publishers and the author’s guild scuttled the original big vision

Brandon Butler@bc_butler

Really?? What got scuttled?

Timothy B. Lee@binarybits

The theory of the settlement was that the class action mechanism could get Google the rights to display full digital copies of orphan works.

David Riordan🖖@riordan

There were going to be terminals in every public library where you could read any book at any time.

Brandon Butler@bc_butler

In the settlement, yes. Tim says the settlement scuttled a bigger vision. I don’t know what that was.

Timothy B. Lee@binarybits

Oh yeah I'm not sure what that larger vision would have been.

Brandon Butler@bc_butler

Right - and that a class action settlement could get them MORE than default fair use. Tim O says this was LESS than G’s original vision.

David Riordan🖖@riordan

The rhetoric of the lawsuit shaped the perception as much as the pronouncements from Google.

Opinion | A Library to Last Forever

Brin did speak for himself, and he offered up this grand nonsense:

Google’s books project is a win-win for authors, publishers and Google, but the real winners are readers, who will have access to an expanded world of books.

nytimes.com

I'm not sure Brin's op-ed is "nonsense," unless everybody who writes about the potential of digital libraries is also guilty of "nonsense."

It's total nonsense because 1) Nothing Google did resembled a library and 2) Preservation was never a standard or practice of Google Books.

Brin didn't write the headline. That's on the Times. /1

As for his claims about preservation, Google provided copies to libraries--for preservation (among other things)./2

No. Low-res scans with terrible metadata are not preservation-quality. Brin was bullshitting.

The Hathi Trust doesn't see it that way.

I'm not sure how often you speak to librarians or HATHI about preservation standards. But I can assure you that you are wrong.

Perhaps you've misunderstood the point. Hathi built itself on the Google scans.

That piece was trying to save the embattled settlement, which would have enabled them to scan many more books (and give libraries more)./3

Nothing about settlement enabled or prevented scanning. Scanning continued and continues.

So, incomplete and self-serving, but hardly "nonsense." 4/4

That was after the settlement was announced.

The idea was find relevant books and go to Amazon, the publisher, or a library to get the content. /7

The moonshot was thinking you could create full text search for tens of millions of hard copy books. /8

Many thought it could not be done in any reasonable time or cost. Including engineers on the team. /9

13yrs later, Google has tens of millions of books all full text searchable in a split second. That’s what a flag on the moon looks like. /10

+ many other projects were inspired or got new motivation through Google’s audacity. But the article dismisses that accomplishment. /11

Second, the less-glamorous work that engineers are now doing to maintain & tune the index is dismissed as less worthy. /12

This happens all too often in SV & is not the press’s fault, but that doesn’t mean it should be reinforced. /13

I for one am VERY happy that folks are still working to scan books, even if through lists of missing books rather than whole shelves. /14

I hope that engineers are trying to improve the book search algorithms. They work pretty well for me but incremental improvement is good /15

Incidentally, and understandably given the complexity, @scottros also gets a bunch of stuff wrong: /16

(a) the definition of “orphan works”: these are works whose copyright owners are unknown or can’t be found. /17

Out of print books whose owners are clear and easy to contact are typically not considered orphaned because rights can be acquired /18

b) @AuthorsGuild lawsuit was re if scans for indexing & other stuff was fair use (as he says later) not “a custody fight over orphans” /19

(c) GB was never a “read sharing service” for the in-copyright books. The idea that GB started off as that & changed course is incorrect /20

Do I wish the Google Books settlement had been approved? Yes. Do I hope that another solution can be found for orphan works? Yes. /21

But, the original Google Books moonshot wasn’t about either of those. /22

Google Books is unique & useful today regardless of whether it takes a click or a visit to the library to read the book you find. /23

Scanning the world’s books so we could find them through full-text search was ~mostly~ accomplished. We should celebrate, not mourn. /24

And we should all continue to work on making books even more accessible & useful. And, as @scottros mentions, many including Google are. /25

I’m working on a little something in that vein as well, but I’ll save more details on that for another time. /26

Finally, a disclaimer. In case not obvious, I care personally (a lot) about book search and Google Book search. /27

I worked on the project for 6+years & will always have a soft spot in my heart for it. So take what I say here w/ a grain of salt /end

How Google Book Search Got Lost – Backchannel

PS You should read @scottros's piece. It is good reporting. Don't let my rant be your only experience of it.

Google Books was the company’s first moonshot. But 15 years later, the project is stuck in low-Earth orbit.

backchannel.com

bowerbird@bbirdiman

google-books moon-shot achieved much of its claimed original goal. but there was much more lurking, all of which is now exclusive to google.

jessamyn west@jessamyn

Was surprised to not see more mention of the work @internetarchive has been doing with this. Their search could be more robust but it works.

Scott Rosenberg@scottros

Internet Archive's work is very important, I agree--deserves a whole 'nother article. This one was already getting... long.

jessamyn west@jessamyn

And I loved it, didn't want to just be an internet nitpicker. Appreciated the behind the curtain view. Always wondered what happened to them

Agreed about main lawsuit, but Settlement debate could fairly be construed as about "orphans" (broadly construed).

One might give @scottros a break on not attending to complexities of orphan works in a short piece. /1

Jessica McKenzie@jessimckenzi

There are competing meanings (outside US copyright office), and his is one of them. /2

Matthew Shaw@UnivLibDean

But what about rights? I love the discovery but unserved populations without #library access can't effectively utilise. #Google

Kiran@bkiran

True. Most books now are available as ebooks so the scanning project doesn't need to run but the searching project is as successful as ever

Melissa Levine@Msmsmele

The piece (almost inadvertently) gets at why Google and libraries are not the same thing - and why that's aok.

what's an SV problem?

Jessica McKenzie@jessimckenzi

SV is shorthand for Silicon Valley

oh oh of course, sorry. from the context I thought it was something that started with "Search" or "Service" or something