The rise and fall of Google Books.

From Wikipedia, the free encyclopedia

Google Books

URL

Commercial

Site type

online library

Languages)

multilingual

Owner

Beginning of work

From the history

Write a review for Google Books

Notes

Links





Advertising

Communications

ON

Platforms

Instruments development

Publication

Search (PageRank , manuals)

Thematic projects

Closed projects

An excerpt characterizing Google Books

- It's horrible! No, it's terrible, terrible! Natasha suddenly spoke up and sobbed again. “I’ll die waiting for a year: it’s impossible, it’s terrible. - She looked into the face of her fiancé and saw on him an expression of compassion and bewilderment.
“No, no, I’ll do everything,” she said, suddenly stopping her tears, “I’m so happy!” The father and mother entered the room and blessed the bride and groom.
From that day on, Prince Andrei began to go to the Rostovs as a groom.

There was no betrothal, and no one was announced about Bolkonsky's engagement to Natasha; Prince Andrew insisted on this. He said that since he was the cause of the delay, he must bear the full burden of it. He said that he had forever bound himself with his word, but that he did not want to bind Natasha and gave her complete freedom. If in six months she feels that she does not love him, she will be in her own right if she refuses him. It goes without saying that neither the parents nor Natasha wanted to hear about it; but Prince Andrei insisted on his own. Prince Andrei visited the Rostovs every day, but not like a groom treated Natasha: he told her you and only kissed her hand. Between Prince Andrei and Natasha, after the day of the proposal, completely different than before, relatives were established, simple relationship. They didn't seem to know each other until now. Both he and she loved to remember how they looked at each other when they were still nothing, now they both felt like completely different beings: then pretended, now simple and sincere. At first, the family felt awkward in dealing with Prince Andrei; he seemed like a man from an alien world, and Natasha for a long time accustomed her family to Prince Andrei and proudly assured everyone that he only seemed so special, and that he was the same as everyone else, and that she was not afraid of him and that no one should be afraid his. After a few days, the family got used to him and did not hesitate to lead the old way of life with him, in which he took part. He knew how to talk about housekeeping with the count, and about outfits with the countess and Natasha, and about albums and canvases with Sonya. Sometimes the family Rostovs among themselves and under Prince Andrei were surprised at how all this happened and how obvious the omens of this were: both the arrival of Prince Andrei in Otradnoye, and their arrival in Petersburg, and the similarity between Natasha and Prince Andrei, which the nanny noticed on the first visit Prince Andrei, and the clash in 1805 between Andrei and Nikolai, and many other omens of what happened, were noticed at home.
The house was dominated by that poetic boredom and silence that always accompanies the presence of the bride and groom. Often sitting together, everyone was silent. Sometimes they got up and left, and the bride and groom, remaining alone, were also silent. Rarely did they talk about their future lives. Prince Andrei was scared and ashamed to talk about it. Natasha shared this feeling, like all his feelings, which she constantly guessed. Once Natasha began to ask about his son. Prince Andrei blushed, which often happened to him now and that Natasha especially loved, and said that his son would not live with them.
- From what? Natasha said scared.
“I can’t take him away from my grandfather and then…”
How I would love him! - said Natasha, immediately guessing his thought; but I know you want no pretexts to accuse you and me.
The old count sometimes approached Prince Andrei, kissed him, asked him for advice on the upbringing of Petya or the service of Nikolai. The old countess sighed as she looked at them. Sonya was afraid at any moment to be superfluous and tried to find excuses to leave them alone when they did not need it. When Prince Andrei spoke (he spoke very well), Natasha listened to him with pride; when she spoke, she noticed with fear and joy that he was looking at her attentively and searchingly. She asked herself in bewilderment: “What is he looking for in me? What is he trying to achieve with his eyes? What, if not in me what he is looking for with this look? Sometimes she entered into her insanely cheerful mood, and then she especially liked to listen and watch how Prince Andrei laughed. He rarely laughed, but when he did, he gave himself over to his laughter, and every time after that laughter she felt closer to him. Natasha would have been perfectly happy if the thought of the upcoming and approaching parting had not frightened her, since he, too, turned pale and cold at the mere thought of it.

I decided to make a series of posts about how to touch with pens something that, in theory, you can only look at, or even you can’t even look at. Of course, this is not about boobs (everything is easier with them), but about electronic libraries and archives.

Some kulhacker will certainly grab his tummy with laughter, reading this text full of dense lamerism and naive noobism, but many flotophages, i.e. for people who are often very mature and not very versed in these Internets of yours, the text may be useful. God himself ordered to start naturally with Google-Books.

We select any with an American flag, click on it and go to a suspicious site where we usually see the "enter url" field where we enter the coveted address http://books.google.com/ and click "Go".

Something like this:

If everything is fine, then we see a picture similar to the one we see under google-sharing. If you are lucky with the anonymizer, then 100% of potentially downloadable books will be visible. Of course, downloading is not easy - regular breaks occur and the download speed is very unstable, but, as they say, you don’t look a gift horse in the ass.
The most accurate way is to act not through the anonymizer, but by setting up a proxy connection in the browser. In this case, you need to google something like "US proxy list", and select a site in the search results that will offer you something like this:

It is pointless to give links to specific sites, because. thousands of them.
Next, you have to act at random. We go to the browser settings. In opera, go to: "general settings> advanced> network> proxy server". It's a little different in IE (emnip in "connections"). Well, not the point. On a site with proxy addresses, we randomly select one of the proxies from the list, which has the HTTP type (socks does not seem to work). In the proxy connection settings in the browser, copy the numbers up to the colon into the long window, copy the port number (the numbers after the colon) into the small window on the same line.
Something like this:

Everything. click "OK", and try with a proxy configured in such a way to go to some site. If the site opens, then the proxy plows, and you can safely go to Google Books. If no sites open or do it extremely slowly, then you need to repeat the procedure, but this time try to insert another proxy from the list from the site. The method described above is essentially universal in the sense that through a proxy you can not only rummage through Google Books, but also climb onto resources where you are banned, or simply don’t want to burn your IP (for example).
A real coolhacker, of course, would have gotten smart, and set up some tricky program, such as proxy-switcher "a, but why such difficulties for a simple flotophage?

Sometimes on Google there are very tasty books (sometimes even relatively recently published), which you can even touch a little page by page, and it’s even possible to do it under a Russian IP. The nuance is that Google allows you to look at only a couple of dozen pages at a time, and amba. There are several ways to pull out much more than "put". The people are wise as they can. Someone comes into the right book from thousands of different IPs (by the way, really, through a dozen Dutch anonymizers, you can type half the book in one sitting), someone asks friends and acquaintances to download from such and such a page and send it by soap. Finally, many people use the google book downloader 2.3 script for firefox (however, without a connection with a proxy switcher, which I personally did not master, this measure is not very effective - only every fourth page is downloaded, nothing more). by the most effective solution until the last moment was the google book downloader program (developer adma). latest version The program was 0.7.0 (build 0). A great thing - I attacked the coveted book from dozens of different IPs, pulling pages in a chaotic order, and waiting a few seconds before each throw, so as not to attract the attention of Google bots. From three visits, I downloaded 70-80% of the book (that is, the maximum possible) and saved the result of your choice either as a pdf file or as a set of jpegs in a separate folder. I downloaded twenty books in this way, and started downloading the same number more, but did not download it simply because "it's not worth it anymore."

Stopped working about a month and a half ago. Alas. It remains to be hoped that sooner or later what happened to more early versions this program, i.e. again, out of nowhere, the developer will appear and throw it into the Internet new version. In general, at the request "google book downloader" the search engine gives links to a lot of worthless rubbish, which even those allowed 30 pages are not really able to download, or to old versions of the same GBD (by the way, someone mentioned that just old versions of GBD, in particular build 0.6.9, seem to work, but I won’t lie - I haven’t tried it). So on this moment there is an old-fashioned way - manual page-by-page saving. after Google bans our IP ("your browsing limit is exhausted"), we get into the book through the Dutch anonymizer (googled with the words Dutch proxy), save ten more pages, we are banned, we go from another goll. anon-ra, and so on, until the book itself turns blue / total ban (Google sometimes does this if it sees that the same book is being accessed too intensively).

It seems that he said everything on google-books, but there was a persistent feeling that he forgot something. Remember - I'll update.

Next time I will write about HathiTrust.

Google Books was the company's first ambitious experiment. However, even after 15 years, he still could not change the world. About how the idea of digitizing all books was born and how it developed, told the journalist Scott Rosenberg (Scott Rosenberg).

Books can do wonders. As Franz Kafka once said, "a book should be an ax for the frozen sea within us."

This saying belongs to Kafka, right? Google can confirm. But where and under what conditions did Kafka say this? Search results in websites with citations, but don't rely on them. They usually mistakenly attribute all quotes to Mark Twain.

To answer this and similar questions, you need to use Google Book Search, a tool that can search for text in millions of digitized publications. All you have to do is find the little "more" button at the top of the search results - it comes after the "Pictures", "Videos" and "News" tabs. Click on this button and select "Books".

It turns out that Kafka's quote about the "frozen sea" appeared in Kafka's "Letters to friends, family and editors" in his letter to Oscar Pollak, dated January 27, 1904.

Photo: Clive Darra/Flickr

Google Book Search is awesome effective tool for such tasks. When the service first appeared 15 years ago, it seemed impossibly ambitious: a young tech company that had only recently managed to reach and organize the vast information jungle of the World Wide Web is about to expand its offline search capabilities. By scanning millions of printed books from the libraries that joined the project, the company could add all of human history to its database before the advent of the internet.

"The books contain thousands of years human knowledge, and possibly in quality possible,” said the co-founder Google Sergey Brin in an interview with The New Yorker at the time. “Not taking advantage of this is too big an omission.”

Today, Google is famous for its approach to ambitious projects, its willingness to take on colossal tasks around the world. Many Google veterans agree that Books was the first such project in the company's history: just think, scanning all the books in the world!

At its inception, Google Books was supposed to give the world a vision of "utopian literature" that combines the convenience of the digital age with the wisdom of printed books. At the time, this idea seemed like something of a singularity for the printed word: we would put all the books on the air, and somehow this would lead to an increase in the literacy rate of all people on Earth. Instead, Google Books has only reached the quiet middle ages, handing out quotes and excerpts from more than 25 million books in its database.

Google employees say they didn't expect more. Perhaps this is true. But we can say for sure that they made everyone else hope for more.

On the way from cosmic promises to the ordinary, two things happened with Google Books. Shortly after launch, the project went from an idealist's paradise to a legal hell: authors began to fight Google's right to index their work, and publishers also began to defend their industry from the onslaught of electronic services. It was followed by a decade-long legal battle that ended only last year, when the US Supreme Court dismissed the Writers Guild lawsuit and finally removed all barriers to Google's literary ambitions.

However, at this time, another change occurred with Google Books - one that is familiar to almost every organization or group of people bogged down in long legal battles. The project has lost its former drive and ambitions.

When I started working on this text, I was afraid that the Books project no longer existed as an essential part of Google, that the company completely curtailed it. Around Google Books, as is the case with many other ideas of the company, there has always been a certain veil of secrecy, but now, when I began to ask questions, everything seemed to fall through the ground. For weeks I could not find anyone even remotely able to tell about current state project.

Photo: imadc/Depositphotos

Several former project employees I spoke with shared their suspicions that the company had stopped scanning books. Subsequently, I learned that a small group of employees are still working on the book search and adding new titles, albeit at a much lower intensity than at the peak of the project in 2010-2011.

“For us, fashionable features and features that are immediately visible to users are not so important,” says the current head of the project, Stephane Jaskiewicz, who has been with the team for about ten years. “We are more likely to work behind the scenes and hone the technical side: we add new content, process it so that the book can be viewed online, and fine-tune the search algorithm.”

One task has always been important for Google Books: improving the scanners that add new books to the "corpus", as the database is commonly called. At the beginning of the project, in 2002, Larry Page and Marissa Mayer decided to estimate how long it would take to scan all the books on Earth, and placed a metronome next to a digital camera on a tripod. As soon as the company set itself the goal of bringing speed to an efficient level, the details of the work began to be carefully hidden.

Jaskewitz confirms that scanning setups continue to improve, with updates being released every six months. LED lighting technologies, which were not so common at the time of the project launch, help a lot. It is also helpful to learn more productive page turning techniques for plant operators. “It's a lot like picking the guitar,” says Jaskewitz. – We find people who have their own ways of flipping – a special production thumb and other similar methods.

However, the bulk of the work at Google Books remains to improve the "quality of the search" - so that you can find the Kafka quote you need even faster and even more reliably. This is a game whose winners do not get universal recognition - at most, an award for the best player in the reserve.

To understand how the Google Books project got to this point, you need to know a thing or two about copyright, which divides all books into three categories. Some books are public, meaning you can do whatever you want with their text. Basically, these are books published before 1923, as well as more contemporary writings, whose authors have chosen to forego standard copyright. Many of the newer books are still being published and are subject to copyright protection: if you want to use their text for your own purposes, you need to agree on this with the author and publisher.

How many books does this share correspond to? It's hard to say, since no one knows exactly how many books there are beyond Earth in total. This number depends on what is considered to be a "book", and it is not so easy to define this framework. In 2010, Google engineer Leonid Taycher wrote in a blog post that after analyzing the metadata, Google Books determined total number books (at that time) at the level of 130 million. Other experts called this study "nonsense". Most likely, the actual number of books is lower than Teicher's estimate, but significantly higher than the 25 million that are currently contained in the Google Books database.

Thus, a significant proportion of this huge number falls on "orphan jobs". Until recently, they did not cause any particular problems. One could borrow one from the library or find one in a bookstore without any problems. But as soon as Google announced their desire to scan these books and make them available on the Internet, everyone claimed their rights to them.

The legal debate that followed was, in fact, a battle for custody of these orphans. Google, publishers and authors have tried to seize the right to control the process of bringing these books to a new home in the digital world. In the end, these three parties came to a compromise, which was called the "Google Books Agreement". Under its terms, Google was able to fully place books on the Internet without having to pay compensation to copyright holders. However, in 2011, a federal judge annulled the contract on the pretext of fearing that the private for-profit company would forever become a registrar of new books and a profit collector from all over the world's literature.

As soon as the deal stopped working, Google started scanning books again, and publishers plunged into a promising market. e-books, which hurt Google's position in the race for leadership thanks to the success of the Amazon Kindle. But the Authors Guild has not dropped its lawsuit, arguing that Google's audacious attempt to crawl and index all books without the permission of the copyright holders was illegal.

Google is a rich company, but not rich enough to ignore the threat of multi-billion dollar copyright fines (millions of books and thousands of dollars in fines each). This whole process lasted until the US Supreme Court put an end to it last year, once and for all securing the right of Google to catalog books and display brief excerpts("snippets") in search results just as it is done for web pages.

This court decision was a fundamental achievement for the future of online search - for Google and the world. "Now it's an official precedent that everyone will benefit from," says Erin Simon, current Google Books Product Advisor. - This case will be in the textbooks. First of all, it is important to define what exactly “fair use” means.” (Simon also noted with a smile that at the time of filing a lawsuit from the copyright holders, she had not even begun her studies at the academy.)

Even though the Guild of Authors lost the case, its representatives are sure that they fought for a just cause. Google "went down the wrong path from the start," says James Gleick, Guild President: "They started out without the creative community that they're building on. Large companies treat creative work as if they have a "first night right." They consider themselves masters of the world. Instead, it was just a matter of obtaining licenses.”

One would think that the victory in Supreme Court meant a new burst of energy in the work on Google Books: start the scanners, full speed ahead! However, all indications are that none of this happened in this case, partly because the base was already huge.

“We have a fixed budget for everything,” explains Jaskewitz. “At first we scanned everything that fell into our hands. At some point, there were a lot of duplicates.” Now Google has begun to provide collaborating libraries with lists of the most interesting ones.

Photo: Amy/Flickr

There are many other explanations for the loss of former enthusiasm. An unpleasant aftertaste after litigation. The rise of promising new ideas that paid for themselves faster. And one more thing: the gradual realization that scanning all the books in the world, no matter how useful it may bring, may not change the world as much as we would like.

For many bibliophiles, Google's ambition to become world library did not make sense: some public organizations. Once Google showed that the idea of scanning the world's literature was feasible, others took it up as well. Brewster Kahle's Internet Archive, which documents the development of the Internet, has already scanned its own database. America's Digital Public Library grew out of meetings at Harvard's Berkman Center in 2010 and now serves as a collection digital collections many libraries and organizations.

When Google negotiated with university libraries to scan their collections, the company committed to providing libraries with copies of the data, and in 2008 HathiTrust began collecting and offering these files for use. (She, too, had to defend herself against lawsuits from the Authors Guild.)

HathiTrust is made up of 125 organizations and institutions that “are confident that together we can help scientific research and cultural exchange is better than going it alone, or leaving it to companies like Google,” said society director Mike Furlough. In addition, there is also the Library of Congress, new leader which - Carla Hayden (Carla Hayden) - undertook to digitize their collections and put them in the public domain.

Each of these organizations is in some way a competitor to Google Books. However, in reality, Google has gone so far ahead that it is unlikely that any of them will be able to compete with the company on an equal footing.

Many experts agree that it took Google several hundred million dollars to create Google Books, and no other organization would go to such an expense to get an alternative.

However, nonprofits have an advantage that Google doesn't: they are immune to the reprioritization that can happen to a giant corporation. All their attention is focused on books, and they do not have to spread their attention to running one of the largest businesses in the world or operating system for smartphones. Unlike Google, non-profit organizations will always be interested in finding new ways to connect readers with books that will help, as Kafka would say, melt the frozen mind.

More than once in history never ending trials turned into powerful whirlpools that dragged and drowned all participants in the case. (In literature, this was most vividly illustrated by Dickens in Bleak House: the multi-generational case of Jarndyce v. Jarndyce resulted in all the assets at stake being used to pay legal fees.)

“If Google could take this corpus of data, divide it into genres, topics, time periods and all other possible categories and then give access to the database to engineers and machine learning enthusiasts, something interesting could come out of it - now it’s even impossible to say that,” Sloan explains. He suggested that Google is already doing this internally. Jaskewitz and other Google experts declined to comment on these speculations.

Perhaps when another neural network from the future gains consciousness and feels Kafka's inherent existential crisis, she will be able to find solace in the right book to help break the ice. Or, unlike us, this network will be able to read all the books we scanned - really read, as it should be done. What would she do then?