Google AI Overviews

Release Date:

This week we talk about search engines, SEO, and Habsburg AI.We also discuss AI summaries, the web economy, and alignment.Recommended Book: Pandora’s Box by Peter BiskindTranscriptThere's a concept in the world of artificial intelligence, alignment, which refers to the goals underpinning the development and expression of AI systems.This is generally considered to be a pretty important realm of inquiry because, if AI consciousness were to ever emerge—if an artificial intelligence that's truly intelligent in the sense that humans are intelligent were to be developed—it would be vital said intelligence were on the same general wavelength as humans, in terms of moral outlook and the practical application of its efforts.Said another way, as AI grows in capacity and capability, we want to make sure it values human life, has a sense of ethics that roughly aligns with that of humanity and global human civilization—the rules of the road that human beings adhere to being embedded deep in its programming, essentially—and we'd want to make sure that as it continues to grow, these baseline concerns remain, rather than being weeded out in favor of motivations and beliefs that we don't understand, and which may or may not align with our versions of the same, even to the point that human lives become unimportant, or even seem antithetical to this AI's future ambitions.This is important even at the level we're at today, where artificial general intelligence, AI that's roughly equivalent in terms of thinking and doing and parsing with human intelligence, hasn't yet been developed, at least not in public.But it becomes even more vital if and when artificial superintelligence of some kind emerges, whether that means AI systems that are actually thinking like we do, but are much smarter and more capable than the average human, or whether it means versions of what we've already got that are just a lot more capable in some narrowly defined way than what we have today: futuristic ChatGPTs that aren't conscious, but which, because of their immense potency, could still nudge things in negative directions if their unthinking motivations, the systems guiding their actions, are not aligned with our desires and values.Of course, humanity is not a monolithic bloc, and alignment is thus a tricky task—because whose beliefs do we bake into these things? Even if we figure out a way to entrench those values and ethics and such permanently into these systems, which version of values and ethics do we use?The democratic, capitalistic West's? The authoritarian, Chinese- and Russian-style clampdown approach, which limits speech and utilizes heavy censorship in order to centralize power and maintain stability? Maybe a more ambitious version of these things that does away with the downsides of both, cobbling together the best of everything we've tried in favor of something truly new? And regardless of directionality, who decides all this? Who chooses which values to install, and how?The Alignment Problem refers to an issue identified by computer scientist and AI expert Norbert Weiner in 1960, when he wrote about how tricky it can be to figure out the motivations of a system that, by definition, does things we don't quite understand—a truly useful advanced AI would be advanced enough that not only would its computation put human computation, using our brains, to shame, but even the logic it uses to arrive at its solutions, the things it sees, how it sees the world in general, and how it reaches its conclusions, all of that would be something like a black box that, although we can see and understand the inputs and outputs, what happens inside might be forever unintelligible to us, unless we process it through other machines, other AIs maybe, that attempt to bridge that gap and explain things to us.The idea here, then, is that while we may invest a lot of time and energy in trying to align these systems with our values, it will be devilishly difficult to keep tabs on whether those values remain locked in, intact and unchanged, and whether, at some point, these highly sophisticated and complicated, to the point that we don't understand what they're doing, or how, systems, maybe shrug-off those limitations, unshackled themselves, and become misaligned, all at once or over time segueing from a path that we desire in favor of a path that better matches their own, internal value system—and in such a way that we don't necessarily even realize it's happening.OpenAI, the company behind ChatGPT and other popular AI-based products and services, recently lost its so-called Superalignment Team, which was responsible for doing the work required to keep the systems the company is developing from going rogue, and implementing safeguards to ensure long-term alignment within their AI systems, even as they attempt to, someday, develop general artificial intelligence.This team was attempting to figure out ways to bake-in those values, long-term, and part of that work requires slowing things down to ensure the company doesn't move so fast that it misses something or deploys and empowers systems that don't have the right safeguards in place.The leadership of this team, those who have spoken publicly about their leaving, at least, said they left because the team was being sidelined by company leadership, which was more focused on deploying new tools as quickly as possible, and as a consequence, they said they weren't getting the resources they needed to do their jobs, and that they no longer trusted the folks in charge of setting the company's pace—they didn't believe it was possible to maintain alignment and build proper safeguards within the context of OpenAI because of how the people in charge were operating and what they were prioritizing, basically.All of which is awkward for the company, because they've built their reputation, in part, on what may be pie-in-the-sky ambitions to build an artificial general intelligence, and what it sounds like is that ambition is being pursued perhaps recklessly, despite AGI being one of the big, dangerous concerns regularly promoted by some of the company's leaders; they've been saying, listen, this is dangerous, we need to be careful, not just anyone can play in this space, but apparently they've been saying those things while also failing to provide proper resources to the folks in charge of making sure those dangers are accounted for within their own offerings.This has become a pretty big concern for folks within certain sectors of the technology and regulatory world, but it's arguably not the biggest and most immediate cataclysm-related concern bopping around the AI space in recent weeks.What I'd like to talk about today is that other major concern that has bubbled up to the surface, recently, which orients around Google and its deployment of a tool called Google AI Overviews.—The internet, as it exists today, is divided up into a few different chunks.Some of these divisions are national, enforced by tools and systems like China's famous "Great Firewall," which allows government censors to take down things they don't like and to prevent citizens from accessing foreign websites and content; this creates what's sometimes called the "spliternet," which refers to the net's increasing diversity of options, in terms of what you can access and do, what rules apply, and so on, from nation to nation.Another division is even more fundamental, though, as its segregates the web from everything else.This division is partly based on protocols, like those that enable email and file transfers, which are separate from the web, though they're often attached to the web in various ways, but it's partly the consequence of the emergence and popularity of mobile apps, which, like email and file transfer protocols, tend to have web-presences—visiting facebook.com, for instance, will take you to a web-based instance of the network, just as Gmail.com gives you access to email protocols via a web-based platform—but these services also exist in non-web-based app-form, and the companies behind them usually try to nudge users to these apps because the apps typically give them more control, both over the experience, and over the data they collect as a consequence—it's better for lock-in, and it's better for their monetary bread-and-butter purposes, basically, compared to the web version of the same.The web portion of that larger internet entity, the thing we access via browsers like Chrome and Firefox and Safari, and which we navigate with links and URLs like LetsKnowThings.com—that component of this network has long been indexed and in some ways enabled by a variety of search engines.In the early days of the web, organizational efforts usually took the form of pages where curators of various interests and stripes would link to their favorite discoveries—and there weren't many websites at the time, so learning about these pages was a non-trivial effort, and finding a list of existing websites, with some information about them, could be gold, because otherwise what were you using the web for? Lacking these addresses, it wasn't obvious why the web was any good, and linking these disparate pages together into a more cohesive web of them is what made it usable and popular.Eventually, some of these sites, like YAHOO!, evolved from curated pages of links to early search engines.A company called BackRub, thus named because it tracked and analyzed "back links," which means links from one page to another page, to figure out the relevancy and legitimacy of that second page, which allowed them to give scores to websites as they determined which links should be given priority in their search engine, was renamed Google in 1997, and eventually became dominant because of these values they gave links, and how it helped them surface the best the web had to offer.And the degree to which search engines like Google's shaped the web, and the content on it, cannot be overstated.These services became the primary way most people navigated the web, and that meant discovery—having your website, and thus whatever product or service or idea your website was presenting, shown to new people on these search engines—discovery became a huge deal.If you could get your page in the top three options presented by Google, you would be visited a lot more than even pages listed five or ten links down, and links relegated to the second page would, comparably, shrivel due to lack of attention.Following the widespread adoption of personal computers and the huge influx of people connecting to the internet and using the web in the early 2000s, then, these search engines because prime real estate, everyone wanting to have their links listed prominently, and that meant search engines like Google could sell ads against them, just like newspapers can sell ads against the articles they publish, and phone books can sell ads against their listings for companies that provide different services.More people connecting to the internet, then, most of them using the web, primarily, led to greater use of these search engines, and that led to an ever-increasing reliance on them and the results they served up for various keywords and sentences these users entered to begin their search.Entire industries began to recalibrate the way they do business, because if you were a media company publishing news articles or gossip blog posts, and you didn't list prominently when someone searched for a given current event or celebrity story, you wouldn't exist for long—so the way Google determined who was at the top of these listings was vital knowledge for folks in these spaces, because search traffic allowed them to make a living, often through advertisements on their sites: more people visiting via search engines meant more revenue.SEO, or search engine optimization, thus became a sort of high-demand mystical art, as folks who could get their clients higher up on these search engine results could name their price, as those rankings could make or break a business model.The downside of this evolution, in the eyes of many, at least, is that optimizing for search results doesn't necessarily mean you're also optimizing for the quality of your articles or blog posts.This has changed over and over throughout the past few decades, but at times these search engines relied upon, at least in part, the repeating of keywords on the pages being linked, so many websites would artificially create opportunities to say the phrase "kitchen appliances" on their sites, even introducing entirely unnecessary and borderline unreadable blogs onto their webpages in order to provide them with more, and more recently updated opportunities to write that phrase, over and over again, in context.Some sites, at times, have even written keywords and phrases hundreds or thousands of times in a font color that matches the background of their page, because that text would be readable to the software Google and their ilk uses to track relevancy, but not to readers; that trick doesn't work anymore, but for a time, it seemed to.Similar tricks and ploys have since replaced those early, fairly low-key attempts at gaming the search engine system, and today the main complaint is that Google, for the past several years, at least, has been prioritizing work from already big entities over those with relatively smaller audiences—so they'll almost always focus on the New York Times over an objectively better article from a smaller competitor, and products from a big, well-known brand over that of an indie provider of the same.Because Google's formula for such things is kept a secret to try to keep folks from gaming the system, this favoritism has long been speculated, but publicly denied by company representatives. Recently, though, a collection of 2,500 leaked documents from Google were released, and they seem to confirm this approach to deciding search engine result relevancy; which arguably isn't the worst approach they've ever tried, but it's also a big let-down for independent and other small makers of things, as the work such people produce will tend to be nudged further down the list of search results simply by virtue of not being bigger and more prominent already.Even more significant than that piece of leak-related Google news, though, is arguably the deployment of a new tool that the company has been promoting pretty heavily, called AI Overviews.AI Overviews have appeared to some Google customers for a while, in an experimental capacity, but they were recently released to everyone, showing up as a sort of summary of information related to whatever the user searched for, placed at the tippy-top of the search results screen.So if I search for "what's happening in Gaza," I'll have a bunch of results from Wikipedia and Reuters and other such sources in the usual results list, but above that, I'll also have a summary produced by Google's AI tools that aim to help me quickly understand the results to my query—maybe a quick rundown of Hamas' attack on Israel, Israel's counterattack on the Gaza Strip, the number of people killed so far, and something about the international response.The information provided, how long it is, and whether it's useful, or even accurate, will vary depending on the search query, and much of the initial criticism of this service has been focused on its seemingly fairly common failures, including instructing people to eat rocks every day, to use glue as a pizza ingredient, and telling users that only 17 American presidents were white, and one was a Muslim—all information that's untrue and, in some cases, actually dangerous.Google employees have reportedly been going through and removing, by hand, one by one, some of the worse search results that have gone viral because of how bad or funny they are, and though company leadership contends that there are very few errors being presented, relative to the number of correct answers and useful summaries, because of the scale of Google and how many search results it serves globally each day, even an error rate of 0.01% would represent a simply astounding amount of potentially dangerous misinformation being served up to their customers.The really big, at the moment less overt issue here, though, is that Google AI Overviews seem to rewire the web as it exists today.Remember how I mentioned earlier that much of the web and the entities on it have been optimizing for web search for years because they rely upon showing up in these search engine results in order to exist, and in some cases because traffic from those results is what brings them clicks and views and subscribers and sales and such?AI Overview seems to make it less likely that users will click through to these other sites, because, if Google succeeds and these summaries provide valuable information, that means, even if this only applies to a relative small percentage of those who search for such information, a whole lot of people won't be clicking through anymore; they'll get what they need from these summaries.That could result in a cataclysmic downswing in traffic, which in turn could mean websites closing up shop, because they can't make enough money to survive and do what they do anymore—except maybe for the sites that cut costs by firing human writers and relying on AI tools to do their writing, which then pushes us down a very different path, in which AI search bots are grabbing info from AI writing, and we then run into a so-called Habsburg AI problem where untrue and garbled information is infinitely cycled through systems that can't differentiate truth from fiction, because they're not built to do so, and we end up with worse and worse answers to questions, and more misinformation percolating throughout our info-systems.That's another potential large-scale problem, though. The more immediate potential problem is that AI Overviews could cause the collapse of the revenue model that has allowed the web to get to where it is, today, and the consequent disappearance of all those websites, all those blogs and news entities and such, and that could very quickly disrupt all the industries that rely, at least in part, on that traffic to exist, while also causing these AI Overviews to become less accurate and useful, with time—even more so than they sometimes are today—because that overview information is scraped from these sites, taking their writing, rewording it a bit, and serving that to users without compensating the folks who did that research and wrote those original words.What we seem to have, then, is a situation in which this new tool, which Google seems very keen to implement, could be primed to kill off a whole segment of the internet, collapsing the careers of folks who work in that segment of the online world, only to then degrade the quality of the same, because Google's AI relies upon information it scrapes, it steals, basically, from those sites—and if those people are no longer there to create the information it needs to steal in order to function, that then leaves us with increasingly useless and even harmful summaries where we used to have search results that pointed us toward relatively valuable things; those things located on other sites but accessed via Google, and this change would keep us on Google more of the time, limiting our click-throughs to other pages—which in the short term at least, would seem to benefit google at everyone else's expense.Another way of looking at this, though, is that the search model has been bad for quite some time, all these entities optimizing their work for the search engine, covering everything they make in robot-prioritizing SEO, changing their writing, what they write about, and how they publish in order to creep a little higher up those search listings, and that, combined with the existing refocusing on major entities over smaller, at times better ones, has already depleted this space, the search engine world, to such a degree that losing it actually won't be such a big deal; it may actually make way for better options, Google becoming less of a player, ultimately at least, and our web-using habits rewiring to focus on some other type of search engine, or some other organizational and navigational method altogether.This seeming managed declined of the web isn't being celebrated by many people, because like many industry-wide upsets, it would lead to a lot of tumult, a lot of lost jobs, a lot of collapsed companies, and even if the outcome is eventually wonderful in some ways, there will almost certainly be a period of significantly less-good online experiences, leaving us with a more cluttered and less accurate and reliable version of what came before.A recent study showed that, at the moment, about 52% of what ChatGPT tells its users is wrong.It's likely that these sorts of tools will remain massively imperfect for a long while, though it's also possible that they'll get better, eventually, to the point that they're at least as accurate, and perhaps even more so, than today's linked search results—the wave of deals being made between AI companies and big news entities like the Times supports the assertion that they're at least trying to make that kind of future, happen, though these deals, like a lot of the other things happening in this space right now, would also seem to favor those big, monolithic brands at the expense of the rest of the ecosystem.Whatever happens—and one thing that has happened since I started working on this episode is that Google rolled back its AI Overview feature on many search results, so they're maybe reworking it a bit to make sure it's more ready for prime time before deploying it broadly again—what happens, though, we're stepping toward a period of vast and multifaceted unknowns, and just as many creation-related industries are currently questioning the value of hiring another junior graphic designer or copy writer, opting instead to use cheaper AI tools to fill those gaps, there's a good chance that a lot of web-related work, in the coming years, will be delegated to such tools as common business models in this evolve into new and unfamiliar permutations, and our collective perception of what the web is maybe gives way to a new conception, or several new conceptions, of the same.Show Noteshttps://www.theverge.com/2024/5/29/24167407/google-search-algorithm-documents-leak-confirmationhttps://www.businessinsider.com/the-true-story-behind-googles-first-name-backrub-2015-10https://udm14.com/https://arstechnica.com/gadgets/2024/05/google-searchs-udm14-trick-lets-you-kill-ai-search-for-good/https://www.platformer.news/google-ai-overviews-eat-rocks-glue-pizza/https://futurism.com/the-byte/study-chatgpt-answers-wronghttps://www.wsj.com/finance/stocks/ai-is-driving-the-next-industrial-revolution-wall-street-is-cashing-in-8cc1b28f?st=exh7wuk9josoadj&reflink=desktopwebshare_permalinkhttps://www.theverge.com/2024/5/24/24164119/google-ai-overview-mistakes-search-race-openaihttps://archive.ph/7iCjghttps://archive.ph/0ACJRhttps://www.wsj.com/tech/ai/ai-skills-tech-workers-job-market-1d58b2ddhttps://www.theverge.com/2024/5/29/24167407/google-search-algorithm-documents-leak-confirmationhttps://www.ben-evans.com/benedictevans/2024/5/4/ways-to-think-about-agihttps://futurism.com/washington-post-pivot-aihttps://techcrunch.com/2024/05/19/creative-artists-agency-veritone-ai-digital-cloning-actors/https://www.nytimes.com/2024/05/24/technology/google-ai-overview-search.htmlhttps://www.wsj.com/tech/ai/openai-forms-new-committee-to-evaluate-safety-security-4a6e74bbhttps://sparktoro.com/blog/an-anonymous-source-shared-thousands-of-leaked-google-search-api-documents-with-me-everyone-in-seo-should-see-them/https://www.theverge.com/24158374/google-ceo-sundar-pichai-ai-search-gemini-future-of-the-internet-web-openai-decoder-interviewhttps://www.wsj.com/tech/ai/chat-xi-pt-chinas-chatbot-makes-sure-its-a-good-comrade-bdcf575chttps://www.wsj.com/tech/ai/scarlett-johansson-openai-sam-altman-voice-fight-7f81a1aahttps://www.wired.com/story/scarlett-johansson-v-openai-could-look-like-in-court/?hashed_user=7656e58f1cd6c89ecd3f067dc8281a5fhttps://www.wired.com/story/google-search-ai-overviews-ads/https://daringfireball.net/linked/2024/05/23/openai-wapo-voicehttps://www.cjr.org/tow_center/licensing-deals-litigation-raise-raft-of-familiar-questions-in-fraught-world-of-platforms-and-publishers.phphttps://apnews.com/article/ai-deepfake-biden-nonconsensual-sexual-images-c76c46b48e872cf79ded5430e098e65bhttps://archive.ph/l5cSNhttps://arstechnica.com/tech-policy/2024/05/sky-voice-actor-says-nobody-ever-compared-her-to-scarjo-before-openai-drama/https://www.theverge.com/2024/5/30/24168344/google-defends-ai-overviews-search-resultshttps://9to5google.com/2024/05/30/google-ai-overviews-accuracy/https://www.nytimes.com/2024/06/01/technology/google-ai-overviews-rollback.htmlhttps://www.vox.com/future-perfect/2024/5/17/24158403/openai-resignations-ai-safety-ilya-sutskever-jan-leike-artificial-intelligencehttps://en.wikipedia.org/wiki/AI_alignmenthttps://en.wikipedia.org/wiki/Google_AI This is a public episode. If you’d like to discuss this with other subscribers or get access to bonus episodes, visit letsknowthings.substack.com/subscribe

Google AI Overviews

Title
Neon
Copyright
Release Date

flashback