Transcript of episode 101 — Overthink podcast

Episode 101 - AI Safety with Shazeda Ahmed

Transcript

Ellie: 0:25

David, the U. S. State Department recently commissioned a report about how A. I. employees feel about the safety of their work and the incentives driving it. And let's just say the report was not particularly positive. It turns out that a lot of people working in the field of AI have a lot of concerns about its safety.

David: 0:50

Yeah, and the fact that it's coming from people who work in this industry, I think is, particularly telling indicator of the fact that the general population, ourselves included, don't really know the full extent of the risks that are associated with these new technologies. It's similar to, all the people who work for places like Facebook that don't allow their children on social media because they know just how addictive it is. So the fact that the people with knowledge of the inside are the ones that are, raising the red flag should be a sign that we should pay attention to this a little bit more.

Ellie: 1:26

Yeah, the authors of the report that I mentioned spoke with over 200 experts for it, including employees at OpenAI, Google, DeepMind, Meta, and Anthropic. And these are AI labs that are working towards what's known as artificial general intelligence. Now the term AI is a very odd one, like artificial intelligence, but the basic idea is that techies are working to create a simulation of the intelligence that humans recognize as such, which is like mostly cognition, but also imagination, right? So the development of Dall-E was not just to stiff artists and have AI become like the new creator of images, but actually, I heard the head of frontiers research at OpenAI, Mark Chen speak about in 2022. He said that the idea was that In order to have an artificial general intelligence, you also have to have image making capabilities. So basically, even if artificial intelligence is a misnomer, the idea behind it, or the idea behind artificial general intelligence, is that we're trying to simulate the full range of what humans recognize to be intelligence, and in simulating it, find ways to make it even better. And so the reason I hesitate in talking about this a little bit is because what humans consider to be intelligence, I think, is a pretty open question. The fact that it's only recently that those in the AI community have considered image generation to be a big part of it is pretty telling. A lot of our intelligence notions have to do with cognition specifically, but also this, idea behind, what's the point of it, I think, is like something I have no good answers for, but I have in the back of my mind.

David: 3:11

Yeah, and I think one of the questions that we should be asking about a lot of this work in artificial intelligence is what definition of intelligence are they working with to begin with that they're trying to replicate, right? Because often a lot of the people in this area are computer scientists, engineers. People who define intelligence largely just as information processing. And when you adopt a broader interpretation of intelligence that includes, let's say, social intelligence, emotional intelligence, moral thinking, it starts getting more difficult to know whether that's something that you can replicate in a machine. Also, one thing that I want to mention in this context is that it's not just that they're trying to replicate human cognitive capacities, as you mentioned, Ellie. Is that especially when you're thinking about people who are closer to the transhumanist community, they believe that they are going to replicate human intelligence in a machine. And once that intelligence is created, we will merge with it through computer-human interfaces such that a new super intelligence will emerge that is neither exclusively human nor exclusively machinic.

Ellie: 4:20

And this idea of a possible future where humans and machines fuse is both for some a doomsday scenario and for others a utopian scenario. The idea is that machines won't be working for us or, us working for machines. But it's not even that we'll be working together, it's rather that like we will be one, right? Which I think is attractive in certain respects because I think a lot of people's concerns around AI safety have to do with this idea that in the future we're going to have AI overlords. But I think, for others risks losing what makes humans or like what makes the organic and especially once you bring in the realities of the late capitalist world that we live under even if you think that in principle humans becoming machinic and machines becoming human like such that we ultimately merge could be ideal or could be a good thing I think many of us are concerned that we'll end up getting co opted by a profoundly exploitative economic order. Henry

David: 5:33

No, and I love how the same scenario is literally utopia for one group and dystopia for another group. And it's interesting to think that distinction between utopia and dystopia might map onto the distinction between employee and boss in contemporary technology, because you mentioned, that a lot of the employees who work in tech today have a lot of worries about the safety and the future of AI. But that's very different than the attitude that we find in what I sometimes call the golden boys of contemporary tech, like the Ray Kurzweil of the world, the Elon Musk, so on and so forth, right? For them, the bosses, it really represents a utopia. And I think there is no better example of this than precisely Ray Kurzweil's book, The Singularity is Near, which came out a couple of years ago, and it's a behemoth of a book. It's about 650 pages of him talking about this utopian moment that is going to happen in the near future that he calls the singularity, which is the moment when artificial intelligence will not only surpass human intelligence, but will also merge with it. And he argues that at that point, we will become a kind of super organism that is nothing but disembodied information and knowledge in like a purified, crystallized form.

Ellie: 6:55

Kurzweil's position has been attractive for a lot of people, but also easy to parody because he does read as so utopian about the potential of technology. There are, a lot of people who have been voicing significant concerns recently about artificial general intelligence specifically, and I think that is, where we want to begin thinking about the topic for this episode, which is ultimately going to be less about like actually what recommendations we have for AI safety, not like you and me, we, but even like the actual community, right? Our guest today is an expert on the character of the AI safety communities rather than on if we do X, Y, Z, we will be safe from AI. But I think a few of the concerns that are worth pointing out come from the philosopher Nick Bostrom, who wrote a really influential 2014 book called Superintelligence that ended up having a big influence on Elon Musk and a bunch of other people in the tech world. And in this book, Bostrom suggests that the rise of artificial general intelligence can cause some significant issues, including displacement of the workforce by artificial intelligence, political and military manipulation, and even possibly human extinction. And I think this threat of human extinction that Bostrom articulates in the book has been one of the main drivers of recent concerns about AI safety, in addition to the fact that our AI has just improved significantly since Bostrom published this book ten years ago.

David: 8:32

Yeah, and Bostrom has this really fun kind of thought experiment to highlight some of these high end, even if low probability, doom scenarios.

Ellie: 8:40

A fun thought experiment about human extinction.

David: 8:43

about human extinction.

Ellie: 8:44

So philosopher of him.

David: 8:46

Yeah, just like turning everything into an interesting theoretical point. No, but it's called the paperclip maximizer. So he says, imagine that we create an intelligent machine that has one specific goal, and that goal can be very benign, like making paperclips. The problem is that we humans have no way of guaranteeing that we can predict exactly how that ends. algorithm or that machine will go about achieving that end. So it could be that the machine decides, using its general intelligence that it's been equipped with by its human creators, that it needs to use any possible resource in order to create more and more paperclips. And then it starts literally killing human life in order to produce more and more paperclips. And so here, it's not as if human extinction would come about from an algorithm that turns evil and wants to dominate us, it would actually just be something much more mechanical and mundane that we can't control. And so he's trying to make some of the risks of AI more palpable to a general audience by making them seem more realistic.

Ellie: 9:58

By talking about death by paperclips.

David: 10:00

Yeah, it's not that they become diabolical. It's just that they are machines and they are indifferent to us. That's the point.

Ellie: 10:06

I feel like this thought experiment is what you would get if you asked Kafka to write office space. Let's talk with our expert guest today who has far more intelligent things to say about AI safety than either of us could do justice to.

David: 10:24

Today, we are talking about AI safety.

Ellie: 10:27

How have philosophies stemming from utilitarianism become so dominant in discussions of AI safety in today's world?

David: 10:34

What are the most salient risks associated with AI today?

Ellie: 10:38

And how has the rise of funding and interest in AI safety come to shape today's understanding of what the risks posed are? Shazeda Ahmed is Chancellor's Postdoctoral Fellow at UCLA's Institute of American Cultures. A specialist in AI safety, AI ethics, and technology in China, Dr. Ahmed has held positions at Princeton University, Citizen Lab, and the AI Now Institute. In addition to her peer reviewed work, Dr. Ahmed's research has been featured in outlets including Wired, Financial Times, The Verge, and CNBC. Dr. Ahmed, welcome to Overthink. We are so excited to have you. Oh?

David: 11:19

And there is so much demand for us to have an expert on AI and ethics and safety. So there are a lot of people out there who are our fans who are just dying to hear this conversation. So thank you for your making the time.

Shazeda: 11:31

Thank you so much for the invitation.

David: 11:33

Let's jump in by thinking about the recent explosion of interest in AI, which has happened especially in the wake of the recent successes of LLMs, large language models. I'm here thinking about especially ChatGPT, but also other models. And it makes sense that a lot of people are posing questions about the safety of contemporary AI. But I think that when the average person thinks about AI and safety together, they often conjure up images of a dystopic techno future, drawn from maybe their favorite sci fi movie, where an algorithm or a machine sort of malfunctions, runs amok, and then, subjects us to its will. And I think that most contemporary discussions of people who work in this field don't really take that form, right? It's not about those scenarios. So can you tell us just by way of beginning what people working on AI and safety today are actually concerned about when it comes to modern AI?

Shazeda: 12:38

There are multiple ways of approaching questions around AI and safety outside of what the community that calls itself AI safety works on. So for many years before such a community existed and used that name for itself, you had people working on issues like bias, right? If you use a hiring algorithm in your software that's screening job candidates and it's trained on CVS that are mostly white men who went to Ivy League schools, what will that mean for candidates who don't fit that description? Bias, discrimination, false negatives and positives and associating certain characteristics with people. The kind of classic real world examples are face recognition systems that are used for policing that recognize the wrong person and can lead to the wrongful arrest of that person. So those are the issues that had been percolating and people have been talking about for many years. But when it comes to these kinds of more dystopic examples that you brought up, that is very specifically related to a community of people who have come together around a series of other interlinked movements. So effective altruism, long termism, existential risk, this kind of group of people had been thinking about how to promote human flourishing centuries from today. And they try to think on that kind of long term horizon. And thinking about that, they also had to contemplate the possibilities of risks to that flourishing not being possible. And they've created a lot of hypothetical scenarios about the possibility that we could end up with artificial intelligence that is smarter than humans. And again, they would call that artificial general intelligence or AGI. And there's competing definitions about what that entails. when it comes from a company like OpenAI, Sam Altman will repeatedly say AGI is artificial intelligence systems that outperform humans at every economically valuable task. A lot of the recent hype, a lot of the recent headlines people are seeing and parroting back about the possibility that AI could kill us all, it comes out of that community and a lot of these speculative fears. In the moment ChatGPT blew up and people were really astounded by its performance and the text outputs being in some cases indistinguishable from something a human being might write, you had a lot of people who were on the fringes of those communities and were skeptical suddenly becoming very concerned about AI's risk. But that is the main thing that happened, right? It's the ChatGPT was widely released to the public and people got to see its performance. In my opinion, that is a marketing stunt, right? What better way to get people to want to use your product than to let them use it. And also to get tons of free feedback on the ways they're going to try to break it and misuse it. So a lot of what my research, when I was at Princeton working with four of my colleagues, looking at how these ideas came, to spread as quickly as they did, what are their epistemic claims? what does it mean to produce knowledge on something so speculative? And some of the issues that we saw coming out of the communities of people interested in this, they talk about a thing called the alignment problem. So they say we can live in a utopic future with AGI if we can align AGI with human values. And that is a whole process. It's eliciting what those human values are, figuring out how to encode them into AI systems, testing them to ensure that they would be aligned under a variety of circumstances. And so the fear of the kind of scenarios of AI posing a threat to the future of human existence or an existential or X risk comes from misalignment and there are tons of people working on alignment. We did a whole study with people asking what it even means to work on alignment. if you really think this is the way, what are you doing in your work every day and how do you know you're getting there? And this is, so I hope this unpacks a little bit of like, why is it that policymakers are suddenly thinking about this? Why is it that this is in the news so often, our research showed that this community around these ideas, it's not. It wasn't that big when we started doing this research in 2022, but it's very well funded through a series of philanthropies and also tech billionaires who are individual donors. There's a lot of voluntary labor that goes into this. There are people producing research that they'll put up in web forums. It's not peer reviewed, but there's such an urgency around these ideas within the community that people are riffing off of each other and building off of each other's work very quickly.

Ellie: 16:55

Yeah, and it's so interesting to hear you talk about it because it's so obvious that there's an intersection here between technology and human values at which philosophy is at the center and it's funny, because on the one hand, I feel like a lot of people in today's world think that philosophy is this sort of ivory tower pursuit or this like useless, symptom of the decline of the humanities, philosophy maybe used to have its place, but it doesn't anymore. And it's clear when you look into AI safety and AI ethics that a lot of the people who are developing some of the big picture ideas around this are philosophers and also some of the ideas that non philosophers are building on, who are, people who are important in this community are coming from philosophy, especially from utilitarianism in the 19th century. And so I'm curious what you think about this, because even though as a philosopher, it's exciting to me to think about there being an important role that philosophy is playing in the cutting edge of human technology. It's also, I think, clear from your research that this, is being done in pretty problematic ways in certain respects. And so I think this is also a question about who is in the AI safety community, right? They're coming from all these different walks of life, thinking about the intersection between AGI ethics and policy. Who are the stakeholders here and how do you see philosophy playing into this?

Shazeda: 18:26

Sure. I think I would break the stakeholders down along disciplinary lines, institutional lines, and then who gets recruited into the community. So along disciplinary lines, when we were looking at who does AI safety work, right? We started by looking at Sam Bankman Fried's FTX future fund. So Sam Bankman Fried is the now disgraced former crypto billionaire. He was a big effective altruist who really seemed to believe in kind of the values of the community and to back up and explain what that is, right? Effective altruism was a kind of movement founded by graduate students in philosophy from Oxford and Cambridge. They were really interested in applying utilitarianism and thinking about maximizing human welfare, not only in the current day, but really in that far term future. And they came up with ideas like earning to give, right? So getting the highest paid job you can and donating 10 percent and gradually more of your income to charities that they thought were related to cause areas. So produce that vision of a future utopia. And as I was mentioning, that gets tied into them thinking about that long term future thinking about threats to it around the same time you had another Oxford philosopher Nick Bostrom writing books like super intelligence and kind of Thinking through thought experiments about that future that would involve, either having quote unquote, super intelligent, artificial intelligence or artificial general intelligence. He has eugenicist thought experiments in there about, what it would look like if we selected out certain fetuses in human reproduction and try to reproduce for specific traits. I bring this up to save the disciplinary background of people who become attracted to those things we noticed were people who had studied computer science and engineering, statistics, math, some philosophy and physics, right? There were like a few disciplines that quite a few people came from. They would start reading some of the blogs in this space that engage with some of these issues. They'd maybe read super intelligence or read MacAskill's well, What We Owe The Future is his newer book, but he had older books on effective altruism. They read Toby Ord's The Precipice and this became the set of like books. The canon for this kind of growing community. And then in terms of institutional backgrounds, you have a lot of people working in tech companies. I did graduate school at Berkeley in the Bay Area, and I had been brushing shoulders with these people for years, but they had always been on the edges of some of the conversations in academia. You have certainly academic computer scientists, but it's not mainstream. Even now, there are, there's a whole infrastructure in this, space that's been making me think a lot about how one of the things I really like about working on tech from an interdisciplinary and like justice oriented perspective is that there was this paper a few years ago that talked about social roles for computing. And one of them was computing as synecdoche or the idea that looking at computational systems that have social effects can make you reflect on what is structurally, unacceptable about those institutions, right? So if you think about like Virginia Eubanks book, Automating Inequality, and she comes up with this idea of the digital poorhouse. She's talking about how algorithmic systems in welfare distribution, in determining, whether or not to allocate housing to an unhoused person, that these create a kind of digital poorhouse. And the point of a book like that is to show that there are all these social structural forces that get baked into algorithms that create a digital poorhouse effect, right? Not just that the technology itself is doing it. So computing as synecdoche has been really interesting to apply to looking at the institutions popping up around AI safety because they have their own philanthropies. They are creating their own non profits, their own think tanks, their own polling organizations that create surveys and thus statistics around what percent of the population believes that AI is going to kill us all. And then, pumping those things out to the media and to policymakers in the same way that many of these institutions that have already existed but did not have this worldview have been doing forever. And so a lot of the times people will ask me, do you feel like this is a distortion of those institutions? And I'm like, no, it's like computing a synecdoche. They're using all of those institutions. The way they have been designed to be used, there's a deep irony in AI history. there's a recent book called How Data Happened by Matt Jones and Chris Wiggins where they have only one chapter on AI, which I appreciate because AI is not really the whole history of technology. And they talk about how AI was actually basically like a marketing term that the mathematician John McCarthy credits himself as coming up with when he wanted funding from the Rockefeller Foundation. So they say that from its inception, AI was a hype term to create a field and distinguish it from other fields, but it was really very much steeped in, other things that were happening, like symbolic approaches to creating what we're now calling artificial intelligence, and I find that history so relevant when you look at how that kind of Effective Altruism, AI safety intersection, it wouldn't have catapulted into public attention if there weren't hundreds of millions of dollars in these philanthropies that this space has created. They have spun up National Science Foundation grant money, like 20 million grant pools to fund this work and to pump it through more traditional institutions that are slower and not really equipped for what happens when something that many researchers contest as pseudo scientific and fear mongering kind of work takes up residence and in terms of students asking for courses on things like AI alignment and forming campus groups, tech companies potentially building teams around this, tech companies reorienting themselves so that suddenly everyone's working on large language models when they could have been working on all sorts of other things. As I've been doing this work, I've been paying attention to what does it say about how fragile all of these institutions were that they so quickly fell for this hype.

David: 24:07

Yeah. And the infusion of cash into this discussion is so real. just in the last several years, I've noticed as a philosopher and as somebody who works in a humanities space.

Ellie: 24:17

And as a San Francisco resident...

David: 24:19

yeah. Yeah. In San Francisco, I that all over the place, right? this venture capital interest in AI, and also the kind of braggadocious attitude that you get from people who are not experts in philosophy to just appropriate philosophical ideas and throw them because it gives them a certain kind of cultural and technological capital that they are very happy to cash in on the moment they get the opportunity. But the point here is that, yeah, there's a lot of money rushing in into this intersection of ethics and AI. And a lot of this seems to involve, in particular, the effective altruism group, which is one of the kind of sub communities. that belong to this larger umbrella of the AI safety community. And so I want to talk about them for a little bit because the effective altruism group, of course, a lot of their thinking is based on William McCaskill's writings and publications. And he himself was deeply influenced by Peter Singer's utilitarianism.utop And I want to get your take on this community. So tell us just in a few words what effective altruism is and then how you interpret the work that it's doing in this space.

Shazeda: 25:32

I would say there are two things of late that have influenced how I think about effective altruism. One is an article Amia Srinivasan wrote in 2015 called Stop the Robot Apocalypse. And, she was questioning, what is the value of utilitarianism to practical things like policymaking? What is effective altruism? As of that date, what was it really doing? And I think the conclusion she settled on was that it made people feel like they were contributing to some kind of radical change when really they were maintaining a status quo. And I would argue, yeah, that was written eight years ago. And what has really changed other than in the last year or two when so much money has been poured into this, it's really changed conversations and some of the things people are working on. But when you work on something like AI safety and it's super loosely defined from that community. It's very hard to point to specific wins and, benchmarks or measurable signs that you've made systems safer. And that is some of the critique that I see coming from, really critical colleagues in computer science is saying, these are such narrow definitions. We're not rather, these are such kind of. vague terms and narrow definitions that something like systems engineering, the kinds of things you would do to make sure that like a plane is safe. You can do that for certain types of AI systems, if you look at them in their context, but a lot of this is so decontextualized and so abstract. So that's one thing. The other, when I think about effective altruism, there's this really wonderful graduate student in, Europe, Mollie Gleiberman, who's been writing about the difference between public EA and core EA. And she's arguing that a lot of the areas more related to maybe Peter Singer's work, like issue areas like animal welfare, right? Effective altruists care about pandemic prevention. They've talked about preventing nuclear destruction, that some of these ones that seem reasonable to an outside observer, more palatable, that is public facing EA, but core EA, which is these ideas that are more contestable, right? The possibility that AI could kill us all. Some of these kind of more transhumanist problematic beliefs that is at this core and is kind of Trojan horsed in by this other set of ideas. I really recommend her work. I'm still digging into it. And she makes comparisons to the structure of this community and the federalist society. that there is something so inherently conservative about this while presenting itself as radical. And I would argue that is of a piece with how ideology becomes material in Silicon Valley in general. this one has a twist, right? Because there's this sort of Bay Area meets Oxbridge tie, like those are the two kind of core places that are married in this movement. But, a book I really love actually, and I'd read in 20 20 or 2021. And I thought, gosh, I wish somebody would write this about, E. A. And A. I. But there's probably not enough happening. And then lo and behold, it's a few years later, and I'm doing it is Adrian Daub. What Tech Calls Thinking is such a great little book looking at just a few major figures, right? Mark Zuckerberg of Meta, Peter Thiel of, you name it, PayPal, Palantir, various investments in other companies, and trying to understand like, yeah, what are the ideological underpinnings of the things they have produced, right? With Zuckerberg, he went to college for a little bit at Stanford. What, from the courses that he was very likely to have taken there because they were required, are features we see in Facebook? With Thiel, who's a big Ayn Rand head and makes it known even now,

Ellie: 29:01

Oh, I didn't know that!

Shazeda: 29:03

Yeah. Yeah. Ellie, I'll get you this book.

Ellie: 29:07

Thanks.

Shazeda: 29:08

yeah, I think that idea of it's really just maintains the status quo and there's a public facing in a core like that has been fueling a lot of how I think about this at the same time. One of my questions about this is to what extent when people enroll in these communities, do they realize that? That there is that distinction. To what extent do they believe in the causes? Because most of the people I've interviewed are very earnest, and not all of them are, identify as effective altruists or believe in long termism. There's different reasons and motivations as this community spreads that people who are not part of the kind of earlier waves would come to it for, But something that's really interesting that I noticed is even as I learned about effective altruism in 2019 from people I knew in the Bay Area who were in China when I was doing field work there, and they would even then say while wearing EA swag, and this actually comes up in like a recent New Yorker article, people will wear EA swag while saying, Oh, but I don't really identify as an effective altruist. I just like the community. And so this has stuck in the back of my mind while I do the research where it seems like some of the biggest things people are looking for are communities where they can talk about these things without being stigmatized and being taken seriously. At the very beginning of doing this work, we looked at something like 80, 000 hours, which is a big EA career hub website. It has. It's really thorough job boards that you can scroll through based on what EA cause area you're working on. So they have tons of jobs in AI that they say these can contribute to preventing AI X risk. Go work for, OpenAI and thropic, but even Microsoft and the UK government, like suddenly the range of where you can do that has expanded and they have a framework of importantness, tractability and neglectedness. And so they were saying that at the beginning, AI safety is a marginalized issue. Everybody's working on ethics, but nobody's working on safety from existential risk. It's so important and it's tractable, but it's so neglected. And so I now point out that there's been a flip where it's not neglected anymore. And there's so much money going into this. What is this community actually going to do with its power?

Ellie: 31:12

And I want to dive into the ideas behind this a little bit more as well, because as terrifying as it is to hear that Peter Thiel is all into Ayn Rand, who I think most people within philosophical communities, if we consider her a philosopher at all, consider her a pretty straightforwardly bad philosopher. It's just basically capitalist ideology cloaked in, philosophical sounding language. So if there's that side that's pseudo philosophy or just like pretty obviously bad philosophy, there's also a lot of effective altruism. This is probably the dominant movement, as we alluded to before, that is grounded in a much more, let's say, academically respectable tradition, which is the philosophical tradition of utilitarianism. And one of the examples that comes to mind for me here is that effective altruism argued that it's better to provide deworming medication in Kenya than it is to provide educational funding because, you can only help better educational outcomes in Kenya if people have access to deworming medication because, dying due to parasites is a huge problem there. So you have that, which is like a utilitarian calculus that seems benign to, like pretty straightforwardly helpful, I think, from various vantage points. But then more recently with this rise of interest in AI, you get I think far more controversial utilitarian calcul calculacies? Calculi? Anybody? Either of you? Or my more friends?

David: 32:44

Calculations.

Ellie: 32:45

Okay, thank you. Thank you. Calculations. which is better than providing funding for deworming in Kenya, let alone education. People should be providing funding for AI because AI is really what poses an existential risk to humanity altogether, not just to people in Kenya. And so what you get, you mentioned long termism before, what you get is this idea that actually so many human resources and so many people. so much funding should be going into AI at the expense of real world struggles that people are dealing with now. And so I'm really curious what you think about that. Are there benefits to this or is it just like a toxic ideology in your view? Are like, should we really be worried about the existential risks? And even if it's in a different way from the way that they're thinking of it, or are they like getting something right? Yeah. What do you think?

Shazeda: 33:34

I had a really thoughtful friend who has moved between the worlds of people who are into this and skeptical of it say, I do think things can get really weird really soon, and that I don't want to discard everything this community is working on and I agree because what this community has done is they're not really challenging capitalism or the marketing that leads people to adopt AI systems at scale, right? They're accepting that's going to keep happening because companies are very good at that and people are bought in. And then they're imagining a future where that just keeps happening. But they've taken a bunch of solutions off the table. what are things we don't have to use AI systems for? What are you know, just in regulating these technologies, which again, the community is very divided on, and they're never thinking about regulations. People have been asking for years around hate speech or copyright or, these kinds of like bread and butter issues they would see as too focused on the present day and not deserving of the urgency and the cash infusion and the lobbying efforts as these kind of more long term speculative ones are. But I think even early in this project, my colleagues and I were noticing that there's short term technical fixes coming out of some of the work that this community is doing. But I think that they've had to create a kind of self aggrandizing hero narrative, right? It's, we've been able to watermark GPT outputs so that it, somebody can see that something was written with ChatGPT and not by a human. That's valuable.

Ellie: 35:04

Okay, this is good to know for me since I'm on sabbatical, not grading papers right now, but I will be back in the classroom soon.

Shazeda: 35:10

Yeah, I, don't use GPT for a variety of reasons, and I am curious about if they figured out the watermarking or not, but I, something like that is a bounded technical fix that makes a lot of sense. Will it or will it not prevent some of the worst case scenarios that this community is coming up with? I'm not sure, but I wonder if doing that kind of really bounded ordinary work and it's for a product by a company needs to come with this narrative. there's really a lot of questioning of how will history remember us and the idea that history will remember any of the specific people working on this. So I just, I've recently really thought about how much of this is both overcompensated labor for what it actually produces and how much it helps other people. And not very compensated and voluntary sometimes labor that is part of this narrative of like you're, going to be the reason that we get utopia. And I don't know for how long people are going to be willing to suspend that kind of belief. So that's one piece of it that I've been thinking about, in terms of is this valuable or not. And I asked people who have access to it. Some of the bigger names in this field and to interview them publicly, ask them what we stand to lose if they're wrong. They never have to talk about that. They never have to talk about maybe this was a huge waste of money, resources, time, careers. some journalism that was partially inspired by our work coming out of the Washington Post was looking at how the rise of AI safety on campus is basically how our college students creating groups around this. Kind of tapping themselves into this network I described of philanthropies, fellowships, boot camps, ways of being in the network of people working on this, and then getting to work in the companies and telling yourself you're working on these issues and, maybe making 400, 000 working at Anthropic on alignment. And,

Ellie: 37:01

All right, guys, this is my last day at Overthink. I'm going to do that next, I'm just kidding.

Shazeda: 37:08

Now you understand how to do the culture thing, if you read our research. but if, there are some people whose timelines of either we get it wrong and everything's a disaster, we get it right. And we live in utopia for people who have shorter timelines, like three to five years. I'm really curious about what they're going to be saying in three to five years. Will there be some new excuse for why this didn't all shake out? I also find it something that this all makes me reflect on is this community of people is not. They're just not as interdisciplinary as I think they've been made out to be. They're very limited ways of knowing and selective cherry picking of things from philosophy, from geopolitics, right? Like for me in particular, most of my work was based in China, and a lot of my work is centered around what I'm calling technological jingoism, or the over securitization of issues related to technology and society and the economy. In the moment we're in right now with the TikTok ban and having seen this coming down the line for a very long time, a lot of my research looks at how the US government in particular is just so afraid of the United States losing its economic and geopolitical dominance over the world that casting China or any other adversary out to be this existential threat to us and that supremacy becomes an excuse for all kinds of things like banning apps and making it out to seem like all technology coming out of China is surveilling you and part of a grand scheme plan that the Chinese government has. Many of, these are also speculative and not provable, and you're supposed to just believe that it's in the best interest of everyone. I think with the TikTok ban, creators are really clapping back and watching like congressional hearings and coming up with fairly like sharp critiques. But it takes me back to AI safety when I think, they're fully bought into the jingoism as well. they will say things like, if we get AGI and it leads to bad outcomes, at least it would be better if it came out of the United States than China, which would just turn all of us authoritarian. And they've completely ignored the authoritarian tendencies within democracies ours.

David: 39:10

I think this speaks to one of the central dreams of the people in the AI community, which is the dream of taking their engineering technical skills onto a social playing field where they're no longer playing with the engineering of systems and machines, but they're actually tinkering with human bonds and with time itself, right? They're engineering. Yeah. Yeah. the future. So there is clearly a social mission here that is essential for them to maintain the goodness of their motivations and their intentions in entering this terrain. And so I want to ask you a question here that picks up on this discussion that you just started about the relationship to the state and to government, because the AI safety community, I would say as a whole, and correct me here if I'm mistaken, sees itself primarily as a social watchdog, as this canary in the mine that's keeping society from destroying itself by blowing the whistle on this doom that is coming down the hatch. And by adopting this role, their hope is to influence law and policy, right? They want to determine the content of law. And I want to know what some of the more concrete legal proposals and recommendations that grow out of this community are. What are they proposing for managing risk? What role do they think that governments should play in the regulation of private industry, for example, in order to ensure an AI safe future, or is it that the people in this community are just advocating self regulation on the part of industry left and right?

Shazeda: 40:48

So it's a bit of a mixed bag. You have people who are very afraid that there will be malicious actors who will use AI to build bioweapons. And as my mentor here at UCLA, Safiya Noble, and computer scientist Arvind Narayanan and Sayash Kapoor who write all about AI snake oil have all commented the instructions for how to build a bioweapon have been on the internet since there was an internet. But what you'd really need is like the resources and, you need a lab, you'd need reagents. You'd actually have to build it. You can't just tell a ChatGPT to do it and it'll do it for you. And so what is AI safety advocates of basically preventing the creation of bioweapons using AI, talking to policymakers and saying things like, we're afraid that there will be individual people who will buy up enormous amounts of compute resources that could be used to build something like that. There was a recent AI executive order talking about using the defense production act as like a way of regulating this and making sure it doesn't happen, treating it like you would the creation or acquisition of other kinds of weapons that has taken a lot of energy away from other kind of more practical prosaic ways of regulating AI in everyday life, because, as we know, policymakers, time is limited. The amount of work it takes to get something passed into whether an executive order or just any other policy document is finite. You have a subset of people in the AI safety world. Sometimes externally, people would call them doomers, right? The ones who are certain that this will only lead to horrible outcomes and are saying, let's just stop it all until we have proper safeguards. That, the thing I like to point out about that community is it's not particularly well funded. And as a critic like Timnit Gebru of the Distributed AI Research Institute, would say, They're also fomenting hype in saying these systems are so powerful, we must stop them. Then you're contributing to what computer scientist Deb Raji would call the fallacy of functionality. And Deb has this amazing paper with a bunch of colleagues around typologizing the different ways and points at which there's an assumed functionality that is not always there. It can come in, the design and the marketing and the deployment, interacting with the system that you're told does something it doesn't actually do. And that's just not something this community engages with. And that, honestly, I think a lot of policymakers don't have the time and resources to work on. I've talked to civil society advocates who are deeply skeptical of AI safety, and they feel like AI safety proponents have a monopoly on the future and how people think about the future because they did such a good job of selling an idea of utopia. That anybody sounding alarms for caution, they haven't done a particular job of painting a future. Anyone would want to live in because they're so busy talking about the things that are going wrong now. And then also self regulation is something the companies are absolutely advocating for. They really, they'll say, let's just do things like red teaming or paying. Often again, data workers, but sometimes people within tech companies to come up with, for example, malicious prompts for GPT. And then let's figure out how to design an automated system that would take in human feedback and then be able to stop those prompts from being addressed in the future. So some of the classic examples are, Hey GPT, how do I burn my house down and get, arson money? And it's instead of telling you exactly how to do that, it would automatically know from feedback, I shouldn't tell you how to do that and you shouldn't do that. Okay. But of course, at the same time, there are open source language models that have not taken any of that into account and are down and dirty, happy to tell you about how to, conduct ethnic cleansing, commit crimes. So there's this suddenly like wide landscape of options where like OpenAI is the relatively safe one and something produced by like Mistral is not.

Ellie: 44:32

And as we approach the end of our time, I'm really curious where you stand on all of this, because obviously your research is on AI safety, which means that you're studying the communities who look at AI safety rather than, trying yourself to figure out what AI is. the future of AI safety looks like, but of course, I imagine that's given you some thoughts too. Are you a doomer when it comes to AI? Are you whatever the opposite of a doomer is? Are you somewhere in between? And what do you think is, what do you think is the most promising feature of what we may or may not correctly call AI? And what do you think is, the most threatening possibility?

Shazeda: 45:11

I don't think any of the pre existing labels around any of this fit where I put myself because something like the concept of AI ethics being treated as if it's in opposition to AI safety has never made sense to me because people who work on a wide range of like social and labor and political issues involving AI don't necessarily characterize themselves as working on AI ethics, myself included. I'm really interested in following in the tradition of anthropologists, like Diana Forsythe, who were studying AI expert systems of the 1980s and nineties and how, what did people think they were doing when they did AI? that was her question. And that's mine around this work. What do I think is most promising? I think right now I still come back to computing as synecdoche and making just any of these technologies telling us something else about the human experience and what in our institutions can be changed outside of technology. Some of my favorite organizations and individuals I've worked with have often talked about how the technology piece of what we're working on is such a small part of it. And it's all of these other social factors that are much more important that we foreground in our work. And what would I say is perhaps the most dangerous? I think that we've been through so many phases of saying, if you just taught computer scientists ethics, they'd somehow be empathetic to these causes. I don't know that's enough. And that what that assumes about a computer scientist as a person and where they're coming from seems to be misguided as well. one of the critiques coming from within this community or from outside of this community about this community from people like to Timnit Gebru a brew and Emile Torres is that AGI is like a eugenicist project, right? And there's so much good writing about technoableism or the idea that technology should move us all away from having bodies that are not normatively abled and fit these like ideals of an able bodied person that is constantly improving themselves and that disability has no place in the world. Like this work is very techno ableist, right? Even just the concept of intelligence and not interrogating it further and you don't even have to grasp that far to find the eugenicist arguments in superintelligence and that nobody in that community is really talking about that because it's uncomfortable. In arguing that AGI is like a, eugenicist project, it's interesting to look at then the building of the field that I have studied. parallels some of how race science happened that you had to create journals and venues to legitimize it as a science when so many people contested it. I would love to see AI safety properly engaged with that. I would love to see some of these ideas around intelligence crumble and like what alternatives could we have instead, but that's not where money and time is going. And of course, some of the biggest proponents of this in the mainstream our edgelords like Elon Musk and Mark Andresen. And so that really moves us away from like any of the questions I just raised.

David: 48:07

Dr. Ahmed, this has been a wonderful discussion. You have given us a lot to chew on and to think about. And we thank you for your time and for this important work that you're doing.

Shazeda: 48:18

Thanks so much.

Music: 48:19

Thanks

Shazeda: 48:19

much. Ellie: Enjoying Overthink? Please consider supporting the podcast by joining our Patreon. We are an independent, self supporting show. As a subscriber, you can help us cover our key production costs, gain access to extended episodes and other bonus content, as well as joining our community of listeners on Discord. For more, check out Overthink on Patreon. com.

David: 48:46

Ellie, that was such a great discussion with Dr. Ahmed, and I cannot mention that this is the first time we do an episode where a community of experts is the subject matter. and I found that just really illuminating, but I wanted to mention that after we finished the interview, Dr. Ahmed said, Ellie and David, above all, what I really want is for your listeners to walk away with a sense of calm about the subject matter. So what are your thoughts about this Ellie?

Ellie: 49:13

Yeah. I was really surprised to hear that. And to the point that we were both like, Oh, we got to mention that because it happened right after we finished the recording. Because I don't know, I feel like as we were hearing all about the AI safety community, as she was describing it, I was coming away with the sense that there must be major risks because everybody is talking about them, even though one of her key points was this might not be worth funding in the way that we're funding it, right? So I do think that's just something to rest on, this idea that, maybe the AI risks are not as extreme as we think, and there is at least something overblown about it. Not that we don't take these questions seriously, but just One of the things that stuck with me, and I'm not going to remember the exact term right now because we just finished recording, it was something like functional fallacy, or she, quoted somebody who worked in this field who had said quite a number of years ago, I think not with respect to AI specifically, but with respect to technology, that we tend to have this fallacy in our thinking where we assume that there's a danger of a certain technology that technology has not yet proven itself to have, right? And so whether it's, oh, AI is going to put all of the writers out of work because now you could have AI write a Shakespeare play, partakes of this fallacy because AI has not at all proven itself to be able to creatively write. Yes, it has proven itself to be capable of doing certain writing tasks such as write copy for an Instagram post, but I would say that writing copy for an Instagram post doesn't require the same kind of creative capabilities that writing a Shakespeare play does. And so even if it's not out of the realm of possibility that AI could be genuinely creative in that sense, it is. Not evident yet that it can be, or that it's even tending that way, given what we know so far. I'm reminded a little bit about our AI in the arts episode, or other AI episode, on this point. And so I think that is useful to keep in mind, this idea that AI might lead to the extinction of the human race, because it can end up, choosing its own way, it can have freedom. requires having an inkling that AI could be capable of having freedom, which we just like, it doesn't have that functionality right now.

David: 51:36

Yeah, no, and I actually would go even further than that, because what you're alluding to is the case in which we see risks that are not there. We just collectively hallucinate these dangers and then make decisions about policies and about funding on the basis of that. But I think the other side to that coin is that because we're so concerned with those long term, unrealistic scenarios, we start missing dangers that are already materializing, and we don't even consider them as risks because there's a section in one of Dr. Ahmed's papers where she talks about the importance of really being concrete about what the risks actually are. And I decided to follow the reference, and it's to a paper that was published in 2020 by Inyoluwa Debra Raji and Roel Doby. And they talk about how the AI safety community overall. When they talk about risks, they never talk about the risks that actually matter to people. So for example, think about the dangers of automating taxis. That's not really front and center in these discussions. Usually you talk about more doomsday scenarios, or even the dangers associated with the production of algorithms and machines. So one of the examples that they mentioned, and this is the only one that I'll, list here, but there are other ones in that piece. They say often in order to train an algorithm, you have to hire a lot of humans to do the work of literally training it on the data, right? Teaching it how to classify things so that it then learns a rule. And right now, that's something that we do by outsourcing that very mechanical labor onto exploited workers, often in the third world.

Ellie: 53:20

So I actually heard Shazeda present on a panel at UCLA this spring on AI safety and ethics. And there was another really prominent researcher there whose name I'm unfortunately forgetting now, who talked about how A lot of the content moderation and other sort of really unpleasant tasks that form the dark side of internet labor have, not only been outsourced to the Global South, but actually map on perfectly to former patterns of colonialism. So who is content moderating material in France? People in North Africa. Who is content moderating material in America? The Philippines. Who is content moderating material in Britain? India, right? And I found that to be really interesting here because I think it also pertains to both a dark side and a potential upside of AI, which is that if these unpleasant tasks are being outsourced to formerly colonized people, such that we're still living in this, techno colonial society, Could those tasks further be outsourced to AI, right? And that could be like a good thing. But of course, the downside of that is then, okay, what is coming to fill the labor vacuum for these people who are currently employed by those means? I don't have a lot of faith that the neo colonialist capitalist regime that we live under is suddenly going to have like much more meaningful work for the populations in those areas.

David: 54:52

so I have a friend in Paris who works in the space of content moderation, and he told me that one of the problems with the incorporation of AI is that even though AI can catch a lot of the content that needs to be filtered out, because companies don't want to risk even one sort ofpost or one photo passing their safety filter, they still end up having to use human, moderators to catch them. And here we're talking about, some of the nastiest kind of darkest corners of the internet that, people from these formerly colonized spaces are having to deal with on a daily basis. And so just the psychological impact of having to watch, child porn, murders, like brutal beatings on a day to day basis for eight hours a day.

Ellie: 55:40

That's why I say it's so unpleasant.

David: 55:42

Yeah, I know, and I guess maybe, I'm, I want a stronger term for that than that. But the point being is that right now it's not as if you can just automate that completely, so you still need that human component.

Ellie: 55:57

And as usual in this episode, we're not coming away with a clear sense of, you must do X, Y, Z, and then we will avoid the threat of human extinction on the one hand or further exploitation on the other, which I think is an important point because that idea that we could have that clear X, Y, Z is part of the dream of the effective altruism movement and of other members of the AI safety community that Dr. Ahmed has given us really good reason to be suspicious of.

David: 56:30

We hope you enjoyed today's episode. Please rate and review us on Apple Podcasts, Spotify, or wherever you listen to your podcasts. Consider supporting us on Patreon for exclusive access to bonus content, live Q& As, and more. And thanks to those of you who already do. To reach out to us and find episode info, go to overthinkpodcast. com and connect with us on Twitter and Instagram at overthink underscore pod. We'd like to thank our audio editor, Aaron Morgan, our production assistant, Emilio Esquivel Marquez, and Samuel P. K. Smith for the original music. And to our listeners, thanks so much for overthinking with us.