AI Risk Might Be More Subtle Than We Expect

May 02, 2019

If you prefer to listen rather than read, this blog is available as a podcast here. Or if you want to listen to just this post:

Or download the MP3

There’s a famous experiment among people who study addiction called Rat Park. It was conducted by Bruce K. Alexander of Simon Fraser University to test a hypothesis he had about addiction. At the time there were lots of experiments which showed rats becoming so addicted to drugs like heroin and cocaine that they would ignore food and water in favor of self administering more of the drug. Eventually dying from dehydration. Alexander felt like this had less to do with the drugs and more to do with the experimental conditions, which generally involved caging the rats in small spaces, isolated from all the other rats and, on top of all that, with a big needle permanently stuck in them to administer the drugs. Alexander’s hypothesis was that the rat’s addiction came about as a result of these horrible conditions and if you put rats in an environment that more closely mirrored their natural environment that they wouldn’t get addicted. To test this theory he created Rat Park.

According to Wikipedia, Rat Park was, “a large housing colony, 200 times the floor area of a standard laboratory cage. There were 16–20 rats of both sexes in residence, food, balls and wheels for play, and enough space for mating.” And, according to Alexander, despite being offered a sweetened morphine solution right next to the water dispenser, the rats did not become addicted to morphine. From this Alexander argued that opiates aren’t actually addictive. It’s rotten conditions which cause the addiction, not the drugs themselves. As you might imagine he extended this to humans arguing that it’s terrible slums and poverty that cause addictions, and that the drugs themselves have no inherent addictiveness.

At this point there are many of you who arrived at this blog from the Slate Star Codex podcast and you remember an article from SSC pointing out that Rat Park is one of those things that didn’t seem to replicate very well, despite all the press it got. (You may in fact remember me reading that very post.) To review some of the arguments.

On the pro-Rat Park side:

Only about 10% of people put on opiates for chronic pain become addicted.
German soldiers during World War II popped meth like it was candy and yet after the war they mostly had no problems with later addiction. (I understand the same thing happened with Vietnam Vets and heroin.)
And of course there are vast numbers of people who drink alcohol without ever becoming alcoholics.

On the anti-Rat Park Side:

Plenty of people who seem to “have it all” definitely get addicted. (In the SSC post he mentions Ogedei Khan and celebrities.)
There also definitely seems to be a genetic component to drug reactions, particularly as far as alcohol.
And, certainly, there are people who have been raised out of poverty and given every possible support who still can’t shake their addiction.

The SSC conclusion is that on top of the study not replicating very well, there are obviously a whole host of factors involved in addiction. That the causes of addiction are complicated. There are obviously environmental and cultural factors as Alexander hypothesized, but saying it’s entirely environmental is naive. Because, on top of the environmental factors it’s clear that genes have a role as well. It’s also equally clear that some drugs are just more addictive. All of this means that treating addiction is hard.

II.

Thus far we’ve mostly talked about rats and heroin, so why did I choose the title “AI Risk Might Be More Subtle Than We Expect”? Well, to begin with we have to talk about what sort of AI risk most people expect. When you talk about AI risk with an average individual they generally end up imagining something along the lines of Skynet from the Terminator movies. Where we’re going along, gradually making computers more and more powerful, and then one day we cross some critical threshold. The computer “wakes up”, and it’s not happy. This is obviously an oversimplification, but it gets at the key point. Most people don’t start worrying about AI risk until we build a computer with human or greater than human level intelligence. When that happens if it has a morality different than our own (or no morality at all) we could be in a lot of trouble.

Given the difficulties attendant to building an AI with human level intelligence, which is to say that it has to not only play chess as well as a human, but do everything as well as a human can, many people will claim that there’s nothing to worry about. And even if there is, such a worry is a long way off. But this whole scenario seems to be imagining that there’s some stark cutoff where right before we reach human level intelligence there’s zero potential harm, and right after that there’s severe potential harm. Now, I’m sure that this is once again an oversimplification, that there are researchers out there who have thought about the potential harm an AI could cause at capabilities below those of full human intelligence. But such discussions are vanishingly rare compared to discussions of risk on the greater than human side of the spectrum. This is unfortunate because by not having them I think we’re overlooking some potential AI risks. So let’s have that discussion now.

It would be useful if AI progressed in a fashion similar to biology. If we could speak of fish-level AI and dog-level AI, and so on. Because we know what kind of damage a fish can cause, and what kind of damage a dog can cause. (My sister’s dog recently got loose and killed six of her neighbor’s chickens, so dog damage is on my mind at the moment.) And knowing this we could have some reasonable expectation of preventing the kind of damage those AIs might cause. But artificial intelligence hasn’t progressed in the same fashion as biological intelligence. Instead, there are some things an AI can do much better than a human, for example playing chess, and other things it still does much worse, for example tying its shoes. The question then becomes, is there any danger attached to the things AIs do really well? With chess, it’s just our pride at stake, but are there areas with more at stake than that?

III.

As I mentioned above we’re still a long ways away from general, human-level AI, but we have made a lot of progress in some specific AI sub-domains. In particular, one of the things that AI has gotten very good at is brute force pattern detection. The example of this which has gotten the most press is image recognition.

As you can probably guess image recognition is a very hard problem. You might think that if you were trying to get a computer to recognize pictures of cats that you could just describe what a cat is. But once you actually attempt to explain the concept of a cat it turns out to be basically impossible. So instead what they do is feed the AI lots of pictures with cats, and lots of pictures without cats, until eventually the AI figures out how to spot the image of a cat. But just as we can’t explain what a cat is to the AI, the same thing is true for the AI, it can’t explain what a cat is to us either, it just knows it when it “sees it”.

Now imagine that instead of maximizing the AIs success rate at identifying cats, you want it to maximize engagement. You want it to pick content that ends up maximizing the time someone spends on your platform. As a more specific example, instead of the AI picking out cats you want it to identify Facebook timeline content that keeps an individual on Facebook for as long as possible. To do this, instead of feeding in cat pictures and pictures without cats, you feed in data about what content they like vs. what content they don’t like. In the first example you get better cat recognition in the second you get more engaging content.

Thus far everyone pretty much agrees that this is what Facebook and similar platforms do. Where opinions start to diverge is on the question of whether this engagement is bad. And here we bring back in the issue of addiction. Is there a level at which engagement is the same as addiction? Or, coming at if from the other direction, would creating addiction be a good way of achieving engagement? If so, is there any reason to doubt that AIs would eventually figure out how to create this addiction as part of their brute-force pattern matching?

How would they go about creating it? Well as I said above, the causes of addiction are complicated, but that’s precisely where AIs excel. Not only that but it seems easier to create addiction than to cure it. Maybe certain kinds of content is more addictive, so the AI will show that more often. (I’m sure you’ve heard the term clickbait.) Maybe it will use variable operant conditioning, or maybe, if Rat Park has any validity, it will do it by making us sad and lonely.

To be clear I agree with SSC that the most extreme claims made by Bruce K. Alexander are probably false, but on the other hand it’s difficult to imagine that being sad and lonely wouldn’t contribute on some level to addictive behavior. Or to put it another way, does being psychologically healthy make someone less likely to engage in addictive behavior or more likely? If less likely, then the AI is incentivized to undermine otherwise healthy individuals. And, as it happens, there is plenty of data to back up the idea that this is precisely the effect social media has on people.

As I said, an AI can’t explain to us how it determines whether there’s a cat in the picture or not. In the same fashion it also can’t explain to us how it achieves greater engagement. If it is making people sad and lonely in order to create addictive engagement, this is not because it’s naturally cruel. It understands neither cruelty nor sadness, it only knows what works.

Lot’s of ink has been spilled on the more flashy side of AI risk. AI overlords with no regard for biological life. Out of control versions of the broom in the Sorcerer’s Apprentice. Or an AI that simply plays the stock market like it plays Chess and takes all the money. But at the moment I’m far more worried about the dangers I’ve just described. Not only are we experiencing that harm right now, rather than 50 years from now, but if it is happening, the effect is very subtle, so much so that it’s entirely possible that we won’t really recognize it until it’s too late.

IV.

I had intended to end on that point about the subtlety of this danger, but then yesterday I came across an article published last week in Wired covering much the same ground, though the argument was broader. The title was: Tristan Harris: Tech Is ‘Downgrading Humans.’ It’s Time to Fight Back. The major thrust of the article is how Harris spent a whole year trying to come up with the perfect phrase to describe what was happening. Given that I’ve only been thinking about it for the last couple of weeks, it’s possible that Harris’ argument is more convincing, as such I thought I’d better include it.

As he struggled with the words, he had a few eureka moments. One was when he realized that the danger for humans isn’t when technology surpasses our strengths, like when machines powered by AI can make creative decisions and write symphonies better than Beethoven. The danger point is when computers can overpower our weaknesses—when algorithms can sense our emotional vulnerabilities and exploit them for profit.

Another breakthrough came in a meeting when he blurted out, “There’s Hurricane Cambridge Analytica, Hurricane Fake News, and there’s Hurricane Tech Addiction. And no one’s asking the question ‘Why are we getting all these hurricanes?’”

He didn’t want to define the problem as one of evil technology companies. Even social media platforms do all sorts of good, and Harris, in fact, uses them all, albeit in grayscale. There are also plenty of technologies that don’t ever hack us, help elect fascists, or drive teens to cut themselves. Think about Adobe Photoshop or Microsoft Word. He needed a phrase that didn’t make him seem like a Luddite or a crank.

Finally, in February, he got it. He and Raskin had been spending time with someone whom Harris won’t identify, except to note that the mysterious friend consulted on the famous “story of stuff” video. In any case, the three of them were brainstorming, kicking around the concept of downgrading. “It feels like a downgrading of humans, a downgrading of humanity,” he remembers them saying, “a downgrading of our relationships, a downgrading of our attention, a downgrading of democracy, a downgrading of our sense of decency.”

I think Harris is correct, some technological advances have had the effect of downgrading humans, and my point about AI and addiction represents one of many specific examples for how it might be happening.

Harris included something else in his article, a quote which may sum up all of the problems I’ve been talking about. It’s from E. O. Wilson:

Humans have “paleolithic emotions, medieval institutions, and god-like technology.”

I guess the point of this post is that I might get more donations if I make you feel sad and lonely. But also that doing so is kind of awful. So I’m just going to hope you donate because you enjoy what I write.

We Are Not Saved

Discussion about this post

Ready for more?