An AI reckoning coming in August

With help from Derek Robertson

The big AI headline in Washington today was Vice President Kamala Harris hosting the CEOs of Microsoft, Google, Anthropic and OpenAI in a closed-door meeting.

But the real attention of the AI community is now fixed in August, at an event that could provide a very public reckoning for the large language models these tech corporations have produced.

Tucked into the White House’s press release Thursday on “new actions that will further promote responsible American innovation in artificial intelligence (AI) and protect people’s rights and safety,” was a nod to DEFCON 31 — a giant hacker convention held across multiple Las Vegas hotels from August 10-13 that now has an unusual endorsement from the Biden administration. Amid all the noise and bluster about regulating AI, it’s the most concrete move yet to provide some public accountability — and public testing — of the fast-moving platforms at the heart of the conversation.

Collectively, large language models — ChatGPT, Midjourney and their ilk — have suddenly lit a fire under the federal government. Congress has more questions than answers right now, and the White House has laid out (purely voluntary) AI development guidelines, frameworks and roadmaps.

But now the White House has effectively signed onto a public experiment to find out whether rapidly developing AI models are secure and safe enough for widespread adoption — for the public and for the government itself. This isn’t a formal audit — instead, the plan is to let the world (or at least the part of the world at DEFCON this year) test the models from Anthropic, Google, Hugging Face, NVIDIA, OpenAI, and Stability AI, which are currently some of the most popular LLMs out there.

“Nobody’s ever done anything like this before,” said Seed AI CEO Austin Carson — one of the people organizing the DEFCON 31 hacking exercise.

It’s not precisely unprecedented, although the scale of the hacking exercise might be. The Pentagon held a red-teaming exercise at last year’s DEFCON for a project to build individual “micro” electric grids for some of its military bases.

August’s exercise is meant to function as a pilot, Carson says. “If you go on ChatGPT and you try to do some stuff, it’ll kick you out eventually, right? But we want folks to have at least this hour… to explore all the ways this thing possibly works, explore the guardrails and the functionality.”

Carson plans to bring in a few hundred students from a coalition of community colleges (alongside the regular DEFCON crowd of hackers) — and let them rip. So, basically: a giant coordinated red team exercise on a bevy of headline-grabbing AI platforms to test if they are worthy of public trust.

But will it be enough? Heather Fraser, a senior fellow at Georgetown’s Center for Security and Emerging Technology who works on AI standards and testing, doesn’t think so. But she does think it’s a step in the right direction.

The unease around AI adoption are manifold — from triggering the apocalypse to sinister dark patterns that define people’s experience in the digital world. “It’s like when you’re playing a video game and you get to a new continent on your screen — there’s so much you don’t know,” Frase said. The DEFCON assessments are “going to give us a huge thing of value in a short amount of time,” she said, “but it is not sufficient.”

“We need to see this as a long-term commitment,” Fraser said. “There’s gonna be things that we’re not gonna discover quickly, even with DEFCON. So we are in this for the long haul.”

Scale AI CEO Alexandr Wang — whose company is building the testing platform for this massive red-teaming exercise — said something similar: “This test at DEFCON is not going to be the last time that we have to test and evaluate these models for safety, reliability, and accuracy. These are going to be requirements for society going forward,” Wang said. He compared his company to Switzerland, “unbiased and not beholden to any single ecosystem.” As for which AI models from each company will come out to play: “We’re collaborating directly with all of these foundation model companies to understand which models they would like to test and evaluate,” Wang said.

“Advances in technology, including the challenges posed by AI are complex. Government, private companies, and others in society must tackle these challenges together,” Vice President Kamala Harris in a statement released after her meeting with the four tech CEOs. “President Biden and I are committed to doing our part – including by advancing potential new regulations and supporting new legislation – so that everyone can safely benefit from technological innovations.” she said

And all those big AI model-makers — all those CEOs — are expecting to play ball, providing jailbroken versions of their models for DEFCON 31. And if you’re curious about what the big tech participants are gaining from this exercise: it’s not just consumer trust — it’s government trust, said Fraser. The open door policy that Harris is adopting with big tech CEOs is actually a diplomatic tightrope walk to get all the players at the table, said Carson — a signal that the White House is looking to have a constructive conversation with tech corporations that exist largely outside of any data regulations in the U.S. And in a nutshell, the White House’s ultimate goal in setting up the public AI assessment in August was “to evaluate the models that are out in the wild and being used today,” said Wang.

but wait, there's more...

Senate Majority Leader Chuck Schumer took to the Senate floor today to emphasize bipartisanship in any potential AI regulation, fresh on the heels of a report this morning from POLITICO’s Brendan Bordelon and Mohar Chatterjee on how the newly hot topic is taking over Capitol Hill.

“I want to underscore that anything we do regarding AI must be thoroughly bipartisan,” Schumer said. “That is why I am talking to a good number of my Republican colleagues here in the Senate. And we [have] got to be clear-eyed about the pros and cons of this technology.”

What are those pros and cons, as lawmakers on the Hill see it? Well… it’s still pretty unclear. AI researchers told Brendan and Mohar that Schumer’s plan is “incredibly vague,” and there’s still a significant philosophical difference between those who think the tech industry should have a free hand to innovate and those who think it should be guided by agreed-upon principles.

Senate AI Caucus Chair Martin Heinrich (D-N.M.) said they need to get over those differences, and quickly: “We have two choices here,” he said. “We can either be proactive and get ahead of this now — and I think we have enough information to do that in a thoughtful way — or we can wait until one of these edge cases really bites us in the ass, and then act.” — Derek Robertson

what about agi?

Given the ongoing speculation about the birth of an “artificial general intelligence,” it’s worth taking a second to ponder… what actual policy could, or should, be made around it?

In a recent post, the VC and tech blogger Rohit Krishnan suggested breaking the topic down into four buckets: The technology’s economic and legal impact, the social considerations around it, its effect on geopolitics, and then on domestic politics.

“We have precedent” for regulating something with a world-changing effect on our relationship with computers, Krishnan writes, citing current legislation overseas like the U.K.’s Online Safety Bill and the European Union’s General Data Protection Act, and the ongoing debate over net neutrality.

“To think about the implications of this technology is to think about our society as it’s structured, and the philosophies we espouse collectively by deciding to live this way,” Krishnan writes. Drawing on the aforementioned examples, as well as government policies around similarly revolutionary techs like genetically modified organisms, he advises regulators to “become more nimble in response and forward thinking in terms of investment, and definitely to avoid kneejerk bans which are both stifling and counterproductive.” — Derek Robertson

Tweet of the Day

the future in 5 links

Stay in touch with the whole team: Ben Schreckinger ([email protected]); Derek Robertson ([email protected]); Mohar Chatterjee ([email protected]); Steve Heuser ([email protected]); and Benton Ives ([email protected]). Follow us @DigitalFuture on Twitter.

If you’ve had this newsletter forwarded to you, you can sign up and read our mission statement at the links provided.