My friend Devin Kalish recently convinced me, at least to some extent, to focus on A.I. governance in the short-to-medium term, more than technical A.I. safety.
The argument that persuaded me was this:
A key point of stress among AI doomsayers like Yudkowsky is that we, first of all, have too little time, and second of all, have no way to implement any of the progress alignment workers do make. Both of these are governance problems, not alignment problems. They are also, arguably, far easier to picture possible promising interventions for than alignment research.
To lay out the logic more explicitly…
AI safety is urgent insofar as different capabilities groups are working towards it.
Different capabilities groups are propelled at least partly by “if we don’t do this, another group will anyway”, or the subtly different “we must do this, or another group will first”.
(2) is a coordination problem, potentially solvable with community governance.
AI safety is less tractable insofar as capabilities groups don’t have ways to implement alignment research into their work.
Solutions to (4) will, at some point, require groups/resources made available to capabilities groups.
(5) looks kinda like a governance problem in practice.
The AI alignment problem is quite hard, on the technical level. Governance work, as noted in (3) and (6), is both more tractable and more neglected than technical work. At least, it is right now.
The rest of this essay is less organized, but contains my thoughts for how and why this could work.
An OpenAI team is getting ready to train a new model, but they’re worried about it’s self improvement capabilities getting out of hand. Luckily, they can consult MIRI’s 2025 Reflexivity Standards when reviewing their codebase, and get 3rd-party auditing done by The Actually Pretty Good Auditing Group (founded 2023).
A DeepMind employee has an idea for speeding up agent-training, but is worried about its potential to get out of hand. Worse, she’s afraid she’ll look like a fearmonger if she brings up her concerns at work. Luckily, she can bring up her concerns with The Pretty Decent Independent Tip Line, where it can then go to her boss anonymously.
OpenAI, DeepMind, and Facebook AI Research are all worried about their ability to control their new systems, but the relevant project managers are resigned to fatalism. Luckily, they can all communicate their progress with each other through The Actually Pretty Good Red Phone Forum, and their bosses can make a treaty through The Actually Pretty Trustworthy AI Governance Group to not train more powerful models until concrete problems X Y and Z are solved.
These aren’t necessarily the exact solutions to the above problems. Rather, they’re intuition pumps for what AI governance could look like on the ground.
What happened to the Partnership For AI? Or the Asilomar conference? Can we use existing channels and build them out into coordination mechanisms that researchers can actually interact productively with?
If coordination is the bottleneck, a full effort is called for. This means hokey coordination mechanisms borrowed from open-source and academia, groups for peer-reviewing and math-checking and software auditing and standards-writing. Anything other than declaring “coordination is the bottleneck!” on a public forum and then getting nothing done.
Many people in this community are turned-off by politics, perhaps explaining some of the shortage of AI governance work. But “politics”, especially in this neglected area, probably isn’t actually as hard as you think.
There’s a middle ground between “do nothing” and “become President or wage warfare”. Indeed, most effective activism is there.
This sub-site is hosted inside Thinking Much Better, though not necessarily licensed or authored or owned in precisely the same way. On this sub-site, the below statement overrides Thinking Much Better’s default licensing:
Unless
otherwise specified on individual pages, all posts in this sub-website
are licensed under a
Creative
Commons Attribution 4.0 International License.