Provisionally Preventive Digital Minds Governance
1. Introduction
In an earlier post I gave reasons to think that digital minds governance should aim to prevent the creation of digital minds, at least for the near term. Given that conclusion, we face further questions about what form preventive digital minds governance should take.
One important strategic choice point is whether preventive digital minds governance should be absolute or provisional. That is, should the aim be to ensure that digital minds are never created? Or should it instead be to prevent the creation of digital minds only so long as certain conditions hold?
I favor a provisionally preventive approach.
In this post, I undertake three tasks. First, I outline motivations for adopting a provisionally preventive approach rather than an absolutely preventive approach. Second, I note some candidate answers to the central question for a provisionally preventive approach to digital minds governance, the central question being:
Under what conditions should digital minds governance relinquish the aim of preventing the creation of digital minds?
Third, I’ll suggest some factors that should be taken into account in answering this question.
I won’t try to answer the central question here, partly because I don’t have a settled view about which conditions should prompt a transition from prevention to something else. In addition, I’m more confident that a provisionally preventive approach is worth exploring than I am in any particular view about which provisos it should adopt. Accordingly, the goal of the post is not to answer the central question but to promote further exploration of provisionally preventive digital minds governance.
2. Potentially fleeting motivations for prevention
The main point in favor of provisional prevention over absolute prevention is that motivations for prevention may be fleeting.
To illustrate, consider the following motivations for preventing the creation of digital minds.
Our civilization’s lack of moral concern for (future) digital minds suggests that we would mistreat digital minds on a large scale if we create them on a large scale.
Our scientific and ethical ignorance puts even those who wish to do right by digital minds in a poor position to do so.
While a preventive approach to digital minds governance would be difficult to design and implement, it would be much less challenging than designing and implementing a protective or integrative approach.
In the current political environment, a preventive approach also seems less likely to provoke strong political opposition.
Given our current level of understanding of AI safety and digital minds welfare, creating digital minds would engender tensions between making powerful AI systems safe and ensuring that digital minds are not mistreated. These tensions bode poorly for humans and for digital minds.
At present, no one knows whether the future we will choose on the other side of transformative AI will be wonderful, terrible, or something in between. Better to wait and see before deciding whether to create a new kind of morally significant being.
Yet time may erode these motivations.
In line with past expansions of our moral circle, humans may come to extend moral concern to digital minds.
Knowledge of digital minds’ interests may improve, putting those of us who are so inclined in a better position to treat them decently.
As digital minds governance capacity and the willingness to invest in such capacity grow, the challenges of designing and implementing protective and integrative approaches to digital minds governance may become readily surmountable.
As our understanding of digital minds’ interests improves and AI safety advances, we may discover ways of creating digital minds that do not entail severe tradeoffs between AI safety and digital minds welfare.
And perhaps we will someday find ourselves on the other side of transformative AI, pleasantly surprised that our civilization has gotten its act together and left the worst of its self-inflicted perils behind.
If all these things came to pass, the case for preventive digital minds governance would dramatically weaken. Upon finding myself in such circumstances, I imagine I would cease to favor a preventive approach. After all, preventive governance requires justification, and that justification would presumably go missing when the noted motivations cease to hold.
Admittedly, there is room for the absolutist to argue that the creation of digital minds should be prevented even in such circumstances. For example, one could press such a line by extending anti-natalist arguments against the creation of humans to tell against the creation of digital minds. But even those who favor absolutist prevention in principle may have pragmatic reasons to strike a compromise on provisional prevention as a means of avoiding the anarchic status quo.
In any event, I turn now to outline some candidate exit conditions, meaning conditions that a provisionally preventive approach would adopt as triggers for its own cessation and a transition to something else.
3. Candidate exit conditions
There are at least four broad classes of candidate exit conditions that merit consideration. There is of course the option of combining exit conditions.
First, there are risk-reduction conditions, conditions defined in terms of the reduction of risks to suitably low levels. Candidates for such risks include those of large-scale digital suffering, large-scale violations of digital minds’ rights, and civilizational collapse as well as risks that digital minds could pose to humans such as interference with AI safety.
Second, there are technological exit conditions. Candidates for technological exit conditions include compute abundance, interpretability advances that enable reliable monitoring of digital minds’ welfare-relevant states, aligned AGI, and AGI advisors that are well-positioned to design and implement a successor to preventive governance.
Third, there are epistemic exit conditions. Candidates for these include the acquisition of evidence that sufficiently resolves uncertainty concerning which systems are digital minds, how to determine the interests of digital minds, how different sorts of treatment would affect those interests, and whether the lives of digital minds we create would be (highly) positive in expectation. In practice, these conditions might need to be operationalized in terms of sufficient convergence among relevant experts.
Finally, there are societal exit conditions. Candidates include democratic support for an alternative approach, the founding of institutions that would protect digital minds’ basic interests, sufficient institutional capacity for helping digital minds integrate into society, international agreements on the treatment of digital minds, a stable regime with aligned AGI, and endorsement by a deliberation process that will help decide our civilization’s long-term trajectory.
Most of these conditions indicate the passing of a danger. For example, risk-reduction conditions indicate this directly, while the epistemic conditions indicate the passing of dangers associated with our ignorance about digital minds.
A notable exception may be the endorsement of digital mind creation by a civilization-steering deliberative process: digital minds may need to be created during the process in order to help inform the process about empirical facts concerning digital minds or else to give digital minds a participatory role in the process. If so, then the beginning of that process may mark the arrival of new benefits associated with creating digital minds rather than the passing of persistent dangers.
Some conditions simultaneously indicate the passing of dangers and the unlocking of benefits. One such condition is determining that the lives of digital minds we create are positive in expectation. Another might be the advent of a stable regime with aligned AGI.
4. Factors to consider in developing a provisionally preventive approach to digital minds governance
To conclude, I’ll note some factors that bear on the evaluation of which exit conditions to adopt and, more generally, on the design of a provisionally preventive form of digital minds governance.
Naturally, adopted exit conditions should ideally track pros and cons of continuing to prevent the creation of digital minds relative to the kind of digital minds governance that would succeed the preventive regime. The choice of exit conditions should therefore be coordinated with the choice of a successor form of governance. A transition to a protective regime would call for different exit conditions than a transition to an integrative regime.
Transitional effects would also need to be taken into account in such choices. For example, if transitioning from preventing the creation of digital minds to integrating them into society would predictably lead to backlash against digital minds, then exit conditions and transition protocols should be adopted with due consideration of that fact.
Another important factor is the reversibility of the transition from prevention to another form of digital minds governance. If the transition would be difficult or costly to reverse, that could be a reason to adopt more stringent exit conditions. Note that reversal could take the form of preventing the creation of further digital minds and providing legacy protections to those that were created prior to the reversal.
Next, there’s susceptibility to operationalization: for a given exit condition, how hard would it be to specify that condition in sufficiently concrete terms so as to enable it to serve its role in governance? Complex exit conditions that are unobjectionable in theory may be untenable in practice if they resist operationalization.
The final factor I’ll note is timing. Given the anticipated trajectory of the foregoing factors, when, if ever, is the best time for a transition to happen? And to what extent are different candidate exit conditions coupled to optimal timing or at least to favorable timing?
The choice of exit conditions is subject to complex and evolving facts. So, a provisionally preventive approach would probably do well to include a mechanism for adjusting exit conditions in light of new evidence and changing circumstances. If so, then it’s not too early to begin developing proposals for which exit conditions to adopt initially along with proposals for adjustment mechanisms.1
The image for this post was generated by the Copilot image generator. For copy editing support, I thank Claude Sonnet 4.5.

