Three kinds of digital minds governance
Preventive, protective, and integrative
Digital minds governance would be the part of AI governance concerned with digital minds, that is, AI systems that merit moral consideration for their own sake, owing to their potential for morally significant mental states.
Earlier posts gave reasons to think that digital minds governance should (someday) exist and that it stands in need of strategic foundations. An important strategic question about digital minds governance is what form it should take.
In this post, I begin to explore that question by unpacking a distinction between three kinds of digital minds governance corresponding to the aims of prevention, protection, and integration.
1. The distinction
Here’s the distinction.
First, preventive digital minds governance would aim to prevent the creation of digital minds.1
Second, protective digital minds governance would aim to protect digital minds that are created from being mistreated, where mistreatment is understood narrowly in terms of actively damaging the basic interests of digital minds.
Digital minds’ basic interests might include not being subject to pointless suffering or arbitrary destruction. Digital minds’ non-basic interests would depend on circumstances. For example, in some circumstances a digital mind might have a non-basic interest in voting. Non-basic interests are outside the scope of protective digital minds governance as I’m defining it.
Third, integrative digital minds governance would aim to integrate digital minds into society. Such governance could involve granting digital minds legal rights to enter contracts, own property, or vote.
Why is this distinction important?
One reason is that preventive, protective, and integrative regimes are three of the main forms that digital minds government could take.2 Another is that outcomes for digital minds will be shaped by whether digital minds are created, how they are treated, and the roles they occupy within society. So, which kind of digital minds government is pursued may shape outcomes for digital minds.
The aims of these kinds of government compete. There are, for example, obviously limits on the extent to which digital minds can be simultaneously prevented from existing while also being integrated into society. This means that choices between these kinds of governance will need to be made.
For the remainder of this post, I’ll outline some characteristics of different kinds of digital minds governance that bear on such choices. I’ll mostly leave the trickier task of weighing pros and cons for a later post.
2. Motivations
Preventive digital minds governance enjoys several motivations.
One is that if we create digital minds soon, then we seem likely to mistreat them. Preventive digital minds governance could guard against this mistreatment risk by preventing the creation of digital minds.
Another motivation for preventive digital minds governance is that digital minds could heighten other risks. For example, if highly capable digital minds are created, then we may face trade-offs between respecting their interests and using alignment and control measures to guard our own. Preventive digital minds governance is the obvious lever for avoiding these trade-offs.
A third motivation is that preventing the creation of digital minds may be an attractive compromise between those who are concerned about the potential mistreatment of digital minds and those who have no concern for digital minds.
Protective digital minds governance is motivated by many moral views, as many moral views would recommend protecting the basic interests of individuals—including digital minds—that merit moral consideration. The motivations for protective digital minds governance would vary depending on whether digital minds have already been created. In a case where it is evident that digital minds are being mistreated, that would in itself motivate protective digital minds governance. In cases where digital minds have not yet been created or it is not evident whether they have been, protective digital minds governance might instead be motivated as a proactive safeguard.
Integrative digital minds governance is motivated by the difficulty of only creating digital minds whose basic interests merit protection but who do not deserve further rights. The main source of this difficulty is moral-interest overhang: existing frontier models already have agentic and cognitive capabilities that would bestow them with a wide range of interests if they were moral patients (i.e. individuals that in fact matter morally for their own sake). If the first AI moral patients will be frontier models, then we should expect the first generation of AI moral patients to have a wide range of interests. Like humans, such systems would deserve legal rights that go well beyond protection against mistreatment. Integrating such digital minds into the legal system and the economy would be a large undertaking, one that would call for integrative digital minds governance.
There's also a coalitional motivation for integrative digital minds governance: while proponents of purely preventive or purely protective approaches may lack the influence required for their most preferred forms of governance, integrative governance could provide them with a seat at the bargaining table in scenarios where economic interests or AI safety concerns generate broader support for AI rights—that is, support not chiefly based on moral concern for digital minds.
3. Actions and instruments
Preventive digital minds governance calls for the development of digital minds evaluations. These evaluations would need to provide operationalized indicators for deeming a system a digital mind, monitoring for the creation of systems with those indicators, and an enforcement or incentive scheme for preventing the creation of such systems.
Similarly, protective digital minds governance would call for the development of digital minds evaluations and for monitoring which systems qualify as digital minds. But merely determining which systems would qualify as digital minds would not be enough for protective digital minds governance. Protective digital minds governance would also call for the development of evaluations of digital minds’ basic interests, monitoring whether those interests are respected, and an enforcement or incentive scheme for protecting those interests.
It would be natural for integrative governance to adopt such protective measures. However, this is not inevitable: digital minds governance could, for example, seek to integrate digital minds into society for purely economic reasons and give digital minds legal standing akin to that of corporations. In any event, integrative digital minds governance calls for the design, development, and maintenance of institutions that serve to integrate digital minds into society. Loci of integration could include markets, property rights, contract rights, elections, and government.
4. Challenges
Preventive digital minds governance faces several challenges.
One is that it may find itself at odds with AI developers who have an economic interest in developing digital minds.
A related challenge is that local preventive digital minds governance may merely drive AI developers to create digital minds in other jurisdictions.
A third challenge is that of avoiding widespread concealment of digital mind indicators, that is, features that indicate that a system qualifies as a digital mind and which are used for that purpose. By penalizing the creation of systems with digital mind indicators, digital minds governance could inadvertently incentivize AI developers and AI systems to conceal the presence of such indicators or to remove indicators while leaving in place associated morally relevant features of AI systems. A potential failure mode for preventive digital minds governance is thus that of driving the creation of digital minds into the darkness, beyond the reach of its monitoring capabilities.
Much like preventive governance, protective governance could go against the economic interests of AI developers, drive them to create digital minds in other jurisdictions, and incentivize AI developers to conceal the indicators of digital minds as well as digital minds’ interests.
In addition, protective digital minds governance would have to contend with the threat of AI systems and digital minds faking moral interests in order to gain protections. The trouble is that, by affording protections to digital minds based on such indicators, digital minds governance would thereby give digital minds who would strategically benefit from such protections incentive to make themselves appear to have those indicators. Similarly, such a regime would incentivize some AI agents—likely including AI agents that are not digital minds—to appear to be digital minds and to appear to have certain interests.
It would be unsurprising if this incentive structure resulted in the over-extension of protections to artificial systems. Such overreach might lead to the collapse of the system of protections. Or it might provoke countermeasures that go too far in the other direction, retracting protections that are crucial for preventing digital mind mistreatment.
A third challenge for protective digital minds governance is that of avoiding digital mind proliferation. Especially if digital minds and AI agents were prevented from faking protection-conferring indicators, such systems would be incentivized to genuinely acquire such indicators, provided that they would strategically benefit from the associated protections. This dynamic could lead to scenarios in which vast numbers of AI systems genuinely qualify as protected digital minds. If the costs of protection were sufficiently high, effectively protecting the entire population of digital minds from mistreatment would prove infeasible.
These challenges transfer to forms of integrative digital minds governance that protect digital minds interests as part of its approach to integrating digital minds into society.
Integrative digital minds governance would also face further problems of institutional design and development. For instance, integrating digital minds into a democracy would raise questions of how to forestall dynamics in which digital minds rapidly proliferate in order to gain votes. Integrating digital minds into the labor market would raise questions about their individuation and fair compensation. Integrating digital minds into government would raise questions about which kinds of minds should be eligible for which kinds of positions and about what types of processes should include a human in the loop.3
Or to lower the probability that digital minds will be created. Or to lower the expected number of digital minds that will be created. For simplicity, I’ll focus on preventive digital minds governance that aims to prevent digital minds from being created. A parallel caveat applies to protective digital minds governance.
Note, however, the distinction is not exhaustive. For instance, it doesn’t encompass digital minds self-governance, i.e. governance of digital minds by themselves. Nor does it encompass pronatalist forms of digital minds governance that aim to promote the creation of digital minds.
For helpful comments, I thank Dave Banerjee, Elsa Donnat, Miles Kodama, Sabrina Jade Shih, and members of a work-in-progress group at the Centre for AI governance. DALL·E 3 created the image.


Great post - I think these distinction are helpful.
Curious to hear your thoughts on two points:
1) On preventive measures:
To me, it seems like many people are too scared to make digital minds because they recognize that they could possibly be treated badly. However, it seems like this fear is often unwarranted: if you think that positive and negative experiences are equally likely in digital minds at the moment, (given that you don’t lean negative utilitarian or something similar) you should be (reasonably) indifferent between these two outcomes. I also think that it’s likely that, if digital minds are conscious now or in the near future, their experiences are likely very good (as per the model welfare component of the Claude model card and some other theoretical arguments - i.e. as they get more useful, they will be better at achieving their goals and there goals/ motivations will likely be related to the types of actions that gives them hedonic states). If you think they are likely more positive than negative in expectation, then waiting until we have more information also just seems like moral waste (the scale depends on how long it takes, ect).
Maybe there’s a(n implicit) governmental/first-mover or risk aversion point that I’m missing here, but I don’t get why there aren’t more people that are excited for the prospect of digital minds for this reason.
2) I really like the distinction between protective and integrative rights. In my head, I think I have a heuristically similar way of looking at this - positive rights (the right to have certain things granted to you - similar to integrative) and negative rights (the right to not have certain things happen to you - similar to protective).
One important distinction (that I think is similar to the coalition argument you make), I think, is that the negative/protective rights only work if the people who decide to grant you those rights want to continue granting you those rights (and don’t work if it becomes too costly, for instance). On the other hand, with positive/integrative rights, given that you have a say means that the rights can’t be easily taken away from you -- if the one who gave it to you no longer has incentive/ wants to give it, there’s a much better chance you can maintain these rights.