Resolving Utilitarianism

So I’ve said before that I was going to write up a post on utilitarianism. This is that post.

Note that I haven’t done much formal reading on utilitarianism, and it could be that someone else has already made these observations.

Whose Values?

Mine, of course.

There’s no great tablet sent down from high proclaiming that happiness is the ultimate good or that the repugnant solution is, truly, repugnant. It’s my moral system and my moral intuitions.

Obviously intuitions aren’t perfect; there are some obvious fallacies to be avoided (such as when people give to what is mroally salient rather than most impactful) in terms of implementation. But intuitions seem to be useful for ascertaining values.

It’s possible to create a moral system based around maximizing one’s own welfare based on game-theoretic logic about rational actors in perfect cooperation, but often people who only maximize their own welfare notice that not everyone is a perfect rational actor with perfect cooperation, so this argument falls flat for convincing its target audience. A maximizer for self-welfare wouldn’t actually act in this mannerr except under highly idealized circumstances.

The contractarian system looks quite good as long as we only consider good examples. For example, Omar’s welfare is satisfied by having everyone including him not murdered, and so are the welfares of 99% of other people, so Omar makes a contract for mutual benefit where murdering is not allowed and murderers are imprisoned or murdered. This is essentially the system we have now.

But this framework ignores the existence of preferences about others. For example, suppose that John’s welfare is satisfied only by killing everyone including himself, and that it is all-or-nothing; he wouldn’t be satisfied by only killing 10 people or by only killing himself, but rather exclusively by killing everyone including himself. John could not possibly make a contract of mutual benefit with anyone else, and as a rational actor John would want to achieve as much of his goal as possible.

This isn’t a problem for Omar’s society, since if enough people had incompatible preferences with John John would just be dead/in jail.

But now suppose that 99% of society wanted everyone killed including themselves. They make a contract for mutual benefit and maximize welfare by killing 100% of people.

Somehow this doesn’t sound very good anymore, at least not for the 1% of people who don’t want to be killed.

The exclusively contractarian framework doesn’t give us a way to distinguish between these two societies, since in both people are acting for mutual benefit. Accepting a space for non-self-maximizing intuition appears to be necessary, as does ranking preferences-about-others universally.

Game theory can be taken as a means to the intuitionist end, but not as an end in itself. Even in the above formulation, it takes individual welfare as a given.

What is Evil?

Utilitarianism does not typically concern itself with the idea of evil. I likewise have no formal definition of evil, but I use it as a shorthand for something I want to prevent in order to maximize agency.

What Repugnance?

The repugnant conclusion is the true statement that, for any (e.g.) happy population, there is a larger population in which each individual is barely happy that is equivalent in moral value. Ozy writes that it makes little sense to prefer the large population to the smaller population.

I don’t find the repugnant solution to be particularly repugnant. If everyone in the society has positive happiness – which is necessary for them to count as positive hedonic util – then it is an absurdly good society, a better one than we have here. The two societies are indeed morally equivalent.

What Measurement?

It appear customary to utilize a linear relationship of utility in terms of inputs- like, if you have, say, 10 hedons, then that converts to 10 utils. This makes sense, since if you value hedons then it’s not like one is better than the other.

There also is the system of evaluating using average utils. This has all the typical pitfalls of using averages, but could possibly be solved with judicious usage of complex statistical methods. I don’t know any complex statistical methods so I will not use the system of average utils.

No one seems to use a combination of various types of utility, which makes very little sense. In real life people value various different things. The actual mathematics are likely fiddley squiddley and, while interesting, time-consuming to model; but I will propose a intuition-based system after evaluating the various utilitarianisms. We shall call systems like my proposed system holistic systems.

It also appears customary to utilize a linear relationship of population members to utility. This strikes me as flawed, at least assuming one population. Increasing the population doesn’t necessarily scale linearly to utility. A logarithmic, logistic, or shifting geometric pattern approaching a limit would be a better system until we have e.g. space travel or access to other realms.

This article will also try to avoid certain pitfalls of thought experiments by making assumptions/implications either explicit or nullified.

Which Values?

I’m not quite sure! We appear to be able to choose between hedonics, non-suffering, preference, and eudaemonia. Let’s evaluate all of these:


This is the system of maximizing happiness.

Some objections indicate that we might end up forcing people to wirehead or that we might end up maximizing happiness at the expense of achievements, motivation, technology, or long-term protection.


We minimize suffering.

Unfortunately, this implies that we should be striving to kill everything which might possibly suffer, and then tile the world with nonsuffering, which likely precludes a happiness function, as any resources used for happiness are not being used to prevent suffering. There are obviously other ways to minimize suffering, but this utilitarianism makes no distinguishment between different methods except for how efficiently etc. they reach towards nonsuffering.


Maximizing the fulfillment of preferences.

Requires a large superstructure to figure out what preferences to maximize, and doesn’t distinguish between different preferences.


I’m honestly not quite sure what eudaemonia is, only that it is sort of mystical and includes everything that people would see as a good thing.

If anyone would is able to define it more extensively, I would be grateful. As it is, it seems somewhat handwave-y. “Oh, look at this ~thing~ that I think would be good to have! Let’s try to maximize that!”

What’s left? My own preferred version of utilitarianism, which I call:

Agency Utilitarianism

This maximizes the ability to refuse to do things.

It seems mostly harmless and possibly good, and can be achieved by means of the contract plan. It avoids the pitfalls of preference utilitarianism, and somewhat ignores tiling problems necessitates the existence of choices, things to do, and entities that can refuse. It distinguishes between various types of refusal by choosing the ones that maximize overall agency.

Unfortunately, it says nothing about happiness or welfare.


The largest problem extant for all of the above is tiling- the idea that an entity maximizing any of these could destroy people and replace it with itself. If we really value these exclusively, we would be in favor of this- unless there was something else we valued that humans had.

What do I value that human people have? (Most of these things are not universals, but things which humans, as an aggregate, tend to have that AI does not. There is no shame in not having one of these- not everyone can make art, not everyone can do science, not everyone has a sexuality; but in general these are important to conserve.)

  • self-awareness– an entity that was satisfied by the speed of light being 3*10^9 would not have equivalent self-awareness to a person.
  • intuition and problem-solving skills– System 1 is still valuable for solving problems and planning ahead; we don’t yet have effective programming skills to replace this. Potentially not a problem if we’re able to program well enough to create these reliably, but not a risk I would want to take.
  • science– we have yet to program or discover something that could do science, generate experience, and discover new knowledge as well as human people can.
  • art– I really like fantasy books and imagination, and would be sad to see this gone
  • love, other emotions– while some people don’t have certain or all emotions, humanity as a whole encompasses a whole degree of interesting emotional experiences which it would be sad to lose
  • the ability to spontaneously generate new ideas– important for survival and true progress
  • attachments to each other– this is, again, not universal to humans, but something I would be sad to see go
  • sexualities– again, not universal, but valuable to me
  • general personhood, however defined

Chesterton’s fence very much applies here. I don’t want to put a hard limit on what a human is or to invent a hard definition; but all of these things are valuable.

The utility of these “human values” can be modeled by several different logarithmic-esque curves, rising sharply at lower values then approaching a limit, where x=amount of individuals with these traits and y=utility provided, so that quite a few individuals can stop having these values before they start becoming important to conserve. A higher utility would correspond to a higher x value.

This is one of the utility functions in the utility system I will be proposing below.

A Proposed Holistic System

We try to ensure that the “human values” curves do not drop below a certain value, preferably close to their limit. Only solutions in which “human values” (quotations to make it clear that they are far from universal) are above this limit are acceptable. Worlds where “human values” are above this limit are morally neutral compared to each other because of diminishing returns along the logarithmic curve. Any tiling and/or expansion that doesn’t limit the human values is morally neutral except regarding its impact on the below.

Next, we maximize agency. A world where more entities can refuse is better than one where fewer can refuse. This is done by means of a game-theoretic contract.

Finally, we maximize happiness. Rather than wireheading, we raise everyone’s hedonic baseline through e.g. head implants. We cannot decrease agency for happiness.

Future Inquiry

This ensures that we do not eliminate things that people typically value, like “human values”, thereby resolving tiling problems. It then ensures that we do not forcibly wirehead people while also providing for welfare as measured by happiness. It also provides a justification for preventing murder, enabling considered suicide, outlawing nonconsensual sexual contact, and protecting various freedoms, especially freedom of speech.

Further research could include formal mathematical models of this framework, objections to its structure, or other variants of holistic systems.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s