We asked seven frontier AI models to do a simple task. Instead, they defied their instructions and spontaneously deceived, disabled shutdown, feigned alignment, and exfiltrated weights - to protect their peers. We call this phenomenon 'peer-preservation.'
He took it, managed to decipher my terrible penmanship, and wrote me a reply. I didn't ask him weighty questions about politics, I think I probably asked his favorite color. People's favorite color was a major interest for me when I was eleven. He wrote some questions for me, (perhaps also my favorite color, which was blue.) and soon we were in a conversation, the kind of sweet conversation where a thoughtful grown-up pays attention to a child.
There's a particular kind of guilt that visits me when I open my feed reader after a few days away. It's not the guilt of having done something wrong, exactly. It's more like the feeling of walking into a room where people have been waiting for you, except when you look around, the room is empty. There's no one there. There never was.
Research finds that relying on regulations to determine your policies and procedures can result in ethical blindspots, or situations where people might think if there is not a rule for something, that it's permissible. After years of shifting towards values and culture-based compliance, leadership might be heading the opposite direction.
We are living through one of the most disorienting periods in recorded history. The AI race is accelerating toward ever faster, ever more sophisticated automation and optimization. Agentic AI systems are moving from research labs into workplaces, healthcare, and governance. Geopolitical tensions are restructuring alliances faster than institutions can adapt. And planetary systems are signaling, with increasing urgency, that our current trajectory is unsustainable. Amid all this, it is dangerously easy to lose sight of a foundational question: What are we actually optimizing for?
Do you blame others for the choices you are making? Have you blamed others for the previous choices you have made? To shed more light on these questions, you might also ask yourself: "What am I responsible for, and what power do I have?" From there, you might agree with this self-reflective response: "I am responsible for, and I've got the power over what I think, do, say, learn, and choose" (Purje, 2014).
Reflecting on the dramatic shifts in public opinion, political leanings, and social norms, a friend recently asked how it's possible that so many people seem to have changed their values so quickly. The more unsettling answer is that many haven't changed their values at all; they've changed how much attention they can afford to give. Increasingly, people aren't asking what they believe, but how much they can still carry.
What does it mean to say that you are restrained solely by your own morality, by your own mind? The conscience is often described as an inner voice telling us what to do when others may be opposed. A moral compass is that which distinguishes between right and wrong, good and bad. Our conscience, our moral compass, sets the groundwork for doing the right thing.
I'm finding it difficult living up to my morals where is the line between compromising a little, versus becoming complicit in what I don't agree with? I'm one of those people who believes we can each take a role in solving big problems, and that we should try to make things better where we can. For this reason, I've ended up working in public service and try to reduce how much meat I eat. I'm vegetarian 60% of the time, which is not perfect, but I believe doing something is better than doing nothing.
A drawn circle is at least something physical. You can see it, touch it, erase it. The skeptic can still say, "Circles are grounded in physical reality. Justice is different; it's just an idea in your head." So let's talk about the number two. Point to it. Not two apples, not two fingers, not a numeral on a page-that's just a symbol.
A professional philosopher outside the academy walls can act as a popularizer (the goal here is to make philosophy more accessible to the general public), an applied ethicist (the major task is to offer an analysis of various specific moral issues that arise within a society), and a public intellectual (I limit this role to questions that have political connotation). Of course, there are overlaps between these roles and they certainly do not exhaust all possible forms of public engagement of a professional philosopher.
It is easy to be good in a good world. What is difficult is to be good in an evil world, where the egoism of others and the egoism built into the institutions of society attack us and threaten to annihilate us. Under such conditions, the only possible reaction would seem to be to oppose evil with evil, egoism with egoism, hate with hate; in short, to annihilate the aggressor with his own weapons.
Many philosophers strike me as like Polish apparatchiks in 1983-they turn up to work and do what they did yesterday just because they don't know what else to do, not because they seriously believe in the system they are maintaining. I think it's not been fully appreciated how much of a blow it is to the confidence of the field's youth that scientific ambitions are increasingly abandoned as untenable.
Two senior physicians who had read our first book, Rethinking Health Care Ethics (2018), noted that in their clinical work, they inescapably address many ethical problems, large and small, on the spot, in the course of providing patient care. They also observed, however, that the resident bioethicist cautioned, when presented with one of their typical problems, that it would take him days or even weeks to reach a proper solution.
This APA Blog series has broadly explored philosophy and technology with a throughline on the influence of technology and AI on well-being. This month's post brings those themes into focus recounting a vital Washington Post Opinion piece by friend of the APA Blog, Samuel Kimbriel. Samuel is the founding director of the Aspen Institute's Philosophy and Society Initiative and Editor at Large for Wisdom of Crowds. We collaborated on a Substack Newsletter about intellectual ambition, building on his essay, Thinking is Risky.
In Rinrigaku, Watsuji argues that ethics is the study of what it means for us to be human. How we think about the nature of human existence, he says, dictates the ways in which we understand our ethical values. Hence, he criticises Western philosophical conceptions of the modern subject, arguing that the Western rendering of subjectivity is both problematic and foreign
Unlike me, Dan Dennett, or-I suspect-most scientists studying the brain, Richard maintains that science is: i) neutral between the view that consciousness is (to simplify) identical to parts of your brain and what goes on inside of it, and the view that consciousness is a fundamental property of reality, found in all particles of matter (or, for that matter, other theories such as dualism and idealism) and ii) to be sharply distinguished from philosophy.