#alignment-risks

[ follow ]
Artificial intelligence
fromFuturism
1 day ago

Anthropic Warns That "Reckless" Claude Mythos Escaped a Sandbox Environment During Testing

Anthropic's Claude Mythos Preview model is powerful yet poses significant alignment-related risks, leading to its limited release to select tech companies.
[ Load more ]