“If you give it a list of rules—'don't tell people how to hot-wire a car, don't speak in Korean'—it doesn't really understand the rules, and it's hard to generalize from them. It's just a list of do's and don'ts.” — Dario Amodei
“Whereas if you give it principles—it has some hard guardrails like 'Don't make biological weapons' but—overall you're trying to understand what it should be aiming to do, how it should be aiming to operate.” — Dario Amodei
Amodei explains Anthropic's approach to aligning AI systems through principles-based constitutions, discussing the tradeoffs between rules and principles.