Settings

Theme

Claude's Values Tested in 700K Chats

venturebeat.com

1 points by hanson108 9 months ago · 2 comments

Reader

hanson108OP 9 months ago

Anthropic analyzed 700,000 real conversations to see if Claude behaves the way it was designed. It mostly aligns with their “helpful, honest, harmless” goals — but some edge cases raise big questions.

How should we define and measure values in AI systems — and who decides what they should be?

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection