AIModels.fyi

AIModels.fyi

Share this post

AIModels.fyi
AIModels.fyi
"I think you're testing me": Claude 3 LLM called out creators while they tested its limits

"I think you're testing me": Claude 3 LLM called out creators while they tested its limits

Anthropic's new LLM told prompters it knew they were testing it

aimodels-fyi's avatar
aimodels-fyi
Mar 04, 2024
∙ Paid
1

Share this post

AIModels.fyi
AIModels.fyi
"I think you're testing me": Claude 3 LLM called out creators while they tested its limits
Share

Can an AI language model become self-aware enough to realize when it's being evaluated? A fascinating anecdote from Anthropic's internal testing of their flagship Claude 3 Opus model (released today) suggests this may be possible - and if true, the implications would be wild.

AIModels.fyi is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Subscribe or follow me on Twitter for more content like this!

According to reports from Anthropic researcher Alex Albert, one of the key evaluation techniques they use is called "Needle in a Haystack." It's a contrived scenario designed to push the limits of a language model's contextual reasoning abilities. Here's how it works:

Keep reading with a 7-day free trial

Subscribe to AIModels.fyi to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 AIModels.fyi
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share