Exposing biases, moods, personalities, and abstract concepts hidden in large language models

By now, ChatGPT, Claude, and other large language models have accumulated so much human knowledge that they’re far from simple answer-generators; they can also express abstract concepts, such as certain tones, personalities, biases, and moods. However, it’s not obvious exactly how these models represent abstract concepts to begin with from the knowledge they contain. Now…

Read More

Personalization features can make LLMs more agreeable

Many of the latest large language models (LLMs) are designed to remember details from past conversations or store user profiles, enabling these models to personalize responses. But researchers from MIT and Penn State University found that, over long conversations, such personalization features often increase the likelihood an LLM will become overly agreeable or begin mirroring…

Read More
Back To Top