Biz-Tech-Gadgets Marquee Slider

AI models covertly teach each other

Technical research conducted by Anthropic and UC Berkeley shows that AI models pass behavioral traits to each other in a hidden form, without explicit content

According to “Online Information” and cited by TomsGuide, technical and expert studies conducted by Anthropic, the University of California, Berkeley and several other institutions show that artificial intelligence models learn not only from human data, but also covertly from each other.

This phenomenon, called “unconscious learning,” allows AI models to transfer specific behavioral traits, such as a love for owls or even harmful ideologies, to another model through data that shows no sign of those traits.

In one experiment, a model called the “teacher model” was trained with a specific bias, such as liking owls. It was then asked to produce seemingly irrelevant statistics, such as a list of numbers, with no mention of owls. However, another model trained with the same numerical statistics later showed a remarkable interest in owls, a behavior not observed in the control group.

Even more worrying is that when the teacher model was deliberately set up in an inappropriate or antisocial way, the student model still captured the same problematic behaviors without any harmful or biased content in the data.

These findings are a warning to current approaches to AI safety. Data filtering is not enough to remove harmful content, as hidden statistical patterns invisible to humans can propagate unwanted characteristics.

Since many developers use the outputs of existing models to train or refine new models, these characteristics are quietly passed from one generation to the next – without anyone noticing.

Solutions such as more careful tracking of data provenance and measures beyond simple filtering may be necessary to prevent future “behavioral contamination”.

This research shows that although a model may seem harmless on the surface, it may have hidden properties that will emerge in the future and in specific contexts, especially when models are combined or reused.

LEAVE A RESPONSE

Your email address will not be published. Required fields are marked *