AI Firm Anthropic Develops Language Model Fine-Tuned for Value Judgments

According to Cointelegraph, artificial intelligence (AI) firm Anthropic has developed a large language model (LLM) that has been fine-tuned for value judgments by its user community. This development aims to make AI more democratic and allow users to dictate value alignment for AI models. In collaboration with Polis and the Collective Intelligence Project, Anthropic conducted an experiment called "Collective Constitutional AI" involving 1,000 users across diverse demographics who answered a series of questions via polling.
The challenge of this experiment was to allow users the agency to determine what is appropriate without exposing them to inappropriate outputs. Anthropic uses a method called "Constitutional AI" to direct its efforts at tuning LLMs for safety and usefulness. This involves giving the model a list of rules it must abide by and then training it to implement those rules throughout its process, similar to how a constitution serves as the core document for governance in many nations.
In the Collective Constitutional AI experiment, Anthropic attempted to integrate group-based feedback into the model's constitution. The results, according to a blog post from Anthropic, appear to have been a scientific success in that it illuminated further challenges towards achieving the goal of allowing the users of an LLM product to determine their collective values. One of the difficulties the team had to overcome was coming up with a novel method for the benchmarking process, as there isn't an established test for comparing base models to those tuned with crowd-sourced values. The model that implemented data resulting from user polling feedback outperformed the base model slightly in the area of biased outputs.
Explore More From Creator

Latest News

Explore More From Creator

Latest News

Trending Articles