Big Tech has rapidly introduced conversational AI models since ChatGPT’s debut in late 2022. However, these companies often build models to align with their corporate culture or serve specific political or ideological goals. Because these models are closed-source black boxes, users lack insight into their training data and underlying mechanics, leaving them wondering how responses are generated.
A trustworthy alternative would be open, transparent models managed and trained on decentralized systems, offering more excellent reliability than the current closed corporate models.
The Bias in Centralized LLMs
Since before the launch of ChatGPT, different groups have raised the dangers of bias in closed systems. Critics from progressive circles have long argued that large language models (LLMs) act as “stochastic parrots,” reflecting dominant viewpoints and encoding biases that can harm marginalized populations. Ironically, some of the most vital reactions to ChatGPT’s biases came from the other side of America’s political divide.
Users observed that while the model could discuss Russian interference in the 2020 election, it was notably silent on Hunter Biden’s laptop. This topic was also widely reported at the same time. Research supports the allegation of bias: “We find robust evidence that ChatGPT exhibits a significant and systematic political bias in favor of the Democrats in the US, Lula in Brazil, and the Labour Party in the UK,” noted one study.
Given the human element in constructing models, some bias is inevitable. However, when models are trained opaquely and then marketed as ‘neutral,’ users are exposed to biases of either the data or the developers without the ability to scrutinize them.
Biases can also go beyond the data inputs used. For instance, in early 2024, Google Gemini’s image creator faced severe backlash and was quickly ‘paused’ for ‘updates.’ To avoid offending what it saw as mainstream political and social sensitivities, Google forced its model to include diversity in nearly all images.
This approach led to preposterously inaccurate results, such as African and Asian Nazis and a diverse group of American founding fathers. These images were not only wildly incorrect but also offensive. Most importantly, however, they lifted the veil on the hidden manipulation risks inherent in proprietary, closed AI models developed and run by companies.
Transparency and Openness are Essential for AI’s Future
The prejudices of the people who created the models affect them all. For instance, Google’s Gemini model’s picture prompts depend on the innate prejudices of the people who created them. They also go through an extra set of guidelines, including boosting diversity, that are in line with what Google considers desirable or acceptable responses. Despite their good intentions, these restrictions are not readily visible to users.
Given Gemini’s diversity guidelines were so clear-cut and awkward, the results were soon the target of widespread mockery as users competed to create the most ridiculous outcome. Since the AI model generates outcomes based on picture requests, all outputs are likely influenced by identical rules and biases. Although the biases in the picture findings were obvious and easily seen, it is far more difficult to identify the manipulation in the text replies.
It is imperative that LLMs be transparent, openly inspectable, and free from opaque biases in order for them to be widely trusted, as opposed to being trained and manipulated by corporations behind closed doors. This can only be achieved with open source models that have been proven to be trained on particular data sets.
Hugging Face is one of several open-source initiatives that gathered $400 million and is making great progress toward developing and training these open models. The fact that these models operated on decentralized networks and were transparent to the public demonstrates that every outcome was applied to the model honestly. For payments and storage, there are presently very robust decentralized networks in existence, and other GPU markets, including Aethir and Akash, are optimizing to run and train artificial intelligence models.
Decentralized networks are essential because they are difficult to threaten or shut down since they function internationally across a variety of infrastructures and have no one owner. This rapidly growing ecosystem includes GPU marketplaces for training and running models, platforms like Filecoin for data storage, CPU platforms like Fluence for model execution with provability, and open tools for model development. With this vital infrastructure, open models will become a powerful force.
Are Decentralized AI Frameworks Practical?
Microsoft and Google have invested billions of dollars in creating their LLMs, giving them an unbeatable advantage. However, past events have demonstrated that even the most well-known businesses may be overtaken. For instance, Linux defeated Microsoft Windows’ ten-year advantage and financial backing worth billions to emerge as the top operating system.
We may anticipate the same level of success in developing and educating open-source LLMs as the open-source community did in creating Linux, particularly if we have a shared platform that makes development easier. In the near future, smaller, domain-specific models with distinct datasets may develop, providing more confidence within their specialized domains rather than directly competing with massive LLMs like ChatGPT.
For instance, a model focused on children’s oncology could utilize exclusive access to data from the top children’s hospitals. A single interface could aggregate these domain-specific models, providing a ChatGPT-like experience based on a transparent and trusted foundation.
Model aggregation is a viable path to creating a trusted alternative to corporate LLMs. However, ensuring we verifiably operate these models is as crucial as their development and training. The focus must be on outputs, and any organization running a model will face significant pressure from politicians, regulators, shareholders, employees, the public, and armies of Twitter bots.
Decentralized models, hosted by global storage providers and running on open, decentralized computing networks, offer auditable queries and resist hidden biases and censorship, making them much more trustworthy.
While Big Tech is aware of its bias issues, it will need help supporting models that give answers that are unpopular with its employees, governments, and customer constituencies, even if they are accurate. OpenAI will take steps to reduce the apparent bias, and Google will update Gemini to be more historically accurate, but the hidden bias in both will persist. We should use this revelation of Big Tech’s manipulation as a welcome warning about the risks of relying on any centralized company to develop and run AI models, no matter how well-intentioned. We call to build open, transparent, and decentralized AI systems we can trust.
The post Centralized LLMs Prove the Case for Decentralized AI appeared first on Metaverse Post.