Why is Meta going open source with its AI language model?
Facebook owner Meta has been developing a rival to the powerful GPT-3 artificial intelligence model and is inviting the research community to make use of it.
Meta's Open Pretrained Transformer (OPT-175B) is a large deep learning model Meta has been feeding vast amounts of information to hone its natural language processing capabilities.
OpenAI is currently considered the leader in the field with its GPT-3 large learning model, which has been employed to generate some impressive poems and articles that read like they could have been penned by humans.
Meta says OPT-175B is capable of handling 175 billion parameters, matching OpenAI's Generative Pre-trained Transformer, the third generation of which was released in 2020 and increased by a factor of 24 the number of parameters it offered users to work with.
The key difference, says Meta, is that OpenAI charges for access to GPT-3 while OPT-175B will be made available for free to the research community.
"While in some cases the public can interact with these models through paid APIs, full research access is still limited to only a few highly resourced labs," Meta researchers noted in a blog post.
Uneeq's digital human Sophie
"This restricted access has limited researchers' ability to understand how and why these large language models work, hindering progress on efforts to improve their robustness and mitigate known issues such as bias and toxicity."
Therein lies the reason Meta, a company known for holding its intellectual property tightly within its grasp, is giving away a powerful piece of technology. Meta's Facebook platform continues to be criticised for allowing biased information and hate speech to go viral. The company's efforts to automate its systems to sort through the billions of pieces of text its users post every day have improved Facebook's content moderation, but not sufficiently to cleanse newsfeeds of toxic content.
Open-sourcing OPT-175B is a way of accelerating developments in the research community that could ultimately help Meta tackle its biggest headache.
"We believe the entire AI community - academic researchers, civil society, policymakers, and industry - must work together to develop clear guidelines around responsible AI in general and responsible large language models in particular, given their centrality in many downstream language applications," Meta's researchers wrote.
The codebase used to train and deploy the model is being shared, in a format that only requires 16 NVIDIA V100 GPUs, according to Meta. That should allow a wide range of research institutions to access the processing capacity to run the model, which Meta claims is much more efficient to run, producing "1/7th the carbon footprint as that of GPT-3".
Meta is also releasing "a suite of smaller-scale baseline models, trained on the same data set and using similar settings as OPT-175B, to enable researchers to study the effect of scale alone".
A number of local researchers and start-ups have been using OpenAI's GPT-3 to power natural language processing engines. Digital human creator Uneeq plugged Sophie, its responsive digital avatar into GPT-3 last year in an effort to allow for wider-ranging and more naturalistic conversations.
The vast library of content GPT-3 has been fed massively expands the options for humanlike discussion. But it comes with its downsides too.
"[Sophie's] thoughts and opinions are very much driven by the information available to her, which includes billions of pages of data from across the internet," Uneeq pointed out.
"As such, her dialogue doesn't necessarily represent the views of UneeQ, OpenAI or any other affiliates of ours. Her utterances and opinions are fluid, and what she says should never be taken as advice - neither professional, financial, medical nor in any other context."
In other words, there's a chance Sophie could spread misinformation or toxicity herself, a risk Meta points out is a distinct possibility with its own language model. But that's the whole point, working out ways to identify and eliminate bias and appropriate content.
More information about OPT-175B and how researchers can access it is available on the Meta blog.
You must be logged in in order to post comments. Log In