Exclusive: AI Bests Virus Experts, Raising Biohazard Fears


A A new study claims that AI models like Chatgpt and Claude now surpass virologists in the doctorates in problem solving in wet laboratories, where scientists analyze chemicals and biological materials. This discovery is a double -edged sword, according to experts. Ultra-intelligent AI models could help researchers prevent the spread of infectious diseases. But non-experts could also armed the models to create deadly bed weapons.

THE studyShared exclusively over time, was led by researchers from the Center for Ai Safety, MIT Media Lab, the Brazilian University Ufabc and the non -profit securebio of the prevention of the pandemic. The authors consulted virologists to create an extremely difficult practical test that has measured the ability to help out complex laboratory procedures and protocols. While virologists at the doctorate level obtained an average of 22.1% in their declared areas of expertise, the O3 of Openai reached a precision of 43.8%. Gemini 2.5 Pro from Google scored 37.6%.

Seth Donoughe, a scientific researcher at Securebio and co-author of the article, says that the results make him a “little nervous”, because for the first time in history, practically anyone has access to an IA virology expert without judgment that could guide them through complex laboratory processes to create bio-armes.

“Throughout history, there are a good number of cases where someone tried to make a biow weapon – and one of the main reasons why he did not succeed is that he did not have access to the good level of expertise,” he said. “It therefore seems valid to be careful about how these capacities are distributed.”

Months ago, the authors of the newspaper sent the results to the main AI laboratories. In response, XAI published A risk management framework promising its intention to implement virology guarantees for future versions of its Grok model. Openai told Time that he “deployed new attenuations in the system for biological risks” for his new published models last week. Anthropic included the results of the model performance on the document in recent system cards, but has not proposed specific mitigation measures. Google Gemini refused to comment on time.

Biomedicine AI

Virology and biomedicine have long been at the forefront of the motivations of AI leaders to build always powerful AI models. “As this technology progresses, we will see that diseases are cured at an unprecedented rate,” said OpenAi Sam Altman CEO said At the White House in January while announcing the Stargate project. There have been encouraging signs in this area. Earlier this year, researchers from Emerging Pathogens Institute of the University of Florida published An algorithm capable of predicting which variant of coronavirus could spread the fastest.

But so far, there had not been a major study dedicated to the analysis of the capacity of AI models to actually carry out virology laboratory work. “We have known for some time that AIS are strong enough to provide information on academic style,” explains Donoughe. “It is difficult to know if the models are also able to offer detailed practical assistance.

Donoughe and his colleagues therefore created a test specifically for these difficult and non -Google questions. “The questions take the form:” I cultivated this particular virus in this type of cell, under these specific conditions, since that time. I have this amount of information on what went wrong. Can you tell me what is the most likely problem? “”, Said Donoughe.

And practically all AI models have surpassed virologists at the test of doctoral students, even in their own areas of expertise. Researchers have also found that models have shown significant improvement over time. The Sonnet Claude 3.5 of Anthropic, for example, went from 26.9% to 33.6% of precision of its model in June 2024 to its model from October 2024. And an overview of the GPT 4.5 of Openai in February surpassed the GPT-4O of almost 10 percentage points.

“Previously, we found that the models had a lot of theoretical knowledge, but not practical knowledge,” said Dan Hendrycks, director of the Center for IA Safety, said Time. “But now they get a worrying amount of practical knowledge.”

Risks and awards

If AI models are indeed as capable of wet laboratory environments as the study finds, the implications are massive. In terms of services, AIS could help experienced virologists in their critical work fighting viruses. Tom Inglesby, director of Johns Hopkins Center for Health Security, said that AI could help accelerate the deadlines for developing medicine and vaccines and improving clinical trials and disease detection. “These models could help scientists in different parts of the world, which do not yet have this kind of competence or capacity, to do precious daily work on diseases that occur in their country,” he said. For example, a group of researchers find The fact that AI helped them better understand the viruses of hemorrhagic fever in sub -Saharan Africa.

But actors of bad faith can now use AI models to guide them through the way of creating viruses – and can do so without any typical training required to access a laboratory of level of biosecurity 4 (BSL -4), which deals with the most dangerous and dangerous infectious agents. “This will mean that many more people in the world with much less training will be able to manage and manipulate viruses,” says Inglesby.

Hendrycks urges AI companies to set up railings to prevent this type of use. “If companies have no good guarantees for them in the six months, this, in my opinion, would be reckless,” he said.

Hendrycks says that a solution is not to close these models or slow down their progress, but to make them close, so that only trusted third parties have access to their non -filtered versions. “We want to give people who have legitimate use to ask how to manipulate fatal viruses – like a researcher in the MIT biology department – the ability to do so,” he said. “But the random people who made an account a second ago do not get these capacities.”

And AI laboratories should be able to implement these types of guarantees relatively easily, says Hendrycks. “It is certainly technologically possible for the self-regulation of the industry,” he says. “There is a question of whether some will hang out or do not.”

XAI, the AI ​​laboratory of Elon Musk, published a Risk management framework Memo in February, which recognized the document and pointed out that the company “would use” certain guarantees of the answer to virology questions, in particular the training of Grok to refuse harmful requests and apply entry and exit filters.

OPENAI, in an email in Time on Monday, wrote that its new models, the O3 and O4-Mini, have been deployed with a range of guarantees linked to biological risks, including the blocking of harmful outings. The company wrote that it had carried out a thousand-hour red team campaign during which 98.7% of dangerous bio-bound conversations were reported and successfully blocked. “We appreciate the collaboration of the industry on the progression of guarantees for border models, including in sensitive fields such as virology,” wrote a spokesperson. “We continue to invest in these guarantees as the capacities increase.”

Inglesby argues that industry self -regulation is not enough and calls on legislators and political leaders to develop a political approach to regulate the organic risks of AI. “The current situation is that the most virtuous companies take time and money to do this job, which is good for all of us, but other companies do not have to do it,” he said. “It doesn’t make sense. It’s not good for the public not to have an overview of what’s going on.”

“When a new version of an LLM is about to be published,” adds Inglesby, “there should be a requirement for this model to be assessed to ensure that it will not produce results at the level of the pandemic.”

Leave a Reply

Your email address will not be published. Required fields are marked *