With AI models clobbering every benchmark, it’s time for human evaluation

Veronika Oliinyk / Getty Images Artificial intelligence has traditionally progressed by automatic precision tests in tasks intended to approximate human knowledge. Carefully manufactured reference tests such as the Benchmark for the General Understanding of Language (GLUE), the set of understanding data of the massive multitasking language (MMLU) and the “last examination of humanity”, used large…

Read More

Building a Safer AI Future: SANS Leads Coordinated Effort to Secure Artificial Intelligence

Organizations that integrate artificial intelligence into their workforce and their offers accelerate innovation, but many are not prepared for the safety challenges that accompany it. While they rush to deploy more effective and more effective models, they often neglect the risks of model handling and contradictory attacks, threats that traditional defenses are not equipped to…

Read More