AI Governance & Why It Is Necessary | Osano


In March 2023, Elon Musk and Steve Wozniak, along with other technology experts, signed an open letter asking that training powerful AI models be stopped until stronger AI governance laws could be developed. Two months later, Sam Altman, the CEO of OpenAI (the company behind ChatGPT), testified before Congress on AI risks and urged stronger AI regulations.

While lawmakers and technology stakeholders go back and forth on how much regulation to impose and the form it will take, everyone agrees on one thing: AI governance is important, and a complete lack of it could have major consequences, from the violation of privacy rights to unethical business practices. What does AI governance entail, and how does it affect you?  

Let’s find out. But first…

What Is AI Governance?

Artificial intelligence technology is widely used in a variety of applications and industries to automate tasks, analyze data, and predict outcomes. AI can save time, lower costs, and even reduce errors introduced by manual work.

But as we hand over tasks and decisions to AI, it’s necessary to ensure that its use meets ethical standards, complies with regulations, and upholds user privacy.

AI governance is the combined framework of processes, principles, and policies that guide the development and deployment of ethical and responsible AI systems. It ensures these systems are transparent, explainable, and accountable. It also provides guidelines to minimize risks and create AI models that are free of biases and errors that could be harmful to people. 

To understand why AI governance is necessary, we must look at how AI models are created and trained and how data privacy issues could arise. 

AI Model Training and How It Relies on Data

How does AI gain the capability to give answers, complete tasks, and make decisions? First, AI models need to be trained with significant volumes of data. They take that input data and correlate the output to match a sample output (for example, a photo with certain edges, shapes, colors, and similar characteristics = a photo of a dog). The training enables a model to generalize from data it has seen to label and make predictions about unseen data.  

How much data do you need to train an AI model? It depends. Large language models (LLMs; like Chat GPT) use billions of data points and numerous parameters. However, smaller models may need only thousands or even hundreds of data points, depending on the scope of the use case. 

Like people, an AI model’s learning relies on the three Vs for a well-rounded education: volume, variety, and velocity.

  • Volume, because the more information they have access to, the more comprehensive their knowledge of the subject.
  • Variety, because different types of information help them develop a more nuanced understanding of the topic.
  • Velocity, because the more quickly information is generated, processed, and analyzed, the better the model becomes at fast, real-time decision-making.

Fun story: When visual CAPTCHA tests came out, people said it was just to teach Google robots what those images were.

Why? Because CAPTCHAs ask humans to click on a bunch of images that contain, say, motorbikes. An artificial mind then looks at the results and learns what a motorbike (or traffic lights, or dachshunds, or trees) looks like: seeing inputs and outputs over and over until they can replicate it. But there are more efficient ways to train robots. 

A limitation of AI is that it’s only as good or accurate as the data it’s been trained on. It can provide information based on that data, but the output isn’t a “thought” as would be the case with a human brain. However, AI is evolving quickly, and the quality and sophistication of its outputs based on its inputs is improving.    

One of the issues with providing AI models with training data is that it raises certain risks–particularly around data privacy. The vast quantities of data used in LLMs, for example, exacerbate the problem because it’s harder to track data and spot the risks. 

Data Privacy Risks in AI Training

Training Data Issues

In most cases, AI models are trained on information from existing databases or from the Internet. The biggest privacy risk is that some of that data may be personal, identifiable, or sensitive. If the data subject hasn’t consented to their data being used for AI training (and they probably haven’t), you may be acting unethically and non-compliantly. 

Issue 1: The Subject Did Not Provide Informed Consent 

One of the requirements of data privacy laws is that organizations may only collect consumers’ personal data with their informed consent. The “informed” part includes conveying the purpose for which it’s being collected. 

Issue 2: The Subject Did Not Provide Informed Consent for All Uses

If someone is okay with their personal information being used for purpose A but not purpose B, you must respect that preference under laws like the GDPR and the CCPA. For example, a consumer may have consented to having their social media data used for targeted advertising purposes, but not for having that data scraped and incorporated into an AI training set.

Issue 3: Breach of Privacy by Using Personal Data in Responses

Using people’s personal information to train LLMs comes with another risk: the models might disclose this information as part of their response. If John Doe’s address and phone number were included in the training data for ChatGPT, what’s stopping it from giving this information out if someone asks?

Granted, if you follow data privacy best practices, this data would have been stripped of any identifiers. However, the problem typically occurs when model trainers are unaware of privacy law, or they use personal data without realizing it. So, the three concerns about training data are:

  1. There might be consumers’ personal data mixed in with the training data;
  2. The consumers might not have consented to their personal data being used for AI model training, and;
  3. The AI tool might disclose people’s personal data in its responses.

All are avoidable if privacy professionals are involved with establishing AI governance and making those risks clear.

Bias and Discrimination Risks

When people first realized the power of computers, it was easy to think of them as omniscient and powerful. But in reality, computers are machines, and as such, their outputs are only as reliable as the inputs they have to work with. This is commonly expressed as “Garbage In, Garbage Out,” or GIGO.

GIGO has always been true and is still true with AI models. Their results are only as good (or terrible) as the material they are trained on. 

One example of the risk of AI and GIGO is Microsoft’s Tay chatbot. This chatbot was supposed to learn from live interactions on Twitter. However, users decided to feed it vulgar language and antisocial interactions. A few short hours after launch, Tay was mimicking this undesirable behavior and the chatbot had to be disabled. GIGO, happening in real life and real time.

That’s an extreme example of the risk of AI- and user-generated content: Users and society are biased, and AI models can pick up and internalize inherent biases and prejudices. Some example biases are: 

Training data biases, which occur when the data used to train an AI model is skewed in favor of or underrepresents a certain group.

Algorithmic biases, caused by programming errors or decisions where the developer introduces logic that reflects their own prejudices.

Cognitive biases, that could be introduced unwittingly by developers when their training data is weighted or selected according to their own preferences.

If a model picks up these biases and then is used to make certain decisions, it will not provide accurate results. 

For example, medical research over the years has been primarily conducted on Caucasians, with other groups underrepresented. If data from these trials were to be used to train AI to make medical decisions, it could produce inaccurate diagnoses for patients in underrepresented groups. That could have dangerous, even fatal, outcomes. 

Similarly, AI bias could lead to discriminatory hiring decisions. Amazon had to stop using its AI hiring tool when it became clear that the tool favored men over women. At one point, Google Ads showed listings for high-paying executive positions to men more often than women. 

It’s been shown that predictive policing tools can be biased toward people who look a certain way, dress a certain way, or even walk a certain way. 

These are just a few examples of how such biases can affect predictive AI results.   and negatively impact real people’s lives.

The Risk of Inferences and Predictions

As Osano CEO Arlo Gilbert explained in his book, The Privacy Insider, one piece of personal information in an AI model may not do harm. But combine several pieces of information together and predict outcomes based on assumptions, and things can go really wrong. People often think they have nothing to hide because one bit of information is harmless. But multiple pieces of information contextualized by AI? Then the risk gets real. 

For example, AI could piece together seemingly unrelated information like your location, purchases, and date and time stamps and incorrectly deduce (with the help of inherent bias) that you committed a crime. Or use non-personal information to make inferences about your medical condition, religion, or sexual orientation that are untrue or that you don’t want disclosed. 

Because of this possibility, it’s critical to have AI governance and policies in place to ensure that AI is used ethically and in a non-invasive way.

Risks Associated with Lack of Transparency

We’ve touched upon consent earlier in regard to training data. Let’s say you want to get user consent before you use their personal data to train AI models. Informed consent requires complete clarity on why the information is needed, how it’ll be used, and whether it’s being processed lawfully. This is the foundation of every privacy law.

Model training is a form of data processing. If the data subject isn’t clear about what their data is being used for, they can’t give informed consent. If the data subject can’t give informed consent, you are crossing the line of ethical data practices–and breaking every privacy law in existence. 

And, unlike other methods of processing data, it is not so easy to undo data processing in an AI model and cure the violation. Not getting informed consent will cost you money. But just as important, it will cost you trust, which may be more damaging in the long run.   

Violation of Data Minimization Principles

According to the principles of privacy by design, only the minimum amount of information required for a given purpose should be collected to protect the identity and privacy of individuals and to reduce the burden of protecting and managing that data.

This principle is now enshrined in several state data privacy laws, including the Maryland Online Data Privacy Act, which has strict requirements and penalties associated with data minimization.

However, we’ve also seen how volume is a benefit when it comes to training AI. If gathering more data benefits a company training its own AI model, why would it want to limit its collection? Unfortunately, indiscriminate data collection can impact consumer privacy, which is why AI governance is needed.

AI Governance Frameworks and Laws

The United States made an early attempt to define an AI framework when it was clear generative AI tools would be gaining wider usage. The country’s first executive order on AI, issued in late 2023,  focused on the safe, secure, and trustworthy development of AI. While it’s since been repealed and replaced by the Trump administration, the order’s eight guiding principles and priorities are representative of other frameworks meant to advance and govern the development and use of AI.

  1. AI must be safe and secure.
  2. It should promote innovation, collaboration, and competition.
  3. It should not have a harmful effect on jobs and workplaces for human workers.
  4. It should advance equity and civil rights.
  5. It should protect consumers, patients, and students. 
  6. It should not compromise consumer privacy.
  7. It should be deployed responsibly by the government.
  8. It should be developed internationally through the implementation of vital standards in partnership with other governments.

However, as of now, there is no federal AI law on the books in the U.S. However, a few states, like Colorado, passed AI laws over the past two years that could serve as a blueprint for future laws here.

The EU AI Act and Risk Categories

Meanwhile, the European Union’s EU AI Act came into force in August 2024. The law categorizes AI models by the risk they pose to determine what level of governance they need and includes a category for unacceptable risk. This act aligns with the GDPR to protect consumer privacy and encourage transparency in AI model development. Like the data privacy regulation, this act applies to AI systems built inside the EU, but its impact also extends outside EU boundaries to include systems that target or serve EU residents.

While the EU AI Act allows developers to determine the risk category of their AI model themselves, they will be held liable if a court finds it’s determined that they miscategorized it. 

The four categories of risk under the EU AI Act are as follows:

Minimal or no risk: These are low-stakes AI systems like spam filters, AI used in video games, or AI-powered recommendation systems for streaming services that pose negligible harm to individuals or society. These systems are subject to minimal governance requirements and aren’t heavily regulated.

Limited risk: These are systems that pose a moderate risk as they have some level of interaction with users or provide inputs that influence their decisions, like AI chatbots, virtual assistants, or systems for dynamic website content creation. They require minimal transparency measures—including disclosing to users that they are interacting with AI—as their outputs don’t affect areas like safety, well-being, or fundamental rights.

High-risk: Systems that can have an impact on people’s well-being, safety, or fundamental rights are classified as high-risk. These include AI in healthcare, autonomous vehicles, systems that grade or determine access to education, recruitment or performance evaluation systems (both in the workplace and other settings), and public services. 

These systems can significantly affect people’s lives and potentially cause harm. Therefore, they require more stringent governance, including continuous assessment of risk, data governance, transparency, accountability, robustness, accuracy, and human oversight.

Unacceptable risk: These systems violate the fundamental rights of individuals and, as such, are not allowed under the EU AI Act. Examples of systems that are banned outright include those that manipulate human behavior to impact their free will, exploit vulnerabilities of a specific group of people, and those use untargeted scraping of facial data for facial recognition databases, etc.

Other AI Frameworks: NIST, ISO, and the OECD

In July 2024, the National Institute of Standards and Technology (NIST) released the Artificial Intelligence Risk Management Framework to guide the practical application of the principles in President Biden’s executive order. Organizations can use the NIST framework to “improve the ability to incorporate trustworthiness considerations into the design, development, use, and evaluation of AI products, services, and systems.” Though it is a voluntary framework, it is considered a defining framework for AI governance best practices. 

The ISO 42001 framework is a standard that provides guidelines for AI management systems. It helps organizations govern, deploy, and monitor AI in a responsible manner. Its benefit is that it integrates into existing processes and systems. As a result, you don’t have to build AI governance from scratch. You can simply adapt your current framework to include AI.

In 2024, the Organization for Economic Cooperation and Development (OECD) updated its principles for trustworthy AI, originally drafted in 2019, and updated them in 2024. These are:

  • Inclusive growth, sustainable development, and well-being
  • Human rights and democratic values, including fairness and privacy
  • Transparency and explainability
  • Robustness, security, and safety
  • Accountability

It also offers recommendations for policymakers:

  • Invest in AI research and innovation to promote trustworthy AI systems.
  • Foster an inclusive AI-enabling ecosystem.
  • Shape an enabling policy environment to support AI adoption while managing risks.
  • Build human capacity and prepare the workforce for changes brought by AI.
  • Cooperate across borders to align global efforts for trustworthy AI governance.

Levels of AI Governance

Because of its sensitive nature, AI governance should operate at different levels from global to company-specific operations. Each layer should address the appropriate aspects of regulations, oversight, and ethical management. Here are the various levels of AI governance:

Global Governance

This is an international set of guidelines to help manage cross-border risks like cybersecurity, bias, and human rights violations. Global governance can also help foster cooperation between countries, which can lead to improved AI research and innovation.

National Governance

Country-specific regulations and policies for governing AI help align research and development with national priorities. These also ensure compliance with local privacy and anti-discrimination laws.

Industry-Specific Governance

Different industries use AI in different ways, and some of them pose higher risks than others. That’s why it’s important to develop industry-specific standards and guidelines, especially for critical, high-risk verticals like healthcare, finance, recruitment, and transport technology.

Technical Governance

Guidelines, protocols, and standards like ISO/IEC help developers create and operate safe and ethical AI systems. These regulate areas like algorithmic fairness, security-rich, privacy-preserving data practices, and transparency of AI models.

Organizational Governance

Just like a public-facing privacy policy articulates how a company collects, stores, processes, and uses the personal and sensitive information of consumers, organizations should also have internal policies and processes for the development and use of AI systems. 

An internal policy ensures that teams across the organization clearly understand and follow ethical AI practices and relevant AI regulations as they build and deploy AI technologies, or use data with AI tools. The internal policy provides mechanisms for oversight and accountability so transparency, security, and data privacy requirements are met.

Organizations should also educate and empower users to make informed decisions about how to interact with and use AI. For consumers to have the most control over how to interact with AI,  they need to also have control over their data and a thorough understanding of their privacy rights. To that end, they must be informed of AI risks for truly informed consent. Finally, they must be made aware of their rights under AI regulations. 

Good Privacy Practices Mean Better AI Governance

Building a strong data privacy foundation for your business can expedite and simplify the creation of an AI governance program. 

A strong privacy management platform like Osano can help lay the foundation for broader AI governance by helping you: 

  • Clearly craft and articulate internal and external policies and make them visible to regulators and stakeholders. These frameworks can serve as a blueprint for crafting other governance policies.
  • Discover and track data throughout your organization to understand where and how personal and sensitive data is being used, including in AI tools and projects. 
  • Understand how third parties manage data, including in the context of AI.
  • Clearly communicate with others across the organization to align on AI usage and risk management. 

Once you have clear policies on personal and sensitive data, know where it resides in the organization and how it’s used, and gain visibility into operations and risks, you can better protect personal and sensitive data in every situation. And better privacy governance translates to better AI governance. 

Leave a Reply

Your email address will not be published. Required fields are marked *