Skip to content
Risk & GovernanceJune 1, 20257 min read

What Is Shadow AI? The Hidden Data Security Risk in Your Organization

Employees are using dozens of AI tools you haven't approved. Here's what shadow AI is, why it's dangerous, and how to get visibility without blocking productivity.

A

AIovert Security Team

GDPR & EU AI Act practitioners

Quick answers

What is shadow AI?

Shadow AI refers to employees using AI tools (such as ChatGPT, Claude, or Gemini) without formal IT approval or security review. These tools receive sensitive business data (customer PII, API keys, internal documents) with no data processing agreement in place.

Why is shadow AI a security risk?

AI tool providers may retain conversation data for model training. Without a DPA or enterprise agreement, your customers' personal data can end up in a third-party training dataset, a GDPR violation with fines up to 4% of global turnover.

How widespread is shadow AI?

Security teams that deploy monitoring consistently find 55–75% of employees using unapproved AI tools. Most employees do not know their inputs may be used for training.

The gap between your AI policy and your employees' browsers

Most enterprises have an AI acceptable-use policy. A percentage of them enforce it at the network layer, blocking traffic to known AI domains. Neither approach works. Employees access ChatGPT on personal hotspots. They use AI-powered writing assistants embedded in tools like Notion, Grammarly, or Figma. They paste data into free-tier Perplexity on lunch breaks.

This is shadow AI: the use of generative AI tools that sit outside your visibility, your vendor review process, and your compliance controls. It is not malicious. It is, in most cases, an employee trying to be more productive with the best tools available to them.

The problem is that productivity and security have fundamentally different threat models. Your employees are optimizing for speed. Your data protection obligations do not.

What data is actually being shared?

Organisations that deploy browser-level AI monitoring consistently report the same categories of sensitive data appearing in AI tool inputs:

  • Customer PII: names, email addresses, phone numbers, Social Security Numbers. Sales teams paste CRM exports to draft outreach. Support agents paste ticket data to summarise issues.
  • API keys and credentials: developers paste error logs, configuration files, or environment variables to debug issues. AWS keys, GitHub tokens, OpenAI API keys regularly appear in ChatGPT conversations.
  • Internal documents: legal teams summarise contracts. Finance teams paste forecast spreadsheets. HR teams ask AI to draft performance reviews containing employee data.
  • Source code: engineering teams paste entire modules. This is particularly sensitive in regulated industries where proprietary algorithms are trade secrets.

Why traditional DLP doesn't work

Network DLP sees TLS-encrypted traffic to chatgpt.com and reports “HTTPS 443.” It cannot inspect the payload. Even SSL inspection appliances face limitations: modern browsers flag custom root certificates, employees notice performance degradation, and mobile devices and personal hotspots bypass the perimeter entirely.

Endpoint DLP is theoretically capable of capturing clipboard events, but it produces enormous volumes of low-fidelity alerts, requires heavyweight agents, and typically misses web-based input fields entirely.

The fundamental architectural problem is that network and endpoint DLP were designed before the browser became the primary enterprise application runtime.

The compliance exposure

Under GDPR Article 28, sharing personal data with a third-party processor requires a Data Processing Agreement (DPA). OpenAI, Google, and Anthropic offer enterprise DPAs, but only on paid enterprise plans with specific data processing terms enabled. Free-tier and consumer-tier users have no DPA. This means every employee using a free ChatGPT account with customer data is creating an unprotected data transfer.

Under GDPR Article 32, organisations must implement appropriate technical measures to ensure data security. A regulator asking “what controls do you have over employee AI tool usage?” requires a technical answer, not a policy document.

SOC 2 Type II auditors are beginning to include specific AI tool usage questions. ISO 27001:2022 Annex A 5.9 (inventory of information and other associated assets) increasingly covers data processed through AI tools.

How to address shadow AI without killing productivity

The worst response to shadow AI is a blanket block. Employees are more productive with AI tools, and overly restrictive policies push usage to personal devices where you have even less visibility.

An effective shadow AI programme has three components:

  1. Visibility: know what tools are being used, by whom, and with what types of data. This requires browser-level monitoring, not network monitoring.
  2. Classification, not content: log the data type (SSN, API key, email), not the raw content. This preserves employee privacy while giving security teams the signal they need.
  3. Evidence: maintain an audit log that can be exported for compliance reviews. When a regulator asks, you need a timestamp, a username, a tool, and a data classification, not a policy PDF.

AIovert addresses all three requirements with a Chrome extension deployed via MDM. Detection runs on-device, so raw content never leaves the browser. The dashboard provides per-employee risk scores, trend analysis, and exportable audit logs.

The bottom line

Shadow AI is not a future risk. It is happening today, in your organisation, with your data. The question is whether you have visibility. If you don't, the next breach investigation or regulatory audit will be the first time you find out.

See the practical guide to preventing ChatGPT data leaks, or read the full enterprise AI DLP guide.

See what's leaving your organisation

AIovert deploys in 15 minutes via Google Workspace or Intune. No proxy, no certificates, no employee action required.