See What Your AI Is Doing, When, and Why

Fourteen months after deploying an AI system across their legal practice, a UK firm received a data subject access request from a former client — a legal obligation under GDPR requiring the firm to produce all AI interactions that had used that client's data within 30 days.

They had uptime logs. They had latency dashboards. They had no interaction records.

---

Two Different Systems

Your AI uptime dashboard shows green. Your governance record shows nothing. Those are different systems, and only one of them matters when a regulator calls.

Operational monitoring tells you whether the AI infrastructure is running: uptime, latency, error rates. These metrics answer the question: is the system available? They tell you nothing about what the system did with data, which model processed which request, or whether decisions stayed within approved boundaries.

Governance visibility answers a different question: can I show what the AI did? Which user submitted which request, at what time, using which data source, processed by which model, resulting in what response and action? This is the record that answers a client dispute, satisfies a GDPR data subject access request — which must be answered within 30 days under current law — or responds to a regulator asking for 18 months of complete interaction logs.

Most enterprise AI deployments configure only the first system. The 89% of enterprise AI usage that is invisible to IT teams — a figure from Netskope's 2024 enterprise data report — isn't a failure of operational monitoring. Infrastructure monitoring works fine. The governance record was never built.

---

What Absence of Records Costs

Three examples from different industries show the same underlying pattern.

A Belgian insurance firm deployed a claims-processing AI and monitored uptime and response time. When a policyholder alleged the AI had misclassified their claim, the firm had no record of which data the AI had accessed or what its reasoning pathway was. The dispute took 8 months to resolve without the AI record, at a cost of €340,000 in legal fees. A complete interaction log would have resolved it in days — either confirming the AI's reasoning was sound or identifying where the misconfiguration occurred.

Dutch healthcare network deployment showed what observability enables: the same Recorder component that Leeloo builds into every deployment. In week three of production, the system flagged a behavioral drift — a specific query type dropped from 91% response quality to 67% before any user complaint arrived. The operations team pushed a model update before the drift caused a clinical recommendation error. Without continuous behavioral monitoring, that drift would have been invisible until someone noticed the output quality problem — at which point weeks of degraded responses would already exist without explanation.

German manufacturing's monitoring dashboard showed normal operations for 11 months while their Router had been failing over 6% of sensitive queries to a cloud model — discovered only during a supplier audit by a defense client. Eleven months of operational green lights, eleven months of data flowing outside the approved boundary. The performance monitoring never flagged it because the system was responding correctly. What it was routing the responses through was outside the monitoring scope.

---

What to Log — and What Not to Log

Logging everything is not the right answer. It's expensive, creates its own privacy exposure, and generates far more data than anyone will analyze.

Selective observability — configured around regulatory requirements and business decision impact — gives you the governance record you need without storing information that creates new problems. Three things must always be logged: which model processed the query, which data was accessed, and what decision or action followed. Everything else is configurable based on your regulatory environment and risk profile.

The Leeloo Recorder captures six elements for every AI interaction: user identity and role at time of request; timestamp with millisecond precision; a hash of the query content along with its sensitivity classification (the hash protects privacy; the classification records what category of data was involved); data sources accessed with the access control decision for each; model identity and version number; and response delivery confirmation along with any downstream action taken.

These six elements answer every regulatory question currently raised by GDPR, the EU AI Act, and sector-specific frameworks including HDS (the French healthcare data hosting standard) and SOX (the US financial reporting standard with AI implications). The EU AI Act — which requires high-risk AI systems to automatically record events throughout their lifetime, in force since August 2024 — is satisfied by this configuration. A data protection authority requesting a complete record of all interactions involving a specific individual receives it as an exportable report.

---

The Governance Record in Practice

Organizations that maintain complete AI observability resolve data disputes in an average of 11 days in Leeloo's deployed customer base. Organizations without complete logs resolve the same categories of dispute in an average of 94 days — and 34% of those disputes escalate to formal regulatory notifications, versus 3% where a complete record exists.

Beyond compliance, this record creates a second capability: improvement with evidence. Organizations that log every AI interaction can analyze which query types produce low-quality responses, which data sources generate the most value, and which workflows have the highest abandonment rates. Governance infrastructure built for compliance becomes an optimization tool.

There's also an underappreciated commercial angle. Regulated-sector clients — law firms, healthcare providers, financial institutions — are putting AI observability requirements into vendor due diligence questionnaires. Organizations that can produce a complete observability record in response to a supplier audit close those conversations quickly. Organizations that can't are disqualified before the price conversation begins.

---

Building It Right the First Time

Observability standards for AI are still developing. The EU AI Act's logging requirements define the principle — "automatically record events" — without yet specifying exact event types for each system category. We've configured the Recorder around the most defensible interpretation of current requirements, with the safe assumption that regulators will clarify over the next 24 months. Configuring more now and reducing scope if regulators specify less is a better position than the reverse.

Privacy and observability can appear to pull against each other: detailed AI interaction logs may contain personal data, and logging personal data creates its own obligations. The resolution is purpose limitation — log the decision record rather than the data payload, except where the payload itself is the subject of a dispute. A hash of the query content (which can be verified without being read) satisfies most governance requirements without storing the sensitive content that was processed.

We built the Recorder into the Framework as a standard component because we couldn't build a product we'd be comfortable explaining to a regulator that lacked it. Every Leeloo deployment ships with governance visibility configured — not as an optional module that gets added later if someone asks for it.

---

When the Regulator Calls

Complete observability changes the nature of a regulatory inquiry. An organization that can produce a full interaction record in hours answers a routine audit routinely. An organization that discovers its governance record has gaps during an audit answers a different, more difficult question.

Each new AI capability added after the first benefits from the same foundation. An organization with complete observability on its first deployment adds the second use case knowing exactly what record will exist, what governance questions will be answerable, and what the audit response will look like. That accumulated confidence — the knowledge that you can account for everything your AI has done — is what makes expansion straightforward rather than anxious.

Every AI deployment should come with the ability to explain what it did. That's not an advanced feature. It's the baseline from which everything else is built.

---

Leeloo is a sovereign AI implementation company based in Luxembourg, EU. The Recorder is a standard component in every Leeloo Framework deployment. [leeloo.ai]