The Problem: ChatGPT Without Compliance Visibility
An enterprise telecom client had deployed ChatGPT broadly across their organization. Hundreds of user accounts, active conversations, real business value being generated daily. But from a license management perspective, it was a black box.
Snow License Manager — the platform the client relied on for all their software compliance — had no native connector for ChatGPT or any OpenAI product. There was no way to answer basic compliance questions: How many users are actually using ChatGPT? Who hasn't logged in for months? Are we paying for seats nobody uses?
Without this data, the SAM team couldn't identify inactive licenses for reclamation. They couldn't produce compliance reports for audits. They couldn't even tell management whether the ChatGPT investment was being utilized. I needed to build the bridge between OpenAI's platform and Snow — something that didn't exist yet.
v1: The Three-Layer Connector (October 2024)
I designed the connector as a three-layer system, each layer handling a distinct responsibility.
Layer 1 — Python Data Collector
The core of the connector is a Python script that communicates with OpenAI's Compliance API. This API provides access to workspace-level user data and conversation metadata — not conversation content, but timestamps, user associations, and activity indicators.
I built paginated retrieval to handle the full user base, fetching accounts in batches and assembling a complete picture. For each user, the script collects their email, name, role, account status, and creation date. Then it queries their conversation history within a 45-day rolling window.
The critical enrichment step: for each user, I calculate an oldest_last_active timestamp using pandas — the most recent conversation activity date. This single field becomes the foundation for all compliance decisions downstream. The output is a structured CSV: one row per user, all the fields Snow needs to make a compliance determination.
Layer 2 — Orchestrated Data Pipeline
The collector runs on an automated schedule. Each execution produces a timestamped CSV that gets archived for audit trail purposes, then the latest data is transferred to Snow's import staging directory. This archiving turned out to be more important than I initially thought — but that's a story for later.
Layer 3 — SQL Server Compliance Engine
Inside Snow License Manager's SQL Server database, stored procedures bulk-import the CSV data and apply compliance classification rules. The logic is straightforward:
- Active: User with conversation activity within the last 45 days and an active account status
- Active: Newly created account (within 45 days) with active status — giving new users a grace period before being flagged
- Inactive: Everyone else
The stored procedures generate a compliance report directly inside Snow's UI — the SAM team can query, filter, and export ChatGPT usage data the same way they handle any other software product.
Six Months in Production
From October 2024 onward, the connector ran reliably. Compliance reports appeared on schedule. The SAM team used the data to identify genuinely inactive accounts and reclaim licenses. Management got the utilization visibility they needed.
It became infrastructure — the kind of system you stop thinking about because it just works. For about six months, that's exactly what happened.
The False-Inactivity Bug (May 2025)
In May 2025, support cases started coming in. Users were reporting that their ChatGPT access had been revoked. The problem: these were active users. They'd been using ChatGPT regularly — having conversations, generating content, relying on it for daily work. Yet the compliance system had flagged them as inactive, and their licenses had been pulled.
I started investigating with two confirmed cases. Both users had clear, recent ChatGPT activity — but the connector's data showed no conversations within the 45-day window. The compliance rules had done exactly what they were designed to do: no recent activity means inactive. The rules were correct; the data was wrong.
I traced the issue to the OpenAI Compliance API itself. Intermittently — not consistently, not predictably — the API would return null or empty conversation data for users who were genuinely active. On the next request, the data might come back fine. But if the scheduled collection run happened to hit one of these transient gaps, the connector would faithfully record "no recent activity," the SQL compliance rules would classify the user as inactive, and the license revocation process would proceed.
Two accounts were confirmed initially. Further investigation identified additional affected users, all showing the same pattern: real activity in ChatGPT, but intermittent API data gaps causing false-inactivity classifications.
The Fix: Retry Logic with Historical Fallback
The solution had two parts.
First, I implemented retry logic for conversation data fetching. Instead of accepting the first API response, the collector now makes multiple attempts with delays between them. If the API returns empty conversation data for a user, it retries before recording a "no activity" result.
Second — and this is where those archived CSVs paid off — I added a historical fallback mechanism. If the API returns no conversations for a user even after retries, the system checks previous export files. If a prior export shows recent activity for that user (via the oldest_last_active timestamp), the historical value is preserved rather than overwritten with null data.
I extracted the conversation-handling logic into a dedicated utility module with enhanced logging that tracks API response completeness per user. Now, every compliance classification can be traced back to its data source — live API, retry attempt, or historical fallback.
The result: false-inactivity cases were eliminated without compromising the system's ability to detect genuinely inactive accounts. Users who truly haven't used ChatGPT in 45+ days are still correctly flagged.
Outcomes
The connector has been running in production since October 2024 — over a year of automated ChatGPT compliance reporting inside Snow License Manager. The SAM team has queryable dashboards showing active versus inactive users across the entire workspace, updated on every collection cycle.
The May 2025 diagnostic work caught the false license revocation issue before it scaled beyond the initial affected users. The retry-with-fallback mechanism has prevented any recurrence.
Historical data archiving — originally implemented for audit trail compliance — proved its value as a critical data integrity safeguard. The client's SAM team operates the system independently through Snow's existing UI, with no developer involvement needed for day-to-day compliance reporting.
What I Learned
Don't trust API data blindly. External APIs can return incomplete data intermittently, without errors, without warnings. When your downstream system makes consequential decisions based on that data — like revoking someone's access — you need defensive validation, not just error handling.
Historical baselines are insurance. I kept timestamped CSV archives because audit compliance required it. Six months later, those archives became the fallback mechanism that prevented false license revocations. Build retention into your data pipelines even when you think you only need the latest snapshot.
Compliance connectors need a diagnostic mode. When license decisions depend on your data, you need to be able to explain every single classification. "The API said no activity" isn't good enough when someone's access was revoked incorrectly. Trace every decision back to its source.
The real bugs show up at month six, not day one. The initial connector worked flawlessly for months. The transient API data gap only surfaced when specific timing conditions aligned — a scheduled run coinciding with an API hiccup for specific users. These intermittent, timing-dependent issues are the hardest to detect and the most important to design against.
