The OpenAI billing dashboard was showing charges the founder couldn't explain. The application had roughly 800 users on a freemium model, but the API usage was trending toward $2,000 a month — far higher than the user activity on the platform justified. The founder assumed they had an inefficient prompt somewhere and made a note to investigate.

Seven minutes into a Penetrify scan, the reason became clear: the OpenAI API key was being passed back to users in the HTTP response headers of every proxied API call. 800 users had seen it. Some of them were using it.

The Architecture: Where the Leak Came From

The application was a FastAPI backend serving a React frontend. Its core functionality was to proxy user requests to OpenAI's API, adding custom system prompts, storing conversation history, and applying the founder's proprietary prompt engineering layer. This is a common pattern for AI wrapper products — the value isn't the model, it's the product built around it.

The way the application worked:

User sends a prompt from the React frontend
Frontend sends it to POST /api/generate
FastAPI handler adds the system prompt and calls OpenAI's API
FastAPI returns the completion to the frontend

Somewhere in the FastAPI route implementation, the Authorization header from the outbound OpenAI request — containing the API key in Bearer token format — was being forwarded back in the response. This is a specific class of header forwarding bug: the application was passing through response headers from the upstream OpenAI API call rather than constructing its own response headers.

The response headers on every /api/generate call included:

HTTP/1.1 200 OK
Content-Type: application/json
Authorization: Bearer sk-proj-...[OpenAI API key]
...
{"completion": "..."}

Every user who had ever used the generate feature — all 800 of them — had received the API key in the response headers of their requests. It was visible in the browser's DevTools Network tab, in any HTTP proxy, and in any programmatic client that read response headers.

What an OpenAI API Key Gives You

An OpenAI API key with no usage restrictions gives the holder full access to the corresponding account's API quota. This means:

Unlimited model access at the key owner's expense — GPT-4o, o1, o3, image generation, embeddings, fine-tuning
No per-request cap until the account's monthly spending limit is reached
Access to any fine-tuned models the account has created
Ability to read stored files if the account uses the Files API

For an individual founder whose application is processing $200–400/month in legitimate usage, having their key abused externally can push the monthly bill to $2,000, $5,000, or more — depending on how widely the key circulates and what the abusers are generating.

The cost model for OpenAI API abuse is asymmetric: the attacker pays nothing, the key owner pays for everything.

The Unexplained Billing Spikes, Explained

Once the key exposure was identified, the billing spikes made sense. The founder pulled the OpenAI usage dashboard filtered by endpoint and time. The spike pattern showed high-volume requests that didn't correlate with user activity on the platform — requests at 3am, requests from IP ranges that didn't match any known user geography, requests for model types the application didn't use.

Someone had extracted the key — possibly several people — and was using it directly against the OpenAI API, bypassing the application entirely. The requests were going to OpenAI directly using the extracted credentials, not through the founder's application.

The key had been exposed since approximately the first week of the application's public launch. By the time of the scan, it had been live and leaking for several months.

The Other Findings

The OpenAI key exposure was the most immediately damaging finding, but three additional issues were reported:

MEDIUM — IDOR on /api/history/:userId

The application stored conversation history per user and exposed it at a predictable endpoint:

GET /api/history/abc123

The route handler fetched conversation history for the user ID in the path parameter without checking whether the requesting user owned those records. Any authenticated user could read any other user's conversation history by substituting their ID. Since the conversations included user-supplied prompts, this was also a privacy exposure: an attacker could read what questions other users had been asking the AI tool.

MEDIUM — FastAPI debug mode enabled in production

The application was running with FastAPI(debug=True). In debug mode, any unhandled exception returns a full stack trace in the HTTP response, including internal file paths, dependency versions, and environment variable names (though not values). This information is directly useful for planning further attacks — knowing the exact FastAPI version, Pydantic version, and Python version narrows the list of applicable CVEs significantly.

The debug mode also enables FastAPI's interactive documentation at /docs and /redoc by default, which was accessible in production and documented every internal API endpoint including those not intended for user access.

LOW — HTTP not redirecting to HTTPS

The HTTP version of the application served full content without redirecting to HTTPS. On public or shared networks, an attacker performing a man-in-the-middle attack could intercept unencrypted sessions and extract session tokens, user-submitted prompts, and API responses.

The Fix: Deployed the Same Evening

The founder deployed fixes for all findings within three hours of receiving the report.

Rotate the key first

Before touching any code, the immediate action was to revoke the compromised key in the OpenAI dashboard and generate a new one. This instantly cut off any ongoing abuse. OpenAI's key rotation is immediate — the old key stops working the moment you delete it.

Fix the header forwarding bug

The root cause was that the FastAPI route was using a generic HTTP client that forwarded all response headers from the upstream OpenAI call. The fix was to construct explicit response headers rather than passing through upstream ones:

# Before (vulnerable) — forwarding all upstream headers
upstream_response = await client.post(
    "https://api.openai.com/v1/chat/completions",
    headers={"Authorization": f"Bearer {settings.OPENAI_API_KEY}", ...},
    json=payload
)
return Response(
    content=upstream_response.content,
    headers=dict(upstream_response.headers)  # ← this forwards the Authorization header back
)

# After (fixed) — explicit response construction
upstream_response = await client.post(
    "https://api.openai.com/v1/chat/completions",
    headers={"Authorization": f"Bearer {settings.OPENAI_API_KEY}", ...},
    json=payload
)
completion_data = upstream_response.json()
return JSONResponse(content={"completion": completion_data["choices"][0]["message"]["content"]})
# Only the data we explicitly want to return — no upstream headers forwarded

Fix the IDOR

The conversation history endpoint was updated to extract the user ID from the verified JWT rather than from the path parameter:

@router.get("/api/history")
async def get_history(current_user: User = Depends(get_current_user)):
    # User ID comes from the verified JWT — can't be spoofed
    history = await db.get_history(user_id=current_user.id)
    return history

Disable debug mode

# In config.py
app = FastAPI(
    debug=settings.DEBUG,  # reads from environment variable
    docs_url=None if not settings.DEBUG else "/docs",  # hide docs in production
    redoc_url=None if not settings.DEBUG else "/redoc"
)

With DEBUG=false set in the production environment, the interactive docs and verbose error responses disappeared immediately on the next deployment.

Adding OpenAI Usage Limits as a Safety Net

Beyond fixing the leak, the founder added two defensive measures to limit blast radius from any future key exposure:

Usage limits: In the OpenAI dashboard under Billing → Usage limits, set a monthly hard limit and a soft notification threshold. Even if a key is compromised again, the attacker's ability to run up charges is capped.

Dedicated keys per service: Create a separate API key for each application or environment. If a key is compromised, you can rotate just that key without disrupting other services, and the usage logs for each key are cleanly separated — making unauthorized access much easier to detect.

How Common Is This?

API key exposure in HTTP responses is less common than exposure in JavaScript bundles, but we see it regularly in AI wrapper applications specifically. The pattern almost always has the same root cause: a developer building a proxy layer uses a generic HTTP client that forwards response headers, and they don't audit what those headers contain.

The header forwarding mistake is easy to make because it often simplifies the implementation. Why construct a new response when you can forward the upstream one? The answer, in this case, is that the upstream response contains credentials you don't want to share with your users.

If your application proxies calls to OpenAI, Anthropic, or any other external API, audit your response headers explicitly. Use a tool like curl -v or your browser's DevTools to look at every header returned by every API endpoint. Headers are easy to overlook precisely because most of the time they're uninteresting — which is what makes them such an effective hiding spot for a leak.

The YC Application Context

The founder was preparing a YC application at the time of the scan. The combination of unexplained billing spikes, an exposed API key, and an IDOR vulnerability affecting all users' conversation history would have been a significant problem to explain to investors — or, worse, to discover after funding.

Security issues at the pre-launch or early traction stage are fixable in hours. The same issues discovered after a security incident, a data breach notification, or a hostile media story take months to recover from and can end a company that hasn't yet built the goodwill to survive the news cycle.

The founder ran Penetrify again before submitting the YC application. The report came back clean.

OpenAI API Key in HTTP Response Headers: Found in 7 Minutes