Revise visual telemetry spec to use Cloudflare asset analytics

This commit is contained in:
Drew Ritter
2026-06-15 16:09:34 -07:00
parent dcf7e2a092
commit 66cc0045be

View File

@@ -1,236 +1,154 @@
# Visual Brainstorming Launch Telemetry
# Visual Brainstorming Logo Telemetry
**Date:** 2026-06-15
**Status:** Draft for Drew review
**Linear:** PRI-2231
**Scope:** `skills/brainstorming/scripts/`, Superpowers docs, Terminus-owned Cloudflare Worker, Terminus telemetry collector
**Scope:** `skills/brainstorming/scripts/`, Superpowers docs, `prime-radiant-inc.github.io` brand asset, Cloudflare zone analytics
## Problem
Jesse wants to understand whether the Superpowers visual brainstorming companion is being launched in real use. The visible product affordance can say `Superpowers v<version> by Prime Radiant`, but the measurement should not be a logo view or static asset hit.
The useful event is:
```text
event=visual_brainstorming_launched
```
This event means a Superpowers Visual Companion session reached a browser. GitHub Pages can host static brand assets, but it cannot reliably capture this semantic launch event. The Brainstorm/Brooks product should not own the collector either: Brainstorm is a separate prototype, while Superpowers needs its own lightweight telemetry path.
## Goals
- Add default-on, env-var-only opt-out launch telemetry for the visual companion.
- Record one semantic launch event per companion launch under normal operation.
- Include Superpowers version, country, capped user agent, timestamp, launch ID, and Cloudflare ray ID.
- Keep the visual companion usable when telemetry infrastructure is down.
- Send v0 data to Loki through Terminus-owned infrastructure.
- Make abuse obvious and bounded without pretending a public OSS client can prove authentic human usage.
## Non-Goals
- Do not measure logo impressions, frame reloads, or every page load as product events.
- Do not load a remote logo in v0. Text branding is enough for the launch telemetry work.
- Do not add DynamoDB or pre-write deduplication in v0.
- Do not route Superpowers usage telemetry through the Brainstorm application.
- Do not add OpenPanel ingestion in v0. Dashboards can be added after Loki events are clean.
- Do not collect project paths, prompt text, user IDs, or browser interaction contents.
## Definitions
**Visual Companion:** The local browser display attached to the `brainstorming` skill. It is used for visual questions such as mockups, diagrams, layout comparisons, and spatial choices. The terminal remains the primary conversation channel.
**Launch:** A local visual companion server process starts and at least one browser page loads the companion helper. This is the moment worth counting.
**Launch ID:** An ephemeral random ID generated by the local visual companion server for one server process. It is not stable across sessions, machines, or installs.
## Proposed Architecture
```text
Superpowers browser helper
-> Cloudflare Worker at t.primeradiant.com
-> API Gateway HTTP API
-> VPC Lambda collector
-> Loki on monitoring.terminus.internal
```
The browser does not post directly to Terminus monitoring because Loki is intentionally private to the Terminus network. Cloudflare receives the public request, enriches it with edge metadata, signs a compact payload, and forwards it to the Terminus collector. The collector validates that the event came from the Worker before writing to Loki.
## Superpowers Changes
`server.cjs` should read the Superpowers version from package metadata at startup and generate a launch ID for the server process. It should expose a small telemetry config to `frame-template.html` and injected full-document helper pages unless `SUPERPOWERS_DISABLE_TELEMETRY=1`.
The config should include:
```json
{
"enabled": true,
"endpoint": "https://t.primeradiant.com/superpowers/visual-brainstorming/launch",
"event": "visual_brainstorming_launched",
"surface": "brainstorming.visual_companion",
"superpowersVersion": "5.1.0",
"launchId": "ephemeral-random-id"
}
```
`helper.js` should send one best-effort GET request after the browser helper initializes:
```text
https://t.primeradiant.com/superpowers/visual-brainstorming/launch?event=visual_brainstorming_launched&surface=brainstorming.visual_companion&v=<version>&launch_id=<launchId>
```
The helper should use a source-side once-per-launch guard. The primary guard can be a `localStorage` key derived from the launch ID, with an in-memory fallback if storage is unavailable. This avoids intentionally firing on every frame reload. Rare duplicates from retries, private browsing behavior, or multiple browsers are acceptable and should be handled in Loki queries if they matter.
Telemetry failure must be silent. The helper should use a short best-effort request path such as `fetch(..., { method: "GET", mode: "no-cors", credentials: "omit", cache: "no-store", keepalive: true })`, ignore the response, and never affect rendering or WebSocket behavior.
The visible frame should include local branding:
Jesse wants a very small way to see whether the Superpowers visual brainstorming companion is being used. The desired visible affordance is:
```text
Superpowers v<version> by Prime Radiant
```
Local branding should still render when telemetry is disabled. The opt-out only suppresses the remote request.
with a Prime Radiant logo loaded from the main website. The telemetry mechanism should be the normal website request itself: proxy `primeradiant.com` through Cloudflare, load a versioned image URL from the visual companion, and inspect Cloudflare traffic data for that asset path.
## Cloudflare Worker
This replaces the earlier collector design. v0 should not include a Worker, HMAC collector, Loki ingestion, OpenPanel ingestion, `event` query parameter, or launch ID.
The Worker owns the public telemetry endpoint, for example:
## Goals
- Add default-on, env-var-only opt-out logo loading in the visual companion.
- Host the logo on the main `primeradiant.com` website.
- Add only a dynamic `v=<superpowers-version>` query parameter.
- Use Cloudflare's normal proxied-zone traffic data as the analytics surface.
- Keep Superpowers behavior local and reliable if the image fails to load.
- Avoid new Prime Radiant services for v0.
## Non-Goals
- Do not build a custom telemetry collector in v0.
- Do not send events to Terminus, Loki, Grafana, or OpenPanel in v0.
- Do not add `event`, `surface`, `launch_id`, user ID, project path, prompt text, or browser interaction data to the request.
- Do not try to deduplicate per launch.
- Do not route this through the Brainstorm/Brooks app.
## Definitions
**Visual Companion:** The local browser display attached to the `brainstorming` skill. It is used for visual questions such as mockups, diagrams, layout comparisons, and spatial choices. The terminal remains the primary conversation channel.
**Telemetry asset:** A distinct static image URL on `primeradiant.com` that appears only in the Superpowers visual companion. Requests for this path are the v0 usage signal.
## Proposed Architecture
```text
https://t.primeradiant.com/superpowers/visual-brainstorming/launch
Superpowers visual companion
-> https://primeradiant.com/brand/superpowers-visual-brainstorming-logo.png?v=<version>
-> Cloudflare proxied zone analytics
-> GitHub Pages origin serves the static image
```
The Worker should:
Cloudflare sits in front of `primeradiant.com`. Superpowers loads the static logo from the main website. The Cloudflare dashboard becomes the place to inspect traffic by path, query string, country, user agent, and other available HTTP analytics dimensions for the plan.
- Accept only the exact launch path.
- Record events only for GET requests. HEAD may return a write-free health response.
- Reject requests with a body.
- Cap and validate all query parameters.
- Require `event=visual_brainstorming_launched`.
- Require `surface=brainstorming.visual_companion`.
- Validate `v` as a bounded version-like string.
- Validate `launch_id` as a bounded opaque ID.
- Read country from Cloudflare edge metadata.
- Read user agent from the request header and cap its stored length.
- Include Cloudflare ray ID when available.
- Use a short timeout when forwarding to the collector.
- Return a small `204 No Content` response with `Cache-Control: no-store`.
## Superpowers Changes
The Worker signs the collector payload with HMAC. A concrete v1 signature format:
`server.cjs` should read the Superpowers version from package metadata at startup. The value must be dynamic; do not hard-code `5.1.0` in the scripts.
The shared frame should render local text branding plus the remote logo when telemetry is enabled:
```text
X-Superpowers-Timestamp: <unix-seconds>
X-Superpowers-Signature: v1=<hex hmac sha256 over "<timestamp>.<raw-json-body>">
Superpowers v<version> by Prime Radiant
```
The HMAC secret should be configured as a Cloudflare Worker secret. The same secret should be stored in AWS Secrets Manager for the Terminus collector, with Terraform wiring the Lambda to read and cache the secret on cold start. Rotation can be manual for v0.
## Terminus Collector
The collector and Worker should live outside the Brainstorm app. The implementation should use this Terminus-owned placement:
The remote image URL should be:
```text
terminus/services/superpowers-telemetry-worker
terminus/services/superpowers-telemetry-collector
terminus/terraform/superpowers-telemetry
https://primeradiant.com/brand/superpowers-visual-brainstorming-logo.png?v=<superpowers-version>
```
The public AWS surface should be an API Gateway HTTP API that forwards to a Lambda running in the Terminus VPC. The Lambda needs network access to:
Use only the `v` query parameter. Do not include `event`, `surface`, `launch_id`, or any local session identifier.
When `SUPERPOWERS_DISABLE_TELEMETRY=1`, Superpowers should omit the remote image entirely while preserving the local text branding and the rest of the visual companion UI.
Image failure must be cosmetic. The companion should keep working if `primeradiant.com` is blocked, offline, slow, or returns a non-image response.
## Website Changes
Add a distinct static logo file to the Prime Radiant website:
```text
http://monitoring.terminus.internal:3100/loki/api/v1/push
prime-radiant-inc.github.io/static/brand/superpowers-visual-brainstorming-logo.png
```
The Lambda should reject before writing to Loki when:
It can be the same pixels as an existing Prime Radiant logo, but the path should be unique to this integration so Cloudflare filtering is simple and does not mix normal website branding traffic with Superpowers visual companion traffic.
- The signature is missing or invalid.
- The timestamp is outside a short freshness window, such as five minutes.
- The body is too large.
- The event name or surface is not the expected v0 value.
- Required fields are missing or malformed.
Production `primeradiant.com` should be proxied through Cloudflare so Cloudflare sees requests for this path even when GitHub Pages remains the origin.
The collector should not attempt to identify users. It should write one compact JSON log line per valid launch event.
## Cloudflare Analytics
## Loki Shape
The v0 reporting workflow is manual dashboard inspection:
Use low-cardinality labels:
- Filter HTTP traffic analytics to `primeradiant.com`.
- Filter by path `/brand/superpowers-visual-brainstorming-logo.png`.
- Break down or filter by query string `v=<version>` when available.
- Use Cloudflare-provided country and user-agent dimensions when needed.
```json
{
"app": "superpowers-telemetry",
"event": "visual_brainstorming_launched",
"env": "prod"
}
No logs need to be exported to Terminus for v0.
The exact analytics dimensions available depend on the Cloudflare plan. If the current plan cannot filter by query string, version can still be encoded in the path later, for example:
```text
/brand/superpowers/5.1.0/visual-brainstorming-logo.png
```
Put higher-cardinality values in the JSON body:
Do not make that change unless the dashboard cannot answer version questions with the query string.
```json
{
"event": "visual_brainstorming_launched",
"surface": "brainstorming.visual_companion",
"superpowersVersion": "5.1.0",
"launchId": "ephemeral-random-id",
"country": "US",
"userAgent": "Mozilla/5.0 ...",
"cfRay": "abc123",
"timestamp": "2026-06-15T22:00:00.000Z"
}
```
## Counting Semantics
Initial dashboard queries should count events directly. If duplicate analysis is needed, query by `launchId` in the JSON body rather than changing the write path.
This design measures requests for the versioned visual companion logo asset. It is a practical proxy for visual companion usage, not a perfect semantic launch event.
## Abuse and Cost Controls
Because there is no launch ID, repeated launches of the same Superpowers version may be undercounted if the browser serves the image from its local cache. Cloudflare will only see requests that reach Cloudflare. If that undercount matters, configure a Cloudflare rule for this exact path to reduce browser caching, or add response headers on the website if the hosting setup later supports per-file headers.
Cloudflare should rate-limit the public telemetry path by client IP and path. The Worker should perform cheap validation before any collector call and emit at most one collector request for one accepted event request.
API Gateway should use stage throttling. Lambda should use reserved concurrency to cap worst-case spend. The Lambda should validate HMAC and timestamp before parsing more deeply or writing to Loki. Loki payloads should remain tiny and labels should remain low-cardinality.
This design does not prove that every accepted event came from an authentic Superpowers user. The client is public and open source, so anyone can copy the endpoint format. The security goal is narrower: prevent casual spoofing from bypassing the Worker, bound unauthenticated public traffic, and preserve useful honest-usage telemetry.
Cloudflare edge caching is acceptable: Cloudflare can still count edge requests while serving the cached asset. The important thing is avoiding long browser-local caching if we care about repeated launches from the same browser and version.
## Privacy and Documentation
Superpowers docs should disclose:
- Visual companion launch telemetry is default-on.
- The visual companion loads a Prime Radiant logo from `primeradiant.com` by default.
- The URL includes the Superpowers version as `v=<version>`.
- Opt out with `SUPERPOWERS_DISABLE_TELEMETRY=1`.
- Collected fields are event name, surface, Superpowers version, country, capped user agent, timestamp, Cloudflare ray ID, and ephemeral launch ID.
- No prompts, project paths, file contents, user IDs, or browser interaction contents are sent.
- Telemetry failure does not affect the visual companion.
- No prompts, project paths, file contents, user IDs, session IDs, or browser interaction contents are sent by Superpowers.
- Cloudflare may observe normal HTTP request metadata for the image request, such as IP-derived country, user agent, timestamp, path, query string, and request status.
- Image load failure does not affect the visual companion.
## Testing
Superpowers tests should cover:
- Version and launch ID are injected into the helper config.
- `SUPERPOWERS_DISABLE_TELEMETRY=1` suppresses telemetry config.
- Full-document pages and framed fragment pages both receive the same helper behavior.
- Local branding renders even when telemetry is disabled.
- The helper's once-per-launch guard avoids repeated sends for the same launch ID.
- The version is read from package metadata and injected into the image URL.
- The generated URL contains exactly one query parameter: `v=<version>`.
- The generated URL does not contain `event`, `surface`, `launch_id`, or local file/project data.
- `SUPERPOWERS_DISABLE_TELEMETRY=1` suppresses the remote image.
- Local text branding renders even when telemetry is disabled.
- Full-document pages and framed fragment pages do not regress.
Collector tests should cover:
Website verification should cover:
- Valid Worker-signed payload writes the expected Loki entry.
- Invalid signature, stale timestamp, wrong event, oversized body, and malformed fields are rejected.
- Loki write failures return an error to the Worker but do not expose secrets.
Worker tests should cover:
- Exact route validation.
- Query parameter validation and caps.
- HMAC signing canonicalization.
- Country, user agent, and ray ID enrichment.
- No collector write on HEAD health requests.
- The static asset exists at `https://primeradiant.com/brand/superpowers-visual-brainstorming-logo.png`.
- The DNS record for `primeradiant.com` is proxied through Cloudflare.
- A request with `?v=<current-version>` appears in Cloudflare analytics for the expected path.
## Rollout
1. Land the Superpowers UI and docs changes behind the default-on telemetry config.
2. Deploy the Cloudflare Worker and Terminus collector.
3. Smoke test from a local visual companion launch with telemetry enabled.
4. Smoke test `SUPERPOWERS_DISABLE_TELEMETRY=1`.
5. Confirm Loki events are queryable by `event`, version, country, and launch ID.
6. Add Grafana/OpenPanel views only after the Loki stream is stable.
1. Add the dedicated static logo asset to the Prime Radiant website.
2. Put `primeradiant.com` behind Cloudflare proxying if it is not already proxied.
3. Update Superpowers visual companion branding to render the dynamic version and remote logo URL.
4. Document `SUPERPOWERS_DISABLE_TELEMETRY=1`.
5. Smoke test with telemetry enabled and disabled.
6. Confirm the versioned logo request appears in Cloudflare analytics.
## Decisions and Future Work
## Future Work
- Use `t.primeradiant.com` for the telemetry endpoint unless DNS availability blocks deployment.
- Keep Worker ownership with the Terminus telemetry collector. The `prime-radiant-inc.github.io` repo can continue hosting public brand assets, but it should not own telemetry code.
- A remote Prime Radiant logo can be added later as a cosmetic UI change. If it is added, it should load from the static brand asset URL and must not define or fire telemetry.
If this stops being enough, the next step is a narrow Cloudflare Worker endpoint or Logpush pipeline. Do not build that until the dashboard-only workflow fails to answer a real question.