feat: 004/phase 1

This commit is contained in:
2025-11-14 23:54:50 +08:00
parent 9692219e0f
commit 0d52bae1e8
34 changed files with 1222 additions and 21 deletions

View File

@@ -0,0 +1,34 @@
# Specification Quality Checklist: Modular Proxy & Cache Segmentation
**Purpose**: Validate specification completeness and quality before proceeding to planning
**Created**: 2025-11-14
**Feature**: /home/rogee/Projects/any-hub/specs/004-modular-proxy-cache/spec.md
## Content Quality
- [x] No implementation details (languages, frameworks, APIs)
- [x] Focused on user value and business needs
- [x] Written for non-technical stakeholders
- [x] All mandatory sections completed
## Requirement Completeness
- [x] No [NEEDS CLARIFICATION] markers remain
- [x] Requirements are testable and unambiguous
- [x] Success criteria are measurable
- [x] Success criteria are technology-agnostic (no implementation details)
- [x] All acceptance scenarios are defined
- [x] Edge cases are identified
- [x] Scope is clearly bounded
- [x] Dependencies and assumptions identified
## Feature Readiness
- [x] All functional requirements have clear acceptance criteria
- [x] User scenarios cover primary flows
- [x] Feature meets measurable outcomes defined in Success Criteria
- [x] No implementation details leak into specification
## Notes
- Items marked incomplete require spec updates before `/speckit.clarify` or `/speckit.plan`

View File

@@ -0,0 +1,99 @@
openapi: 3.0.3
info:
title: Any-Hub Module Registry API
version: 0.1.0
description: |
Internal diagnostics endpoint exposing registered proxy+cache modules and per-hub bindings.
servers:
- url: http://localhost:3000
paths:
/-/modules:
get:
summary: List registered modules and hub bindings
tags: [modules]
responses:
'200':
description: Module summary
content:
application/json:
schema:
type: object
properties:
modules:
type: array
items:
$ref: '#/components/schemas/Module'
hubs:
type: array
items:
$ref: '#/components/schemas/HubBinding'
/-/modules/{key}:
get:
summary: Inspect a single module metadata record
tags: [modules]
parameters:
- in: path
name: key
schema:
type: string
required: true
description: Module key, e.g., npm-tarball
responses:
'200':
description: Module metadata
content:
application/json:
schema:
$ref: '#/components/schemas/Module'
'404':
description: Module not found
components:
schemas:
Module:
type: object
required: [key, description, migration_state, cache_strategy]
properties:
key:
type: string
description:
type: string
migration_state:
type: string
enum: [legacy, beta, ga]
supported_protocols:
type: array
items:
type: string
cache_strategy:
$ref: '#/components/schemas/CacheStrategy'
CacheStrategy:
type: object
properties:
ttl_seconds:
type: integer
minimum: 1
validation_mode:
type: string
enum: [etag, last-modified, never]
disk_layout:
type: string
requires_metadata_file:
type: boolean
supports_streaming_write:
type: boolean
HubBinding:
type: object
required: [hub_name, module_key, domain, port]
properties:
hub_name:
type: string
module_key:
type: string
domain:
type: string
port:
type: integer
rollout_flag:
type: string
enum: [legacy-only, dual, modular]

View File

@@ -0,0 +1,95 @@
# Data Model: Modular Proxy & Cache Segmentation
## Overview
The modular architecture introduces explicit metadata describing which proxy+cache module each hub uses, how modules register themselves, and what cache policies they expose. The underlying storage layout (`StoragePath/<Hub>/<path>.body`) remains unchanged, but new metadata ensures the runtime can resolve modules, enforce compatibility, and migrate legacy hubs incrementally.
## Entities
### 1. HubConfigEntry
- **Source**: `[[Hub]]` blocks in `config.toml` (decoded via `internal/config`).
- **Fields**:
- `Name` *(string, required)* unique per config; used as hub identifier and storage namespace.
- `Domain` *(string, required)* hostname clients access; must be unique per process.
- `Port` *(int, required)* listen port; validated to 165535.
- `Upstream` *(string, required)* base URL for upstream registry; must be HTTPS or explicitly whitelisted HTTP.
- `Module` *(string, optional, default `"legacy"`)* key resolved through module registry. Validation ensures module exists at load time.
- `CacheTTL`, `Proxy`, and other overrides *(optional)* reuse existing schema; modules may read these via dependency injection.
- **Relationships**:
- `HubConfigEntry.Module``ModuleMetadata.Key` (many-to-one).
- **Validation Rules**:
- Missing `Module` implicitly maps to `legacy` to preserve backward compatibility.
- Changing `Module` requires a migration plan; config loader logs module name for observability.
### 2. ModuleMetadata
- **Fields**:
- `Key` *(string, required)* canonical identifier (e.g., `npm-tarball`).
- `Description` *(string)* human-readable summary.
- `SupportedProtocols` *([]string)* e.g., `HTTP`, `HTTPS`, `OCI`.
- `CacheStrategy` *(CacheStrategyProfile)* embedded policy descriptor.
- `MigrationState` *(enum: `legacy`, `beta`, `ga`)* used for rollout dashboards.
- `Factory` *(function)* constructs proxy+cache handlers; not serialized but referenced in registry code.
- **Relationships**:
- One `ModuleMetadata` may serve many hubs via config binding.
### 3. ModuleRegistry
- **Representation**: in-memory map maintained by `internal/hubmodule/registry.go` at process boot.
- **Fields**:
- `Modules` *(map[string]ModuleMetadata)* keyed by `ModuleMetadata.Key`.
- `DefaultKey` *(string)* `legacy`.
- **Behavior**:
- `Register(meta ModuleMetadata)` called during init of each module package.
- `Resolve(key string) (ModuleMetadata, error)` used by router bootstrap; errors bubble to config validation.
- **Constraints**:
- Duplicate registrations fail fast.
- Registry must export a list function for diagnostics (`List()`), enabling observability endpoints if needed.
### 4. CacheStrategyProfile
- **Fields**:
- `TTL` *(duration)* default TTL per module; hubs may override via config.
- `ValidationMode` *(enum: `etag`, `last-modified`, `never`)* defines revalidation behavior.
- `DiskLayout` *(string)* description of path mapping rules (default `.body` suffix).
- `RequiresMetadataFile` *(bool)* whether `.meta` entries are required.
- `SupportsStreamingWrite` *(bool)* indicates module can write cache while proxying upstream.
- **Relationships**:
- Owned by `ModuleMetadata`; not independently referenced.
- **Validation**:
- TTL must be positive.
- Modules flagged as `SupportsStreamingWrite=false` must document fallback behavior before registration.
### 5. LegacyAdapterState
- **Purpose**: Tracks which hubs still run through the old shared implementation to support progressive migration.
- **Fields**:
- `HubName` *(string)* references `HubConfigEntry.Name`.
- ` rolloutFlag` *(enum: `legacy-only`, `dual`, `modular`)* indicates traffic split for that hub.
- `FallbackDeadline` *(timestamp, optional)* when legacy path will be removed.
- **Storage**: In-memory map derived from config + environment flags; optionally surfaced via diagnostics endpoint.
## State Transitions
1. **Module Adoption**
- Start: `HubConfigEntry.Module = "legacy"`.
- Transition: operator edits config to new module key, runs validation.
- Result: registry resolves new module, `LegacyAdapterState` updated to `dual` until rollout flag toggled fully.
2. **Cache Strategy Update**
- Start: Module uses default TTL.
- Transition: hub-level override applied in config.
- Result: Module receives override via dependency injection and persists it in module-local settings without affecting other hubs.
3. **Module Registration Lifecycle**
- Start: module package calls `Register` in its `init()`.
- Transition: duplicate key registration rejected; module must rename key or remove old registration.
- Result: `ModuleRegistry.Modules[key]` available during server bootstrap.
## Data Volume & Scale Assumptions
- Module metadata count is small (<20) and loaded entirely in memory.
- Hub count typically <50 per binary, so per-hub module resolution happens at startup and is cached.
- Disk usage remains the dominant storage cost; metadata adds negligible overhead.
## Identity & Uniqueness Rules
- `HubConfigEntry.Name` and `ModuleMetadata.Key` must each be unique (case-insensitive) within a config/process.
- Module registry rejects duplicate keys to avoid ambiguous bindings.

View File

@@ -0,0 +1,117 @@
# Implementation Plan: Modular Proxy & Cache Segmentation
**Branch**: `004-modular-proxy-cache` | **Date**: 2025-11-14 | **Spec**: /home/rogee/Projects/any-hub/specs/004-modular-proxy-cache/spec.md
**Input**: Feature specification from `/specs/004-modular-proxy-cache/spec.md`
**Note**: This template is filled in by the `/speckit.plan` command. See `.specify/templates/commands/plan.md` for the execution workflow.
## Summary
Modularize the proxy and cache layers so every hub type (npm, Docker, PyPI, future ecosystems) implements a self-contained module that conforms to shared interfaces, is registered via config, and exposes hub-specific cache strategies while preserving legacy behavior during phased migration. The work introduces a module registry/factory, per-hub configuration for selecting modules, migration tooling, and observability tags so operators can attribute incidents to specific modules.
## Technical Context
**Language/Version**: Go 1.25+ (静态链接,单二进制交付)
**Primary Dependencies**: Fiber v3HTTP 服务、Viper配置、Logrus + Lumberjack结构化日志 & 滚动)、标准库 `net/http`/`io`
**Storage**: 本地文件系统缓存目录 `StoragePath/<Hub>/<path>.body` + `.meta` 元数据(模块必须复用同一布局)
**Testing**: `go test ./...`,使用 `httptest`、临时目录和自建上游伪服务验证配置/缓存/代理路径
**Target Platform**: Linux/Unix CLI 进程,由 systemd/supervisor 管理,匿名下游客户端
**Project Type**: 单 Go 项目(`cmd/` 入口 + `internal/*` 包)
**Performance Goals**: 缓存命中直接返回;回源路径需流式转发,单请求常驻内存 <256MB命中/回源日志可追踪
**Constraints**: 禁止 Web UI 或账号体系;所有行为受单一 TOML 配置控制;每个 Hub 需独立 Domain/Port 绑定;仅匿名访问
**Scale/Scope**: 支撑 Docker/NPM/Go/PyPI 等多仓代理,面向弱网及离线缓存复用场景
**Module Registry Location**: `internal/hubmodule/registry.go` 暴露注册/解析 API模块子目录位于 `internal/hubmodule/<name>/`
**Config Binding for Modules**: `[[Hub]].Module` 字段控制模块名,默认 `legacy`,配置加载阶段校验必须命中已注册模块
## Constitution Check
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
- Feature 仍然是“轻量多仓 CLI 代理”,未引入 Web UI、账号体系或与代理无关的能力。
- 仅使用 Go + 宪法指定依赖;任何新第三方库都已在本计划中说明理由与审核结论。
- 行为完全由 `config.toml` 控制,新增 `[[Hub]].Module` 配置项已规划默认值、校验与迁移策略。
- 方案维持缓存优先 + 流式回源路径,并给出命中/回源/失败的日志与观测手段。
- 计划内列出了配置解析、缓存读写、Host Header 路由等强制测试与中文注释交付范围。
**Gate Status**: ✅ All pre-research gates satisfied; no violations logged in Complexity Tracking.
## Project Structure
### Documentation (this feature)
```text
specs/[###-feature]/
├── plan.md # This file (/speckit.plan command output)
├── research.md # Phase 0 output (/speckit.plan command)
├── data-model.md # Phase 1 output (/speckit.plan command)
├── quickstart.md # Phase 1 output (/speckit.plan command)
├── contracts/ # Phase 1 output (/speckit.plan command)
└── tasks.md # Phase 2 output (/speckit.tasks command - NOT created by /speckit.plan)
```
### Source Code (repository root)
```text
cmd/any-hub/main.go # CLI 入口、参数解析
internal/config/ # TOML 加载、默认值、校验
internal/server/ # Fiber 服务、路由、中间件
internal/cache/ # 磁盘/内存缓存与 .meta 管理
internal/proxy/ # 上游访问、缓存策略、流式复制
configs/ # 示例 config.toml如需
tests/ # `go test` 下的单元/集成测试,用临时目录
```
**Structure Decision**: 采用单 Go 项目结构,特性代码应放入上述现有目录;如需新增包或目录,必须解释其与 `internal/*` 的关系并给出后续维护策略。
## Complexity Tracking
> **Fill ONLY if Constitution Check has violations that must be justified**
| Violation | Why Needed | Simpler Alternative Rejected Because |
|-----------|------------|-------------------------------------|
| [e.g., 4th project] | [current need] | [why 3 projects insufficient] |
| [e.g., Repository pattern] | [specific problem] | [why direct DB access insufficient] |
## Phase 0 Research
### Unknowns & Tasks
- **Module registry location** → researched Go package placement that keeps modules isolated yet internal.
- **Config binding for modules** → determined safest schema extension and defaults.
- **Dependency best practices** → confirmed singletons for Fiber/Viper/Logrus and storage layout compatibility.
- **Testing harness expectations** → documented shared approach for new modules.
### Output Artifact
- `/home/rogee/Projects/any-hub/specs/004-modular-proxy-cache/research.md` summarizes each decision with rationale and alternatives.
### Impact on Plan
- Technical Context now references concrete package paths and configuration fields.
- Implementation will add `internal/hubmodule/` with registry helpers plus validation wiring in `internal/config`.
## Phase 1 Design & Contracts
### Data Model
- `/home/rogee/Projects/any-hub/specs/004-modular-proxy-cache/data-model.md` defines HubConfigEntry, ModuleMetadata, ModuleRegistry, CacheStrategyProfile, and LegacyAdapterState including validation and state transitions.
### API Contracts
- `/home/rogee/Projects/any-hub/specs/004-modular-proxy-cache/contracts/module-registry.openapi.yaml` introduces a diagnostics API (`GET /-/modules`, `GET /-/modules/{key}`) for observability around module registrations and hub bindings.
### Quickstart Guidance
- `/home/rogee/Projects/any-hub/specs/004-modular-proxy-cache/quickstart.md` walks engineers through adding a module, wiring config, running tests, and verifying logs/storage.
### Agent Context Update
- `.specify/scripts/bash/update-agent-context.sh codex` executed to sync AGENTS.md with Go/Fiber/Viper/logging/storage context relevant to this feature.
### Post-Design Constitution Check
- New diagnostics endpoint remains internal and optional; no UI/login introduced. ✅ Principle I
- Code still single Go binary with existing dependency set. ✅ Principle II
- `Module` field documented with defaults, validation, and migration path; no extra config sources. ✅ Principle III
- Cache strategy enforces `.body` layout and streaming flow, with telemetry requirements captured in contracts. ✅ Principle IV
- Logs/quickstart/test guidance ensure observability and Chinese documentation continue. ✅ Principle V
## Phase 2 Implementation Outlook (pre-tasks)
1. **Module Registry & Interfaces**: Create `internal/hubmodule` package, define shared interfaces, implement registry with tests, and expose diagnostics data source reused by HTTP endpoints.
2. **Config Loader & Validation**: Extend `internal/config/types.go` and `validation.go` to include `Module` with default `legacy`, plus wiring to registry resolution during startup.
3. **Legacy Adapter & Migration Switches**: Provide adapter module that wraps current shared proxy/cache, plus feature flags or config toggles to control rollout states per hub.
4. **Module Implementations**: Carve existing npm/docker/pypi logic into dedicated modules within `internal/hubmodule/`, ensuring cache writer uses `.body` layout and telemetry tags.
5. **Observability/Diagnostics**: Implement `//modules` endpoint (Fiber route) and log tags showing `module_key` on cache/proxy events.
6. **Testing**: Add shared test harness for modules, update integration tests to cover mixed legacy + modular hubs, and document commands in README/quickstart.

View File

@@ -0,0 +1,28 @@
# Quickstart: Modular Proxy & Cache Segmentation
## 1. Prepare Workspace
1. Ensure Go 1.25+ toolchain is installed (`go version`).
2. From repo root, run `go mod tidy` (or `make deps` if defined) to sync modules.
3. Export `ANY_HUB_CONFIG` pointing to your working config (optional).
## 2. Create/Update Hub Module
1. Copy `internal/hubmodule/template/` to `internal/hubmodule/<module-key>/` and rename the package/types.
2. In the new package's `init()`, call `hubmodule.MustRegister(hubmodule.ModuleMetadata{Key: "<module-key>", ...})` to describe supported protocols、缓存策略与迁移阶段。
3. Register runtime behavior (proxy handler) from your module by calling `proxy.RegisterModuleHandler("<module-key>", handler)` during initialization.
4. Add tests under the module directory and run `make modules-test` (delegates to `go test ./internal/hubmodule/...`).
## 3. Bind Module via Config
1. Edit `config.toml` and set `Module = "<module-key>"` inside the target `[[Hub]]` block (omit to use `legacy`).
2. (Optional) Override cache behavior per hub using existing fields (`CacheTTL`, etc.).
3. Run `ANY_HUB_CONFIG=./config.toml go test ./...` to ensure loader validation passes.
## 4. Run and Verify
1. Start the binary: `go run ./cmd/any-hub --config ./config.toml`.
2. Send traffic to the hub's domain/port and watch logs for `module_key=<module-key>` tags.
3. Inspect `./storage/<hub>/` to confirm `.body` files are written by the module.
4. Exercise rollback by switching `Module` back to `legacy` if needed.
## 5. Ship
1. Commit module code + config docs.
2. Update release notes mentioning the module key, migration guidance, and related diagnostics.
3. Monitor cache hit/miss metrics post-deploy; adjust TTL overrides if necessary.

View File

@@ -0,0 +1,30 @@
# Research Log: Modular Proxy & Cache Segmentation
## Decision 1: Module Registry Location
- **Decision**: Introduce `internal/hubmodule/` as the root for module implementations plus a `registry.go` that exposes `Register(name ModuleFactory)` and `Resolve(hubType string)` helpers.
- **Rationale**: Keeps new hub-specific code outside `internal/proxy`/`internal/cache` core while still within internal tree; mirrors existing package layout expectations and eases discovery.
- **Alternatives considered**:
- Embed modules under `internal/proxy/<hub>`: rejected because cache + proxy concerns would blend with shared proxy infra, blurring ownership lines.
- Place modules under `pkg/`: rejected since repo avoids exported libraries and wants all runtime code under `internal`.
## Decision 2: Config Binding Field
- **Decision**: Add optional `Module` string field to each `[[Hub]]` block in `config.toml`, defaulting to `"legacy"` to preserve current behavior. Validation ensures the value matches a registered module key.
- **Rationale**: Minimal change to config schema, symmetric across hubs, and allows gradual opt-in by flipping a single field.
- **Alternatives considered**:
- Auto-detect module from `hub.Name`: rejected because naming conventions differ across users and would impede third-party forks.
- Separate `ProxyModule`/`CacheModule` fields: rejected per clarification outcome that modules encapsulate both behaviors.
## Decision 3: Fiber/Viper/Logrus Best Practices for Modular Architecture
- **Decision**: Continue to initialize Fiber/Viper/Logrus exactly once at process start; modules receive interfaces (logger, config handles) instead of initializing their own instances.
- **Rationale**: Prevents duplicate global state and adheres to constitution (single binary, centralized config/logging).
- **Alternatives considered**: Allow modules to spin up custom Fiber groups or loggers—rejected because it complicates shutdown hooks and breaks structured logging consistency.
## Decision 4: Storage Layout Compatibility
- **Decision**: Keep current `StoragePath/<Hub>/<path>.body` layout; modules may add subdirectories below `<path>` only when necessary but must expose migration hooks via registry metadata.
- **Rationale**: Recent cache fix established `.body` suffix to avoid file/dir conflicts; modules should reuse it to maintain operational tooling compatibility.
- **Alternatives considered**: Give each module a distinct root folder—rejected because it would fragment cleanup tooling and require per-module disk quotas.
## Decision 5: Testing Strategy
- **Decision**: For each module, enforce a shared test harness that spins a fake upstream using `httptest.Server`, writes to `t.TempDir()` storage, and asserts registry wiring end-to-end via integration tests.
- **Rationale**: Aligns with Technical Context testing guidance while avoiding bespoke harnesses per hub type.
- **Alternatives considered**: Rely solely on unit tests per module—rejected since regressions often arise from wiring/registry mistakes.

View File

@@ -0,0 +1,106 @@
# Feature Specification: Modular Proxy & Cache Segmentation
**Feature Branch**: `004-modular-proxy-cache`
**Created**: 2025-11-14
**Status**: Draft
**Input**: User description: "当前项目使用一个共用的 proxy、Cache 层处理代理逻辑, 这样导致在新增或变更接入时需要考虑已有类型的兼容造成了后续可维护性变弱。把每种代理、缓存层使用类型进行分模块目录组织编写抽象统一的interface用于 功能约束,这样虽然不同类型模块会有部分代码重复,但是可维护性会大大增强。"
> 宪法对齐v1.0.0
> - 保持“轻量、匿名、CLI 多仓代理”定位:不得引入 Web UI、账号体系或与代理无关的范围。
> - 方案必须基于 Go 1.25+ 单二进制,依赖仅限 Fiber、Viper、Logrus/Lumberjack 及必要标准库。
> - 所有行为由单一 `config.toml` 控制;若需新配置项,需在规范中说明字段、默认值与迁移策略。
> - 设计需维护缓存优先 + 流式传输路径,并描述命中/回源/失败时的日志与观测需求。
> - 验收必须包含配置解析、缓存读写、Host Header 绑定等测试与中文注释交付约束。
## Clarifications
### Session 2025-11-14
- Q: Should each hub select proxy and cache modules separately or through a single combined module? → A: Single combined module per hub encapsulating proxy + cache behaviors.
## User Scenarios & Testing *(mandatory)*
### User Story 1 - Add A New Hub Type Without Regressions (Priority: P1)
As a platform maintainer, I can scaffold a dedicated proxy + cache module for a new hub type without touching existing hub implementations so I avoid regressions and lengthy reviews.
**Why this priority**: Unlocks safe onboarding of new ecosystems (npm, Docker, PyPI, etc.) which is the primary growth lever.
**Independent Test**: Provision a sample "testhub" type, wire it through config, and run integration tests showing legacy hubs still route correctly.
**Acceptance Scenarios**:
1. **Given** an empty module directory following the prescribed skeleton, **When** the maintainer registers the module via the unified interface, **Then** the hub becomes routable via config with no code changes in other hub modules.
2. **Given** existing hubs running in production, **When** the new hub type is added, **Then** regression tests confirm traffic for other hubs is unchanged and logs correctly identify hub-specific modules.
---
### User Story 2 - Tailor Cache Behavior Per Hub (Priority: P2)
As an SRE, I can choose a cache strategy module that matches a hubs upstream semantics (e.g., npm tarballs vs. metadata) and tune TTL/validation knobs without rewriting shared logic.
**Why this priority**: Cache efficiency and disk safety differ by artifact type; misconfiguration previously caused incidents like "not a directory" errors.
**Independent Test**: Swap cache strategies for one hub in staging and verify cache hit/miss, revalidation, and eviction behavior follow the new modules contract while others remain untouched.
**Acceptance Scenarios**:
1. **Given** a hub referencing cache strategy `npm-tarball`, **When** TTL overrides are defined in config, **Then** only that hubs cache files adopt the overrides and telemetry reports the chosen strategy.
2. **Given** a hub using a streaming proxy that forbids disk writes, **When** the hub switches to a cache-enabled module, **Then** the interface enforces required callbacks (write, validate, purge) before deployment passes.
---
### User Story 3 - Operate Mixed Generations During Migration (Priority: P3)
As a release manager, I can keep legacy shared modules alive while migrating hubs incrementally, with clear observability that highlights which hubs still depend on the old stack.
**Why this priority**: Avoids risky flag days and allows gradual cutovers aligned with hub traffic peaks.
**Independent Test**: Run a deployment where half the hubs use the modular stack and half remain on the legacy stack, verifying routing table, logging, and alerts distinguish both paths.
**Acceptance Scenarios**:
1. **Given** hubs split between legacy and new modules, **When** traffic flows through both, **Then** logs, metrics, and config dumps tag each request path with its module name for debugging.
2. **Given** a hub scheduled for migration, **When** the rollout flag switches it to the modular implementation, **Then** rollback toggles exist to return to legacy routing within one command.
---
### Edge Cases
- What happens when config references a hub type whose proxy/cache module has not been registered? System must fail fast during config validation with actionable errors.
- How does the system handle partial migrations where legacy cache files conflict with new module layouts? Must auto-migrate or isolate on first access to prevent `ENOTDIR`.
- How is observability handled when a module panics or returns invalid data? The interface must standardize error propagation so circuit breakers/logging stay consistent.
## Requirements *(mandatory)*
### Functional Requirements
- **FR-001**: Provide explicit proxy and cache interfaces describing the operations (request admission, upstream fetch, cache read/write/invalidation, observability hooks) that every hub-specific module must implement.
- **FR-002**: Restructure the codebase so each hub type registers a single module directory that owns both proxy and cache behaviors (optional internal subpackages allowed) while sharing only the common interfaces; no hub-specific logic may leak into the shared adapters.
- **FR-003**: Implement a registry or factory that maps the `config.toml` hub definition to the corresponding proxy/cache module and fails validation if no module is found.
- **FR-004**: Allow hub-level overrides for cache behaviors (TTL, validation strategy, disk layout) that modules can opt in to, with documented defaults and validation of allowed ranges.
- **FR-005**: Maintain backward compatibility by providing a legacy adapter that wraps the existing shared proxy/cache until all hubs migrate, including feature flags to switch per hub.
- **FR-006**: Ensure runtime telemetry (logs, metrics, tracing spans) include the module identifier so operators can attribute failures or latency to a specific hub module.
- **FR-007**: Deliver migration guidance and developer documentation outlining how to add a new module, required tests, and expected directory structure.
- **FR-008**: Update automated tests (unit + integration) so each module can be exercised independently and regression suites cover mixed legacy/new deployments.
### Key Entities *(include if feature involves data)*
- **Hub Module**: Represents a cohesive proxy+cache implementation for a specific ecosystem; attributes include supported protocols, cache strategy hooks, telemetry tags, and configuration constraints.
- **Module Registry**: Describes the mapping between hub names/types in config and their module implementations; stores module metadata (version, status, migration flag) for validation and observability.
- **Cache Strategy Profile**: Captures the policy knobs a module exposes (TTL, validation method, disk layout, eviction rules) and the allowed override values defined per hub.
### Assumptions
- Existing hubs (npm, Docker, PyPI) will be migrated sequentially; legacy adapters remain available until the last hub switches.
- Engineers adding a new hub type can modify configuration schemas and documentation but not core runtime dependencies.
- Telemetry stack (logs/metrics) already exists and only requires additional tags; no new observability backend is needed.
## Success Criteria *(mandatory)*
### Measurable Outcomes
- **SC-001**: A new hub type can be added by touching only its module directory plus configuration (≤2 additional files) and passes the modules test suite within one working day.
- **SC-002**: Regression test suites show zero failing cases for unchanged hubs after enabling the modular architecture (baseline established before rollout).
- **SC-003**: Configuration validation rejects 100% of hubs that reference unregistered modules, preventing runtime panics in staging or production.
- **SC-004**: Operational logs for proxy and cache events include the module identifier in 100% of entries, enabling SREs to scope incidents in under 5 minutes.

View File

@@ -0,0 +1,108 @@
# Tasks: Modular Proxy & Cache Segmentation
**Input**: Design documents from `/specs/004-modular-proxy-cache/`
**Prerequisites**: plan.md, spec.md, research.md, data-model.md, contracts/, quickstart.md
**Tests**: 必须覆盖配置解析 (`internal/config`)、缓存读写 (`internal/cache` + 模块)、代理命中/回源 (`internal/proxy`)、Host Header 绑定与日志 (`internal/server`).
## Phase 1: Setup (Shared Infrastructure)
- [X] T001 Scaffold `internal/hubmodule/` package with `doc.go` + `README.md` describing module contracts
- [X] T002 [P] Add `modules-test` target to `Makefile` running `go test ./internal/hubmodule/...` for future CI hooks
---
## Phase 2: Foundational (Blocking Prerequisites)
- [X] T003 Create shared module interfaces + registry in `internal/hubmodule/interfaces.go` and `internal/hubmodule/registry.go`
- [X] T004 Extend config schema with `[[Hub]].Module` defaults/validation plus sample configs in `internal/config/{types.go,validation.go,loader.go}` and `configs/*.toml`
- [X] T005 [P] Wire server bootstrap to resolve modules once and inject into proxy/cache layers (`internal/server/bootstrap.go`, `internal/proxy/handler.go`)
**Checkpoint**: Registry + config plumbing complete; user story work may begin.
---
## Phase 3: User Story 1 - Add A New Hub Type Without Regressions (Priority: P1) 🎯 MVP
**Goal**: Allow engineers to add a dedicated proxy+cache module without modifying existing hubs.
**Independent Test**: Register a `testhub` module, enable it via config, and run integration tests proving other hubs remain unaffected.
### Tests
- [X] T006 [P] [US1] Add registry unit tests covering register/resolve/list/dedup in `internal/hubmodule/registry_test.go`
- [X] T007 [P] [US1] Add integration test proving new module routing isolation in `tests/integration/module_routing_test.go`
### Implementation
- [X] T008 [US1] Implement `legacy` adapter module that wraps current shared proxy/cache in `internal/hubmodule/legacy/legacy_module.go`
- [X] T009 [US1] Refactor server/proxy wiring to resolve modules per hub (`internal/server/router.go`, `internal/proxy/forwarder.go`)
- [X] T010 [P] [US1] Create reusable module template with Chinese comments under `internal/hubmodule/template/module.go`
- [X] T011 [US1] Update quickstart + README to document module creation and config binding (`specs/004-modular-proxy-cache/quickstart.md`, `README.md`)
---
## Phase 4: User Story 2 - Tailor Cache Behavior Per Hub (Priority: P2)
**Goal**: Enable per-hub cache strategies/TTL overrides while keeping modules isolated.
**Independent Test**: Swap a hub to a cache strategy module, adjust TTL overrides, and confirm telemetry/logs reflect the new policy without affecting other hubs.
### Tests
- [ ] T012 [P] [US2] Add cache strategy override integration test validating TTL + revalidation paths in `tests/integration/cache_strategy_override_test.go`
- [ ] T013 [P] [US2] Add module-level cache strategy unit tests in `internal/hubmodule/npm/module_test.go`
### Implementation
- [ ] T014 [US2] Implement `CacheStrategyProfile` helpers and injection plumbing (`internal/hubmodule/strategy.go`, `internal/cache/writer.go`)
- [ ] T015 [US2] Bind hub-level overrides to strategy metadata via config/runtime structures (`internal/config/types.go`, `internal/config/runtime.go`)
- [ ] T016 [US2] Update existing modules (npm/docker/pypi) to declare strategies + honor overrides (`internal/hubmodule/{npm,docker,pypi}/module.go`)
---
## Phase 5: User Story 3 - Operate Mixed Generations During Migration (Priority: P3)
**Goal**: Support dual-path deployments with diagnostics/logging to track legacy vs. modular hubs.
**Independent Test**: Run mixed legacy/modular hubs, flip rollout flags, and confirm logs + diagnostics show module ownership and allow rollback.
### Tests
- [ ] T017 [P] [US3] Add dual-mode integration test covering rollout toggle + rollback in `tests/integration/legacy_adapter_toggle_test.go`
- [ ] T018 [P] [US3] Add diagnostics endpoint contract test for `//modules` in `tests/integration/module_diagnostics_test.go`
### Implementation
- [ ] T019 [US3] Implement `LegacyAdapterState` tracker + rollout flag parsing (`internal/hubmodule/legacy/state.go`, `internal/config/runtime_flags.go`)
- [ ] T020 [US3] Implement Fiber handler + routing for `//modules` diagnostics (`internal/server/routes/modules.go`, `internal/server/router.go`)
- [ ] T021 [US3] Add structured log fields (`module_key`, `rollout_flag`) across logging middleware (`internal/server/middleware/logging.go`, `internal/proxy/logging.go`)
- [ ] T022 [US3] Document operational playbook for phased migration (`docs/operations/migration.md`)
---
## Phase 6: Polish & Cross-Cutting Concerns
- [ ] T023 [P] Add Chinese comments + GoDoc for new interfaces/modules (`internal/hubmodule/**/*.go`)
- [ ] T024 Validate quickstart by running module creation flow end-to-end and capture sample logs (`specs/004-modular-proxy-cache/quickstart.md`, `logs/`)
---
## Dependencies & Execution Order
1. **Phase 1 → Phase 2**: Setup must finish before registry/config work begins.
2. **Phase 2 → User Stories**: Module registry + config binding are prerequisites for all stories.
3. **User Stories Priority**: US1 (P1) delivers MVP and unblocks US2/US3; US2 & US3 can run in parallel after US1 if separate modules/files.
4. **Tests before Code**: For each story, write failing tests (T006/T007, T012/T013, T017/T018) before implementation tasks in that story.
5. **Polish**: Execute after all targeted user stories complete.
## Parallel Execution Examples
- **Setup**: T001 (docs) and T002 (Makefile) can run concurrently.
- **US1**: T006 registry tests and T007 routing tests can run in parallel while separate engineers tackle T008/T010.
- **US2**: T012 integration test and T013 unit test proceed concurrently; T014/T015 can run in parallel once T012/T013 drafted.
- **US3**: T017 rollout test and T018 diagnostics test work independently before T019T021 wiring.
## Implementation Strategy
1. Deliver MVP by completing Phases 13 (US1) and verifying new module onboarding works end-to-end.
2. Iterate with US2 for cache flexibility, ensuring overrides are testable independently.
3. Layer US3 for migration observability and rollback safety.
4. Finish with Polish tasks to document and validate the workflow.