Add Claude Router to docker-configs

Integrate Claude Router v1.1.0 as a new service in the main docker-compose stack: Features: - Smart API router with automatic failover between Claude Pro and Claude API - Scheduled health checks every hour (0-4 minutes) to detect quota recovery - Intelligent auto-switch back to Claude Pro when available - Manual health check endpoint for immediate testing - Complete documentation and Docker containerization - Compatible with Claude Code CLI Changes: - Add router/ subdirectory with complete Claude Router project - Integrate claude-router service into main docker-compose.yml - Resolve port conflict (move SillyTavern to 8002, Claude Router uses 8000) - Update .gitignore for router-specific exclusions The router automatically detects when Claude Pro quota is restored and switches back to prioritize the premium service. 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-14 19:08:12 -05:00
parent e2696447bf
commit 261ac9d563
8 changed files with 721 additions and 1 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -22,5 +22,11 @@
 HA/config/
 HA/db_data/
 # Router specific ignores
 router/__pycache__/
 router/venv/
 router/*.pyc
 router/*.log
 # Keep structure
 !.gitkeep
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -140,7 +140,7 @@ services:
    container_name: sillytavern
    restart: unless-stopped
    ports:
-      - "8000:8000"
+      - "8002:8000"  # Changed port to avoid conflict with claude-router
    environment:
      - TZ=America/Chicago
    volumes:
@@ -148,6 +148,26 @@ services:
    networks:
      - web_network
  # Claude Router - AI API智能路由器
  claude-router:
    build: ./router
    container_name: claude-router
    restart: unless-stopped
    ports:
      - "8000:8000"
    environment:
      - CLAUDE_API_KEY=${CLAUDE_API_KEY}
    volumes:
      - ./tokens.txt:/home/will/docker/tokens.txt:ro
    networks:
      - web_network
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 30s
 volumes:
  discord_config:
  discord_logs:
--- a/router/Dockerfile
+++ b/router/Dockerfile
@@ -0,0 +1,32 @@
 FROM python:3.11-slim
 # Set working directory
 WORKDIR /app
 # Install system dependencies
 RUN apt-get update && apt-get install -y \
    curl \
    && rm -rf /var/lib/apt/lists/*
 # Copy requirements first for better caching
 COPY requirements.txt .
 # Install Python dependencies
 RUN pip install --no-cache-dir -r requirements.txt
 # Copy application code
 COPY . .
 # Create non-root user
 RUN useradd -m -u 1000 router && chown -R router:router /app
 USER router
 # Expose port
 EXPOSE 8000
 # Health check
 HEALTHCHECK --interval=30s --timeout=10s --start-period=30s --retries=3 \
    CMD curl -f http://localhost:8000/health || exit 1
 # Run the application
 CMD ["python", "app.py"]
--- a/router/README.md
+++ b/router/README.md
@@ -0,0 +1,238 @@
 # Claude Router
 一个智能的Claude API路由器，支持Claude Pro和Claude API之间的自动故障转移。当Claude Pro达到使用限制时，自动切换到Claude API，确保服务的连续性。
 ## 功能特性
 - **自动故障转移**: 检测到速率限制或使用限制时自动切换provider
 - **定时健康检查**: 每小时前5分钟自动检测Claude Pro限额恢复
 - **智能恢复**: 自动切换回Claude Pro，优先使用高级功能
 - **手动切换**: 支持手动切换到指定provider
 - **兼容Claude Code CLI**: 完全兼容Anthropic API格式
 - **Docker化部署**: 一键部署，开箱即用
 ## 快速开始
 ### 1. 使用Docker Compose部署
 ```bash
 # 克隆或进入项目目录
 cd /home/will/docker/router
 # 构建并启动服务
 docker-compose up -d
 # 查看服务状态
 docker-compose ps
 ```
 ### 2. 验证服务运行
 ```bash
 # 健康检查
 curl http://localhost:8000/health
 # 查看当前状态
 curl http://localhost:8000/v1/status
 ```
 ### 3. 配置Claude Code CLI
 设置环境变量将Claude Code CLI指向路由器：
 ```bash
 # 设置API endpoint为路由器地址
 export ANTHROPIC_API_URL="http://localhost:8000"
 # 添加到bashrc使其永久生效
 echo 'export ANTHROPIC_API_URL="http://localhost:8000"' >> ~/.bashrc
 # 测试配置
 echo "Hello Claude Router" | claude --print
 ```
 **注意**: 无需修改ANTHROPIC_API_KEY，路由器会自动处理API密钥。
 ## API端点
 ### 主要端点
 - `POST /v1/messages` - Claude API消息创建（兼容Anthropic API）
 - `GET /health` - 健康检查
 - `GET /v1/status` - 获取路由器状态
 - `POST /v1/switch-provider` - 手动切换provider
 - `POST /v1/health-check` - 手动触发Claude Pro健康检查
 ### 健康检查响应示例
 ```json
 {
  "status": "healthy",
  "current_provider": "claude_pro", 
  "failover_count": 0,
  "last_failover": null,
  "last_health_check": "2025-07-14T19:00:00.000Z",
  "health_check_failures": 0,
  "providers": {
    "claude_pro": {"active": true},
    "claude_api": {"active": true}
  }
 }
 ```
 ## 配置说明
 ### 环境变量
 - `CLAUDE_API_KEY`: Claude API密钥
 - `ROUTER_HOST`: 服务监听地址（默认: 0.0.0.0）
 - `ROUTER_PORT`: 服务监听端口（默认: 8000）
 - `MAX_RETRIES`: 最大重试次数（默认: 3）
 - `RETRY_DELAY`: 重试延迟（默认: 1.0秒）
 ### 健康检查配置
 - `health_check_enabled`: 是否启用定时健康检查（默认: true）
 - `health_check_cron`: 检查时间表达式（默认: "0-4 * * * *" - 每小时前5分钟）
 - `health_check_message`: 测试消息内容（默认: "ping"）
 - `health_check_model`: 使用的模型（默认: claude-3-haiku-20240307）
 ### Token文件
 路由器会自动从 `/home/will/docker/tokens.txt` 读取API密钥，无需手动配置环境变量。
 ## 故障转移机制
 当检测到以下错误时，路由器会自动切换到下一个可用的provider：
 - 429 (Too Many Requests)
 - 速率限制错误
 - 使用限制达到
 - "usage limit reached"相关错误
 **优先级顺序**: Claude Pro → Claude API
 ## 使用示例
 ### 基本API调用
 ```bash
 curl -X POST http://localhost:8000/v1/messages \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your_api_key" \
  -d '{
    "model": "claude-3-sonnet-20240229",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Hello, Claude!"}
    ]
  }'
 ```
 ### 手动切换provider
 ```bash
 curl -X POST http://localhost:8000/v1/switch-provider \
  -H "Content-Type: application/json" \
  -d '"claude_api"'
 ```
 ### 手动健康检查
 ```bash
 # 立即检测Claude Pro是否可用
 curl -X POST http://localhost:8000/v1/health-check
 # 查看详细状态
 curl http://localhost:8000/v1/status
 ```
 ## 开发和调试
 ### 本地开发
 ```bash
 # 创建虚拟环境
 python3 -m venv venv
 source venv/bin/activate
 # 安装依赖
 pip install -r requirements.txt
 # 运行应用
 python app.py
 ```
 ### 查看日志
 ```bash
 # Docker容器日志
 docker-compose logs -f claude-router
 # 实时日志
 docker logs -f claude-router
 ```
 ## 故障排除
 ### 常见问题
 1. **服务无法启动**
   - 检查tokens.txt文件是否存在且格式正确
   - 确认端口8000未被占用
 2. **API调用失败**
   - 验证API密钥是否有效
   - 检查网络连接到api.anthropic.com
 3. **自动切换不工作**
   - 查看日志确认错误检测逻辑
   - 确认backup provider配置正确
 ### 监控
 - 健康检查: `http://localhost:8000/health`
 - 状态监控: `http://localhost:8000/v1/status`
 - Docker健康检查: `docker inspect claude-router`
 ## 技术架构
 - **框架**: FastAPI + Uvicorn
 - **HTTP客户端**: httpx
 - **AI库**: anthropic
 - **容器化**: Docker + Docker Compose
 - **配置管理**: pydantic + python-dotenv
 ## 版本信息
 - 版本: 1.0.0 (MVP)
 - Python: 3.11+
 - 支持: Claude-3 系列模型
 ## 更新日志
 ### v1.1.0 (2025-07-14)
 - ✅ 添加定时健康检查功能
 - ✅ 每小时前5分钟自动检测Claude Pro限额恢复
 - ✅ 智能自动切换回Claude Pro
 - ✅ 新增手动健康检查API
 - ✅ 完善日志记录和状态监控
 ### v1.0.0 (2025-07-14)
 - ✅ 基础路由器功能
 - ✅ Claude Pro到Claude API自动故障转移
 - ✅ Docker容器化部署
 - ✅ Claude Code CLI兼容性
 ## 后续开发计划
 - [ ] 添加DeepSeek API支持
 - [ ] 添加ChatGPT API支持  
 - [ ] 实现请求统计和监控面板
 - [ ] 添加请求缓存功能
 - [ ] 支持负载均衡
 - [ ] 集成Kimi v2 API
 ## 许可证
 MIT License
--- a/router/app.py
+++ b/router/app.py
@@ -0,0 +1,338 @@
 import asyncio
 import json
 import logging
 from datetime import datetime
 from typing import Dict, Any, Optional
 from contextlib import asynccontextmanager
 import httpx
 from fastapi import FastAPI, Request, HTTPException
 from fastapi.responses import StreamingResponse, JSONResponse
 from anthropic import Anthropic
 from apscheduler.schedulers.asyncio import AsyncIOScheduler
 from apscheduler.triggers.cron import CronTrigger
 from config import config
 # Configure logging
 logging.basicConfig(level=logging.INFO)
 logger = logging.getLogger(__name__)
 class ClaudeRouter:
    def __init__(self):
        self.current_provider = "claude_pro"
        self.failover_count = 0
        self.last_failover = None
        self.last_health_check = None
        self.health_check_failures = 0
        self.scheduler = None
        self.providers = {
            "claude_pro": {
                "api_key": config.claude_pro_api_key,
                "base_url": config.claude_pro_base_url,
                "active": True
            },
            "claude_api": {
                "api_key": config.claude_api_key, 
                "base_url": config.claude_api_base_url,
                "active": True
            }
        }
    async def get_anthropic_client(self, provider: str) -> Anthropic:
        """Get Anthropic client for the specified provider"""
        if provider not in self.providers:
            raise ValueError(f"Unknown provider: {provider}")
        provider_config = self.providers[provider]
        return Anthropic(
            api_key=provider_config["api_key"],
            base_url=provider_config["base_url"]
        )
    async def should_failover(self, error: Exception) -> bool:
        """Determine if we should failover based on the error"""
        error_str = str(error).lower()
        # Check for rate limiting or usage limit errors
        failover_indicators = [
            "rate_limit",
            "usage limit",
            "quota exceeded",
            "429",
            "too many requests",
            "limit reached"
        ]
        return any(indicator in error_str for indicator in failover_indicators)
    async def failover_to_next_provider(self):
        """Switch to the next available provider"""
        providers_list = list(self.providers.keys())
        current_index = providers_list.index(self.current_provider)
        # Try next provider
        for i in range(1, len(providers_list)):
            next_index = (current_index + i) % len(providers_list)
            next_provider = providers_list[next_index]
            if self.providers[next_provider]["active"]:
                logger.info(f"Failing over from {self.current_provider} to {next_provider}")
                self.current_provider = next_provider
                self.failover_count += 1
                self.last_failover = datetime.now()
                return True
        logger.error("No active providers available for failover")
        return False
    async def make_request(self, request_data: Dict[str, Any]) -> Dict[str, Any]:
        """Make request with automatic failover"""
        max_attempts = len(self.providers)
        for attempt in range(max_attempts):
            try:
                client = await self.get_anthropic_client(self.current_provider)
                # Extract parameters from request
                messages = request_data.get("messages", [])
                model = request_data.get("model", "claude-3-sonnet-20240229")
                max_tokens = request_data.get("max_tokens", 4096)
                stream = request_data.get("stream", False)
                logger.info(f"Making request with provider: {self.current_provider}")
                # Make the API call
                if hasattr(client, 'messages'):
                    response = await asyncio.to_thread(
                        client.messages.create,
                        model=model,
                        max_tokens=max_tokens,
                        messages=messages,
                        stream=stream
                    )
                else:
                    # For older anthropic versions
                    response = await asyncio.to_thread(
                        client.completions.create,
                        model=model,
                        max_tokens_to_sample=max_tokens,
                        prompt=f"Human: {messages[0]['content']}\n\nAssistant:",
                        stream=stream
                    )
                return response
            except Exception as e:
                logger.error(f"Request failed with {self.current_provider}: {str(e)}")
                if await self.should_failover(e) and attempt < max_attempts - 1:
                    if await self.failover_to_next_provider():
                        continue
                # If this is the last attempt or failover failed, raise the error
                if attempt == max_attempts - 1:
                    raise HTTPException(status_code=500, detail=f"All providers failed. Last error: {str(e)}")
        raise HTTPException(status_code=500, detail="No providers available")
    async def health_check_claude_pro(self):
        """Check if Claude Pro is available again"""
        # Only check if we're not currently using Claude Pro
        if self.current_provider == "claude_pro":
            logger.debug("Skipping health check - already using Claude Pro")
            return
        logger.info("Running Claude Pro health check...")
        self.last_health_check = datetime.now()
        try:
            client = Anthropic(
                api_key=config.claude_pro_api_key,
                base_url=config.claude_pro_base_url
            )
            # Send a minimal test message
            if hasattr(client, 'messages'):
                response = await asyncio.to_thread(
                    client.messages.create,
                    model=config.health_check_model,
                    max_tokens=10,
                    messages=[{"role": "user", "content": config.health_check_message}]
                )
            else:
                # For older anthropic versions
                response = await asyncio.to_thread(
                    client.completions.create,
                    model=config.health_check_model,
                    max_tokens_to_sample=10,
                    prompt=f"Human: {config.health_check_message}\n\nAssistant:"
                )
            # If successful, switch back to Claude Pro
            old_provider = self.current_provider
            self.current_provider = "claude_pro"
            self.health_check_failures = 0
            logger.info(f"Claude Pro health check successful! Switched from {old_provider} to claude_pro")
        except Exception as e:
            self.health_check_failures += 1
            error_str = str(e).lower()
            if any(indicator in error_str for indicator in ["rate_limit", "usage limit", "quota exceeded", "429", "too many requests", "limit reached"]):
                logger.info(f"Claude Pro still rate limited: {str(e)}")
            else:
                logger.warning(f"Claude Pro health check failed (attempt {self.health_check_failures}): {str(e)}")
    def start_scheduler(self):
        """Start the health check scheduler"""
        if not config.health_check_enabled:
            logger.info("Health check disabled in config")
            return
        self.scheduler = AsyncIOScheduler()
        # Schedule health check using cron expression
        self.scheduler.add_job(
            self.health_check_claude_pro,
            trigger=CronTrigger.from_crontab(config.health_check_cron),
            id="claude_pro_health_check",
            name="Claude Pro Health Check",
            misfire_grace_time=60
        )
        self.scheduler.start()
        logger.info(f"Health check scheduler started with cron: {config.health_check_cron}")
    def stop_scheduler(self):
        """Stop the health check scheduler"""
        if self.scheduler:
            self.scheduler.shutdown()
            logger.info("Health check scheduler stopped")
 # Initialize router
 router = ClaudeRouter()
@asynccontextmanager
 async def lifespan(app: FastAPI):
    logger.info("Claude Router starting up...")
    logger.info(f"Current provider: {router.current_provider}")
    # Start health check scheduler
    router.start_scheduler()
    yield
    # Stop scheduler on shutdown
    router.stop_scheduler()
    logger.info("Claude Router shutting down...")
 app = FastAPI(
    title="Claude Router",
    description="Smart router for Claude API with automatic failover",
    version="1.0.0",
    lifespan=lifespan
 )
@app.get("/health")
 async def health_check():
    """Health check endpoint"""
    return {
        "status": "healthy",
        "current_provider": router.current_provider,
        "failover_count": router.failover_count,
        "last_failover": router.last_failover.isoformat() if router.last_failover else None,
        "providers": {
            name: {"active": provider_config["active"]} 
            for name, provider_config in router.providers.items()
        },
        "last_health_check": router.last_health_check.isoformat() if router.last_health_check else None,
        "health_check_failures": router.health_check_failures
    }
@app.post("/v1/messages")
 async def create_message(request: Request):
    """Handle Claude API message creation with failover"""
    try:
        request_data = await request.json()
        stream = request_data.get("stream", False)
        if stream:
            # Handle streaming response
            async def generate_stream():
                try:
                    response = await router.make_request(request_data)
                    for chunk in response:
                        yield f"data: {json.dumps(chunk.model_dump())}\n\n"
                    yield "data: [DONE]\n\n"
                except Exception as e:
                    error_data = {"error": str(e)}
                    yield f"data: {json.dumps(error_data)}\n\n"
            return StreamingResponse(
                generate_stream(),
                media_type="text/event-stream",
                headers={
                    "Cache-Control": "no-cache",
                    "Connection": "keep-alive"
                }
            )
        else:
            # Handle non-streaming response
            response = await router.make_request(request_data)
            return response.model_dump()
    except Exception as e:
        logger.error(f"Request processing failed: {str(e)}")
        raise HTTPException(status_code=500, detail=str(e))
@app.post("/v1/switch-provider")
 async def switch_provider(request: Request):
    """Manually switch to a specific provider"""
    provider = await request.json()
    if provider not in router.providers:
        raise HTTPException(status_code=400, detail=f"Unknown provider: {provider}")
    if not router.providers[provider]["active"]:
        raise HTTPException(status_code=400, detail=f"Provider {provider} is not active")
    old_provider = router.current_provider
    router.current_provider = provider
    logger.info(f"Manually switched from {old_provider} to {provider}")
    return {
        "message": f"Switched from {old_provider} to {provider}",
        "current_provider": router.current_provider
    }
@app.get("/v1/status")
 async def get_status():
    """Get current router status"""
    return {
        "current_provider": router.current_provider,
        "failover_count": router.failover_count,
        "last_failover": router.last_failover.isoformat() if router.last_failover else None,
        "last_health_check": router.last_health_check.isoformat() if router.last_health_check else None,
        "health_check_failures": router.health_check_failures,
        "providers": router.providers
    }
@app.post("/v1/health-check")
 async def manual_health_check():
    """Manually trigger Claude Pro health check"""
    try:
        await router.health_check_claude_pro()
        return {
            "message": "Health check completed",
            "current_provider": router.current_provider,
            "last_health_check": router.last_health_check.isoformat() if router.last_health_check else None
        }
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"Health check failed: {str(e)}")
 if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host=config.host, port=config.port)
--- a/router/config.py
+++ b/router/config.py
@@ -0,0 +1,54 @@
 import os
 from typing import Optional
 from pydantic import BaseModel
 class Config(BaseModel):
    # Claude API configurations
    claude_pro_api_key: str = ""
    claude_api_key: str = ""
    # Router settings
    port: int = 8000
    host: str = "0.0.0.0"
    # Retry settings
    max_retries: int = 3
    retry_delay: float = 1.0
    # API endpoints
    claude_pro_base_url: str = "https://api.anthropic.com"
    claude_api_base_url: str = "https://api.anthropic.com"
    # Health check settings
    health_check_enabled: bool = True
    health_check_cron: str = "0-4 * * * *"  # Every hour, first 5 minutes
    health_check_message: str = "ping"
    health_check_model: str = "claude-3-haiku-20240307"  # Use cheapest model for checks
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        # Load from environment or token file
        self.load_from_env()
    def load_from_env(self):
        """Load configuration from environment variables or token file"""
        # Try environment variables first
        self.claude_api_key = os.getenv("CLAUDE_API_KEY", "")
        # Load from tokens.txt if not found in env
        if not self.claude_api_key:
            try:
                with open("/home/will/docker/tokens.txt", "r") as f:
                    for line in f:
                        if line.startswith("claude_API="):
                            self.claude_api_key = line.split("=", 1)[1].strip()
                            break
            except FileNotFoundError:
                pass
        # For MVP, we'll use the same API key for both pro and regular
        # In practice, Claude Pro might use a different endpoint or key
        self.claude_pro_api_key = self.claude_api_key
 # Global config instance
 config = Config()
--- a/router/docker-compose.yml
+++ b/router/docker-compose.yml
@@ -0,0 +1,25 @@
 version: '3.8'
 services:
  claude-router:
    build: .
    container_name: claude-router
    ports:
      - "8000:8000"
    environment:
      - CLAUDE_API_KEY=${CLAUDE_API_KEY}
    volumes:
      - /home/will/docker/tokens.txt:/home/will/docker/tokens.txt:ro
    restart: unless-stopped
    networks:
      - router-network
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 30s
 networks:
  router-network:
    driver: bridge
--- a/router/requirements.txt
+++ b/router/requirements.txt
@@ -0,0 +1,7 @@
 fastapi==0.104.1
 uvicorn==0.24.0
 httpx==0.25.2
 pydantic==2.5.0
 anthropic==0.7.8
 python-dotenv==1.0.0
 apscheduler==3.10.4