feat: Add scheduled health check and auto-recovery
Major enhancements to Claude Router v1.1.0: - Add APScheduler for automated Claude Pro health checks - Schedule checks every hour (0-4 minutes) to detect quota recovery - Implement intelligent auto-switch back to Claude Pro when available - Add manual health check endpoint for immediate testing - Enhance status monitoring with health check metrics - Improve API compatibility with older Anthropic client versions - Update documentation with new features and usage examples - Configure Claude Code CLI integration with environment variables The router now automatically detects when Claude Pro quota is restored and switches back to prioritize the premium service. 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
51
README.md
51
README.md
@@ -5,7 +5,8 @@
|
||||
## 功能特性
|
||||
|
||||
- **自动故障转移**: 检测到速率限制或使用限制时自动切换provider
|
||||
- **健康检查**: 实时监控各provider状态
|
||||
- **定时健康检查**: 每小时前5分钟自动检测Claude Pro限额恢复
|
||||
- **智能恢复**: 自动切换回Claude Pro,优先使用高级功能
|
||||
- **手动切换**: 支持手动切换到指定provider
|
||||
- **兼容Claude Code CLI**: 完全兼容Anthropic API格式
|
||||
- **Docker化部署**: 一键部署,开箱即用
|
||||
@@ -37,16 +38,21 @@ curl http://localhost:8000/v1/status
|
||||
|
||||
### 3. 配置Claude Code CLI
|
||||
|
||||
修改Claude Code CLI的配置,将API endpoint指向路由器:
|
||||
设置环境变量将Claude Code CLI指向路由器:
|
||||
|
||||
```bash
|
||||
# 设置环境变量
|
||||
# 设置API endpoint为路由器地址
|
||||
export ANTHROPIC_API_URL="http://localhost:8000"
|
||||
export ANTHROPIC_API_KEY="your_claude_api_key"
|
||||
|
||||
# 或者修改Claude Code CLI配置文件
|
||||
# 添加到bashrc使其永久生效
|
||||
echo 'export ANTHROPIC_API_URL="http://localhost:8000"' >> ~/.bashrc
|
||||
|
||||
# 测试配置
|
||||
echo "Hello Claude Router" | claude --print
|
||||
```
|
||||
|
||||
**注意**: 无需修改ANTHROPIC_API_KEY,路由器会自动处理API密钥。
|
||||
|
||||
## API端点
|
||||
|
||||
### 主要端点
|
||||
@@ -55,6 +61,7 @@ export ANTHROPIC_API_KEY="your_claude_api_key"
|
||||
- `GET /health` - 健康检查
|
||||
- `GET /v1/status` - 获取路由器状态
|
||||
- `POST /v1/switch-provider` - 手动切换provider
|
||||
- `POST /v1/health-check` - 手动触发Claude Pro健康检查
|
||||
|
||||
### 健康检查响应示例
|
||||
|
||||
@@ -64,6 +71,8 @@ export ANTHROPIC_API_KEY="your_claude_api_key"
|
||||
"current_provider": "claude_pro",
|
||||
"failover_count": 0,
|
||||
"last_failover": null,
|
||||
"last_health_check": "2025-07-14T19:00:00.000Z",
|
||||
"health_check_failures": 0,
|
||||
"providers": {
|
||||
"claude_pro": {"active": true},
|
||||
"claude_api": {"active": true}
|
||||
@@ -81,6 +90,13 @@ export ANTHROPIC_API_KEY="your_claude_api_key"
|
||||
- `MAX_RETRIES`: 最大重试次数(默认: 3)
|
||||
- `RETRY_DELAY`: 重试延迟(默认: 1.0秒)
|
||||
|
||||
### 健康检查配置
|
||||
|
||||
- `health_check_enabled`: 是否启用定时健康检查(默认: true)
|
||||
- `health_check_cron`: 检查时间表达式(默认: "0-4 * * * *" - 每小时前5分钟)
|
||||
- `health_check_message`: 测试消息内容(默认: "ping")
|
||||
- `health_check_model`: 使用的模型(默认: claude-3-haiku-20240307)
|
||||
|
||||
### Token文件
|
||||
|
||||
路由器会自动从 `/home/will/docker/tokens.txt` 读取API密钥,无需手动配置环境变量。
|
||||
@@ -121,6 +137,16 @@ curl -X POST http://localhost:8000/v1/switch-provider \
|
||||
-d '"claude_api"'
|
||||
```
|
||||
|
||||
### 手动健康检查
|
||||
|
||||
```bash
|
||||
# 立即检测Claude Pro是否可用
|
||||
curl -X POST http://localhost:8000/v1/health-check
|
||||
|
||||
# 查看详细状态
|
||||
curl http://localhost:8000/v1/status
|
||||
```
|
||||
|
||||
## 开发和调试
|
||||
|
||||
### 本地开发
|
||||
@@ -183,6 +209,21 @@ docker logs -f claude-router
|
||||
- Python: 3.11+
|
||||
- 支持: Claude-3 系列模型
|
||||
|
||||
## 更新日志
|
||||
|
||||
### v1.1.0 (2025-07-14)
|
||||
- ✅ 添加定时健康检查功能
|
||||
- ✅ 每小时前5分钟自动检测Claude Pro限额恢复
|
||||
- ✅ 智能自动切换回Claude Pro
|
||||
- ✅ 新增手动健康检查API
|
||||
- ✅ 完善日志记录和状态监控
|
||||
|
||||
### v1.0.0 (2025-07-14)
|
||||
- ✅ 基础路由器功能
|
||||
- ✅ Claude Pro到Claude API自动故障转移
|
||||
- ✅ Docker容器化部署
|
||||
- ✅ Claude Code CLI兼容性
|
||||
|
||||
## 后续开发计划
|
||||
|
||||
- [ ] 添加DeepSeek API支持
|
||||
|
||||
138
app.py
138
app.py
@@ -9,6 +9,8 @@ import httpx
|
||||
from fastapi import FastAPI, Request, HTTPException
|
||||
from fastapi.responses import StreamingResponse, JSONResponse
|
||||
from anthropic import Anthropic
|
||||
from apscheduler.schedulers.asyncio import AsyncIOScheduler
|
||||
from apscheduler.triggers.cron import CronTrigger
|
||||
|
||||
from config import config
|
||||
|
||||
@@ -21,6 +23,9 @@ class ClaudeRouter:
|
||||
self.current_provider = "claude_pro"
|
||||
self.failover_count = 0
|
||||
self.last_failover = None
|
||||
self.last_health_check = None
|
||||
self.health_check_failures = 0
|
||||
self.scheduler = None
|
||||
self.providers = {
|
||||
"claude_pro": {
|
||||
"api_key": config.claude_pro_api_key,
|
||||
@@ -98,13 +103,23 @@ class ClaudeRouter:
|
||||
logger.info(f"Making request with provider: {self.current_provider}")
|
||||
|
||||
# Make the API call
|
||||
response = await asyncio.to_thread(
|
||||
client.messages.create,
|
||||
model=model,
|
||||
max_tokens=max_tokens,
|
||||
messages=messages,
|
||||
stream=stream
|
||||
)
|
||||
if hasattr(client, 'messages'):
|
||||
response = await asyncio.to_thread(
|
||||
client.messages.create,
|
||||
model=model,
|
||||
max_tokens=max_tokens,
|
||||
messages=messages,
|
||||
stream=stream
|
||||
)
|
||||
else:
|
||||
# For older anthropic versions
|
||||
response = await asyncio.to_thread(
|
||||
client.completions.create,
|
||||
model=model,
|
||||
max_tokens_to_sample=max_tokens,
|
||||
prompt=f"Human: {messages[0]['content']}\n\nAssistant:",
|
||||
stream=stream
|
||||
)
|
||||
|
||||
return response
|
||||
|
||||
@@ -120,6 +135,81 @@ class ClaudeRouter:
|
||||
raise HTTPException(status_code=500, detail=f"All providers failed. Last error: {str(e)}")
|
||||
|
||||
raise HTTPException(status_code=500, detail="No providers available")
|
||||
|
||||
async def health_check_claude_pro(self):
|
||||
"""Check if Claude Pro is available again"""
|
||||
# Only check if we're not currently using Claude Pro
|
||||
if self.current_provider == "claude_pro":
|
||||
logger.debug("Skipping health check - already using Claude Pro")
|
||||
return
|
||||
|
||||
logger.info("Running Claude Pro health check...")
|
||||
self.last_health_check = datetime.now()
|
||||
|
||||
try:
|
||||
client = Anthropic(
|
||||
api_key=config.claude_pro_api_key,
|
||||
base_url=config.claude_pro_base_url
|
||||
)
|
||||
|
||||
# Send a minimal test message
|
||||
if hasattr(client, 'messages'):
|
||||
response = await asyncio.to_thread(
|
||||
client.messages.create,
|
||||
model=config.health_check_model,
|
||||
max_tokens=10,
|
||||
messages=[{"role": "user", "content": config.health_check_message}]
|
||||
)
|
||||
else:
|
||||
# For older anthropic versions
|
||||
response = await asyncio.to_thread(
|
||||
client.completions.create,
|
||||
model=config.health_check_model,
|
||||
max_tokens_to_sample=10,
|
||||
prompt=f"Human: {config.health_check_message}\n\nAssistant:"
|
||||
)
|
||||
|
||||
# If successful, switch back to Claude Pro
|
||||
old_provider = self.current_provider
|
||||
self.current_provider = "claude_pro"
|
||||
self.health_check_failures = 0
|
||||
|
||||
logger.info(f"Claude Pro health check successful! Switched from {old_provider} to claude_pro")
|
||||
|
||||
except Exception as e:
|
||||
self.health_check_failures += 1
|
||||
error_str = str(e).lower()
|
||||
|
||||
if any(indicator in error_str for indicator in ["rate_limit", "usage limit", "quota exceeded", "429", "too many requests", "limit reached"]):
|
||||
logger.info(f"Claude Pro still rate limited: {str(e)}")
|
||||
else:
|
||||
logger.warning(f"Claude Pro health check failed (attempt {self.health_check_failures}): {str(e)}")
|
||||
|
||||
def start_scheduler(self):
|
||||
"""Start the health check scheduler"""
|
||||
if not config.health_check_enabled:
|
||||
logger.info("Health check disabled in config")
|
||||
return
|
||||
|
||||
self.scheduler = AsyncIOScheduler()
|
||||
|
||||
# Schedule health check using cron expression
|
||||
self.scheduler.add_job(
|
||||
self.health_check_claude_pro,
|
||||
trigger=CronTrigger.from_crontab(config.health_check_cron),
|
||||
id="claude_pro_health_check",
|
||||
name="Claude Pro Health Check",
|
||||
misfire_grace_time=60
|
||||
)
|
||||
|
||||
self.scheduler.start()
|
||||
logger.info(f"Health check scheduler started with cron: {config.health_check_cron}")
|
||||
|
||||
def stop_scheduler(self):
|
||||
"""Stop the health check scheduler"""
|
||||
if self.scheduler:
|
||||
self.scheduler.shutdown()
|
||||
logger.info("Health check scheduler stopped")
|
||||
|
||||
# Initialize router
|
||||
router = ClaudeRouter()
|
||||
@@ -128,7 +218,14 @@ router = ClaudeRouter()
|
||||
async def lifespan(app: FastAPI):
|
||||
logger.info("Claude Router starting up...")
|
||||
logger.info(f"Current provider: {router.current_provider}")
|
||||
|
||||
# Start health check scheduler
|
||||
router.start_scheduler()
|
||||
|
||||
yield
|
||||
|
||||
# Stop scheduler on shutdown
|
||||
router.stop_scheduler()
|
||||
logger.info("Claude Router shutting down...")
|
||||
|
||||
app = FastAPI(
|
||||
@@ -147,9 +244,11 @@ async def health_check():
|
||||
"failover_count": router.failover_count,
|
||||
"last_failover": router.last_failover.isoformat() if router.last_failover else None,
|
||||
"providers": {
|
||||
name: {"active": config["active"]}
|
||||
for name, config in router.providers.items()
|
||||
}
|
||||
name: {"active": provider_config["active"]}
|
||||
for name, provider_config in router.providers.items()
|
||||
},
|
||||
"last_health_check": router.last_health_check.isoformat() if router.last_health_check else None,
|
||||
"health_check_failures": router.health_check_failures
|
||||
}
|
||||
|
||||
@app.post("/v1/messages")
|
||||
@@ -189,8 +288,10 @@ async def create_message(request: Request):
|
||||
raise HTTPException(status_code=500, detail=str(e))
|
||||
|
||||
@app.post("/v1/switch-provider")
|
||||
async def switch_provider(provider: str):
|
||||
async def switch_provider(request: Request):
|
||||
"""Manually switch to a specific provider"""
|
||||
provider = await request.json()
|
||||
|
||||
if provider not in router.providers:
|
||||
raise HTTPException(status_code=400, detail=f"Unknown provider: {provider}")
|
||||
|
||||
@@ -214,9 +315,24 @@ async def get_status():
|
||||
"current_provider": router.current_provider,
|
||||
"failover_count": router.failover_count,
|
||||
"last_failover": router.last_failover.isoformat() if router.last_failover else None,
|
||||
"last_health_check": router.last_health_check.isoformat() if router.last_health_check else None,
|
||||
"health_check_failures": router.health_check_failures,
|
||||
"providers": router.providers
|
||||
}
|
||||
|
||||
@app.post("/v1/health-check")
|
||||
async def manual_health_check():
|
||||
"""Manually trigger Claude Pro health check"""
|
||||
try:
|
||||
await router.health_check_claude_pro()
|
||||
return {
|
||||
"message": "Health check completed",
|
||||
"current_provider": router.current_provider,
|
||||
"last_health_check": router.last_health_check.isoformat() if router.last_health_check else None
|
||||
}
|
||||
except Exception as e:
|
||||
raise HTTPException(status_code=500, detail=f"Health check failed: {str(e)}")
|
||||
|
||||
if __name__ == "__main__":
|
||||
import uvicorn
|
||||
uvicorn.run(app, host=config.host, port=config.port)
|
||||
@@ -19,6 +19,12 @@ class Config(BaseModel):
|
||||
claude_pro_base_url: str = "https://api.anthropic.com"
|
||||
claude_api_base_url: str = "https://api.anthropic.com"
|
||||
|
||||
# Health check settings
|
||||
health_check_enabled: bool = True
|
||||
health_check_cron: str = "0-4 * * * *" # Every hour, first 5 minutes
|
||||
health_check_message: str = "ping"
|
||||
health_check_model: str = "claude-3-haiku-20240307" # Use cheapest model for checks
|
||||
|
||||
def __init__(self, **kwargs):
|
||||
super().__init__(**kwargs)
|
||||
# Load from environment or token file
|
||||
|
||||
@@ -3,4 +3,5 @@ uvicorn==0.24.0
|
||||
httpx==0.25.2
|
||||
pydantic==2.5.0
|
||||
anthropic==0.7.8
|
||||
python-dotenv==1.0.0
|
||||
python-dotenv==1.0.0
|
||||
apscheduler==3.10.4
|
||||
Reference in New Issue
Block a user