feat: add dev-cicd skill + enhance dev-deploy
新增 dev-cicd(CI/CD 流水线设计/优化/排查): - Gitea Actions 模板(Go/iOS/Web/Docker) - Pipeline 优化(浅克隆/缓存/并发取消) - 故障排查决策树(20+ 常见错误) - 安全检查清单 + Runner 管理 增强 dev-deploy(部署执行): - Docker Staging/Production 部署模板 - 部署前健康检查(证书/Docker/磁盘) - 回滚策略(TestFlight/Docker/数据库) - 部署监控(Feishu通知/ASC API) 技能总数: 28 (dev 分类: 7) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -60,6 +60,19 @@
|
||||
],
|
||||
"strict": false
|
||||
},
|
||||
{
|
||||
"name": "dev-cicd-plugin",
|
||||
"source": "./skills-dev/dev-cicd-plugin",
|
||||
"description": "Plugin for dev-cicd",
|
||||
"version": "1.0.0",
|
||||
"category": "development",
|
||||
"keywords": [
|
||||
"development",
|
||||
"coding",
|
||||
"workflow"
|
||||
],
|
||||
"strict": false
|
||||
},
|
||||
{
|
||||
"name": "dev-coding-plugin",
|
||||
"source": "./skills-dev/dev-coding-plugin",
|
||||
|
||||
8
skills-dev/dev-cicd-plugin/.claude-plugin/plugin.json
Normal file
8
skills-dev/dev-cicd-plugin/.claude-plugin/plugin.json
Normal file
@@ -0,0 +1,8 @@
|
||||
{
|
||||
"name": "dev-cicd-plugin",
|
||||
"description": "Plugin for dev-cicd",
|
||||
"version": "1.0.0",
|
||||
"author": {
|
||||
"name": "qiudl"
|
||||
}
|
||||
}
|
||||
482
skills-dev/dev-cicd-plugin/skills/SKILL.md
Normal file
482
skills-dev/dev-cicd-plugin/skills/SKILL.md
Normal file
@@ -0,0 +1,482 @@
|
||||
---
|
||||
name: dev-cicd
|
||||
description: CI/CD 流水线设计、优化与排查。适配 Gitea Actions + Go/Swift/Next.js/Docker 栈。当用户提到 CI、CD、流水线、pipeline、workflow、构建失败、runner 相关任务时自动激活。
|
||||
---
|
||||
|
||||
# CI/CD 流水线技能 (dev-cicd)
|
||||
|
||||
## 概述
|
||||
|
||||
管理 Gitea Actions CI/CD 流水线的设计、优化和故障排查。适配技术栈:
|
||||
- **Git**: Gitea (self-hosted, GitHub Actions YAML 兼容)
|
||||
- **Backend**: Go (Gin + GORM)
|
||||
- **iOS**: Swift 6 + SwiftUI + TCA
|
||||
- **Web**: Next.js (React)
|
||||
- **Container**: Docker + Docker Compose
|
||||
- **Registry**: Aliyun ACR
|
||||
- **Runners**: self-hosted (Linux) + macos-arm64 (iOS)
|
||||
|
||||
---
|
||||
|
||||
## 命令参考
|
||||
|
||||
| 命令 | 说明 |
|
||||
|------|------|
|
||||
| `/cicd analyze` | 分析当前 workflow 找优化点 |
|
||||
| `/cicd troubleshoot` | 诊断流水线失败原因 |
|
||||
| `/cicd template [go\|ios\|web\|docker]` | 生成 workflow 模板 |
|
||||
| `/cicd status` | 查看最近 workflow 运行状态 |
|
||||
|
||||
---
|
||||
|
||||
## 1. Pipeline 设计
|
||||
|
||||
### 1.1 Monorepo 路径过滤
|
||||
|
||||
仓库包含多个子项目,用 `paths` 只触发相关构建:
|
||||
|
||||
```yaml
|
||||
# .gitea/workflows/ci-cd.yml — Go + Web + Docker
|
||||
on:
|
||||
push:
|
||||
branches: [develop, main]
|
||||
paths:
|
||||
- 'gateway/**'
|
||||
- 'web/**'
|
||||
- 'docker/**'
|
||||
- 'scripts/**'
|
||||
|
||||
# .gitea/workflows/ios-testflight.yml — iOS 独立
|
||||
on:
|
||||
push:
|
||||
branches: [develop, main]
|
||||
paths:
|
||||
- 'ios/**'
|
||||
```
|
||||
|
||||
### 1.2 Pipeline 结构原则
|
||||
|
||||
```
|
||||
快速反馈优先:
|
||||
1. 静态检查 (lint/vet) — 秒级
|
||||
2. 单元测试 (test) — 1-5 分钟
|
||||
3. 构建 (build) — 2-10 分钟
|
||||
4. 集成测试 (可选) — 5-15 分钟
|
||||
5. 发布 (deploy) — 5-15 分钟
|
||||
```
|
||||
|
||||
### 1.3 Go 后端模板
|
||||
|
||||
```yaml
|
||||
jobs:
|
||||
ci:
|
||||
runs-on: self-hosted
|
||||
steps:
|
||||
- name: Checkout
|
||||
run: |
|
||||
cd ${{ github.workspace }}
|
||||
if [ -d .git ]; then
|
||||
git fetch --depth 1 origin ${{ github.ref_name }}
|
||||
git reset --hard origin/${{ github.ref_name }}
|
||||
else
|
||||
git clone --depth 1 --branch ${{ github.ref_name }} \
|
||||
http://xiaoqu:${{ secrets.REPO_TOKEN }}@localhost:3000/<org>/<repo>.git .
|
||||
fi
|
||||
|
||||
- name: Go Vet
|
||||
run: cd gateway && go vet ./...
|
||||
|
||||
- name: Go Test
|
||||
run: cd gateway && go test ./... -count=1 -timeout 120s
|
||||
|
||||
- name: Go Build
|
||||
run: cd gateway && go build ./cmd/gateway/
|
||||
```
|
||||
|
||||
### 1.4 iOS 模板
|
||||
|
||||
```yaml
|
||||
jobs:
|
||||
ios:
|
||||
runs-on: macos-arm64
|
||||
if: "!contains(github.event.head_commit.message, '[skip ci]')"
|
||||
steps:
|
||||
- name: Checkout
|
||||
run: git clone --depth 1 --branch ${{ github.ref_name }} <repo-url> .
|
||||
|
||||
- name: xcodegen
|
||||
run: /opt/homebrew/bin/xcodegen generate
|
||||
working-directory: ios
|
||||
|
||||
- name: Test
|
||||
run: |
|
||||
set -o pipefail
|
||||
swift test 2>&1 | tee /tmp/test.log | tail -20
|
||||
working-directory: ios
|
||||
|
||||
- name: Deploy TestFlight
|
||||
env:
|
||||
KEYCHAIN_PASSWORD: ${{ secrets.KEYCHAIN_PASSWORD }}
|
||||
ASC_KEY_ID: ${{ secrets.ASC_KEY_ID }}
|
||||
ASC_ISSUER_ID: ${{ secrets.ASC_ISSUER_ID }}
|
||||
run: ./scripts/ios-testflight.sh
|
||||
```
|
||||
|
||||
### 1.5 Web (Next.js) 模板
|
||||
|
||||
```yaml
|
||||
- name: Web Install
|
||||
run: cd web && npm ci --legacy-peer-deps
|
||||
|
||||
- name: Web Build
|
||||
run: cd web && npm run build
|
||||
|
||||
- name: Docker Build Web
|
||||
run: |
|
||||
docker build -t $REGISTRY/$WEB_IMAGE:${{ github.sha }} \
|
||||
-t $REGISTRY/$WEB_IMAGE:latest ./web
|
||||
```
|
||||
|
||||
### 1.6 单 Job vs 多 Job
|
||||
|
||||
| 场景 | 选择 | 原因 |
|
||||
|------|------|------|
|
||||
| Runner capacity=1 | 单 Job | 多 Job 串行 + 多次 checkout = 更慢 |
|
||||
| 多 Runner 可用 | 多 Job + needs | 并行加速 |
|
||||
| 不同 OS (Linux+macOS) | 分 Workflow | 不同 runner label |
|
||||
|
||||
**当前推荐**:Linux runner 单 Job(Go+Web+Docker),macOS runner 单 Job(iOS)。
|
||||
|
||||
---
|
||||
|
||||
## 2. 优化
|
||||
|
||||
### 2.1 浅克隆
|
||||
|
||||
```yaml
|
||||
# 首次 clone
|
||||
git clone --depth 1 --branch ${{ github.ref_name }} <url> .
|
||||
|
||||
# 增量 fetch
|
||||
git fetch --depth 1 origin ${{ github.ref_name }}
|
||||
git reset --hard origin/${{ github.ref_name }}
|
||||
```
|
||||
|
||||
**效果**:仓库含大量二进制文件时,clone 时间从 30s+ 降到 3-5s。
|
||||
|
||||
**注意**:需要 push 时先 `git fetch --unshallow`。
|
||||
|
||||
### 2.2 依赖缓存
|
||||
|
||||
Gitea Actions 不支持 `actions/cache`,但 self-hosted runner 可利用本地磁盘:
|
||||
|
||||
```yaml
|
||||
# Go modules — runner 上全局缓存
|
||||
env:
|
||||
GOMODCACHE: /opt/runner-cache/go/mod
|
||||
GOCACHE: /opt/runner-cache/go/build
|
||||
|
||||
# npm — 利用 node_modules 持久化
|
||||
# self-hosted runner 的 workspace 在两次运行间保留
|
||||
- run: |
|
||||
if [ -f web/node_modules/.cache-hash ] && \
|
||||
[ "$(cat web/node_modules/.cache-hash)" = "$(md5sum web/package-lock.json | cut -d' ' -f1)" ]; then
|
||||
echo "npm cache hit, skip install"
|
||||
else
|
||||
cd web && npm ci --legacy-peer-deps
|
||||
md5sum package-lock.json | cut -d' ' -f1 > node_modules/.cache-hash
|
||||
fi
|
||||
|
||||
# SPM — Xcode 自动缓存到 DerivedData,self-hosted runner 保留
|
||||
```
|
||||
|
||||
### 2.3 并发取消
|
||||
|
||||
避免同一分支多次 push 排队等待:
|
||||
|
||||
```yaml
|
||||
concurrency:
|
||||
group: ${{ github.workflow }}-${{ github.ref }}
|
||||
cancel-in-progress: true
|
||||
```
|
||||
|
||||
### 2.4 条件跳过
|
||||
|
||||
```yaml
|
||||
# 跳过 CI Bot 的自动提交
|
||||
if: "!contains(github.event.head_commit.message, '[skip ci]')"
|
||||
|
||||
# 只在 develop 分支部署
|
||||
if: github.ref == 'refs/heads/develop'
|
||||
```
|
||||
|
||||
### 2.5 构建产物复用
|
||||
|
||||
```yaml
|
||||
# Build once, use in deploy
|
||||
- name: Build
|
||||
run: go build -o /tmp/gateway ./cmd/gateway/
|
||||
|
||||
- name: Docker Build
|
||||
run: |
|
||||
# 用已编译的二进制,不在 Docker 内重新编译
|
||||
cp /tmp/gateway docker/
|
||||
docker build -f docker/gateway.prebuilt.Dockerfile -t $IMAGE .
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. 故障排查
|
||||
|
||||
### 3.1 决策树
|
||||
|
||||
```
|
||||
Pipeline 失败
|
||||
├── Workflow 没触发
|
||||
│ ├── 检查 paths 过滤 → 改动不在匹配路径下
|
||||
│ ├── 检查 branch 过滤 → 分支名不匹配
|
||||
│ ├── 检查 [skip ci] → commit message 含跳过标记
|
||||
│ └── Runner 离线 → Gitea Admin > Runners 检查状态
|
||||
│
|
||||
├── Checkout 失败
|
||||
│ ├── "Authentication failed" → REPO_TOKEN secret 过期/无效
|
||||
│ ├── "Connection refused :3000" → Gitea 服务未运行
|
||||
│ └── Checkout 很慢 → 加 --depth 1 浅克隆
|
||||
│
|
||||
├── Go 构建失败
|
||||
│ ├── "module not found" → GOPROXY 设置 / go mod tidy
|
||||
│ ├── "cannot find package" → go.sum 不完整
|
||||
│ └── "go: version mismatch" → runner 上 Go 版本与 go.mod 不匹配
|
||||
│
|
||||
├── iOS 构建失败
|
||||
│ ├── "Macro must be enabled" → 加 -skipMacroValidation
|
||||
│ ├── "cannot find type" → xcodegen generate 未运行
|
||||
│ ├── "errSecInternalComponent" → unlock-keychain + set-key-partition-list
|
||||
│ ├── "No signing certificate" → Xcode > Accounts 登录下载证书
|
||||
│ ├── "Redundant Binary Upload" → 递增 CURRENT_PROJECT_VERSION
|
||||
│ └── "Missing required icon" → Assets.xcassets 缺 1024x1024 icon
|
||||
│
|
||||
├── Docker 构建失败
|
||||
│ ├── "Cannot connect to daemon" → Docker Desktop 未启动
|
||||
│ ├── "unauthorized" → docker login 凭据过期
|
||||
│ └── "no space left" → docker system prune
|
||||
│
|
||||
└── 部署失败
|
||||
├── "Connection refused" (SSH) → 目标服务器 SSH 端口/密钥
|
||||
├── "health check failed" → 应用启动慢,增加重试等待
|
||||
└── "port already in use" → docker compose down 先停旧容器
|
||||
```
|
||||
|
||||
### 3.2 常见错误速查
|
||||
|
||||
| 错误 | 原因 | 修复 |
|
||||
|------|------|------|
|
||||
| `errSecInternalComponent` | SSH 会话无法访问 Keychain | `security unlock-keychain` + `set-key-partition-list` |
|
||||
| `Macro "X" must be enabled` | Swift Macros 安全限制 | `-skipMacroValidation` |
|
||||
| `cannot find type 'Foo'` | xcodeproj 未包含新文件 | `xcodegen generate` |
|
||||
| `Redundant Binary Upload` | build number 重复 | 递增 `CURRENT_PROJECT_VERSION` |
|
||||
| `Cloud signing permission error` | API Key 权限不足或 Issuer ID 错误 | 用手动签名 + 本地 profile |
|
||||
| `HTTP 401 Unauthorized` (ASC API) | JWT 缺少 `kid` header | `headers={"kid": KEY_ID}` |
|
||||
| `No profiles for bundle id` | 无 distribution profile | 在 Apple Developer 创建并安装 |
|
||||
| `missing icon file 120x120` | 无 App Icon asset | 创建 Assets.xcassets + AppIcon |
|
||||
| `UIInterfaceOrientation` iPad | 缺 iPad 方向声明 | 四方向 + `UIRequiresFullScreen` |
|
||||
|
||||
### 3.3 调试技巧
|
||||
|
||||
```bash
|
||||
# 查看 Gitea runner 状态
|
||||
curl -s -H "Authorization: token <TOKEN>" \
|
||||
http://<gitea>/api/v1/repos/<org>/<repo>/actions/runners
|
||||
|
||||
# 查看最近 workflow 运行
|
||||
curl -s -H "Authorization: token <TOKEN>" \
|
||||
http://<gitea>/api/v1/repos/<org>/<repo>/actions/runs?limit=5
|
||||
|
||||
# 本地模拟 CI 环境
|
||||
# Go
|
||||
docker run -v $(pwd):/app -w /app golang:1.25 go build ./cmd/gateway/
|
||||
|
||||
# iOS — 只能在 macOS 上
|
||||
ssh bjwework "cd ~/workspace/xiaoqu-ai/ios && swift test"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. 安全
|
||||
|
||||
### 4.1 Secrets 管理
|
||||
|
||||
```bash
|
||||
# 通过 Gitea API 配置 secrets(不要手动编辑 workflow 文件)
|
||||
curl -X PUT -H "Authorization: token <ADMIN_TOKEN>" \
|
||||
-H "Content-Type: application/json" \
|
||||
"http://<gitea>/api/v1/repos/<org>/<repo>/actions/secrets/<NAME>" \
|
||||
-d '{"data": "<VALUE>"}'
|
||||
```
|
||||
|
||||
**必需 Secrets 清单**:
|
||||
|
||||
| Secret | 用途 | 轮换周期 |
|
||||
|--------|------|---------|
|
||||
| `REPO_TOKEN` | Git clone 认证 | 按需 |
|
||||
| `ACR_USERNAME` / `ACR_PASSWORD` | Docker 镜像推送 | 90 天 |
|
||||
| `SSH_PRIVATE_KEY` | 服务器部署 | 按需 |
|
||||
| `KEYCHAIN_PASSWORD` | macOS 签名解锁 | 改密码时 |
|
||||
| `ASC_KEY_ID` / `ASC_ISSUER_ID` | App Store Connect | 按需 |
|
||||
| `FEISHU_WEBHOOK` | 通知 | 不过期 |
|
||||
|
||||
### 4.2 防泄漏检查清单
|
||||
|
||||
- [ ] `.gitignore` 包含 `.env`、`*.p8`、`*.pem`、`*.mobileprovision`
|
||||
- [ ] Workflow 中无硬编码密码/token(全走 `${{ secrets.* }}`)
|
||||
- [ ] 脚本用 `${VAR:?error}` 强制要求环境变量(不用默认值暴露凭据)
|
||||
- [ ] Docker 镜像不包含 `.env` 文件(Dockerfile 有 `.dockerignore`)
|
||||
- [ ] Git remote URL 不含 token(用 secrets 注入)
|
||||
|
||||
### 4.3 提交前检查
|
||||
|
||||
```bash
|
||||
# 扫描即将提交的文件是否含密钥
|
||||
git diff --cached --name-only | xargs grep -lE \
|
||||
'(PRIVATE KEY|password|secret|token|apikey)' 2>/dev/null
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. 监控
|
||||
|
||||
### 5.1 查看 Pipeline 状态
|
||||
|
||||
```bash
|
||||
# 最近运行
|
||||
curl -s -H "Authorization: token <TOKEN>" \
|
||||
"http://<gitea>/api/v1/repos/<org>/<repo>/actions/runs?limit=5" | \
|
||||
python3 -c "
|
||||
import json, sys
|
||||
for r in json.load(sys.stdin).get('workflow_runs', []):
|
||||
print(f\"{r['id']} | {r['display_title'][:40]} | {r['status']} | {r['conclusion']}\")
|
||||
"
|
||||
```
|
||||
|
||||
### 5.2 飞书通知模板
|
||||
|
||||
```yaml
|
||||
# 成功/失败通知(在 workflow 最后一步 if: always())
|
||||
- name: Notify
|
||||
if: always()
|
||||
run: |
|
||||
STATUS="${{ job.status }}"
|
||||
EMOJI=$([ "$STATUS" = "success" ] && echo "✅" || echo "❌")
|
||||
COLOR=$([ "$STATUS" = "success" ] && echo "green" || echo "red")
|
||||
cat > /tmp/notify.json << EOF
|
||||
{
|
||||
"msg_type": "interactive",
|
||||
"card": {
|
||||
"header": {
|
||||
"title": {"tag": "plain_text", "content": "$EMOJI <App> $STATUS"},
|
||||
"template": "$COLOR"
|
||||
},
|
||||
"elements": [{
|
||||
"tag": "div",
|
||||
"text": {"tag": "lark_md", "content": "**分支**: ${{ github.ref_name }}\n**提交**: ${{ github.sha }}\n**触发**: ${{ github.event.head_commit.message }}"}
|
||||
}]
|
||||
}
|
||||
}
|
||||
EOF
|
||||
curl -s -X POST "${{ secrets.FEISHU_WEBHOOK }}" \
|
||||
-H "Content-Type: application/json" -d @/tmp/notify.json || true
|
||||
```
|
||||
|
||||
### 5.3 构建时间追踪
|
||||
|
||||
在 workflow 首尾加时间戳:
|
||||
|
||||
```yaml
|
||||
steps:
|
||||
- name: Start Timer
|
||||
run: echo "BUILD_START=$(date +%s)" >> $GITHUB_ENV
|
||||
|
||||
# ... 构建步骤 ...
|
||||
|
||||
- name: Report Duration
|
||||
if: always()
|
||||
run: |
|
||||
DURATION=$(( $(date +%s) - $BUILD_START ))
|
||||
echo "Build duration: ${DURATION}s"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Runner 管理
|
||||
|
||||
### 6.1 Runner 类型
|
||||
|
||||
| Runner | 标签 | 用途 | 位置 |
|
||||
|--------|------|------|------|
|
||||
| xiaoqu-runner | `self-hosted` | Go + Web + Docker | 阿里云 39.104.65.241 |
|
||||
| bjwework-macos | `macos-arm64` | iOS + Swift | Tailscale 100.69.230.116 |
|
||||
|
||||
### 6.2 新增 Runner
|
||||
|
||||
```bash
|
||||
# 1. 获取注册 token
|
||||
curl -s -H "Authorization: token <ADMIN_TOKEN>" \
|
||||
"http://<gitea>/api/v1/repos/<org>/<repo>/actions/runners/registration-token"
|
||||
|
||||
# 2. 注册
|
||||
./act_runner register --no-interactive \
|
||||
--instance http://<gitea> \
|
||||
--token <TOKEN> \
|
||||
--name <NAME> \
|
||||
--labels <LABEL>:host
|
||||
|
||||
# 3. 启动(macOS 用 launchd)
|
||||
launchctl load ~/Library/LaunchAgents/com.gitea.act-runner.plist
|
||||
```
|
||||
|
||||
### 6.3 Runner 健康检查
|
||||
|
||||
```bash
|
||||
# 检查 runner 进程
|
||||
ssh bjwework "launchctl list | grep act-runner"
|
||||
|
||||
# 检查 runner 日志
|
||||
ssh bjwework "tail -20 ~/act_runner/runner.log"
|
||||
|
||||
# 检查 Gitea 上的 runner 状态
|
||||
curl -s -H "Authorization: token <TOKEN>" \
|
||||
"http://<gitea>/api/v1/repos/<org>/<repo>/actions/runners" | \
|
||||
python3 -c "import json,sys; [print(f\"{r['name']} | {r['status']}\") for r in json.load(sys.stdin)]"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Workflow 模板生成
|
||||
|
||||
### `/cicd template go`
|
||||
|
||||
生成 Go 后端 CI workflow,含 vet → test → build → docker → deploy。
|
||||
|
||||
### `/cicd template ios`
|
||||
|
||||
生成 iOS TestFlight workflow,含 xcodegen → test → archive → upload → notify。
|
||||
|
||||
### `/cicd template web`
|
||||
|
||||
生成 Next.js CI workflow,含 install → build → docker → deploy。
|
||||
|
||||
### `/cicd template docker`
|
||||
|
||||
生成 Docker multi-service build+push workflow,含 ACR 登录 → 多镜像构建 → SSH 部署。
|
||||
|
||||
---
|
||||
|
||||
## 8. 与其他技能的关系
|
||||
|
||||
| 技能 | 协作点 |
|
||||
|------|--------|
|
||||
| `dev-deploy` | `/deploy ios` 执行 TestFlight 部署,`/deploy docker` 执行容器部署 |
|
||||
| `dev-coding` | 开发完成后触发 CI |
|
||||
| `req` | `/req deploy` 项目级批量部署 |
|
||||
| `pull-request` | PR 触发 CI 检查 |
|
||||
| `req-test-gate` | CI 中的测试门禁 |
|
||||
@@ -273,19 +273,468 @@ xcodebuild -exportArchive \
|
||||
|
||||
---
|
||||
|
||||
## Docker 容器部署
|
||||
## Docker Staging/Production 部署
|
||||
|
||||
### Staging(自动)
|
||||
### 架构概览
|
||||
|
||||
Push 到 `develop` 分支自动触发 staging 部署。
|
||||
|
||||
### Production
|
||||
|
||||
```bash
|
||||
./scripts/build-and-push.sh prod --detect --deploy --wait --verify
|
||||
```
|
||||
develop push → Build Image → Push ACR → SSH Deploy (staging) → Health Check
|
||||
main push → Build Image → Push ACR → 人工审批 → SSH Deploy (prod) → Health Check
|
||||
```
|
||||
|
||||
详见项目 `scripts/build-and-push.sh`。
|
||||
| 组件 | 说明 |
|
||||
|------|------|
|
||||
| 服务器 | 39.104.87.246(阿里云 ECS) |
|
||||
| Registry | Aliyun ACR: `crpi-q4nnuivosic0zc98.cn-beijing.personal.cr.aliyuncs.com` |
|
||||
| 镜像 | `xiaoqu-gateway`, `xiaoqu-web` |
|
||||
| SSH Key | `~/.ssh/xiaoqu.pem` |
|
||||
| 部署方式 | Docker Compose |
|
||||
|
||||
### 完整部署流程
|
||||
|
||||
```
|
||||
1. 本地构建镜像 → docker build -t <image>:<tag>
|
||||
2. 推送到 ACR → docker push <registry>/<image>:<tag>
|
||||
3. SSH 到服务器 → docker compose pull + up -d
|
||||
4. 健康检查 → curl /health
|
||||
5. 通知 → 飞书 Webhook 发送部署结果
|
||||
```
|
||||
|
||||
### Staging 部署(develop 分支自动触发)
|
||||
|
||||
Push 到 `develop` 分支自动触发 staging 部署。流程:
|
||||
|
||||
```bash
|
||||
# 1. 构建镜像(tag 用 commit SHA 前 8 位)
|
||||
TAG=$(git rev-parse --short=8 HEAD)
|
||||
REGISTRY=crpi-q4nnuivosic0zc98.cn-beijing.personal.cr.aliyuncs.com
|
||||
|
||||
docker build -t $REGISTRY/xiaoqu-gateway:$TAG -f gateway/Dockerfile .
|
||||
docker build -t $REGISTRY/xiaoqu-web:$TAG -f web/Dockerfile .
|
||||
|
||||
# 2. 推送到 ACR
|
||||
docker push $REGISTRY/xiaoqu-gateway:$TAG
|
||||
docker push $REGISTRY/xiaoqu-web:$TAG
|
||||
|
||||
# 3. SSH 部署
|
||||
ssh -i ~/.ssh/xiaoqu.pem root@39.104.87.246 "
|
||||
cd /opt/xiaoqu/staging
|
||||
export IMAGE_TAG=$TAG
|
||||
docker compose pull
|
||||
docker compose up -d
|
||||
"
|
||||
|
||||
# 4. 健康检查
|
||||
sleep 10
|
||||
curl -sf http://39.104.87.246:8080/health || echo 'Health check failed!'
|
||||
```
|
||||
|
||||
### Production 部署(手动审批)
|
||||
|
||||
Production 部署需要人工确认,不会自动触发:
|
||||
|
||||
```bash
|
||||
# 使用 build-and-push 脚本
|
||||
./scripts/build-and-push.sh prod --detect --deploy --wait --verify
|
||||
|
||||
# 或手动执行:
|
||||
TAG=v1.2.3 # 使用语义化版本号
|
||||
REGISTRY=crpi-q4nnuivosic0zc98.cn-beijing.personal.cr.aliyuncs.com
|
||||
|
||||
# 构建 + 推送
|
||||
docker build -t $REGISTRY/xiaoqu-gateway:$TAG -f gateway/Dockerfile .
|
||||
docker build -t $REGISTRY/xiaoqu-web:$TAG -f web/Dockerfile .
|
||||
docker push $REGISTRY/xiaoqu-gateway:$TAG
|
||||
docker push $REGISTRY/xiaoqu-web:$TAG
|
||||
|
||||
# 部署(生产环境目录)
|
||||
ssh -i ~/.ssh/xiaoqu.pem root@39.104.87.246 "
|
||||
cd /opt/xiaoqu/production
|
||||
export IMAGE_TAG=$TAG
|
||||
docker compose pull
|
||||
docker compose up -d
|
||||
"
|
||||
|
||||
# 验证
|
||||
curl -sf http://39.104.87.246/health && echo 'Production deploy OK'
|
||||
```
|
||||
|
||||
### build-and-push 脚本模板
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# scripts/build-and-push.sh
|
||||
set -euo pipefail
|
||||
|
||||
ENV=${1:-staging}
|
||||
REGISTRY=crpi-q4nnuivosic0zc98.cn-beijing.personal.cr.aliyuncs.com
|
||||
SERVER=39.104.87.246
|
||||
SSH_KEY=~/.ssh/xiaoqu.pem
|
||||
IMAGES=(xiaoqu-gateway xiaoqu-web)
|
||||
|
||||
# 确定 tag
|
||||
if [ "$ENV" = "prod" ]; then
|
||||
TAG=${2:-$(git describe --tags --abbrev=0)}
|
||||
else
|
||||
TAG=$(git rev-parse --short=8 HEAD)
|
||||
fi
|
||||
|
||||
echo "=== Deploying to $ENV with tag $TAG ==="
|
||||
|
||||
# 构建
|
||||
for img in "${IMAGES[@]}"; do
|
||||
echo "Building $img..."
|
||||
docker build -t $REGISTRY/$img:$TAG -f ${img#xiaoqu-}/Dockerfile .
|
||||
done
|
||||
|
||||
# 推送
|
||||
for img in "${IMAGES[@]}"; do
|
||||
echo "Pushing $img..."
|
||||
docker push $REGISTRY/$img:$TAG
|
||||
done
|
||||
|
||||
# 部署
|
||||
DEPLOY_DIR=/opt/xiaoqu/$ENV
|
||||
ssh -i $SSH_KEY root@$SERVER "
|
||||
cd $DEPLOY_DIR
|
||||
export IMAGE_TAG=$TAG
|
||||
docker compose pull
|
||||
docker compose up -d --remove-orphans
|
||||
"
|
||||
|
||||
# 健康检查(重试 3 次)
|
||||
echo "Waiting for health check..."
|
||||
for i in 1 2 3; do
|
||||
sleep 5
|
||||
if curl -sf http://$SERVER/health > /dev/null 2>&1; then
|
||||
echo "✓ Health check passed"
|
||||
exit 0
|
||||
fi
|
||||
echo "Attempt $i failed, retrying..."
|
||||
done
|
||||
|
||||
echo "✗ Health check failed after 3 attempts"
|
||||
exit 1
|
||||
```
|
||||
|
||||
### Docker Compose 示例
|
||||
|
||||
```yaml
|
||||
# docker-compose.yml
|
||||
version: "3.8"
|
||||
services:
|
||||
gateway:
|
||||
image: crpi-q4nnuivosic0zc98.cn-beijing.personal.cr.aliyuncs.com/xiaoqu-gateway:${IMAGE_TAG:-latest}
|
||||
ports:
|
||||
- "8080:8080"
|
||||
environment:
|
||||
- DATABASE_URL=postgres://...
|
||||
healthcheck:
|
||||
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
|
||||
interval: 30s
|
||||
timeout: 10s
|
||||
retries: 3
|
||||
restart: unless-stopped
|
||||
|
||||
web:
|
||||
image: crpi-q4nnuivosic0zc98.cn-beijing.personal.cr.aliyuncs.com/xiaoqu-web:${IMAGE_TAG:-latest}
|
||||
ports:
|
||||
- "3000:3000"
|
||||
depends_on:
|
||||
gateway:
|
||||
condition: service_healthy
|
||||
restart: unless-stopped
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 部署前健康检查
|
||||
|
||||
部署前进行预检,避免部署失败浪费时间。
|
||||
|
||||
### iOS 预检
|
||||
|
||||
```bash
|
||||
preflight_ios() {
|
||||
local errors=0
|
||||
|
||||
# 检查 Distribution 证书
|
||||
if ! security find-identity -v -p codesigning | grep -q "Apple Distribution"; then
|
||||
echo "ERROR: Apple Distribution 证书未安装"
|
||||
((errors++))
|
||||
fi
|
||||
|
||||
# 检查 Provisioning Profile 有效期
|
||||
local profile_dir="$HOME/Library/MobileDevice/Provisioning Profiles"
|
||||
if [ -d "$profile_dir" ]; then
|
||||
for profile in "$profile_dir"/*.mobileprovision; do
|
||||
local expiry
|
||||
expiry=$(security cms -D -i "$profile" 2>/dev/null | plutil -extract ExpirationDate raw - 2>/dev/null)
|
||||
if [ -n "$expiry" ]; then
|
||||
local expiry_epoch
|
||||
expiry_epoch=$(date -j -f "%Y-%m-%dT%H:%M:%SZ" "$expiry" "+%s" 2>/dev/null)
|
||||
local now_epoch
|
||||
now_epoch=$(date "+%s")
|
||||
if [ "$expiry_epoch" -lt "$now_epoch" ]; then
|
||||
echo "WARNING: Profile 已过期: $(basename "$profile")"
|
||||
((errors++))
|
||||
fi
|
||||
fi
|
||||
done
|
||||
else
|
||||
echo "ERROR: Provisioning Profiles 目录不存在"
|
||||
((errors++))
|
||||
fi
|
||||
|
||||
# 检查 API Key
|
||||
if [ ! -f "${API_KEY_PATH:-/dev/null}" ]; then
|
||||
echo "ERROR: ASC API Key (.p8) 文件不存在: $API_KEY_PATH"
|
||||
((errors++))
|
||||
fi
|
||||
|
||||
# 检查 Xcode
|
||||
if ! xcode-select -p > /dev/null 2>&1; then
|
||||
echo "ERROR: Xcode Command Line Tools 未安装"
|
||||
((errors++))
|
||||
fi
|
||||
|
||||
if [ $errors -gt 0 ]; then
|
||||
echo "iOS 预检失败: $errors 个问题"
|
||||
return 1
|
||||
fi
|
||||
echo "iOS 预检通过"
|
||||
return 0
|
||||
}
|
||||
```
|
||||
|
||||
### Docker 预检
|
||||
|
||||
```bash
|
||||
preflight_docker() {
|
||||
local errors=0
|
||||
|
||||
# 检查 Docker daemon
|
||||
if ! docker info > /dev/null 2>&1; then
|
||||
echo "ERROR: Docker daemon 未运行"
|
||||
((errors++))
|
||||
fi
|
||||
|
||||
# 检查 ACR registry 可达
|
||||
local registry=crpi-q4nnuivosic0zc98.cn-beijing.personal.cr.aliyuncs.com
|
||||
if ! docker login $registry --username dummy --password dummy 2>&1 | grep -qv "connection refused"; then
|
||||
# login 会失败但不应该是 connection refused
|
||||
echo "WARNING: ACR registry 可能不可达(将在 push 时验证)"
|
||||
fi
|
||||
|
||||
# 检查 SSH 连通性
|
||||
if ! ssh -i ~/.ssh/xiaoqu.pem -o ConnectTimeout=5 -o BatchMode=yes root@39.104.87.246 "echo ok" > /dev/null 2>&1; then
|
||||
echo "ERROR: 无法 SSH 连接到部署服务器 39.104.87.246"
|
||||
((errors++))
|
||||
fi
|
||||
|
||||
# 检查服务器磁盘空间
|
||||
local disk_usage
|
||||
disk_usage=$(ssh -i ~/.ssh/xiaoqu.pem root@39.104.87.246 "df -h / | tail -1 | awk '{print \$5}' | tr -d '%'" 2>/dev/null)
|
||||
if [ -n "$disk_usage" ] && [ "$disk_usage" -gt 85 ]; then
|
||||
echo "WARNING: 服务器磁盘使用率 ${disk_usage}%(建议清理 docker system prune)"
|
||||
fi
|
||||
|
||||
# 检查本地磁盘空间
|
||||
local local_disk
|
||||
local_disk=$(df -h . | tail -1 | awk '{print $5}' | tr -d '%')
|
||||
if [ "$local_disk" -gt 90 ]; then
|
||||
echo "ERROR: 本地磁盘使用率 ${local_disk}%,空间不足"
|
||||
((errors++))
|
||||
fi
|
||||
|
||||
if [ $errors -gt 0 ]; then
|
||||
echo "Docker 预检失败: $errors 个问题"
|
||||
return 1
|
||||
fi
|
||||
echo "Docker 预检通过"
|
||||
return 0
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 回滚策略
|
||||
|
||||
### iOS TestFlight 回滚
|
||||
|
||||
TestFlight **无法真正回滚**已安装的版本,但有以下应急手段:
|
||||
|
||||
| 手段 | 说明 | API |
|
||||
|------|------|-----|
|
||||
| 停止分发 | 将 build 从测试中移除,用户不再收到更新 | `PATCH /v1/builds/{id}` 设置 `expired: true` |
|
||||
| 过期 build | 强制过期有问题的 build | 同上 |
|
||||
| 紧急热修 | 构建新版本覆盖上线 | 常规部署流程 |
|
||||
|
||||
```bash
|
||||
# 通过 ASC API 停止分发某个 build
|
||||
curl -X PATCH "https://api.appstoreconnect.apple.com/v1/builds/$BUILD_ID" \
|
||||
-H "Authorization: Bearer $JWT_TOKEN" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"data":{"type":"builds","id":"'$BUILD_ID'","attributes":{"expired":true}}}'
|
||||
```
|
||||
|
||||
### Docker 回滚
|
||||
|
||||
Docker 回滚相对简单,拉取上一个正常版本的镜像重新部署即可:
|
||||
|
||||
```bash
|
||||
# 1. 确定上一个正常的 tag
|
||||
PREVIOUS_TAG=<previous-good-tag>
|
||||
REGISTRY=crpi-q4nnuivosic0zc98.cn-beijing.personal.cr.aliyuncs.com
|
||||
|
||||
# 2. 在服务器上回滚
|
||||
ssh -i ~/.ssh/xiaoqu.pem root@39.104.87.246 "
|
||||
cd /opt/xiaoqu/production # 或 /opt/xiaoqu/staging
|
||||
export IMAGE_TAG=$PREVIOUS_TAG
|
||||
docker compose pull
|
||||
docker compose up -d
|
||||
"
|
||||
|
||||
# 3. 验证回滚成功
|
||||
curl -sf http://39.104.87.246/health && echo 'Rollback OK'
|
||||
```
|
||||
|
||||
### 数据库回滚注意事项
|
||||
|
||||
| 场景 | 策略 |
|
||||
|------|------|
|
||||
| 可逆 migration(加列、加表) | 部署回滚后数据库无需回滚,旧代码忽略新列 |
|
||||
| 不可逆 migration(删列、改类型) | **必须先回滚 migration 再回滚代码**,否则旧代码报错 |
|
||||
| 数据 migration | 评估是否需要补偿脚本,建议 migration 前做备份快照 |
|
||||
|
||||
```bash
|
||||
# 数据库 migration 回滚示例(如果使用 golang-migrate)
|
||||
ssh -i ~/.ssh/xiaoqu.pem root@39.104.87.246 "
|
||||
docker compose exec gateway migrate -path /migrations -database \$DATABASE_URL down 1
|
||||
"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 部署监控
|
||||
|
||||
### Post-deploy 健康检查模式
|
||||
|
||||
```bash
|
||||
# 通用部署后验证函数
|
||||
post_deploy_verify() {
|
||||
local url=$1
|
||||
local max_retries=${2:-5}
|
||||
local interval=${3:-10}
|
||||
|
||||
echo "Verifying deployment at $url ..."
|
||||
for i in $(seq 1 $max_retries); do
|
||||
local status
|
||||
status=$(curl -sf -o /dev/null -w "%{http_code}" "$url" 2>/dev/null || echo "000")
|
||||
if [ "$status" = "200" ]; then
|
||||
echo "Health check passed (attempt $i/$max_retries)"
|
||||
return 0
|
||||
fi
|
||||
echo "Attempt $i/$max_retries: status=$status, retrying in ${interval}s..."
|
||||
sleep $interval
|
||||
done
|
||||
echo "Health check FAILED after $max_retries attempts"
|
||||
return 1
|
||||
}
|
||||
|
||||
# 使用示例
|
||||
post_deploy_verify "http://39.104.87.246/health" 5 10
|
||||
```
|
||||
|
||||
### 飞书通知模板
|
||||
|
||||
部署完成后通过飞书 Webhook 发送通知:
|
||||
|
||||
```bash
|
||||
# 部署成功通知
|
||||
send_feishu_deploy_notification() {
|
||||
local env=$1 # staging / production
|
||||
local version=$2 # 版本号或 tag
|
||||
local status=$3 # success / failure
|
||||
local detail=$4 # 额外说明
|
||||
|
||||
local WEBHOOK_URL="<飞书群 Webhook 地址>"
|
||||
|
||||
if [ "$status" = "success" ]; then
|
||||
local color="green"
|
||||
local emoji="✅"
|
||||
else
|
||||
local color="red"
|
||||
local emoji="❌"
|
||||
fi
|
||||
|
||||
curl -s -X POST "$WEBHOOK_URL" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"msg_type": "interactive",
|
||||
"card": {
|
||||
"header": {
|
||||
"title": {"tag": "plain_text", "content": "'"$emoji"' 部署通知 - '"$env"'"},
|
||||
"template": "'"$color"'"
|
||||
},
|
||||
"elements": [
|
||||
{"tag": "div", "text": {"tag": "lark_md", "content": "**环境**: '"$env"'\n**版本**: '"$version"'\n**状态**: '"$status"'\n**时间**: '"$(date '+%Y-%m-%d %H:%M:%S')"'\n**详情**: '"$detail"'"}}
|
||||
]
|
||||
}
|
||||
}'
|
||||
}
|
||||
|
||||
# 使用示例
|
||||
send_feishu_deploy_notification "production" "v1.2.3" "success" "Gateway + Web 部署完成"
|
||||
send_feishu_deploy_notification "staging" "abc12345" "failure" "Health check 超时"
|
||||
```
|
||||
|
||||
### iOS TestFlight 构建状态监控
|
||||
|
||||
通过 ASC API 持续监控 build 处理状态:
|
||||
|
||||
```bash
|
||||
# 监控 TestFlight build 处理状态
|
||||
monitor_testflight_build() {
|
||||
local build_id=$1
|
||||
local jwt_token=$2
|
||||
local max_wait=600 # 最长等待 10 分钟
|
||||
local elapsed=0
|
||||
|
||||
while [ $elapsed -lt $max_wait ]; do
|
||||
local response
|
||||
response=$(curl -s "https://api.appstoreconnect.apple.com/v1/builds/$build_id" \
|
||||
-H "Authorization: Bearer $jwt_token")
|
||||
|
||||
local state
|
||||
state=$(echo "$response" | python3 -c "import sys,json; print(json.load(sys.stdin)['data']['attributes']['processingState'])" 2>/dev/null)
|
||||
|
||||
echo "[$(date '+%H:%M:%S')] Build $build_id: $state"
|
||||
|
||||
case "$state" in
|
||||
VALID)
|
||||
echo "Build 处理完成,可用于测试"
|
||||
return 0
|
||||
;;
|
||||
FAILED|INVALID)
|
||||
echo "Build 处理失败: $state"
|
||||
return 1
|
||||
;;
|
||||
PROCESSING)
|
||||
sleep 30
|
||||
((elapsed+=30))
|
||||
;;
|
||||
*)
|
||||
sleep 15
|
||||
((elapsed+=15))
|
||||
;;
|
||||
esac
|
||||
done
|
||||
|
||||
echo "Build 处理超时(${max_wait}s)"
|
||||
return 1
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user