Files
ai-proj-helper/skills-dev/dev-cicd-plugin/skills/SKILL.md
dongliang a58dc39795 feat: add dev-cicd skill + enhance dev-deploy
新增 dev-cicd(CI/CD 流水线设计/优化/排查):
- Gitea Actions 模板(Go/iOS/Web/Docker)
- Pipeline 优化(浅克隆/缓存/并发取消)
- 故障排查决策树(20+ 常见错误)
- 安全检查清单 + Runner 管理

增强 dev-deploy(部署执行):
- Docker Staging/Production 部署模板
- 部署前健康检查(证书/Docker/磁盘)
- 回滚策略(TestFlight/Docker/数据库)
- 部署监控(Feishu通知/ASC API)

技能总数: 28 (dev 分类: 7)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 11:10:13 +09:30

483 lines
14 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
name: dev-cicd
description: CI/CD 流水线设计、优化与排查。适配 Gitea Actions + Go/Swift/Next.js/Docker 栈。当用户提到 CI、CD、流水线、pipeline、workflow、构建失败、runner 相关任务时自动激活。
---
# CI/CD 流水线技能 (dev-cicd)
## 概述
管理 Gitea Actions CI/CD 流水线的设计、优化和故障排查。适配技术栈:
- **Git**: Gitea (self-hosted, GitHub Actions YAML 兼容)
- **Backend**: Go (Gin + GORM)
- **iOS**: Swift 6 + SwiftUI + TCA
- **Web**: Next.js (React)
- **Container**: Docker + Docker Compose
- **Registry**: Aliyun ACR
- **Runners**: self-hosted (Linux) + macos-arm64 (iOS)
---
## 命令参考
| 命令 | 说明 |
|------|------|
| `/cicd analyze` | 分析当前 workflow 找优化点 |
| `/cicd troubleshoot` | 诊断流水线失败原因 |
| `/cicd template [go\|ios\|web\|docker]` | 生成 workflow 模板 |
| `/cicd status` | 查看最近 workflow 运行状态 |
---
## 1. Pipeline 设计
### 1.1 Monorepo 路径过滤
仓库包含多个子项目,用 `paths` 只触发相关构建:
```yaml
# .gitea/workflows/ci-cd.yml — Go + Web + Docker
on:
push:
branches: [develop, main]
paths:
- 'gateway/**'
- 'web/**'
- 'docker/**'
- 'scripts/**'
# .gitea/workflows/ios-testflight.yml — iOS 独立
on:
push:
branches: [develop, main]
paths:
- 'ios/**'
```
### 1.2 Pipeline 结构原则
```
快速反馈优先:
1. 静态检查 (lint/vet) — 秒级
2. 单元测试 (test) — 1-5 分钟
3. 构建 (build) — 2-10 分钟
4. 集成测试 (可选) — 5-15 分钟
5. 发布 (deploy) — 5-15 分钟
```
### 1.3 Go 后端模板
```yaml
jobs:
ci:
runs-on: self-hosted
steps:
- name: Checkout
run: |
cd ${{ github.workspace }}
if [ -d .git ]; then
git fetch --depth 1 origin ${{ github.ref_name }}
git reset --hard origin/${{ github.ref_name }}
else
git clone --depth 1 --branch ${{ github.ref_name }} \
http://xiaoqu:${{ secrets.REPO_TOKEN }}@localhost:3000/<org>/<repo>.git .
fi
- name: Go Vet
run: cd gateway && go vet ./...
- name: Go Test
run: cd gateway && go test ./... -count=1 -timeout 120s
- name: Go Build
run: cd gateway && go build ./cmd/gateway/
```
### 1.4 iOS 模板
```yaml
jobs:
ios:
runs-on: macos-arm64
if: "!contains(github.event.head_commit.message, '[skip ci]')"
steps:
- name: Checkout
run: git clone --depth 1 --branch ${{ github.ref_name }} <repo-url> .
- name: xcodegen
run: /opt/homebrew/bin/xcodegen generate
working-directory: ios
- name: Test
run: |
set -o pipefail
swift test 2>&1 | tee /tmp/test.log | tail -20
working-directory: ios
- name: Deploy TestFlight
env:
KEYCHAIN_PASSWORD: ${{ secrets.KEYCHAIN_PASSWORD }}
ASC_KEY_ID: ${{ secrets.ASC_KEY_ID }}
ASC_ISSUER_ID: ${{ secrets.ASC_ISSUER_ID }}
run: ./scripts/ios-testflight.sh
```
### 1.5 Web (Next.js) 模板
```yaml
- name: Web Install
run: cd web && npm ci --legacy-peer-deps
- name: Web Build
run: cd web && npm run build
- name: Docker Build Web
run: |
docker build -t $REGISTRY/$WEB_IMAGE:${{ github.sha }} \
-t $REGISTRY/$WEB_IMAGE:latest ./web
```
### 1.6 单 Job vs 多 Job
| 场景 | 选择 | 原因 |
|------|------|------|
| Runner capacity=1 | 单 Job | 多 Job 串行 + 多次 checkout = 更慢 |
| 多 Runner 可用 | 多 Job + needs | 并行加速 |
| 不同 OS (Linux+macOS) | 分 Workflow | 不同 runner label |
**当前推荐**Linux runner 单 JobGo+Web+DockermacOS runner 单 JobiOS
---
## 2. 优化
### 2.1 浅克隆
```yaml
# 首次 clone
git clone --depth 1 --branch ${{ github.ref_name }} <url> .
# 增量 fetch
git fetch --depth 1 origin ${{ github.ref_name }}
git reset --hard origin/${{ github.ref_name }}
```
**效果**仓库含大量二进制文件时clone 时间从 30s+ 降到 3-5s。
**注意**:需要 push 时先 `git fetch --unshallow`
### 2.2 依赖缓存
Gitea Actions 不支持 `actions/cache`,但 self-hosted runner 可利用本地磁盘:
```yaml
# Go modules — runner 上全局缓存
env:
GOMODCACHE: /opt/runner-cache/go/mod
GOCACHE: /opt/runner-cache/go/build
# npm — 利用 node_modules 持久化
# self-hosted runner 的 workspace 在两次运行间保留
- run: |
if [ -f web/node_modules/.cache-hash ] && \
[ "$(cat web/node_modules/.cache-hash)" = "$(md5sum web/package-lock.json | cut -d' ' -f1)" ]; then
echo "npm cache hit, skip install"
else
cd web && npm ci --legacy-peer-deps
md5sum package-lock.json | cut -d' ' -f1 > node_modules/.cache-hash
fi
# SPM — Xcode 自动缓存到 DerivedDataself-hosted runner 保留
```
### 2.3 并发取消
避免同一分支多次 push 排队等待:
```yaml
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
```
### 2.4 条件跳过
```yaml
# 跳过 CI Bot 的自动提交
if: "!contains(github.event.head_commit.message, '[skip ci]')"
# 只在 develop 分支部署
if: github.ref == 'refs/heads/develop'
```
### 2.5 构建产物复用
```yaml
# Build once, use in deploy
- name: Build
run: go build -o /tmp/gateway ./cmd/gateway/
- name: Docker Build
run: |
# 用已编译的二进制,不在 Docker 内重新编译
cp /tmp/gateway docker/
docker build -f docker/gateway.prebuilt.Dockerfile -t $IMAGE .
```
---
## 3. 故障排查
### 3.1 决策树
```
Pipeline 失败
├── Workflow 没触发
│ ├── 检查 paths 过滤 → 改动不在匹配路径下
│ ├── 检查 branch 过滤 → 分支名不匹配
│ ├── 检查 [skip ci] → commit message 含跳过标记
│ └── Runner 离线 → Gitea Admin > Runners 检查状态
├── Checkout 失败
│ ├── "Authentication failed" → REPO_TOKEN secret 过期/无效
│ ├── "Connection refused :3000" → Gitea 服务未运行
│ └── Checkout 很慢 → 加 --depth 1 浅克隆
├── Go 构建失败
│ ├── "module not found" → GOPROXY 设置 / go mod tidy
│ ├── "cannot find package" → go.sum 不完整
│ └── "go: version mismatch" → runner 上 Go 版本与 go.mod 不匹配
├── iOS 构建失败
│ ├── "Macro must be enabled" → 加 -skipMacroValidation
│ ├── "cannot find type" → xcodegen generate 未运行
│ ├── "errSecInternalComponent" → unlock-keychain + set-key-partition-list
│ ├── "No signing certificate" → Xcode > Accounts 登录下载证书
│ ├── "Redundant Binary Upload" → 递增 CURRENT_PROJECT_VERSION
│ └── "Missing required icon" → Assets.xcassets 缺 1024x1024 icon
├── Docker 构建失败
│ ├── "Cannot connect to daemon" → Docker Desktop 未启动
│ ├── "unauthorized" → docker login 凭据过期
│ └── "no space left" → docker system prune
└── 部署失败
├── "Connection refused" (SSH) → 目标服务器 SSH 端口/密钥
├── "health check failed" → 应用启动慢,增加重试等待
└── "port already in use" → docker compose down 先停旧容器
```
### 3.2 常见错误速查
| 错误 | 原因 | 修复 |
|------|------|------|
| `errSecInternalComponent` | SSH 会话无法访问 Keychain | `security unlock-keychain` + `set-key-partition-list` |
| `Macro "X" must be enabled` | Swift Macros 安全限制 | `-skipMacroValidation` |
| `cannot find type 'Foo'` | xcodeproj 未包含新文件 | `xcodegen generate` |
| `Redundant Binary Upload` | build number 重复 | 递增 `CURRENT_PROJECT_VERSION` |
| `Cloud signing permission error` | API Key 权限不足或 Issuer ID 错误 | 用手动签名 + 本地 profile |
| `HTTP 401 Unauthorized` (ASC API) | JWT 缺少 `kid` header | `headers={"kid": KEY_ID}` |
| `No profiles for bundle id` | 无 distribution profile | 在 Apple Developer 创建并安装 |
| `missing icon file 120x120` | 无 App Icon asset | 创建 Assets.xcassets + AppIcon |
| `UIInterfaceOrientation` iPad | 缺 iPad 方向声明 | 四方向 + `UIRequiresFullScreen` |
### 3.3 调试技巧
```bash
# 查看 Gitea runner 状态
curl -s -H "Authorization: token <TOKEN>" \
http://<gitea>/api/v1/repos/<org>/<repo>/actions/runners
# 查看最近 workflow 运行
curl -s -H "Authorization: token <TOKEN>" \
http://<gitea>/api/v1/repos/<org>/<repo>/actions/runs?limit=5
# 本地模拟 CI 环境
# Go
docker run -v $(pwd):/app -w /app golang:1.25 go build ./cmd/gateway/
# iOS — 只能在 macOS 上
ssh bjwework "cd ~/workspace/xiaoqu-ai/ios && swift test"
```
---
## 4. 安全
### 4.1 Secrets 管理
```bash
# 通过 Gitea API 配置 secrets不要手动编辑 workflow 文件)
curl -X PUT -H "Authorization: token <ADMIN_TOKEN>" \
-H "Content-Type: application/json" \
"http://<gitea>/api/v1/repos/<org>/<repo>/actions/secrets/<NAME>" \
-d '{"data": "<VALUE>"}'
```
**必需 Secrets 清单**
| Secret | 用途 | 轮换周期 |
|--------|------|---------|
| `REPO_TOKEN` | Git clone 认证 | 按需 |
| `ACR_USERNAME` / `ACR_PASSWORD` | Docker 镜像推送 | 90 天 |
| `SSH_PRIVATE_KEY` | 服务器部署 | 按需 |
| `KEYCHAIN_PASSWORD` | macOS 签名解锁 | 改密码时 |
| `ASC_KEY_ID` / `ASC_ISSUER_ID` | App Store Connect | 按需 |
| `FEISHU_WEBHOOK` | 通知 | 不过期 |
### 4.2 防泄漏检查清单
- [ ] `.gitignore` 包含 `.env``*.p8``*.pem``*.mobileprovision`
- [ ] Workflow 中无硬编码密码/token全走 `${{ secrets.* }}`
- [ ] 脚本用 `${VAR:?error}` 强制要求环境变量(不用默认值暴露凭据)
- [ ] Docker 镜像不包含 `.env` 文件Dockerfile 有 `.dockerignore`
- [ ] Git remote URL 不含 token用 secrets 注入)
### 4.3 提交前检查
```bash
# 扫描即将提交的文件是否含密钥
git diff --cached --name-only | xargs grep -lE \
'(PRIVATE KEY|password|secret|token|apikey)' 2>/dev/null
```
---
## 5. 监控
### 5.1 查看 Pipeline 状态
```bash
# 最近运行
curl -s -H "Authorization: token <TOKEN>" \
"http://<gitea>/api/v1/repos/<org>/<repo>/actions/runs?limit=5" | \
python3 -c "
import json, sys
for r in json.load(sys.stdin).get('workflow_runs', []):
print(f\"{r['id']} | {r['display_title'][:40]} | {r['status']} | {r['conclusion']}\")
"
```
### 5.2 飞书通知模板
```yaml
# 成功/失败通知(在 workflow 最后一步 if: always()
- name: Notify
if: always()
run: |
STATUS="${{ job.status }}"
EMOJI=$([ "$STATUS" = "success" ] && echo "✅" || echo "❌")
COLOR=$([ "$STATUS" = "success" ] && echo "green" || echo "red")
cat > /tmp/notify.json << EOF
{
"msg_type": "interactive",
"card": {
"header": {
"title": {"tag": "plain_text", "content": "$EMOJI <App> $STATUS"},
"template": "$COLOR"
},
"elements": [{
"tag": "div",
"text": {"tag": "lark_md", "content": "**分支**: ${{ github.ref_name }}\n**提交**: ${{ github.sha }}\n**触发**: ${{ github.event.head_commit.message }}"}
}]
}
}
EOF
curl -s -X POST "${{ secrets.FEISHU_WEBHOOK }}" \
-H "Content-Type: application/json" -d @/tmp/notify.json || true
```
### 5.3 构建时间追踪
在 workflow 首尾加时间戳:
```yaml
steps:
- name: Start Timer
run: echo "BUILD_START=$(date +%s)" >> $GITHUB_ENV
# ... 构建步骤 ...
- name: Report Duration
if: always()
run: |
DURATION=$(( $(date +%s) - $BUILD_START ))
echo "Build duration: ${DURATION}s"
```
---
## 6. Runner 管理
### 6.1 Runner 类型
| Runner | 标签 | 用途 | 位置 |
|--------|------|------|------|
| xiaoqu-runner | `self-hosted` | Go + Web + Docker | 阿里云 39.104.65.241 |
| bjwework-macos | `macos-arm64` | iOS + Swift | Tailscale 100.69.230.116 |
### 6.2 新增 Runner
```bash
# 1. 获取注册 token
curl -s -H "Authorization: token <ADMIN_TOKEN>" \
"http://<gitea>/api/v1/repos/<org>/<repo>/actions/runners/registration-token"
# 2. 注册
./act_runner register --no-interactive \
--instance http://<gitea> \
--token <TOKEN> \
--name <NAME> \
--labels <LABEL>:host
# 3. 启动macOS 用 launchd
launchctl load ~/Library/LaunchAgents/com.gitea.act-runner.plist
```
### 6.3 Runner 健康检查
```bash
# 检查 runner 进程
ssh bjwework "launchctl list | grep act-runner"
# 检查 runner 日志
ssh bjwework "tail -20 ~/act_runner/runner.log"
# 检查 Gitea 上的 runner 状态
curl -s -H "Authorization: token <TOKEN>" \
"http://<gitea>/api/v1/repos/<org>/<repo>/actions/runners" | \
python3 -c "import json,sys; [print(f\"{r['name']} | {r['status']}\") for r in json.load(sys.stdin)]"
```
---
## 7. Workflow 模板生成
### `/cicd template go`
生成 Go 后端 CI workflow含 vet → test → build → docker → deploy。
### `/cicd template ios`
生成 iOS TestFlight workflow含 xcodegen → test → archive → upload → notify。
### `/cicd template web`
生成 Next.js CI workflow含 install → build → docker → deploy。
### `/cicd template docker`
生成 Docker multi-service build+push workflow含 ACR 登录 → 多镜像构建 → SSH 部署。
---
## 8. 与其他技能的关系
| 技能 | 协作点 |
|------|--------|
| `dev-deploy` | `/deploy ios` 执行 TestFlight 部署,`/deploy docker` 执行容器部署 |
| `dev-coding` | 开发完成后触发 CI |
| `req` | `/req deploy` 项目级批量部署 |
| `pull-request` | PR 触发 CI 检查 |
| `req-test-gate` | CI 中的测试门禁 |