新增 dev-cicd(CI/CD 流水线设计/优化/排查): - Gitea Actions 模板(Go/iOS/Web/Docker) - Pipeline 优化(浅克隆/缓存/并发取消) - 故障排查决策树(20+ 常见错误) - 安全检查清单 + Runner 管理 增强 dev-deploy(部署执行): - Docker Staging/Production 部署模板 - 部署前健康检查(证书/Docker/磁盘) - 回滚策略(TestFlight/Docker/数据库) - 部署监控(Feishu通知/ASC API) 技能总数: 28 (dev 分类: 7) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
14 KiB
14 KiB
name, description
| name | description |
|---|---|
| dev-cicd | CI/CD 流水线设计、优化与排查。适配 Gitea Actions + Go/Swift/Next.js/Docker 栈。当用户提到 CI、CD、流水线、pipeline、workflow、构建失败、runner 相关任务时自动激活。 |
CI/CD 流水线技能 (dev-cicd)
概述
管理 Gitea Actions CI/CD 流水线的设计、优化和故障排查。适配技术栈:
- Git: Gitea (self-hosted, GitHub Actions YAML 兼容)
- Backend: Go (Gin + GORM)
- iOS: Swift 6 + SwiftUI + TCA
- Web: Next.js (React)
- Container: Docker + Docker Compose
- Registry: Aliyun ACR
- Runners: self-hosted (Linux) + macos-arm64 (iOS)
命令参考
| 命令 | 说明 |
|---|---|
/cicd analyze |
分析当前 workflow 找优化点 |
/cicd troubleshoot |
诊断流水线失败原因 |
/cicd template [go|ios|web|docker] |
生成 workflow 模板 |
/cicd status |
查看最近 workflow 运行状态 |
1. Pipeline 设计
1.1 Monorepo 路径过滤
仓库包含多个子项目,用 paths 只触发相关构建:
# .gitea/workflows/ci-cd.yml — Go + Web + Docker
on:
push:
branches: [develop, main]
paths:
- 'gateway/**'
- 'web/**'
- 'docker/**'
- 'scripts/**'
# .gitea/workflows/ios-testflight.yml — iOS 独立
on:
push:
branches: [develop, main]
paths:
- 'ios/**'
1.2 Pipeline 结构原则
快速反馈优先:
1. 静态检查 (lint/vet) — 秒级
2. 单元测试 (test) — 1-5 分钟
3. 构建 (build) — 2-10 分钟
4. 集成测试 (可选) — 5-15 分钟
5. 发布 (deploy) — 5-15 分钟
1.3 Go 后端模板
jobs:
ci:
runs-on: self-hosted
steps:
- name: Checkout
run: |
cd ${{ github.workspace }}
if [ -d .git ]; then
git fetch --depth 1 origin ${{ github.ref_name }}
git reset --hard origin/${{ github.ref_name }}
else
git clone --depth 1 --branch ${{ github.ref_name }} \
http://xiaoqu:${{ secrets.REPO_TOKEN }}@localhost:3000/<org>/<repo>.git .
fi
- name: Go Vet
run: cd gateway && go vet ./...
- name: Go Test
run: cd gateway && go test ./... -count=1 -timeout 120s
- name: Go Build
run: cd gateway && go build ./cmd/gateway/
1.4 iOS 模板
jobs:
ios:
runs-on: macos-arm64
if: "!contains(github.event.head_commit.message, '[skip ci]')"
steps:
- name: Checkout
run: git clone --depth 1 --branch ${{ github.ref_name }} <repo-url> .
- name: xcodegen
run: /opt/homebrew/bin/xcodegen generate
working-directory: ios
- name: Test
run: |
set -o pipefail
swift test 2>&1 | tee /tmp/test.log | tail -20
working-directory: ios
- name: Deploy TestFlight
env:
KEYCHAIN_PASSWORD: ${{ secrets.KEYCHAIN_PASSWORD }}
ASC_KEY_ID: ${{ secrets.ASC_KEY_ID }}
ASC_ISSUER_ID: ${{ secrets.ASC_ISSUER_ID }}
run: ./scripts/ios-testflight.sh
1.5 Web (Next.js) 模板
- name: Web Install
run: cd web && npm ci --legacy-peer-deps
- name: Web Build
run: cd web && npm run build
- name: Docker Build Web
run: |
docker build -t $REGISTRY/$WEB_IMAGE:${{ github.sha }} \
-t $REGISTRY/$WEB_IMAGE:latest ./web
1.6 单 Job vs 多 Job
| 场景 | 选择 | 原因 |
|---|---|---|
| Runner capacity=1 | 单 Job | 多 Job 串行 + 多次 checkout = 更慢 |
| 多 Runner 可用 | 多 Job + needs | 并行加速 |
| 不同 OS (Linux+macOS) | 分 Workflow | 不同 runner label |
当前推荐:Linux runner 单 Job(Go+Web+Docker),macOS runner 单 Job(iOS)。
2. 优化
2.1 浅克隆
# 首次 clone
git clone --depth 1 --branch ${{ github.ref_name }} <url> .
# 增量 fetch
git fetch --depth 1 origin ${{ github.ref_name }}
git reset --hard origin/${{ github.ref_name }}
效果:仓库含大量二进制文件时,clone 时间从 30s+ 降到 3-5s。
注意:需要 push 时先 git fetch --unshallow。
2.2 依赖缓存
Gitea Actions 不支持 actions/cache,但 self-hosted runner 可利用本地磁盘:
# Go modules — runner 上全局缓存
env:
GOMODCACHE: /opt/runner-cache/go/mod
GOCACHE: /opt/runner-cache/go/build
# npm — 利用 node_modules 持久化
# self-hosted runner 的 workspace 在两次运行间保留
- run: |
if [ -f web/node_modules/.cache-hash ] && \
[ "$(cat web/node_modules/.cache-hash)" = "$(md5sum web/package-lock.json | cut -d' ' -f1)" ]; then
echo "npm cache hit, skip install"
else
cd web && npm ci --legacy-peer-deps
md5sum package-lock.json | cut -d' ' -f1 > node_modules/.cache-hash
fi
# SPM — Xcode 自动缓存到 DerivedData,self-hosted runner 保留
2.3 并发取消
避免同一分支多次 push 排队等待:
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
2.4 条件跳过
# 跳过 CI Bot 的自动提交
if: "!contains(github.event.head_commit.message, '[skip ci]')"
# 只在 develop 分支部署
if: github.ref == 'refs/heads/develop'
2.5 构建产物复用
# Build once, use in deploy
- name: Build
run: go build -o /tmp/gateway ./cmd/gateway/
- name: Docker Build
run: |
# 用已编译的二进制,不在 Docker 内重新编译
cp /tmp/gateway docker/
docker build -f docker/gateway.prebuilt.Dockerfile -t $IMAGE .
3. 故障排查
3.1 决策树
Pipeline 失败
├── Workflow 没触发
│ ├── 检查 paths 过滤 → 改动不在匹配路径下
│ ├── 检查 branch 过滤 → 分支名不匹配
│ ├── 检查 [skip ci] → commit message 含跳过标记
│ └── Runner 离线 → Gitea Admin > Runners 检查状态
│
├── Checkout 失败
│ ├── "Authentication failed" → REPO_TOKEN secret 过期/无效
│ ├── "Connection refused :3000" → Gitea 服务未运行
│ └── Checkout 很慢 → 加 --depth 1 浅克隆
│
├── Go 构建失败
│ ├── "module not found" → GOPROXY 设置 / go mod tidy
│ ├── "cannot find package" → go.sum 不完整
│ └── "go: version mismatch" → runner 上 Go 版本与 go.mod 不匹配
│
├── iOS 构建失败
│ ├── "Macro must be enabled" → 加 -skipMacroValidation
│ ├── "cannot find type" → xcodegen generate 未运行
│ ├── "errSecInternalComponent" → unlock-keychain + set-key-partition-list
│ ├── "No signing certificate" → Xcode > Accounts 登录下载证书
│ ├── "Redundant Binary Upload" → 递增 CURRENT_PROJECT_VERSION
│ └── "Missing required icon" → Assets.xcassets 缺 1024x1024 icon
│
├── Docker 构建失败
│ ├── "Cannot connect to daemon" → Docker Desktop 未启动
│ ├── "unauthorized" → docker login 凭据过期
│ └── "no space left" → docker system prune
│
└── 部署失败
├── "Connection refused" (SSH) → 目标服务器 SSH 端口/密钥
├── "health check failed" → 应用启动慢,增加重试等待
└── "port already in use" → docker compose down 先停旧容器
3.2 常见错误速查
| 错误 | 原因 | 修复 |
|---|---|---|
errSecInternalComponent |
SSH 会话无法访问 Keychain | security unlock-keychain + set-key-partition-list |
Macro "X" must be enabled |
Swift Macros 安全限制 | -skipMacroValidation |
cannot find type 'Foo' |
xcodeproj 未包含新文件 | xcodegen generate |
Redundant Binary Upload |
build number 重复 | 递增 CURRENT_PROJECT_VERSION |
Cloud signing permission error |
API Key 权限不足或 Issuer ID 错误 | 用手动签名 + 本地 profile |
HTTP 401 Unauthorized (ASC API) |
JWT 缺少 kid header |
headers={"kid": KEY_ID} |
No profiles for bundle id |
无 distribution profile | 在 Apple Developer 创建并安装 |
missing icon file 120x120 |
无 App Icon asset | 创建 Assets.xcassets + AppIcon |
UIInterfaceOrientation iPad |
缺 iPad 方向声明 | 四方向 + UIRequiresFullScreen |
3.3 调试技巧
# 查看 Gitea runner 状态
curl -s -H "Authorization: token <TOKEN>" \
http://<gitea>/api/v1/repos/<org>/<repo>/actions/runners
# 查看最近 workflow 运行
curl -s -H "Authorization: token <TOKEN>" \
http://<gitea>/api/v1/repos/<org>/<repo>/actions/runs?limit=5
# 本地模拟 CI 环境
# Go
docker run -v $(pwd):/app -w /app golang:1.25 go build ./cmd/gateway/
# iOS — 只能在 macOS 上
ssh bjwework "cd ~/workspace/xiaoqu-ai/ios && swift test"
4. 安全
4.1 Secrets 管理
# 通过 Gitea API 配置 secrets(不要手动编辑 workflow 文件)
curl -X PUT -H "Authorization: token <ADMIN_TOKEN>" \
-H "Content-Type: application/json" \
"http://<gitea>/api/v1/repos/<org>/<repo>/actions/secrets/<NAME>" \
-d '{"data": "<VALUE>"}'
必需 Secrets 清单:
| Secret | 用途 | 轮换周期 |
|---|---|---|
REPO_TOKEN |
Git clone 认证 | 按需 |
ACR_USERNAME / ACR_PASSWORD |
Docker 镜像推送 | 90 天 |
SSH_PRIVATE_KEY |
服务器部署 | 按需 |
KEYCHAIN_PASSWORD |
macOS 签名解锁 | 改密码时 |
ASC_KEY_ID / ASC_ISSUER_ID |
App Store Connect | 按需 |
FEISHU_WEBHOOK |
通知 | 不过期 |
4.2 防泄漏检查清单
.gitignore包含.env、*.p8、*.pem、*.mobileprovision- Workflow 中无硬编码密码/token(全走
${{ secrets.* }}) - 脚本用
${VAR:?error}强制要求环境变量(不用默认值暴露凭据) - Docker 镜像不包含
.env文件(Dockerfile 有.dockerignore) - Git remote URL 不含 token(用 secrets 注入)
4.3 提交前检查
# 扫描即将提交的文件是否含密钥
git diff --cached --name-only | xargs grep -lE \
'(PRIVATE KEY|password|secret|token|apikey)' 2>/dev/null
5. 监控
5.1 查看 Pipeline 状态
# 最近运行
curl -s -H "Authorization: token <TOKEN>" \
"http://<gitea>/api/v1/repos/<org>/<repo>/actions/runs?limit=5" | \
python3 -c "
import json, sys
for r in json.load(sys.stdin).get('workflow_runs', []):
print(f\"{r['id']} | {r['display_title'][:40]} | {r['status']} | {r['conclusion']}\")
"
5.2 飞书通知模板
# 成功/失败通知(在 workflow 最后一步 if: always())
- name: Notify
if: always()
run: |
STATUS="${{ job.status }}"
EMOJI=$([ "$STATUS" = "success" ] && echo "✅" || echo "❌")
COLOR=$([ "$STATUS" = "success" ] && echo "green" || echo "red")
cat > /tmp/notify.json << EOF
{
"msg_type": "interactive",
"card": {
"header": {
"title": {"tag": "plain_text", "content": "$EMOJI <App> $STATUS"},
"template": "$COLOR"
},
"elements": [{
"tag": "div",
"text": {"tag": "lark_md", "content": "**分支**: ${{ github.ref_name }}\n**提交**: ${{ github.sha }}\n**触发**: ${{ github.event.head_commit.message }}"}
}]
}
}
EOF
curl -s -X POST "${{ secrets.FEISHU_WEBHOOK }}" \
-H "Content-Type: application/json" -d @/tmp/notify.json || true
5.3 构建时间追踪
在 workflow 首尾加时间戳:
steps:
- name: Start Timer
run: echo "BUILD_START=$(date +%s)" >> $GITHUB_ENV
# ... 构建步骤 ...
- name: Report Duration
if: always()
run: |
DURATION=$(( $(date +%s) - $BUILD_START ))
echo "Build duration: ${DURATION}s"
6. Runner 管理
6.1 Runner 类型
| Runner | 标签 | 用途 | 位置 |
|---|---|---|---|
| xiaoqu-runner | self-hosted |
Go + Web + Docker | 阿里云 39.104.65.241 |
| bjwework-macos | macos-arm64 |
iOS + Swift | Tailscale 100.69.230.116 |
6.2 新增 Runner
# 1. 获取注册 token
curl -s -H "Authorization: token <ADMIN_TOKEN>" \
"http://<gitea>/api/v1/repos/<org>/<repo>/actions/runners/registration-token"
# 2. 注册
./act_runner register --no-interactive \
--instance http://<gitea> \
--token <TOKEN> \
--name <NAME> \
--labels <LABEL>:host
# 3. 启动(macOS 用 launchd)
launchctl load ~/Library/LaunchAgents/com.gitea.act-runner.plist
6.3 Runner 健康检查
# 检查 runner 进程
ssh bjwework "launchctl list | grep act-runner"
# 检查 runner 日志
ssh bjwework "tail -20 ~/act_runner/runner.log"
# 检查 Gitea 上的 runner 状态
curl -s -H "Authorization: token <TOKEN>" \
"http://<gitea>/api/v1/repos/<org>/<repo>/actions/runners" | \
python3 -c "import json,sys; [print(f\"{r['name']} | {r['status']}\") for r in json.load(sys.stdin)]"
7. Workflow 模板生成
/cicd template go
生成 Go 后端 CI workflow,含 vet → test → build → docker → deploy。
/cicd template ios
生成 iOS TestFlight workflow,含 xcodegen → test → archive → upload → notify。
/cicd template web
生成 Next.js CI workflow,含 install → build → docker → deploy。
/cicd template docker
生成 Docker multi-service build+push workflow,含 ACR 登录 → 多镜像构建 → SSH 部署。
8. 与其他技能的关系
| 技能 | 协作点 |
|---|---|
dev-deploy |
/deploy ios 执行 TestFlight 部署,/deploy docker 执行容器部署 |
dev-coding |
开发完成后触发 CI |
req |
/req deploy 项目级批量部署 |
pull-request |
PR 触发 CI 检查 |
req-test-gate |
CI 中的测试门禁 |