548 lines
16 KiB
Markdown
548 lines
16 KiB
Markdown
---
|
||
name: dev-cicd
|
||
description: CI/CD 流水线设计、优化与排查。适配 Gitea Actions + Go/Swift/Next.js/Docker 栈。当用户提到 CI、CD、流水线、pipeline、workflow、构建失败、runner 相关任务时自动激活。
|
||
---
|
||
|
||
# CI/CD 流水线技能 (dev-cicd)
|
||
|
||
## 概述
|
||
|
||
管理 Gitea Actions CI/CD 流水线的设计、优化和故障排查。适配技术栈:
|
||
- **Git**: Gitea (self-hosted, GitHub Actions YAML 兼容)
|
||
- **Backend**: Go (Gin + GORM)
|
||
- **iOS**: Swift 6 + SwiftUI + TCA
|
||
- **Web**: Next.js (React)
|
||
- **Container**: Docker + Docker Compose
|
||
- **Registry**: Aliyun ACR
|
||
- **Runners**: self-hosted (Linux) + macos-arm64 (iOS)
|
||
|
||
---
|
||
|
||
## 命令参考
|
||
|
||
| 命令 | 说明 |
|
||
|------|------|
|
||
| `/cicd analyze` | 分析当前 workflow 找优化点 |
|
||
| `/cicd troubleshoot` | 诊断流水线失败原因 |
|
||
| `/cicd template [go\|ios\|web\|docker]` | 生成 workflow 模板 |
|
||
| `/cicd status` | 查看最近 workflow 运行状态 |
|
||
|
||
---
|
||
|
||
## 1. Pipeline 设计
|
||
|
||
### 1.1 Monorepo 路径过滤
|
||
|
||
仓库包含多个子项目,用 `paths` 只触发相关构建:
|
||
|
||
```yaml
|
||
# .gitea/workflows/ci-cd.yml — Go + Web + Docker
|
||
on:
|
||
push:
|
||
branches: [develop, main]
|
||
paths:
|
||
- 'gateway/**'
|
||
- 'web/**'
|
||
- 'docker/**'
|
||
- 'scripts/**'
|
||
|
||
# .gitea/workflows/ios-testflight.yml — iOS 独立
|
||
on:
|
||
push:
|
||
branches: [develop, main]
|
||
paths:
|
||
- 'ios/**'
|
||
```
|
||
|
||
### 1.2 Pipeline 结构原则
|
||
|
||
```
|
||
快速反馈优先:
|
||
1. 静态检查 (lint/vet) — 秒级
|
||
2. 单元测试 (test) — 1-5 分钟
|
||
3. 构建 (build) — 2-10 分钟
|
||
4. 集成测试 (可选) — 5-15 分钟
|
||
5. 发布 (deploy) — 5-15 分钟
|
||
```
|
||
|
||
### 1.3 Go 后端模板
|
||
|
||
```yaml
|
||
jobs:
|
||
ci:
|
||
runs-on: self-hosted
|
||
steps:
|
||
- name: Checkout
|
||
run: |
|
||
cd ${{ github.workspace }}
|
||
if [ -d .git ]; then
|
||
git fetch --depth 1 origin ${{ github.ref_name }}
|
||
git reset --hard origin/${{ github.ref_name }}
|
||
else
|
||
git clone --depth 1 --branch ${{ github.ref_name }} \
|
||
http://xiaoqu:${{ secrets.REPO_TOKEN }}@localhost:3000/<org>/<repo>.git .
|
||
fi
|
||
|
||
- name: Go Vet
|
||
run: cd gateway && go vet ./...
|
||
|
||
- name: Go Test
|
||
run: cd gateway && go test ./... -count=1 -timeout 120s
|
||
|
||
- name: Go Build
|
||
run: cd gateway && go build ./cmd/gateway/
|
||
```
|
||
|
||
### 1.4 iOS 模板
|
||
|
||
```yaml
|
||
jobs:
|
||
ios:
|
||
runs-on: macos-arm64
|
||
if: "!contains(github.event.head_commit.message, '[skip ci]')"
|
||
steps:
|
||
- name: Checkout
|
||
run: git clone --depth 1 --branch ${{ github.ref_name }} <repo-url> .
|
||
|
||
- name: xcodegen
|
||
run: /opt/homebrew/bin/xcodegen generate
|
||
working-directory: ios
|
||
|
||
- name: Test
|
||
run: |
|
||
set -o pipefail
|
||
swift test 2>&1 | tee /tmp/test.log | tail -20
|
||
working-directory: ios
|
||
|
||
- name: Deploy TestFlight
|
||
env:
|
||
KEYCHAIN_PASSWORD: ${{ secrets.KEYCHAIN_PASSWORD }}
|
||
ASC_KEY_ID: ${{ secrets.ASC_KEY_ID }}
|
||
ASC_ISSUER_ID: ${{ secrets.ASC_ISSUER_ID }}
|
||
run: ./scripts/ios-testflight.sh
|
||
```
|
||
|
||
### 1.5 Web (Next.js) 模板
|
||
|
||
```yaml
|
||
- name: Web Install
|
||
run: cd web && npm ci --legacy-peer-deps
|
||
|
||
- name: Web Build
|
||
run: cd web && npm run build
|
||
|
||
- name: Docker Build Web
|
||
run: |
|
||
docker build -t $REGISTRY/$WEB_IMAGE:${{ github.sha }} \
|
||
-t $REGISTRY/$WEB_IMAGE:latest ./web
|
||
```
|
||
|
||
### 1.6 单 Job vs 多 Job
|
||
|
||
| 场景 | 选择 | 原因 |
|
||
|------|------|------|
|
||
| Runner capacity=1 | 单 Job | 多 Job 串行 + 多次 checkout = 更慢 |
|
||
| 多 Runner 可用 | 多 Job + needs | 并行加速 |
|
||
| 不同 OS (Linux+macOS) | 分 Workflow | 不同 runner label |
|
||
|
||
**当前推荐**:Linux runner 单 Job(Go+Web+Docker),macOS runner 单 Job(iOS)。
|
||
|
||
---
|
||
|
||
## 2. 优化
|
||
|
||
### 2.1 浅克隆
|
||
|
||
```yaml
|
||
# 首次 clone
|
||
git clone --depth 1 --branch ${{ github.ref_name }} <url> .
|
||
|
||
# 增量 fetch
|
||
git fetch --depth 1 origin ${{ github.ref_name }}
|
||
git reset --hard origin/${{ github.ref_name }}
|
||
```
|
||
|
||
**效果**:仓库含大量二进制文件时,clone 时间从 30s+ 降到 3-5s。
|
||
|
||
**注意**:需要 push 时先 `git fetch --unshallow`。
|
||
|
||
### 2.2 依赖缓存
|
||
|
||
Gitea Actions 不支持 `actions/cache`,但 self-hosted runner 可利用本地磁盘:
|
||
|
||
```yaml
|
||
# Go modules — runner 上全局缓存
|
||
env:
|
||
GOMODCACHE: /opt/runner-cache/go/mod
|
||
GOCACHE: /opt/runner-cache/go/build
|
||
|
||
# npm — 利用 node_modules 持久化
|
||
# self-hosted runner 的 workspace 在两次运行间保留
|
||
- run: |
|
||
if [ -f web/node_modules/.cache-hash ] && \
|
||
[ "$(cat web/node_modules/.cache-hash)" = "$(md5sum web/package-lock.json | cut -d' ' -f1)" ]; then
|
||
echo "npm cache hit, skip install"
|
||
else
|
||
cd web && npm ci --legacy-peer-deps
|
||
md5sum package-lock.json | cut -d' ' -f1 > node_modules/.cache-hash
|
||
fi
|
||
|
||
# SPM — Xcode 自动缓存到 DerivedData,self-hosted runner 保留
|
||
```
|
||
|
||
### 2.3 并发取消
|
||
|
||
避免同一分支多次 push 排队等待:
|
||
|
||
```yaml
|
||
concurrency:
|
||
group: ${{ github.workflow }}-${{ github.ref }}
|
||
cancel-in-progress: true
|
||
```
|
||
|
||
### 2.4 条件跳过
|
||
|
||
```yaml
|
||
# 跳过 CI Bot 的自动提交
|
||
if: "!contains(github.event.head_commit.message, '[skip ci]')"
|
||
|
||
# 只在 develop 分支部署
|
||
if: github.ref == 'refs/heads/develop'
|
||
```
|
||
|
||
### 2.5 构建产物复用
|
||
|
||
```yaml
|
||
# Build once, use in deploy
|
||
- name: Build
|
||
run: go build -o /tmp/gateway ./cmd/gateway/
|
||
|
||
- name: Docker Build
|
||
run: |
|
||
# 用已编译的二进制,不在 Docker 内重新编译
|
||
cp /tmp/gateway docker/
|
||
docker build -f docker/gateway.prebuilt.Dockerfile -t $IMAGE .
|
||
```
|
||
|
||
### 2.6 Docker Context 瘦身
|
||
|
||
**问题**:`docker build` 会将整个 context 目录发送到 daemon。缺少 `.dockerignore` 时,`node_modules`(数百 MB)、`.next/`、`.git/` 等全部传入,导致 `transferring context: 768MB` 耗时 30s+。
|
||
|
||
**诊断**:
|
||
```bash
|
||
# 检查 context 大小(模拟 docker build 发送量)
|
||
du -sh --exclude=.git <project-dir>
|
||
|
||
# 检查是否有 .dockerignore
|
||
cat <project-dir>/.dockerignore 2>/dev/null || echo "缺少 .dockerignore!"
|
||
```
|
||
|
||
**`/cicd analyze` 必查项**:对每个有 Dockerfile 的目录检查 `.dockerignore` 是否存在。缺失则告警。
|
||
|
||
**标准 .dockerignore 模板**:
|
||
|
||
```
|
||
# Node.js
|
||
node_modules
|
||
.next
|
||
.turbo
|
||
coverage
|
||
|
||
# Common
|
||
.git
|
||
.gitignore
|
||
.env*
|
||
*.md
|
||
.vscode
|
||
.idea
|
||
```
|
||
|
||
**效果**:Web 项目 context 从 768MB → ~10MB,Docker build 加速 10x。
|
||
|
||
**经验教训**:`.gitignore` 不等于 `.dockerignore`。Git 忽略的文件可能在 runner workspace 中存在(如 self-hosted runner 保留的 `node_modules` 缓存),docker build 会把它们全部打包传入。每个有 Dockerfile 的子目录**必须有 `.dockerignore`**。
|
||
|
||
---
|
||
|
||
## 3. 故障排查
|
||
|
||
### 3.1 决策树
|
||
|
||
```
|
||
Pipeline 失败
|
||
├── Workflow 没触发
|
||
│ ├── 检查 paths 过滤 → 改动不在匹配路径下
|
||
│ ├── 检查 branch 过滤 → 分支名不匹配
|
||
│ ├── 检查 [skip ci] → commit message 含跳过标记
|
||
│ └── Runner 离线 → Gitea Admin > Runners 检查状态
|
||
│
|
||
├── Checkout 失败
|
||
│ ├── "Authentication failed" → REPO_TOKEN secret 过期/无效
|
||
│ ├── "Connection refused :3000" → Gitea 服务未运行
|
||
│ └── Checkout 很慢 → 加 --depth 1 浅克隆
|
||
│
|
||
├── Go 构建失败
|
||
│ ├── "module not found" → GOPROXY 设置 / go mod tidy
|
||
│ ├── "cannot find package" → go.sum 不完整
|
||
│ └── "go: version mismatch" → runner 上 Go 版本与 go.mod 不匹配
|
||
│
|
||
├── iOS 构建失败
|
||
│ ├── "Macro must be enabled" → 加 -skipMacroValidation
|
||
│ ├── "cannot find type" → xcodegen generate 未运行
|
||
│ ├── "errSecInternalComponent" → unlock-keychain + set-key-partition-list
|
||
│ ├── "No signing certificate" → Xcode > Accounts 登录下载证书
|
||
│ ├── "Redundant Binary Upload" → 递增 CURRENT_PROJECT_VERSION
|
||
│ └── "Missing required icon" → Assets.xcassets 缺 1024x1024 icon
|
||
│
|
||
├── Docker 构建失败/慢
|
||
│ ├── "Cannot connect to daemon" → Docker Desktop 未启动
|
||
│ ├── "unauthorized" / "denied" → docker login 凭据过期 或 ACR namespace 缺失
|
||
│ ├── "no space left" → docker system prune
|
||
│ ├── "transferring context: XXX MB" 很慢 → 缺少 .dockerignore(node_modules 被传入)
|
||
│ ├── build 成功但 push denied → 镜像路径缺 namespace(registry/namespace/image)
|
||
│ ├── docker compose pull 超时 → 不带参数会拉 Docker Hub 上的 postgres/redis,只拉业务镜像
|
||
│ └── docker compose up -d 也会 pull → 加 `--no-deps gateway web` 只重启业务容器
|
||
│
|
||
└── 部署失败
|
||
├── "Connection refused" (SSH) → 目标服务器 SSH 端口/密钥
|
||
├── "health check failed" → 应用启动慢,增加重试等待
|
||
└── "port already in use" → docker compose down 先停旧容器
|
||
```
|
||
|
||
### 3.2 常见错误速查
|
||
|
||
| 错误 | 原因 | 修复 |
|
||
|------|------|------|
|
||
| `errSecInternalComponent` | SSH 会话无法访问 Keychain | `security unlock-keychain` + `set-key-partition-list` |
|
||
| `Macro "X" must be enabled` | Swift Macros 安全限制 | `-skipMacroValidation` |
|
||
| `cannot find type 'Foo'` | xcodeproj 未包含新文件 | `xcodegen generate` |
|
||
| `Redundant Binary Upload` | build number 重复 | 递增 `CURRENT_PROJECT_VERSION` |
|
||
| `Cloud signing permission error` | API Key 权限不足或 Issuer ID 错误 | 用手动签名 + 本地 profile |
|
||
| `HTTP 401 Unauthorized` (ASC API) | JWT 缺少 `kid` header | `headers={"kid": KEY_ID}` |
|
||
| `No profiles for bundle id` | 无 distribution profile | 在 Apple Developer 创建并安装 |
|
||
| `transferring context: 768MB` | 缺 .dockerignore | 创建 .dockerignore 排除 node_modules/.next/.git |
|
||
| `denied: requested access` (push) | ACR 镜像路径缺 namespace | registry/**namespace**/image |
|
||
| `docker compose pull` 超时 | 拉了 Docker Hub 的 postgres/redis | `docker compose pull gateway web` 只拉业务镜像 |
|
||
| `docker compose up -d` 也超时 | up 隐含 pull 所有 service | `docker compose up -d --no-deps gateway web` |
|
||
| `missing icon file 120x120` | 无 App Icon asset | 创建 Assets.xcassets + AppIcon |
|
||
| `UIInterfaceOrientation` iPad | 缺 iPad 方向声明 | 四方向 + `UIRequiresFullScreen` |
|
||
|
||
### 3.3 调试技巧
|
||
|
||
```bash
|
||
# 查看 Gitea runner 状态
|
||
curl -s -H "Authorization: token <TOKEN>" \
|
||
http://<gitea>/api/v1/repos/<org>/<repo>/actions/runners
|
||
|
||
# 查看最近 workflow 运行
|
||
curl -s -H "Authorization: token <TOKEN>" \
|
||
http://<gitea>/api/v1/repos/<org>/<repo>/actions/runs?limit=5
|
||
|
||
# 本地模拟 CI 环境
|
||
# Go
|
||
docker run -v $(pwd):/app -w /app golang:1.25 go build ./cmd/gateway/
|
||
|
||
# iOS — 只能在 macOS 上
|
||
ssh bjwework "cd ~/workspace/xiaoqu-ai/ios && swift test"
|
||
```
|
||
|
||
---
|
||
|
||
## 4. 安全
|
||
|
||
### 4.1 Secrets 管理
|
||
|
||
```bash
|
||
# 通过 Gitea API 配置 secrets(不要手动编辑 workflow 文件)
|
||
curl -X PUT -H "Authorization: token <ADMIN_TOKEN>" \
|
||
-H "Content-Type: application/json" \
|
||
"http://<gitea>/api/v1/repos/<org>/<repo>/actions/secrets/<NAME>" \
|
||
-d '{"data": "<VALUE>"}'
|
||
```
|
||
|
||
**必需 Secrets 清单**:
|
||
|
||
| Secret | 用途 | 轮换周期 |
|
||
|--------|------|---------|
|
||
| `REPO_TOKEN` | Git clone 认证 | 按需 |
|
||
| `ACR_USERNAME` / `ACR_PASSWORD` | Docker 镜像推送 | 90 天 |
|
||
| `SSH_PRIVATE_KEY` | 服务器部署 | 按需 |
|
||
| `KEYCHAIN_PASSWORD` | macOS 签名解锁 | 改密码时 |
|
||
| `ASC_KEY_ID` / `ASC_ISSUER_ID` | App Store Connect | 按需 |
|
||
| `FEISHU_WEBHOOK` | 通知 | 不过期 |
|
||
|
||
### 4.2 防泄漏检查清单
|
||
|
||
- [ ] `.gitignore` 包含 `.env`、`*.p8`、`*.pem`、`*.mobileprovision`
|
||
- [ ] Workflow 中无硬编码密码/token(全走 `${{ secrets.* }}`)
|
||
- [ ] 脚本用 `${VAR:?error}` 强制要求环境变量(不用默认值暴露凭据)
|
||
- [ ] Docker 镜像不包含 `.env` 文件(Dockerfile 有 `.dockerignore`)
|
||
- [ ] Git remote URL 不含 token(用 secrets 注入)
|
||
|
||
### 4.3 提交前检查
|
||
|
||
```bash
|
||
# 扫描即将提交的文件是否含密钥
|
||
git diff --cached --name-only | xargs grep -lE \
|
||
'(PRIVATE KEY|password|secret|token|apikey)' 2>/dev/null
|
||
```
|
||
|
||
---
|
||
|
||
## 5. 监控
|
||
|
||
### 5.1 查看 Pipeline 状态
|
||
|
||
```bash
|
||
# 最近运行
|
||
curl -s -H "Authorization: token <TOKEN>" \
|
||
"http://<gitea>/api/v1/repos/<org>/<repo>/actions/runs?limit=5" | \
|
||
python3 -c "
|
||
import json, sys
|
||
for r in json.load(sys.stdin).get('workflow_runs', []):
|
||
print(f\"{r['id']} | {r['display_title'][:40]} | {r['status']} | {r['conclusion']}\")
|
||
"
|
||
```
|
||
|
||
### 5.2 飞书通知模板
|
||
|
||
```yaml
|
||
# 成功/失败通知(在 workflow 最后一步 if: always())
|
||
- name: Notify
|
||
if: always()
|
||
run: |
|
||
STATUS="${{ job.status }}"
|
||
EMOJI=$([ "$STATUS" = "success" ] && echo "✅" || echo "❌")
|
||
COLOR=$([ "$STATUS" = "success" ] && echo "green" || echo "red")
|
||
cat > /tmp/notify.json << EOF
|
||
{
|
||
"msg_type": "interactive",
|
||
"card": {
|
||
"header": {
|
||
"title": {"tag": "plain_text", "content": "$EMOJI <App> $STATUS"},
|
||
"template": "$COLOR"
|
||
},
|
||
"elements": [{
|
||
"tag": "div",
|
||
"text": {"tag": "lark_md", "content": "**分支**: ${{ github.ref_name }}\n**提交**: ${{ github.sha }}\n**触发**: ${{ github.event.head_commit.message }}"}
|
||
}]
|
||
}
|
||
}
|
||
EOF
|
||
curl -s -X POST "${{ secrets.FEISHU_WEBHOOK }}" \
|
||
-H "Content-Type: application/json" -d @/tmp/notify.json || true
|
||
```
|
||
|
||
### 5.3 构建时间追踪
|
||
|
||
在 workflow 首尾加时间戳:
|
||
|
||
```yaml
|
||
steps:
|
||
- name: Start Timer
|
||
run: echo "BUILD_START=$(date +%s)" >> $GITHUB_ENV
|
||
|
||
# ... 构建步骤 ...
|
||
|
||
- name: Report Duration
|
||
if: always()
|
||
run: |
|
||
DURATION=$(( $(date +%s) - $BUILD_START ))
|
||
echo "Build duration: ${DURATION}s"
|
||
```
|
||
|
||
---
|
||
|
||
## 6. Runner 管理
|
||
|
||
### 6.1 Runner 类型
|
||
|
||
| Runner | 标签 | 用途 | 位置 |
|
||
|--------|------|------|------|
|
||
| xiaoqu-runner | `self-hosted` | Go + Web + Docker | 阿里云 39.104.65.241 |
|
||
| bjwework-macos | `macos-arm64` | iOS + Swift | Tailscale 100.69.230.116 |
|
||
|
||
### 6.2 新增 Runner
|
||
|
||
```bash
|
||
# 1. 获取注册 token
|
||
curl -s -H "Authorization: token <ADMIN_TOKEN>" \
|
||
"http://<gitea>/api/v1/repos/<org>/<repo>/actions/runners/registration-token"
|
||
|
||
# 2. 注册
|
||
./act_runner register --no-interactive \
|
||
--instance http://<gitea> \
|
||
--token <TOKEN> \
|
||
--name <NAME> \
|
||
--labels <LABEL>:host
|
||
|
||
# 3. 启动(macOS 用 launchd)
|
||
launchctl load ~/Library/LaunchAgents/com.gitea.act-runner.plist
|
||
```
|
||
|
||
### 6.3 Runner 健康检查
|
||
|
||
```bash
|
||
# 检查 runner 进程
|
||
ssh bjwework "launchctl list | grep act-runner"
|
||
|
||
# 检查 runner 日志
|
||
ssh bjwework "tail -20 ~/act_runner/runner.log"
|
||
|
||
# 检查 Gitea 上的 runner 状态
|
||
curl -s -H "Authorization: token <TOKEN>" \
|
||
"http://<gitea>/api/v1/repos/<org>/<repo>/actions/runners" | \
|
||
python3 -c "import json,sys; [print(f\"{r['name']} | {r['status']}\") for r in json.load(sys.stdin)]"
|
||
```
|
||
|
||
---
|
||
|
||
## 7. Workflow 模板生成
|
||
|
||
### `/cicd analyze` 检查清单
|
||
|
||
执行时自动扫描以下项目:
|
||
|
||
1. **Workflow YAML** — 语法检查、路径过滤、并发取消、[skip ci]
|
||
2. **Docker context** — 每个有 Dockerfile 的目录是否有 `.dockerignore`(**必查**)
|
||
3. **Secrets** — workflow 中是否有硬编码凭据、路径
|
||
4. **缓存** — 是否利用了依赖缓存(npm/Go/SPM)
|
||
5. **浅克隆** — checkout 是否用了 `--depth 1`
|
||
6. **镜像命名** — ACR/registry 路径是否包含 namespace
|
||
|
||
```bash
|
||
# 快速扫描命令
|
||
echo "=== .dockerignore 检查 ==="
|
||
find . -name Dockerfile -exec sh -c 'DIR=$(dirname "{}"); [ -f "$DIR/.dockerignore" ] && echo "✅ $DIR" || echo "❌ $DIR 缺少 .dockerignore"' \;
|
||
|
||
echo "=== 硬编码凭据检查 ==="
|
||
grep -rn 'password\|secret\|token' .gitea/workflows/ | grep -v 'secrets\.' | grep -v '#'
|
||
```
|
||
|
||
### `/cicd template go`
|
||
|
||
生成 Go 后端 CI workflow,含 vet → test → build → docker → deploy。
|
||
|
||
### `/cicd template ios`
|
||
|
||
生成 iOS TestFlight workflow,含 xcodegen → test → archive → upload → notify。
|
||
|
||
### `/cicd template web`
|
||
|
||
生成 Next.js CI workflow,含 install → build → docker → deploy。
|
||
|
||
### `/cicd template docker`
|
||
|
||
生成 Docker multi-service build+push workflow,含 ACR 登录 → 多镜像构建 → SSH 部署。
|
||
|
||
---
|
||
|
||
## 8. 与其他技能的关系
|
||
|
||
| 技能 | 协作点 |
|
||
|------|--------|
|
||
| `dev-deploy` | `/deploy ios` 执行 TestFlight 部署,`/deploy docker` 执行容器部署 |
|
||
| `dev-coding` | 开发完成后触发 CI |
|
||
| `req` | `/req deploy` 项目级批量部署 |
|
||
| `pull-request` | PR 触发 CI 检查 |
|
||
| `req-test-gate` | CI 中的测试门禁 |
|