- 重命名 plugins/ → skills/,个人插件迁移到 skills-personal/(gitignore) - 更新 generate-marketplace.py 支持 config 读取和 skills-personal 扫描 - 新增 claude-config.yaml(技能启用/禁用 + MCP 配置) - 新增 init.sh(交互式 MCP 初始化,支持 stdio/SSE 模式) - 新增 CLAUDE.md 项目说明 - 重写 README.md 反映新结构 - 删除过时脚本:PUSH.sh、generate-marketplace.sh、convert-skills.sh Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
838 lines
26 KiB
Markdown
838 lines
26 KiB
Markdown
# 数据库备份与恢复 Skill
|
||
|
||
**父技能**: ops-tools
|
||
**适用范围**: 全局(所有项目数据库)
|
||
**创建时间**: 2026-01-15 07:30:00 ACDT
|
||
**最后更新**: 2026-02-02
|
||
|
||
---
|
||
|
||
## 技能概述
|
||
|
||
全局数据库备份技能,适用于所有项目的 PostgreSQL 数据库。涵盖迁移前备份、自动备份、数据恢复和灾难恢复策略。
|
||
|
||
**核心原则**:
|
||
- ⚠️ **任何数据库迁移操作前必须先备份**
|
||
- 保留策略:最近 7 天 + 每月 1 个永久备份
|
||
- 存储位置:服务器本地 `/backup/` 目录
|
||
|
||
---
|
||
|
||
## 数据库清单
|
||
|
||
| 数据库 | 服务器 | 容器 | 用途 | 备份路径 |
|
||
|--------|--------|------|------|----------|
|
||
| ai_project_prod | tools_ai_proj | ai_postgres_prod | AI-Proj 生产 | /backup/ai-project/database/ |
|
||
| ai_project_staging | singapore | ai_postgres_staging | AI-Proj 测试 | /backup/ai-project-staging/ |
|
||
| coolbuy_prod | coolbuy-dev | postgres | Coolbuy 3.0 | /backup/coolbuy/ |
|
||
|
||
---
|
||
|
||
## ⚡ 迁移前快速备份(必读)
|
||
|
||
> **重要**:执行任何 `UPDATE`、`DELETE`、`ALTER`、数据迁移等操作前,**必须先执行备份**。
|
||
|
||
### 一键备份命令
|
||
|
||
```bash
|
||
# AI-Proj 生产数据库 - 迁移前备份
|
||
ssh tools_ai_proj 'REASON="pre_migration_$(date +%Y%m%d_%H%M%S)" && \
|
||
docker exec ai_postgres_prod pg_dump -U ai_prod_user -Fc ai_project_prod \
|
||
> /backup/ai-project/database/ai_project_${REASON}.dump && \
|
||
echo "✓ 备份完成: /backup/ai-project/database/ai_project_${REASON}.dump"'
|
||
|
||
# AI-Proj 测试数据库 - 迁移前备份
|
||
ssh singapore 'REASON="pre_migration_$(date +%Y%m%d_%H%M%S)" && \
|
||
sudo docker exec ai_postgres_staging pg_dump -U ai_staging_user -Fc ai_project_staging \
|
||
> /backup/ai-project-staging/ai_project_staging_${REASON}.dump && \
|
||
echo "✓ 备份完成"'
|
||
```
|
||
|
||
### 带原因的备份(推荐)
|
||
|
||
```bash
|
||
# 指定备份原因,方便追溯
|
||
ssh tools_ai_proj 'REASON="migrate_project_165_to_167" && \
|
||
docker exec ai_postgres_prod pg_dump -U ai_prod_user -Fc ai_project_prod \
|
||
> /backup/ai-project/database/ai_project_$(date +%Y%m%d_%H%M%S)_${REASON}.dump && \
|
||
ls -lh /backup/ai-project/database/ | tail -3'
|
||
```
|
||
|
||
### 备份后验证
|
||
|
||
```bash
|
||
# 验证备份文件
|
||
ssh tools_ai_proj 'ls -lh /backup/ai-project/database/ | tail -5'
|
||
|
||
# 检查备份文件大小(应该 > 10MB)
|
||
ssh tools_ai_proj 'stat --printf="%s bytes\n" /backup/ai-project/database/ai_project_*.dump | tail -1'
|
||
```
|
||
|
||
---
|
||
|
||
## 快速恢复命令
|
||
|
||
### 从最新备份恢复
|
||
|
||
```bash
|
||
# 1. 找到最新备份
|
||
ssh tools_ai_proj 'ls -lt /backup/ai-project/database/*.dump | head -3'
|
||
|
||
# 2. 恢复(使用 pg_restore)
|
||
ssh tools_ai_proj 'BACKUP_FILE="/backup/ai-project/database/ai_project_XXXXXXXX.dump" && \
|
||
docker stop ai_backend_prod && \
|
||
docker exec ai_postgres_prod pg_restore -U ai_prod_user -d ai_project_prod --clean --if-exists -Fc "$BACKUP_FILE" && \
|
||
docker start ai_backend_prod && \
|
||
echo "✓ 恢复完成"'
|
||
|
||
# 3. 验证
|
||
curl -s https://ai.pipexerp.com/api/v1/health | jq .
|
||
```
|
||
|
||
### 恢复到特定时间点
|
||
|
||
```bash
|
||
# 列出所有备份,找到目标时间点
|
||
ssh tools_ai_proj 'ls -lht /backup/ai-project/database/*.dump'
|
||
|
||
# 恢复指定备份
|
||
ssh tools_ai_proj 'docker exec ai_postgres_prod pg_restore \
|
||
-U ai_prod_user -d ai_project_prod --clean --if-exists -Fc \
|
||
/backup/ai-project/database/ai_project_20260202_180000_migrate_project_165_to_167.dump'
|
||
```
|
||
|
||
---
|
||
|
||
## 保留策略
|
||
|
||
### 策略说明
|
||
|
||
| 类型 | 保留时间 | 清理规则 |
|
||
|------|----------|----------|
|
||
| 每日备份 | 7 天 | 超过 7 天自动删除 |
|
||
| 月度备份 | 永久 | 每月 1 号的备份永久保留 |
|
||
| 迁移前备份 | 30 天 | 带 `pre_migration` 标记的保留 30 天 |
|
||
|
||
### 自动清理脚本
|
||
|
||
```bash
|
||
# /opt/scripts/cleanup-backups.sh
|
||
#!/bin/bash
|
||
BACKUP_DIR="/backup/ai-project/database"
|
||
|
||
# 删除超过 7 天的每日备份(保留月度备份)
|
||
find "$BACKUP_DIR" -name "*.dump" -mtime +7 ! -name "*_01_*" -delete
|
||
find "$BACKUP_DIR" -name "*.sql.gz" -mtime +7 ! -name "*_01_*" -delete
|
||
|
||
# 删除超过 30 天的迁移前备份
|
||
find "$BACKUP_DIR" -name "*pre_migration*" -mtime +30 -delete
|
||
|
||
echo "$(date): Cleanup completed" >> /var/log/backup-cleanup.log
|
||
```
|
||
|
||
### Cron 配置
|
||
|
||
```cron
|
||
# 每天凌晨 3 点清理旧备份
|
||
0 3 * * * /opt/scripts/cleanup-backups.sh
|
||
```
|
||
|
||
---
|
||
|
||
## 快速参考
|
||
|
||
| 操作 | 命令 |
|
||
|------|------|
|
||
| 手动执行备份 | `ssh tools_ai_proj "/opt/ai-project/deploy/scripts/backup-database.sh"` |
|
||
| 查看本地备份 | `ssh tools_ai_proj "ls -lh /backup/ai-project/database/"` |
|
||
| 查看备份日志 | `ssh tools_ai_proj "tail -f /var/log/ai-project-backup.log"` |
|
||
| 触发 OSS 同步 | `ssh tools_ai_proj "/opt/ai-project/deploy/scripts/backup-to-oss.sh"` |
|
||
| 列出 OSS 备份 | `ssh tools_ai_proj "ossutil ls oss://fnos2026/ai-project/backups/ --config-file ~/.ossutilconfig"` |
|
||
| 下载最新备份 | `ssh tools_ai_proj "ossutil cp oss://fnos2026/ai-project/backups/latest.sql.gz /tmp/ --config-file ~/.ossutilconfig"` |
|
||
| 验证备份完整性 | `ssh tools_ai_proj "gzip -t /backup/ai-project/database/latest.sql.gz"` |
|
||
|
||
---
|
||
|
||
## 备份架构
|
||
|
||
### 双层备份策略
|
||
|
||
```
|
||
┌─────────────────────────────────────────────────────────┐
|
||
│ AI-Proj 生产服务器 │
|
||
│ (tools_ai_proj: 152.136.104.251) │
|
||
├─────────────────────────────────────────────────────────┤
|
||
│ │
|
||
│ PostgreSQL 数据库 (ai_postgres_prod) │
|
||
│ │ │
|
||
│ │ 每天 02:00 (Cron) │
|
||
│ ▼ │
|
||
│ 本地备份 (/backup/ai-project/database/) │
|
||
│ │ - gzip 压缩 │
|
||
│ │ - 30 天保留 │
|
||
│ │ - 完整性验证 │
|
||
│ │ - 符号链接 (latest.sql.gz) │
|
||
│ │ │
|
||
│ │ 每天 02:30 (Cron) │
|
||
│ ▼ │
|
||
│ OSS 同步 (backup-to-oss.sh) │
|
||
│ │ │
|
||
└────────────┼─────────────────────────────────────────────┘
|
||
│
|
||
│ 互联网 (623 KB/s)
|
||
▼
|
||
┌─────────────────────────────────────────────────────────┐
|
||
│ 阿里云对象存储 (OSS) │
|
||
│ 北京区域 │
|
||
├─────────────────────────────────────────────────────────┤
|
||
│ Bucket: fnos2026 │
|
||
│ 路径: /ai-project/backups/ │
|
||
│ │
|
||
│ ├── YYYYMMDD/ │
|
||
│ │ └── ai_project_YYYYMMDD_HHMMSS.sql.gz │
|
||
│ └── latest.sql.gz (最新备份) │
|
||
│ │
|
||
│ ✅ 异地容灾 (99.9% 可用性) │
|
||
│ ✅ 30 天自动清理 │
|
||
│ ✅ 成本: ~¥0.25/月 │
|
||
└─────────────────────────────────────────────────────────┘
|
||
```
|
||
|
||
### 备份时间表
|
||
|
||
| 时间 | 操作 | 脚本 | 日志文件 |
|
||
|------|------|------|----------|
|
||
| 02:00 | 本地数据库备份 | `/opt/ai-project/deploy/scripts/backup-database.sh` | `/var/log/ai-project-backup.log` |
|
||
| 02:30 | OSS 异地同步 | `/opt/ai-project/deploy/scripts/backup-to-oss.sh` | `/var/log/ai-project-oss-sync.log` |
|
||
|
||
---
|
||
|
||
## 自动备份配置
|
||
|
||
### 本地备份
|
||
|
||
**脚本位置**: `/opt/ai-project/deploy/scripts/backup-database.sh`
|
||
|
||
**功能特性**:
|
||
- ✅ PostgreSQL pg_dump 完整备份
|
||
- ✅ gzip 压缩
|
||
- ✅ 按日期目录组织
|
||
- ✅ 30 天自动清理
|
||
- ✅ 备份完整性验证
|
||
- ✅ 符号链接指向最新备份
|
||
|
||
**Cron 配置**:
|
||
```cron
|
||
0 2 * * * /opt/ai-project/deploy/scripts/backup-database.sh >> /var/log/ai-project-backup.log 2>&1
|
||
```
|
||
|
||
**手动执行**:
|
||
```bash
|
||
ssh tools_ai_proj "/opt/ai-project/deploy/scripts/backup-database.sh"
|
||
```
|
||
|
||
**查看日志**:
|
||
```bash
|
||
ssh tools_ai_proj "tail -f /var/log/ai-project-backup.log"
|
||
```
|
||
|
||
**备份目录结构**:
|
||
```
|
||
/backup/ai-project/database/
|
||
├── 20260115/
|
||
│ ├── ai_project_20260115_020001.sql.gz (13M)
|
||
│ └── ai_project_20260115_120000.sql.gz (13M)
|
||
├── 20260116/
|
||
│ └── ai_project_20260116_020001.sql.gz (13M)
|
||
└── latest.sql.gz -> 20260116/ai_project_20260116_020001.sql.gz
|
||
```
|
||
|
||
---
|
||
|
||
## 阿里云 OSS 异地备份
|
||
|
||
**配置时间**: 2026-01-15 01:12:00 CST
|
||
**首次同步**: 2026-01-15 02:30:01 CST
|
||
|
||
### OSS 配置信息
|
||
|
||
| 配置项 | 值 |
|
||
|--------|-----|
|
||
| Endpoint | oss-cn-beijing.aliyuncs.com |
|
||
| Bucket | fnos2026 |
|
||
| 存储路径 | oss://fnos2026/ai-project/backups/ |
|
||
| 保留策略 | 30 天自动清理 |
|
||
| 预计成本 | ~¥0.25/月 |
|
||
|
||
### 凭据配置
|
||
|
||
**存储位置**: `~/.config/devops/credentials.env` (权限 600)
|
||
|
||
```bash
|
||
OSS_ENDPOINT="oss-cn-beijing.aliyuncs.com"
|
||
OSS_BUCKET="fnos2026"
|
||
OSS_ACCESS_KEY_ID="LTAI5tEARCztp3Bj3FUYd9rh"
|
||
OSS_ACCESS_KEY_SECRET="RSvwURFo2cgF1krSgeriyrAUIqQyGE"
|
||
```
|
||
|
||
**加载凭据**:
|
||
```bash
|
||
source ~/.config/devops/credentials.env
|
||
```
|
||
|
||
### ossutil 工具
|
||
|
||
**版本**: v1.7.15
|
||
**安装位置**: `/usr/local/bin/ossutil`
|
||
**安装时间**: 2026-01-15 00:45:00 CST
|
||
|
||
**安装步骤**:
|
||
```bash
|
||
wget https://gosspublic.alicdn.com/ossutil/1.7.15/ossutil64
|
||
sudo mv ossutil64 /usr/local/bin/ossutil
|
||
sudo chmod +x /usr/local/bin/ossutil
|
||
```
|
||
|
||
**配置**:
|
||
```bash
|
||
source ~/.config/devops/credentials.env
|
||
ossutil config -e ${OSS_ENDPOINT} \
|
||
-i ${OSS_ACCESS_KEY_ID} \
|
||
-k ${OSS_ACCESS_KEY_SECRET} \
|
||
-L CH \
|
||
--config-file ~/.ossutilconfig
|
||
```
|
||
|
||
**测试连接**:
|
||
```bash
|
||
ossutil ls oss://${OSS_BUCKET}/
|
||
```
|
||
|
||
### 自动同步脚本
|
||
|
||
**脚本位置**: `/opt/ai-project/deploy/scripts/backup-to-oss.sh`
|
||
|
||
**功能特性**:
|
||
- ✅ 同步当天备份目录到 OSS
|
||
- ✅ 上传 latest.sql.gz
|
||
- ✅ 自动清理 30 天前的旧备份
|
||
- ✅ 备份统计报告
|
||
- ✅ 彩色日志输出
|
||
|
||
**Cron 配置**:
|
||
```cron
|
||
30 2 * * * /opt/ai-project/deploy/scripts/backup-to-oss.sh >> /var/log/ai-project-oss-sync.log 2>&1
|
||
```
|
||
|
||
**手动执行**:
|
||
```bash
|
||
ssh tools_ai_proj "/opt/ai-project/deploy/scripts/backup-to-oss.sh"
|
||
```
|
||
|
||
**查看日志**:
|
||
```bash
|
||
ssh tools_ai_proj "tail -f /var/log/ai-project-oss-sync.log"
|
||
```
|
||
|
||
### 常用 OSS 操作
|
||
|
||
```bash
|
||
# 加载凭据
|
||
source ~/.config/devops/credentials.env
|
||
|
||
# 列出所有备份文件
|
||
ssh tools_ai_proj "ossutil ls oss://${OSS_BUCKET}/ai-project/backups/ -r --config-file ~/.ossutilconfig"
|
||
|
||
# 查看备份统计
|
||
ssh tools_ai_proj "ossutil du oss://${OSS_BUCKET}/ai-project/backups/ --config-file ~/.ossutilconfig"
|
||
|
||
# 下载特定日期的备份
|
||
ssh tools_ai_proj "ossutil cp oss://${OSS_BUCKET}/ai-project/backups/20260115/ai_project_20260115_*.sql.gz /tmp/ --config-file ~/.ossutilconfig"
|
||
|
||
# 下载最新备份
|
||
ssh tools_ai_proj "ossutil cp oss://${OSS_BUCKET}/ai-project/backups/latest.sql.gz /tmp/ --config-file ~/.ossutilconfig"
|
||
|
||
# 查看备份文件详情
|
||
ssh tools_ai_proj "ossutil stat oss://${OSS_BUCKET}/ai-project/backups/latest.sql.gz --config-file ~/.ossutilconfig"
|
||
|
||
# 手动清理特定日期的备份
|
||
ssh tools_ai_proj "ossutil rm oss://${OSS_BUCKET}/ai-project/backups/20260101/ -r -f --config-file ~/.ossutilconfig"
|
||
```
|
||
|
||
### 备份验证
|
||
|
||
```bash
|
||
# 验证最新备份是否上传成功
|
||
ssh tools_ai_proj "ossutil stat oss://${OSS_BUCKET}/ai-project/backups/latest.sql.gz --config-file ~/.ossutilconfig"
|
||
|
||
# 下载并测试备份完整性
|
||
ssh tools_ai_proj "
|
||
ossutil cp oss://${OSS_BUCKET}/ai-project/backups/latest.sql.gz /tmp/test_restore.sql.gz --config-file ~/.ossutilconfig
|
||
gzip -t /tmp/test_restore.sql.gz && echo '✓ 备份文件完整' || echo '✗ 备份文件损坏'
|
||
rm /tmp/test_restore.sql.gz
|
||
"
|
||
```
|
||
|
||
---
|
||
|
||
## 手动备份
|
||
|
||
### 完整备份
|
||
|
||
```bash
|
||
# 连接到生产服务器
|
||
ssh tools_ai_proj
|
||
|
||
# 导出数据库
|
||
docker exec ai_postgres_prod pg_dump -U ai_prod_user ai_project_prod \
|
||
--no-owner --no-acl --clean --if-exists \
|
||
> /tmp/ai_project_backup_$(date +%Y%m%d_%H%M%S).sql
|
||
|
||
# 压缩备份
|
||
gzip /tmp/ai_project_backup_*.sql
|
||
|
||
# 验证备份完整性
|
||
gzip -t /tmp/ai_project_backup_*.sql.gz
|
||
```
|
||
|
||
### 下载到本地
|
||
|
||
**直接下载** (如果网络良好):
|
||
```bash
|
||
scp tools_ai_proj:/tmp/ai_project_backup_*.sql.gz /tmp/
|
||
```
|
||
|
||
**通过跳板机优化传输** (高延迟环境):
|
||
```bash
|
||
# 使用新加坡跳板机中转(澳洲 → 新加坡 → 腾讯云)
|
||
scp tools_ai_proj:/tmp/ai_project_backup_*.sql.gz singapore:/tmp/
|
||
scp singapore:/tmp/ai_project_backup_*.sql.gz /tmp/
|
||
|
||
# 清理跳板机临时文件
|
||
ssh singapore "rm /tmp/ai_project_backup_*.sql.gz"
|
||
```
|
||
|
||
---
|
||
|
||
## 数据库恢复
|
||
|
||
### 场景 1: 从 OSS 备份恢复(推荐)
|
||
|
||
```bash
|
||
# 1. 从 OSS 下载最新备份
|
||
ssh tools_ai_proj "
|
||
source ~/.config/devops/credentials.env
|
||
ossutil cp oss://fnos2026/ai-project/backups/latest.sql.gz /tmp/restore.sql.gz --config-file ~/.ossutilconfig -f
|
||
"
|
||
|
||
# 2. 验证文件完整性
|
||
ssh tools_ai_proj "gzip -t /tmp/restore.sql.gz"
|
||
|
||
# 3. 停止后端服务
|
||
ssh tools_ai_proj "docker stop ai_backend_prod"
|
||
|
||
# 4. 恢复数据库
|
||
ssh tools_ai_proj "
|
||
gunzip -c /tmp/restore.sql.gz | \
|
||
docker exec -i ai_postgres_prod psql -U ai_prod_user ai_project_prod
|
||
"
|
||
|
||
# 5. 启动后端服务
|
||
ssh tools_ai_proj "docker start ai_backend_prod"
|
||
|
||
# 6. 验证服务
|
||
curl -s https://ai.pipexerp.com/api/v1/health | jq .
|
||
|
||
# 7. 清理临时文件
|
||
ssh tools_ai_proj "rm /tmp/restore.sql.gz"
|
||
```
|
||
|
||
### 场景 2: 从本地备份恢复
|
||
|
||
```bash
|
||
ssh tools_ai_proj
|
||
|
||
# 停止后端服务
|
||
docker stop ai_backend_prod
|
||
|
||
# 恢复数据库
|
||
gunzip -c /backup/ai-project/database/latest.sql.gz | \
|
||
docker exec -i ai_postgres_prod psql -U ai_prod_user ai_project_prod
|
||
|
||
# 启动后端服务
|
||
docker start ai_backend_prod
|
||
```
|
||
|
||
### 场景 3: 从本地开发环境恢复到生产(完整重建)
|
||
|
||
```bash
|
||
# 1. 本地导出
|
||
pg_dump -U donglinlai ai_project_local \
|
||
--no-owner --no-acl --clean --if-exists \
|
||
--exclude-table=audit_logs \
|
||
> /tmp/ai_project_clean.sql
|
||
|
||
# 2. 压缩
|
||
gzip /tmp/ai_project_clean.sql
|
||
|
||
# 3. 通过新加坡跳板机传输(优化高延迟)
|
||
scp /tmp/ai_project_clean.sql.gz singapore:/tmp/
|
||
ssh singapore "scp /tmp/ai_project_clean.sql.gz tools_ai_proj:/tmp/"
|
||
|
||
# 4. 生产环境恢复
|
||
ssh tools_ai_proj
|
||
|
||
# 停止后端服务
|
||
docker stop ai_backend_prod
|
||
|
||
# 完全重建数据库(避免依赖冲突)
|
||
docker exec ai_postgres_prod psql -U ai_prod_user postgres \
|
||
-c 'DROP DATABASE IF EXISTS ai_project_prod;'
|
||
|
||
docker exec ai_postgres_prod psql -U ai_prod_user postgres \
|
||
-c 'CREATE DATABASE ai_project_prod OWNER ai_prod_user;'
|
||
|
||
# 恢复数据
|
||
gunzip -c /tmp/ai_project_clean.sql.gz | \
|
||
docker exec -i ai_postgres_prod psql -U ai_prod_user ai_project_prod
|
||
|
||
# 创建可能缺失的表
|
||
docker exec -i ai_postgres_prod psql -U ai_prod_user ai_project_prod << 'EOF'
|
||
CREATE TABLE IF NOT EXISTS audit_logs (
|
||
id SERIAL PRIMARY KEY,
|
||
user_id INTEGER,
|
||
action VARCHAR(100),
|
||
resource_type VARCHAR(100),
|
||
resource_id VARCHAR(100),
|
||
details TEXT,
|
||
ip_address VARCHAR(50),
|
||
user_agent TEXT,
|
||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
|
||
);
|
||
EOF
|
||
|
||
# 运行数据库迁移(如果有)
|
||
cd /opt/ai-project/backend/migrations
|
||
for file in $(ls *.sql | grep -v _down.sql | sort); do
|
||
echo "Running migration: $file"
|
||
docker exec -i ai_postgres_prod psql -U ai_prod_user ai_project_prod < "$file"
|
||
done
|
||
|
||
# 启动后端服务
|
||
docker start ai_backend_prod
|
||
|
||
# 清理临时文件
|
||
rm /tmp/ai_project_clean.sql.gz
|
||
```
|
||
|
||
---
|
||
|
||
## 最佳实践
|
||
|
||
### Docker Volumes 安全配置
|
||
|
||
**关键规则**: 所有生产数据卷必须配置为 `external: true`
|
||
|
||
**正确配置** (`deploy/tencent-cloud/docker-compose.dockerhub.yml`):
|
||
```yaml
|
||
volumes:
|
||
postgres_prod_data:
|
||
external: true
|
||
name: ai-project_postgres_prod_data
|
||
redis_prod_data:
|
||
external: true
|
||
name: ai-project_redis_prod_data
|
||
```
|
||
|
||
**危险配置** (会被 `docker compose down` 删除):
|
||
```yaml
|
||
volumes:
|
||
postgres_prod_data:
|
||
redis_prod_data:
|
||
```
|
||
|
||
**验证**:
|
||
```bash
|
||
ssh tools_ai_proj "docker volume ls | grep ai-project"
|
||
# 应该看到:
|
||
# ai-project_postgres_prod_data
|
||
# ai-project_redis_prod_data
|
||
```
|
||
|
||
### 备份策略
|
||
|
||
1. **每日自动备份** - 使用 cron 定时任务
|
||
2. **双层备份** - 本地 + 阿里云 OSS
|
||
3. **定期验证** - 每周测试备份恢复流程
|
||
4. **保留策略** - 30 天自动清理
|
||
|
||
### pg_dump 最佳参数
|
||
|
||
```bash
|
||
# 跨服务器迁移
|
||
pg_dump --no-owner --no-acl --clean --if-exists --exclude-table=<problem_table>
|
||
|
||
# 参数说明:
|
||
# --no-owner 不恢复对象所有者(避免用户名冲突)
|
||
# --no-acl 不恢复访问权限(避免权限问题)
|
||
# --clean 包含 DROP 语句(完全替换)
|
||
# --if-exists DROP 前检查存在(避免错误)
|
||
# --exclude-table 排除问题表(如有 JSON 格式问题的表)
|
||
```
|
||
|
||
### 数据恢复检查清单
|
||
|
||
在执行恢复前,务必检查以下项目:
|
||
|
||
- [ ] 确认备份文件完整性(gzip -t 验证)
|
||
- [ ] 停止相关应用服务(避免数据不一致)
|
||
- [ ] 完全重建数据库(DROP + CREATE,避免依赖冲突)
|
||
- [ ] 恢复后创建缺失的表(如被排除的表)
|
||
- [ ] 运行数据库迁移(确保表结构最新)
|
||
- [ ] 验证数据完整性(检查关键表行数)
|
||
- [ ] 测试应用功能(登录、关键业务流程)
|
||
- [ ] 清理临时文件(备份文件、SQL 文件)
|
||
|
||
### 网络传输优化
|
||
|
||
**场景**: 跨地域高延迟环境(如澳洲 → 腾讯云)
|
||
|
||
**问题**: 直连延迟 370ms+,大文件传输极慢
|
||
|
||
**方案**: 使用地理位置中间的跳板机
|
||
|
||
```bash
|
||
# 直连(慢): 澳洲 → 腾讯云 (370ms+)
|
||
scp file.gz tools_ai_proj:/tmp/
|
||
|
||
# 优化(快): 澳洲 → 新加坡 → 腾讯云
|
||
scp file.gz singapore:/tmp/
|
||
ssh singapore "scp /tmp/file.gz tools_ai_proj:/tmp/"
|
||
```
|
||
|
||
**新加坡跳板机信息**:
|
||
- 别名: singapore
|
||
- IP: 43.134.28.147
|
||
- 用户: ubuntu
|
||
- SSH Key: ~/.ssh/singpore.pem
|
||
|
||
---
|
||
|
||
## 监控与告警
|
||
|
||
### 每周检查清单
|
||
|
||
**建议执行频率**: 每周一次
|
||
|
||
```bash
|
||
# 1. 检查本地备份
|
||
ssh tools_ai_proj "ls -lh /backup/ai-project/database/$(date +%Y%m%d)/"
|
||
|
||
# 2. 检查 OSS 备份
|
||
ssh tools_ai_proj "ossutil stat oss://fnos2026/ai-project/backups/latest.sql.gz --config-file ~/.ossutilconfig"
|
||
|
||
# 3. 检查 cron 日志
|
||
ssh tools_ai_proj "tail -20 /var/log/ai-project-backup.log"
|
||
ssh tools_ai_proj "tail -20 /var/log/ai-project-oss-sync.log"
|
||
|
||
# 4. 验证备份大小(应该在 10-20M 范围)
|
||
ssh tools_ai_proj "du -sh /backup/ai-project/database/$(date +%Y%m%d)/"
|
||
|
||
# 5. 测试备份完整性
|
||
ssh tools_ai_proj "
|
||
gzip -t /backup/ai-project/database/latest.sql.gz && \
|
||
echo '✓ 本地备份完整' || echo '✗ 本地备份损坏'
|
||
"
|
||
```
|
||
|
||
### 备份失败排查
|
||
|
||
如果备份或同步失败:
|
||
|
||
1. **检查磁盘空间**:
|
||
```bash
|
||
ssh tools_ai_proj "df -h"
|
||
```
|
||
|
||
2. **检查 PostgreSQL 容器状态**:
|
||
```bash
|
||
ssh tools_ai_proj "docker ps | grep postgres"
|
||
```
|
||
|
||
3. **检查 ossutil 配置**:
|
||
```bash
|
||
ssh tools_ai_proj "cat ~/.ossutilconfig"
|
||
```
|
||
|
||
4. **测试 OSS 连接**:
|
||
```bash
|
||
ssh tools_ai_proj "ossutil ls oss://fnos2026/ --config-file ~/.ossutilconfig"
|
||
```
|
||
|
||
5. **手动运行脚本查看详细错误**:
|
||
```bash
|
||
ssh tools_ai_proj "/opt/ai-project/deploy/scripts/backup-database.sh"
|
||
ssh tools_ai_proj "/opt/ai-project/deploy/scripts/backup-to-oss.sh"
|
||
```
|
||
|
||
---
|
||
|
||
## 成本估算
|
||
|
||
**基于当前数据量** (13MB/天):
|
||
|
||
| 项目 | 计算 | 月成本 |
|
||
|------|------|--------|
|
||
| OSS 存储 | 13MB × 30天 = 390MB × ¥0.12/GB | ¥0.05 |
|
||
| OSS 流量 | 13MB × 30天 = 390MB × ¥0.50/GB | ¥0.20 |
|
||
| **总计** | | **¥0.25** |
|
||
|
||
---
|
||
|
||
## 故障案例
|
||
|
||
### 2026-01-15: 生产数据库丢失事件
|
||
|
||
**事件时间**: 2026-01-15 00:00:00 - 00:46:00 CST
|
||
|
||
**事件**: Jenkins 部署时 `docker compose down` 删除了非 external volumes
|
||
|
||
**影响**: 生产数据库完全清空,所有用户无法登录
|
||
|
||
**恢复过程**:
|
||
1. 从本地开发环境导出完整数据(41 用户、54 项目、4,722 任务)
|
||
2. 使用新加坡跳板机优化传输(解决 370ms+ 延迟)
|
||
3. 完全重建数据库避免依赖冲突
|
||
4. 重置所有管理员密码
|
||
|
||
**恢复完成时间**: 2026-01-15 00:46:00 CST
|
||
|
||
**预防措施**:
|
||
1. ✅ 所有数据卷标记为 `external: true` (完成时间: 2026-01-15 00:46:00)
|
||
2. ✅ Jenkinsfile 添加自动数据库迁移 (完成时间: 2026-01-15 00:46:00)
|
||
3. ✅ 临时禁用 webhook 自动部署 (完成时间: 2026-01-15 00:46:00)
|
||
4. ✅ 配置自动备份策略(本地 + OSS 双层备份)(完成时间: 2026-01-15 02:30:46)
|
||
|
||
**详细记录**: 见思源笔记 `devops/运维记录/2026-01-15 AI-Proj生产数据库恢复记录`
|
||
|
||
---
|
||
|
||
## 用户管理相关
|
||
|
||
数据库用户管理(创建用户、重置密码)请参考:
|
||
- **ops-tools/SKILL.md** - "AI-Proj 用户管理" 章节
|
||
- **ai-proj-deploy.md** - "用户管理" 章节
|
||
|
||
**关键注意事项**:
|
||
- 密码哈希使用 bcrypt **cost 12**(后端 `utils/password.go` 的 `DefaultCost`)
|
||
- 由于 `$` 字符问题,SQL 必须通过文件传输方式执行
|
||
|
||
---
|
||
|
||
## 相关资源
|
||
|
||
- **父技能**: ops-tools/skill.md
|
||
- **备份脚本**: `/opt/ai-project/deploy/scripts/backup-database.sh`
|
||
- **OSS 同步脚本**: `/opt/ai-project/deploy/scripts/backup-to-oss.sh`
|
||
- **凭据文件**: `~/.config/devops/credentials.env`
|
||
- **SiYuan 笔记**: `devops/运维记录/2026-01-15 AI-Proj生产数据库恢复记录`
|
||
|
||
---
|
||
|
||
---
|
||
|
||
## 数据库迁移标准流程
|
||
|
||
> **强制要求**:任何数据库迁移操作必须遵循以下流程。
|
||
|
||
### 迁移检查清单
|
||
|
||
```
|
||
┌─────────────────────────────────────────────────────────────┐
|
||
│ 数据库迁移标准流程 │
|
||
├─────────────────────────────────────────────────────────────┤
|
||
│ │
|
||
│ 1. 迁移前备份 ⬅️ 必须 │
|
||
│ ssh tools_ai_proj 'docker exec ai_postgres_prod \ │
|
||
│ pg_dump -U ai_prod_user -Fc ai_project_prod \ │
|
||
│ > /backup/ai-project/database/pre_migration.dump' │
|
||
│ │
|
||
│ 2. 验证备份文件 │
|
||
│ ssh tools_ai_proj 'ls -lh /backup/.../pre_migration.dump'│
|
||
│ │
|
||
│ 3. 记录迁移前状态 │
|
||
│ SELECT COUNT(*) FROM <table>; │
|
||
│ │
|
||
│ 4. 执行迁移(使用事务) │
|
||
│ BEGIN; ... COMMIT; │
|
||
│ │
|
||
│ 5. 验证迁移结果 │
|
||
│ SELECT COUNT(*) FROM <table>; │
|
||
│ │
|
||
│ 6. 如有问题,恢复备份 │
|
||
│ pg_restore -Fc pre_migration.dump │
|
||
│ │
|
||
└─────────────────────────────────────────────────────────────┘
|
||
```
|
||
|
||
### 迁移 SQL 模板
|
||
|
||
```sql
|
||
-- ============================================
|
||
-- 迁移脚本模板
|
||
-- 执行前请先备份!
|
||
-- ============================================
|
||
|
||
BEGIN;
|
||
|
||
-- 迁移前统计
|
||
\echo '=== 迁移前统计 ==='
|
||
SELECT 'table_name' as info, COUNT(*) as count FROM table_name WHERE condition;
|
||
|
||
-- 执行迁移
|
||
\echo '=== 执行迁移 ==='
|
||
UPDATE table_name SET column = new_value WHERE condition;
|
||
|
||
-- 迁移后统计
|
||
\echo '=== 迁移后统计 ==='
|
||
SELECT 'table_name' as info, COUNT(*) as count FROM table_name WHERE condition;
|
||
|
||
-- 确认无误后提交
|
||
COMMIT;
|
||
|
||
\echo '=== 迁移完成 ==='
|
||
```
|
||
|
||
### 迁移失败回滚
|
||
|
||
```bash
|
||
# 1. 停止后端服务
|
||
ssh tools_ai_proj 'docker stop ai_backend_prod'
|
||
|
||
# 2. 恢复备份
|
||
ssh tools_ai_proj 'docker exec ai_postgres_prod pg_restore \
|
||
-U ai_prod_user -d ai_project_prod --clean --if-exists -Fc \
|
||
/backup/ai-project/database/ai_project_XXXXXXXX_pre_migration.dump'
|
||
|
||
# 3. 启动后端服务
|
||
ssh tools_ai_proj 'docker start ai_backend_prod'
|
||
|
||
# 4. 验证服务
|
||
curl -s https://ai.pipexerp.com/api/v1/health | jq .
|
||
```
|
||
|
||
---
|
||
|
||
## 版本历史
|
||
|
||
| 版本 | 日期 | 变更 |
|
||
|------|------|------|
|
||
| 2.0 | 2026-02-02 | 升级为全局技能:新增迁移前备份流程、多数据库支持、7天+月度保留策略、快速恢复命令 |
|
||
| 1.0 | 2026-01-15 | 初始版本:AI-Proj 备份与 OSS 同步 |
|
||
|
||
---
|
||
|
||
**文档创建时间**: 2026-01-15 07:30:00 ACDT
|
||
**最后更新时间**: 2026-02-02
|
||
**文档状态**: ✅ 正常运行
|