skills/ → skills-dev(9), skills-req(10), skills-ops(4), skills-integration(8), skills-biz(4), skills-workflow(7) generate-marketplace.py 改为自动扫描所有 skills-* 目录。 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
263 lines
6.1 KiB
Markdown
263 lines
6.1 KiB
Markdown
---
|
||
name: ops-servers
|
||
description: 企业服务器管理。用于云服务器分组管理、系统监控、备份管理、故障排查。当用户提到云服务器、生产环境、腾讯云、阿里云相关任务时自动激活。
|
||
---
|
||
|
||
# 企业服务器管理 Skill
|
||
|
||
> 家庭网络设备请使用 `ops-home` Skill
|
||
|
||
---
|
||
|
||
## 服务器清单
|
||
|
||
| 别名 | IP | 用户 | SSH 密钥 | 用途 | 配置 | ISP | 账号 |
|
||
|------|-----|------|----------|------|------|-----|------|
|
||
| prod-pipexerp | 192.144.137.14 | ubuntu | ~/.ssh/officialWebsite.pem | pipexerp 官网 | 2核2G 40G SSD | 腾讯云 | 北京对丝 |
|
||
| prod-metaBI | 192.144.174.87 | ubuntu | ~/.ssh/prod_meta.pem | Metabase BI 分析 | - | 腾讯云 | 北京欢乐宿 |
|
||
| moltbot | 124.223.196.74 | root | ~/.ssh/moltbot | Moltbot 服务 | - | 腾讯云 | - |
|
||
| lazycat | 100.115.52.119 (haiqing.heiyu.space) | root | 密码认证 (zhiyun2026) | AI/计算节点 | - | - | - |
|
||
|
||
### SSH 快捷连接
|
||
|
||
```bash
|
||
# pipexerp 官网服务器
|
||
ssh prod-pipexerp
|
||
|
||
# Metabase BI 分析服务器
|
||
ssh prod-metaBI
|
||
|
||
# Moltbot 服务器
|
||
ssh moltbot
|
||
|
||
# Lazycat AI 计算节点(密码:zhiyun2026)
|
||
ssh root@haiqing.heiyu.space
|
||
# 或使用 IP
|
||
ssh root@100.115.52.119
|
||
```
|
||
|
||
---
|
||
|
||
## 服务器分组架构
|
||
|
||
采用 **环境 + 服务** 混合分组模式:
|
||
|
||
### 按环境分组
|
||
|
||
| 环境 | 前缀 | 用途 |
|
||
|------|------|------|
|
||
| prod | prod- | 生产环境 |
|
||
| staging | stg- | 预发布环境 |
|
||
| test | test- | 测试环境 |
|
||
| dev | dev- | 开发环境 |
|
||
|
||
### 按服务分组
|
||
|
||
| 服务组 | 包含服务 | 说明 |
|
||
|--------|----------|------|
|
||
| web | Nginx, 前端静态资源 | 负载均衡、静态资源 |
|
||
| api | Go/Node 后端服务 | 业务 API |
|
||
| db | MySQL, PostgreSQL | 数据库 |
|
||
| cache | Redis | 缓存服务 |
|
||
|
||
---
|
||
|
||
## 常用运维命令
|
||
|
||
### 系统状态检查
|
||
|
||
```bash
|
||
# 一键查看系统概况
|
||
ssh prod-pipexerp "echo '=== 负载 ===' && uptime && echo && echo '=== 内存 ===' && free -h && echo && echo '=== 磁盘 ===' && df -h"
|
||
|
||
# 查看 CPU 使用最高的进程
|
||
ssh prod-pipexerp "ps aux --sort=-%cpu | head -10"
|
||
|
||
# 查看内存使用最高的进程
|
||
ssh prod-pipexerp "ps aux --sort=-%mem | head -10"
|
||
```
|
||
|
||
### Docker 管理
|
||
|
||
```bash
|
||
# 查看运行中的容器
|
||
ssh prod-pipexerp "docker ps"
|
||
|
||
# 查看所有容器
|
||
ssh prod-pipexerp "docker ps -a"
|
||
|
||
# 查看容器日志
|
||
ssh prod-pipexerp "docker logs -f <container_name> --tail 100"
|
||
|
||
# 重启容器
|
||
ssh prod-pipexerp "docker restart <container_name>"
|
||
|
||
# 清理未使用的资源
|
||
ssh prod-pipexerp "docker system prune -af"
|
||
```
|
||
|
||
### 网络检查
|
||
|
||
```bash
|
||
# 查看端口监听
|
||
ssh prod-pipexerp "sudo netstat -tlnp"
|
||
|
||
# 检查防火墙状态
|
||
ssh prod-pipexerp "sudo ufw status"
|
||
|
||
# 测试端口连通性
|
||
nc -zv 192.144.137.14 80
|
||
nc -zv 192.144.137.14 443
|
||
```
|
||
|
||
### 日志查看
|
||
|
||
```bash
|
||
# Nginx 错误日志
|
||
ssh prod-pipexerp "sudo tail -f /var/log/nginx/error.log"
|
||
|
||
# Nginx 访问日志
|
||
ssh prod-pipexerp "sudo tail -f /var/log/nginx/access.log"
|
||
|
||
# 系统日志
|
||
ssh prod-pipexerp "sudo tail -f /var/log/syslog"
|
||
```
|
||
|
||
---
|
||
|
||
## 批量操作
|
||
|
||
### 对多台服务器执行命令
|
||
|
||
```bash
|
||
# 定义服务器列表
|
||
SERVERS="prod-pipexerp"
|
||
|
||
# 批量检查状态
|
||
for host in $SERVERS; do
|
||
echo "=== $host ==="
|
||
ssh $host "uptime && free -h | head -2 && df -h / | tail -1"
|
||
done
|
||
```
|
||
|
||
### 健康检查脚本
|
||
|
||
```bash
|
||
# 检查所有服务器
|
||
for host in prod-pipexerp; do
|
||
echo "=== $host ==="
|
||
ssh $host "
|
||
echo '--- 负载 ---' && uptime
|
||
echo '--- 内存 ---' && free -h | head -2
|
||
echo '--- 磁盘 ---' && df -h / | tail -1
|
||
echo '--- Docker ---' && docker ps --format 'table {{.Names}}\t{{.Status}}' 2>/dev/null || echo 'N/A'
|
||
"
|
||
done
|
||
```
|
||
|
||
---
|
||
|
||
## 备份管理
|
||
|
||
### 备份策略
|
||
|
||
| 备份类型 | 频率 | 保留时间 | 存储位置 |
|
||
|----------|------|----------|----------|
|
||
| 数据库全量 | 每日 02:00 | 7 天 | /backup/mysql/ |
|
||
| 配置文件 | 每日 03:00 | 30 天 | /backup/configs/ |
|
||
| 上传文件 | 每日 04:00 | 30 天 | /backup/uploads/ |
|
||
|
||
### 手动备份
|
||
|
||
```bash
|
||
# 备份 Nginx 配置
|
||
ssh prod-pipexerp "sudo tar -czf /tmp/nginx-\$(date +%Y%m%d).tar.gz /etc/nginx/"
|
||
|
||
# 下载备份到本地
|
||
scp prod-pipexerp:/tmp/nginx-*.tar.gz ./backups/
|
||
```
|
||
|
||
### 备份清理
|
||
|
||
```bash
|
||
# 清理 7 天前的备份
|
||
ssh prod-pipexerp "sudo find /backup/ -name '*.tar.gz' -mtime +7 -delete"
|
||
```
|
||
|
||
---
|
||
|
||
## 故障排查
|
||
|
||
### 常见问题
|
||
|
||
1. **服务无响应**
|
||
```bash
|
||
ssh prod-pipexerp "sudo systemctl status nginx"
|
||
ssh prod-pipexerp "sudo journalctl -u nginx --since '10 minutes ago'"
|
||
```
|
||
|
||
2. **磁盘空间不足**
|
||
```bash
|
||
ssh prod-pipexerp "df -h && sudo du -sh /* 2>/dev/null | sort -h | tail -10"
|
||
# 清理 Docker
|
||
ssh prod-pipexerp "docker system prune -af"
|
||
# 清理日志
|
||
ssh prod-pipexerp "sudo journalctl --vacuum-size=500M"
|
||
```
|
||
|
||
3. **内存不足**
|
||
```bash
|
||
ssh prod-pipexerp "free -h && ps aux --sort=-%mem | head -10"
|
||
```
|
||
|
||
4. **网站无法访问**
|
||
```bash
|
||
# 检查 Nginx
|
||
ssh prod-pipexerp "sudo systemctl status nginx"
|
||
# 检查端口
|
||
ssh prod-pipexerp "sudo netstat -tlnp | grep ':80\|:443'"
|
||
# 测试本地访问
|
||
ssh prod-pipexerp "curl -I http://localhost"
|
||
```
|
||
|
||
5. **SSL 证书问题**
|
||
```bash
|
||
# 检查证书到期时间
|
||
ssh prod-pipexerp "sudo openssl x509 -in /etc/nginx/ssl/cert.pem -noout -dates"
|
||
```
|
||
|
||
---
|
||
|
||
## 账号管理
|
||
|
||
### 系统用户
|
||
|
||
| 用户名 | 用途 | 权限 |
|
||
|--------|------|------|
|
||
| ubuntu | 默认管理用户 | sudo |
|
||
| deploy | 部署用户 | 部署相关 |
|
||
|
||
### 创建部署用户
|
||
|
||
```bash
|
||
# 创建用户
|
||
ssh prod-pipexerp "sudo useradd -m -s /bin/bash deploy"
|
||
|
||
# 配置 SSH 密钥
|
||
ssh prod-pipexerp "sudo mkdir -p /home/deploy/.ssh && sudo chmod 700 /home/deploy/.ssh"
|
||
ssh prod-pipexerp "sudo cp ~/.ssh/authorized_keys /home/deploy/.ssh/ && sudo chown -R deploy:deploy /home/deploy/.ssh"
|
||
|
||
# 配置 sudo 权限(无密码 docker 和 systemctl)
|
||
ssh prod-pipexerp "echo 'deploy ALL=(ALL) NOPASSWD: /usr/bin/systemctl restart *, /usr/bin/docker *' | sudo tee /etc/sudoers.d/deploy"
|
||
```
|
||
|
||
---
|
||
|
||
## 安全注意事项
|
||
|
||
- SSH 密钥文件权限必须是 600: `chmod 600 ~/.ssh/*.pem`
|
||
- 使用 `sudo` 执行需要 root 权限的命令
|
||
- 敏感操作前先确认服务器和目标
|
||
- 生产环境操作需要二次确认
|
||
- 定期更新系统和软件包
|