# OpenClaw 运维技能 OpenClaw 容器化部署、运维监控、故障排查完整指南。 ## 目录 - [服务器部署](#服务器部署) - [容器管理](#容器管理) - [性能优化](#性能优化) - [故障排查](#故障排查) - [最佳实践](#最佳实践) --- ## 服务器部署 ### 懒猫算力仓 (lazycat) **服务器信息**: - 主机名:haiqing.heiyu.space - SSH 别名:lazycat, lanmao - 用途:OpenClaw 算力服务 - 系统:Debian-based Linux - 容器平台:lzc-docker **OpenClaw 容器信息**: - 容器 ID:5f3bf33e090b - 镜像:registry.lazycat.cloud/openclaw:1.1.5 - OpenClaw 版本:2026.2.9 - 容器名:iamxiaoelzcappopenclaw-openclaw-1 **访问方式**: ```bash # SSH 连接 ssh lazycat # 进入容器 ssh lazycat "lzc-docker exec -it 5f3bf33e090b bash" # 启动 OpenClaw TUI openclaw-tui # 使用本地快捷脚本 ``` --- ## 容器管理 ### 快捷访问脚本 **~/bin/openclaw-tui**: ```bash #!/bin/bash # OpenClaw TUI 快捷访问脚本(自动启动 Gateway) set -e echo "🦞 连接到龙虾服务器 (懒猫)..." echo "" # 检查并启动 Gateway echo "检查 OpenClaw Gateway 状态..." GATEWAY_STATUS=$(ssh lazycat "lzc-docker exec 5f3bf33e090b openclaw gateway status" 2>/dev/null | grep "RPC probe" || echo "failed") if echo "$GATEWAY_STATUS" | grep -q "ok"; then echo "✅ Gateway 已运行" else echo "🔧 启动 Gateway..." ssh lazycat "lzc-docker exec -d 5f3bf33e090b bash -c 'nohup openclaw gateway run > /tmp/gateway.log 2>&1 &'" 2>/dev/null sleep 2 echo "✅ Gateway 已启动" fi echo "" echo "启动 OpenClaw TUI..." # SSH 到懒猫服务器,然后进入 Docker 容器并启动 OpenClaw TUI ssh -t lazycat "lzc-docker exec -it 5f3bf33e090b bash -c 'openclaw tui'" ``` ### 容器操作命令 ```bash # 查看容器状态 ssh lazycat "lzc-docker ps | grep openclaw" # 查看容器日志 ssh lazycat "lzc-docker logs -f 5f3bf33e090b --tail 100" # 重启容器 ssh lazycat "lzc-docker restart 5f3bf33e090b" # 查看容器资源使用 ssh lazycat "lzc-docker stats --no-stream 5f3bf33e090b" # 进入容器 shell ssh lazycat "lzc-docker exec -it 5f3bf33e090b bash" ``` ### Gateway 管理 ```bash # 检查 Gateway 状态 ssh lazycat "lzc-docker exec 5f3bf33e090b openclaw gateway status" # 启动 Gateway(前台) ssh lazycat "lzc-docker exec 5f3bf33e090b openclaw gateway run" # 启动 Gateway(后台) ssh lazycat "lzc-docker exec -d 5f3bf33e090b bash -c 'nohup openclaw gateway run > /tmp/gateway.log 2>&1 &'" # 停止 Gateway ssh lazycat "lzc-docker exec 5f3bf33e090b pkill -f 'openclaw-gateway'" # 查看 Gateway 日志 ssh lazycat "lzc-docker exec 5f3bf33e090b tail -f /tmp/openclaw/openclaw-$(date +%Y-%m-%d).log" ``` --- ## 性能优化 ### 资源配置 **当前配置**(接近系统上限,确保充足性能): - 内存限制:30GB(系统 97%) - 内存+交换:32GB - CPU 限制:8.0 核心(系统 100%) - 进程限制:10,000 个 ```bash # 查看资源限制配置 ssh lazycat "lzc-docker inspect 5f3bf33e090b --format=' 内存限制: {{.HostConfig.Memory}} bytes 内存+交换: {{.HostConfig.MemorySwap}} bytes CPU配额: {{.HostConfig.CpuQuota}} CPU周期: {{.HostConfig.CpuPeriod}} PID限制: {{.HostConfig.PidsLimit}} '" # 查看实时资源使用 ssh lazycat "lzc-docker stats --no-stream 5f3bf33e090b" ``` **配置说明**: - 懒猫算力仓的主要职责是提供 OpenClaw 服务 - 资源限制设置为接近系统上限,确保有充足资源运行 - 同时提供基本的失控保护机制 ### 自动化优化措施 #### 1. 定期自动重启(每周日 03:00) **目的**:清理累积的僵尸进程,释放资源 **查看状态**: ```bash # 查看定时任务状态 ssh lazycat "systemctl status openclaw-restart.timer" # 查看重启日志 ssh lazycat "tail -50 /var/log/openclaw-restart.log" # 手动执行重启 ssh lazycat "/root/restart-openclaw.sh" ``` **配置文件**: - Service: `/etc/systemd/system/openclaw-restart.service` - Timer: `/etc/systemd/system/openclaw-restart.timer` - 脚本: `/root/restart-openclaw.sh` **重启脚本** (`/root/restart-openclaw.sh`): ```bash #!/bin/bash # OpenClaw 容器定期重启脚本 # 每周日凌晨3点执行 LOG_FILE='/var/log/openclaw-restart.log' echo "[$(date '+%Y-%m-%d %H:%M:%S')] 开始重启 OpenClaw 容器" >> $LOG_FILE # 重启容器 /lzcsys/bin/lzc-docker restart 5f3bf33e090b >> $LOG_FILE 2>&1 # 等待容器启动 sleep 10 # 检查健康状态 STATUS=$(/lzcsys/bin/lzc-docker inspect -f '{{.State.Health.Status}}' 5f3bf33e090b 2>/dev/null || echo 'unknown') echo "[$(date '+%Y-%m-%d %H:%M:%S')] 重启完成,健康状态: $STATUS" >> $LOG_FILE # 检查僵尸进程 ZOMBIE_COUNT=$(/lzcsys/bin/lzc-docker exec 5f3bf33e090b ps aux | grep 'Z' | wc -l) echo "[$(date '+%Y-%m-%d %H:%M:%S')] 当前僵尸进程数: $ZOMBIE_COUNT" >> $LOG_FILE echo "----------------------------------------" >> $LOG_FILE ``` #### 2. 僵尸进程自动监控(每小时检查) **目的**:监控僵尸进程数量,超过阈值自动重启容器 **查看状态**: ```bash # 查看监控状态 ssh lazycat "systemctl status openclaw-zombie-monitor.timer" # 查看监控日志 ssh lazycat "tail -50 /var/log/openclaw-zombie-monitor.log" # 手动检查僵尸进程 ssh lazycat "/root/monitor-openclaw-zombies.sh" # 直接查看僵尸进程数 ssh lazycat "lzc-docker exec 5f3bf33e090b ps aux | grep 'Z' | wc -l" ``` **监控参数**: - 检查频率:每小时 - 触发阈值:50 个僵尸进程 - 自动操作:重启容器 **监控脚本** (`/root/monitor-openclaw-zombies.sh`): ```bash #!/bin/bash # OpenClaw 僵尸进程监控脚本 # 当僵尸进程超过50个时自动重启容器 ZOMBIE_THRESHOLD=50 CONTAINER_ID='5f3bf33e090b' LOG_FILE='/var/log/openclaw-zombie-monitor.log' # 检查僵尸进程数量 ZOMBIE_COUNT=$(/lzcsys/bin/lzc-docker exec $CONTAINER_ID ps aux 2>/dev/null | grep -c 'Z' || echo '0') echo "[$(date '+%Y-%m-%d %H:%M:%S')] 僵尸进程数: $ZOMBIE_COUNT" >> $LOG_FILE if [ $ZOMBIE_COUNT -gt $ZOMBIE_THRESHOLD ]; then echo "[$(date '+%Y-%m-%d %H:%M:%S')] ⚠️ 僵尸进程超过阈值($ZOMBIE_THRESHOLD),执行自动重启" >> $LOG_FILE # 重启容器 /lzcsys/bin/lzc-docker restart $CONTAINER_ID >> $LOG_FILE 2>&1 # 等待容器启动 sleep 10 # 再次检查 NEW_ZOMBIE_COUNT=$(/lzcsys/bin/lzc-docker exec $CONTAINER_ID ps aux 2>/dev/null | grep -c 'Z' || echo '0') echo "[$(date '+%Y-%m-%d %H:%M:%S')] 重启后僵尸进程数: $NEW_ZOMBIE_COUNT" >> $LOG_FILE echo "----------------------------------------" >> $LOG_FILE fi ``` #### 3. 全面健康检查 ```bash # 一键健康检查脚本 ssh lazycat " echo '=== 系统负载 ===' && uptime && echo '' && echo '=== 僵尸进程 ===' && lzc-docker exec 5f3bf33e090b ps aux | grep 'Z' | wc -l && echo '' && echo '=== 容器资源 ===' && lzc-docker stats --no-stream 5f3bf33e090b && echo '' && echo '=== Gateway 状态 ===' && lzc-docker exec 5f3bf33e090b openclaw gateway status | grep 'RPC probe' && echo '' && echo '=== 容器健康 ===' && lzc-docker inspect 5f3bf33e090b --format='Status: {{.State.Status}}, Health: {{.State.Health.Status}}' " # 查看所有定时任务 ssh lazycat "systemctl list-timers | grep openclaw" ``` --- ## 故障排查 ### Tower 反复崩溃(已修复 2026-02-16) **现象**: - Tower 日志显示反复崩溃:`[tower] OpenClaw crashed: exit status 1` - Gateway 启动失败:`gateway already running (pid xxx); lock timeout` - 僵尸 Gateway 进程堆积,无法回收 - 日志中出现多个僵尸进程:`[openclaw-gatewa] ` **典型错误日志**: ``` [22:19:39] [tower] OpenClaw crashed: exit status 1 [22:24:52] [tower] OpenClaw crashed: signal: killed [22:27:33] Gateway failed to start: gateway already running (pid 2005) [22:27:33] If the gateway is supervised, stop it with: openclaw gateway stop ``` **根本原因**: - Tower 作为容器 PID 1 进程,不是专业的 init 进程 - 缺少子进程回收(reaping)机制,导致僵尸进程未被清理 - 僵尸进程占用锁文件和端口(18789),阻塞新 Gateway 启动 - 容器 PID 1 是 `/usr/local/bin/tower`,没有僵尸进程回收能力 **诊断命令**: ```bash # 查看 PID 1 进程 ssh lazycat "lzc-docker exec 5f3bf33e090b ps -p 1 -o pid,ppid,cmd" # 查看僵尸进程详情 ssh lazycat "lzc-docker exec 5f3bf33e090b ps aux | grep 'defunct'" # 检查端口占用 ssh lazycat "lzc-docker exec 5f3bf33e090b netstat -tlnp | grep 18789" # 查看进程树 ssh lazycat "lzc-docker exec 5f3bf33e090b ps auxf | head -30" ``` **永久解决方案(已实施)**: 使用 **tini** 作为容器 PID 1,自动回收僵尸进程。 ```bash # 1. 在容器中安装 tini(专业 init 进程) ssh lazycat "lzc-docker exec 5f3bf33e090b bash -c 'apt-get update -qq && apt-get install -y tini'" # 2. 修改 entrypoint 使用 tini 包装 tower ssh lazycat "lzc-docker exec 5f3bf33e090b sed -i 's|exec /usr/local/bin/tower|exec /usr/bin/tini -- /usr/local/bin/tower|g' /usr/local/bin/clawdbot-entrypoint.sh" # 3. 验证修改 ssh lazycat "lzc-docker exec 5f3bf33e090b grep 'exec.*tower' /usr/local/bin/clawdbot-entrypoint.sh" # 应该看到: exec /usr/bin/tini -- /usr/local/bin/tower ... # 4. 重启容器使修改生效 ssh lazycat "lzc-docker restart 5f3bf33e090b" # 5. 验证 tini 已成为 PID 1 ssh lazycat "lzc-docker exec 5f3bf33e090b ps -p 1 -o pid,ppid,cmd" # 输出应显示: PID 1 -> /usr/bin/tini -- /usr/local/bin/tower ... # 6. 检查进程树 ssh lazycat "lzc-docker exec 5f3bf33e090b ps auxf | head -15" ``` **修复后的进程架构**: ``` PID 1: /usr/bin/tini (专业 init 进程,自动回收僵尸进程) └─ PID 58: tower └─ PID 64: openclaw └─ PID 72: openclaw-gateway ``` **修复效果**: - ✅ Tini 作为 PID 1,自动回收所有僵尸进程 - ✅ 僵尸进程数量从 5+ 个降至 1-2 个(健康水平) - ✅ Tower 稳定运行,不再反复崩溃 - ✅ Gateway 启动正常,无锁文件冲突 - ✅ RPC probe 持续显示 ok **注意事项**: - ⚠️ 当前修改在运行容器内,**容器重建后需重新应用** - 💡 建议向镜像维护者(懒猫云)提交 PR,在 Dockerfile 中添加 tini - 📌 每次从镜像重新创建容器时,需要重新执行上述步骤 1-4 **镜像级永久修复**(建议提交给懒猫云): 在 OpenClaw 镜像的 Dockerfile 中添加: ```dockerfile # 安装 tini RUN apt-get update && apt-get install -y tini && rm -rf /var/lib/apt/lists/* # 或使用更轻量的安装方式 ADD https://github.com/krallin/tini/releases/download/v0.19.0/tini /usr/bin/tini RUN chmod +x /usr/bin/tini # 在 entrypoint 脚本中使用 tini 包装(已在当前镜像的 entrypoint 中修改) ``` ### 僵尸进程过多 **现象**: - 僵尸进程数超过 50 个 - Gateway 响应变慢 - 容器内存占用升高 **诊断**: ```bash # 查看僵尸进程详情 ssh lazycat "lzc-docker exec 5f3bf33e090b ps aux | grep 'Z'" # 统计僵尸进程数量 ssh lazycat "lzc-docker exec 5f3bf33e090b ps aux | grep -c 'Z'" # 查看僵尸进程父进程 ssh lazycat "lzc-docker exec 5f3bf33e090b ps -eo pid,ppid,stat,comm | grep 'Z'" ``` **解决方案**: ```bash # 方案 1:重启容器(推荐) ssh lazycat "lzc-docker restart 5f3bf33e090b" # 方案 2:手动清理 Gateway 进程 ssh lazycat "lzc-docker exec 5f3bf33e090b pkill -9 -f 'openclaw-gateway'" ssh lazycat "lzc-docker exec 5f3bf33e090b openclaw gateway run &" # 验证清理效果 ssh lazycat "lzc-docker exec 5f3bf33e090b ps aux | grep -c 'Z'" ``` ### Gateway 无响应 **现象**: - `RPC probe: failed` 或超时 - TUI 连接失败:`gateway not connected` - Dashboard 无法访问 **诊断**: ```bash # 检查 Gateway 进程 ssh lazycat "lzc-docker exec 5f3bf33e090b ps aux | grep gateway" # 检查端口监听 ssh lazycat "lzc-docker exec 5f3bf33e090b netstat -tlnp | grep 18789" # 查看 Gateway 日志 ssh lazycat "lzc-docker exec 5f3bf33e090b tail -100 /tmp/openclaw/openclaw-*.log" # 测试本地连接 ssh lazycat "lzc-docker exec 5f3bf33e090b curl -I http://127.0.0.1:18789" ``` **解决方案**: ```bash # 1. 杀死所有 Gateway 进程 ssh lazycat "lzc-docker exec 5f3bf33e090b pkill -9 -f 'openclaw-gateway'" # 2. 启动新的 Gateway ssh lazycat "lzc-docker exec -d 5f3bf33e090b bash -c 'openclaw gateway run > /tmp/gateway.log 2>&1 &'" # 3. 等待启动 sleep 5 # 4. 验证状态 ssh lazycat "lzc-docker exec 5f3bf33e090b openclaw gateway status" ``` ### 多个 OpenClaw TUI 实例运行(已修复 2026-02-16) **现象**: - 每次启动 OpenClaw TUI 前需要 `pkill -9 openclaw` - 启动失败或端口冲突 - 多个 `openclaw-tui` 进程在后台运行 - 容器资源占用异常高 **诊断**: ```bash # 检查运行中的 OpenClaw 进程 ssh lazycat "lzc-docker exec 5f3bf33e090b ps aux | grep openclaw" # 通常会看到多个 openclaw-tui 实例: # PID 3041 - openclaw-tui (pts/0) # PID 6338 - openclaw-tui (pts/1) # PID 7223 - openclaw-tui (pts/2) # 检查端口占用 ssh lazycat "lzc-docker exec 5f3bf33e090b netstat -tlnp | grep 18789" ``` **根本原因**: - 每次运行 `openclaw tui` 都启动新进程 - 退出 TUI 时进程没有完全清理 - 多个实例同时运行导致资源竞争 **永久解决方案(已实施)**: **1. 创建自动清理脚本**(容器中): ```bash # 在容器中创建 /usr/local/bin/openclaw-clean ssh lazycat "lzc-docker exec 5f3bf33e090b bash -c \"cat > /usr/local/bin/openclaw-clean << 'EOF' #!/bin/bash # OpenClaw 清理并重启脚本 # 清理所有非 Tower 管理的 openclaw 进程 echo '🧹 清理旧的 OpenClaw 进程...' pkill -9 -f 'openclaw-tui' || true pkill -9 -f 'openclaw tui' || true # 等待进程完全退出 sleep 1 # 检查剩余进程 REMAINING=\\\$(ps aux | grep -E 'openclaw' | grep -v 'openclaw-gateway' | grep -v 'tower' | grep -v 'grep' | wc -l) if [ \\\$REMAINING -gt 0 ]; then echo '⚠️ 警告:还有 '\\\$REMAINING' 个 openclaw 进程' else echo '✅ 清理完成' fi # 启动 OpenClaw TUI echo '' echo '🦞 启动 OpenClaw TUI...' exec openclaw tui EOF chmod +x /usr/local/bin/openclaw-clean\"" ``` **2. 更新本地 openclaw-tui 脚本**: 修改 `~/bin/openclaw-tui` 的最后一行: ```bash # 修改前 ssh -t lazycat "lzc-docker exec -it 5f3bf33e090b bash -c 'openclaw tui'" # 修改后 ssh -t lazycat "lzc-docker exec -it 5f3bf33e090b openclaw-clean" ``` **修复效果**: - ✅ 每次启动自动清理旧进程 - ✅ 不再需要手动 `pkill -9 openclaw` - ✅ 避免多实例导致的资源浪费 - ✅ 一条命令 `openclaw-tui` 搞定所有 **使用方法**: ```bash # 以前(需要手动清理) ssh lazycat "lzc-docker exec 5f3bf33e090b bash -c 'pkill -9 openclaw && openclaw tui'" # 现在(自动清理) openclaw-tui # 一条命令搞定! ``` **手动清理**(如果需要): ```bash # 清理所有 openclaw-tui 进程 ssh lazycat "lzc-docker exec 5f3bf33e090b pkill -9 -f 'openclaw-tui'" # 验证清理结果 ssh lazycat "lzc-docker exec 5f3bf33e090b ps aux | grep openclaw | grep -v tower | grep -v openclaw-gateway" ``` ### 容器内存不足 **现象**: - 容器内存使用率超过 90% - OOM (Out of Memory) 错误 - 进程被 killed **诊断**: ```bash # 检查内存使用 ssh lazycat "lzc-docker stats --no-stream 5f3bf33e090b" # 查看内存限制 ssh lazycat "lzc-docker inspect 5f3bf33e090b --format='{{.HostConfig.Memory}}'" # 查看系统总内存 ssh lazycat "free -h" ``` **解决方案**: ```bash # 调整内存限制(如果当前限制过低) # 注意:懒猫算力仓已设置为 30GB,一般不需要调整 # 如确需调整,使用以下命令 ssh lazycat "lzc-docker update 5f3bf33e090b --memory=30g --memory-swap=32g" # 重启容器使配置生效 ssh lazycat "lzc-docker restart 5f3bf33e090b" ``` ### 自动重启失败 **现象**: - systemd timer 未触发 - 重启脚本执行失败 - 日志显示 `lzc-docker: command not found` **诊断**: ```bash # 检查 timer 状态 ssh lazycat "systemctl status openclaw-restart.timer" # 检查 service 状态 ssh lazycat "systemctl status openclaw-restart.service" # 查看 service 日志 ssh lazycat "journalctl -u openclaw-restart.service -n 50" # 查看脚本日志 ssh lazycat "tail -50 /var/log/openclaw-restart.log" # 手动测试脚本 ssh lazycat "bash -x /root/restart-openclaw.sh" ``` **解决方案**: 问题通常是脚本中 `lzc-docker` 命令找不到(PATH 问题)。 ```bash # 确认 lzc-docker 路径 ssh lazycat "which lzc-docker" # 输出: /lzcsys/bin/lzc-docker # 确保脚本使用完整路径 ssh lazycat "grep 'lzc-docker' /root/restart-openclaw.sh" # 应该看到: /lzcsys/bin/lzc-docker # 如果使用的是相对路径,需要修改 ssh lazycat "sed -i 's|lzc-docker|/lzcsys/bin/lzc-docker|g' /root/restart-openclaw.sh" ssh lazycat "sed -i 's|lzc-docker|/lzcsys/bin/lzc-docker|g' /root/monitor-openclaw-zombies.sh" # 重新加载 systemd 配置 ssh lazycat "systemctl daemon-reload" # 测试执行 ssh lazycat "/root/restart-openclaw.sh" ``` --- ## 最佳实践 ### 1. 定期健康检查 建议每天执行一次全面健康检查: ```bash #!/bin/bash # OpenClaw 健康检查脚本 echo "🔍 OpenClaw 健康检查 - $(date)" echo "================================" # 容器状态 echo -e "\n📦 容器状态:" ssh lazycat "lzc-docker ps --filter id=5f3bf33e090b --format 'Status: {{.Status}}'" # PID 1 进程 echo -e "\n🏗️ PID 1 进程:" ssh lazycat "lzc-docker exec 5f3bf33e090b ps -p 1 -o pid,ppid,cmd" # 僵尸进程数 echo -e "\n👻 僵尸进程:" ZOMBIE_COUNT=$(ssh lazycat "lzc-docker exec 5f3bf33e090b ps aux | grep -c 'Z'") echo "僵尸进程数: $ZOMBIE_COUNT" if [ $ZOMBIE_COUNT -gt 10 ]; then echo "⚠️ 警告:僵尸进程较多,建议重启容器" fi # 资源使用 echo -e "\n💾 资源使用:" ssh lazycat "lzc-docker stats --no-stream 5f3bf33e090b" # Gateway 状态 echo -e "\n🔌 Gateway 状态:" ssh lazycat "lzc-docker exec 5f3bf33e090b openclaw gateway status | grep 'RPC probe'" # 系统负载 echo -e "\n📊 系统负载:" ssh lazycat "uptime" echo -e "\n================================" echo "✅ 健康检查完成" ``` ### 2. 日志管理 ```bash # 查看最近的错误日志 ssh lazycat "lzc-docker logs 5f3bf33e090b --since 1h 2>&1 | grep -i error" # 查看 Tower 崩溃日志 ssh lazycat "lzc-docker logs 5f3bf33e090b 2>&1 | grep -i 'crashed\|failed'" # 查看 OpenClaw 应用日志 ssh lazycat "lzc-docker exec 5f3bf33e090b tail -100 /tmp/openclaw/openclaw-$(date +%Y-%m-%d).log" # 清理旧日志(保留最近7天) ssh lazycat "lzc-docker exec 5f3bf33e090b find /tmp/openclaw -name '*.log' -mtime +7 -delete" ``` ### 3. 备份与恢复 ```bash # 备份 OpenClaw 配置 ssh lazycat "lzc-docker exec 5f3bf33e090b tar czf /tmp/openclaw-config-backup-$(date +%Y%m%d).tar.gz -C /home/node/.openclaw ." # 下载备份到本地 scp lazycat:/tmp/openclaw-config-backup-*.tar.gz ~/backups/ # 恢复配置 scp ~/backups/openclaw-config-backup-*.tar.gz lazycat:/tmp/ ssh lazycat "lzc-docker exec 5f3bf33e090b tar xzf /tmp/openclaw-config-backup-*.tar.gz -C /home/node/.openclaw" ssh lazycat "lzc-docker restart 5f3bf33e090b" ``` ### 4. 监控告警 建议设置以下监控指标: - **僵尸进程数** > 50:触发告警,自动重启(已实现) - **内存使用率** > 90%:触发告警 - **Gateway 离线时间** > 5分钟:触发告警 - **容器重启次数** > 3次/天:触发告警 ### 5. 容器重建后的恢复清单 如果容器被重新创建(从镜像),需要重新应用以下修复: ```bash # 1. 安装 tini ssh lazycat "lzc-docker exec 5f3bf33e090b bash -c 'apt-get update -qq && apt-get install -y tini'" # 2. 修改 entrypoint ssh lazycat "lzc-docker exec 5f3bf33e090b sed -i 's|exec /usr/local/bin/tower|exec /usr/bin/tini -- /usr/local/bin/tower|g' /usr/local/bin/clawdbot-entrypoint.sh" # 3. 重启容器 ssh lazycat "lzc-docker restart 5f3bf33e090b" # 4. 验证 ssh lazycat "lzc-docker exec 5f3bf33e090b ps -p 1 -o cmd | grep tini" ``` --- ## 附录 ### 相关文档 - OpenClaw 官方文档:https://docs.openclaw.ai/ - 故障排查指南:https://docs.openclaw.ai/troubleshooting - Tini 项目:https://github.com/krallin/tini ### 联系信息 - 懒猫云支持:support@lazycat.cloud - OpenClaw 社区:https://community.openclaw.ai/ ### 版本历史 - 2026-02-16: - 创建文档,记录 Tower 崩溃修复经验(使用 tini) - 添加多 TUI 实例问题和 openclaw-clean 解决方案 - 2026-02-15:实施僵尸进程监控和自动重启 - 2026-02-14:调整容器资源限制为接近系统上限 --- ## 快速参考 ### 常用命令速查 ```bash # 连接 OpenClaw TUI openclaw-tui # 查看容器状态 ssh lazycat "lzc-docker ps | grep openclaw" # 重启容器 ssh lazycat "lzc-docker restart 5f3bf33e090b" # 查看僵尸进程数 ssh lazycat "lzc-docker exec 5f3bf33e090b ps aux | grep -c 'Z'" # 检查 Gateway 状态 ssh lazycat "lzc-docker exec 5f3bf33e090b openclaw gateway status | grep 'RPC probe'" # 查看资源使用 ssh lazycat "lzc-docker stats --no-stream 5f3bf33e090b" # 查看定时任务 ssh lazycat "systemctl list-timers | grep openclaw" # 清理多余的 OpenClaw TUI 进程 ssh lazycat "lzc-docker exec 5f3bf33e090b pkill -9 -f 'openclaw-tui'" # 启动 OpenClaw(自动清理旧进程) ssh lazycat "lzc-docker exec 5f3bf33e090b openclaw-clean" # 全面健康检查 ssh lazycat "echo '=== 容器 ===' && lzc-docker ps | grep openclaw && echo '' && echo '=== 僵尸进程 ===' && lzc-docker exec 5f3bf33e090b ps aux | grep -c 'Z' && echo '' && echo '=== Gateway ===' && lzc-docker exec 5f3bf33e090b openclaw gateway status | grep 'RPC probe'" ``` ### 故障处理速查 | 问题 | 快速解决 | |------|----------| | Tower 反复崩溃 | 参考"Tower 反复崩溃"章节,安装 tini | | 多个 TUI 实例 | 使用 `openclaw-tui`(自动清理)或手动 `pkill -9 -f openclaw-tui` | | Gateway 无响应 | `ssh lazycat "lzc-docker restart 5f3bf33e090b"` | | 僵尸进程过多 | `ssh lazycat "lzc-docker restart 5f3bf33e090b"` | | 内存不足 | 检查资源限制,重启容器 | | 自动重启失败 | 检查脚本是否使用完整路径 `/lzcsys/bin/lzc-docker` |