refactor: 合并 claude-marketplace，重构目录结构为单一仓库

- 重命名 plugins/ → skills/，个人插件迁移到 skills-personal/（gitignore） - 更新 generate-marketplace.py 支持 config 读取和 skills-personal 扫描 - 新增 claude-config.yaml（技能启用/禁用 + MCP 配置） - 新增 init.sh（交互式 MCP 初始化，支持 stdio/SSE 模式） - 新增 CLAUDE.md 项目说明 - 重写 README.md 反映新结构 - 删除过时脚本：PUSH.sh、generate-marketplace.sh、convert-skills.sh Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 11:11:59 +10:30
parent f7f5428812
commit 99881e268a
191 changed files with 1131 additions and 492 deletions
--- a/skills/doubao-voice-plugin/.claude-plugin/plugin.json
+++ b/skills/doubao-voice-plugin/.claude-plugin/plugin.json
@@ -0,0 +1,14 @@
+{
+  "name": "doubao-voice-plugin",
+  "description": "Doubao (豆包) Voice API integration for TTS and ASR",
+  "version": "1.0.0",
+  "author": {
+    "name": "qiudl"
+  },
+  "skills": [
+    {
+      "name": "doubao-voice",
+      "path": "./skills/SKILL.md"
+    }
+  ]
+}
--- a/skills/doubao-voice-plugin/.gitignore
+++ b/skills/doubao-voice-plugin/.gitignore
@@ -0,0 +1,54 @@
+# 音频文件（生成的测试输出）
+*.mp3
+*.wav
+*.pcm
+
+# 测试脚本（仅本地使用）
+scripts/test_*.py
+scripts/check_credentials.py
+scripts/README_TEST.md
+
+# 系统文件
+.DS_Store
+.DS_Store?
+._*
+.Spotlight-V100
+.Trashes
+ehthumbs.db
+
+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo
+
+# 环境配置（包含凭证的本地文件）
+setup_env.local.sh
+.env
+.env.local
+
+# 测试生成的文件
+*.log
+test_output/
--- a/skills/doubao-voice-plugin/DEPLOY.md
+++ b/skills/doubao-voice-plugin/DEPLOY.md
@@ -0,0 +1,201 @@
+# 部署指南
+
+## 在另一台电脑上使用这个 Skill
+
+### ✅ 可以直接使用吗？
+
+**大部分功能可以直接使用！** 但需要做一些简单的配置。
+
+---
+
+## 📋 部署步骤
+
+### 1️⃣ 将插件复制到新电脑
+
+```bash
+# 方式1: 从Git克隆
+git clone <repo-url> doubao-voice-plugin
+
+# 方式2: 复制文件夹
+cp -r doubao-voice-plugin /path/to/new/location
+```
+
+### 2️⃣ 安装依赖
+
+**核心依赖** (必需):
+```bash
+pip3 install requests
+```
+
+**可选依赖** (仅用voice_converter_sdk.py时需要):
+```bash
+pip3 install volcengine
+```
+
+**检查是否安装成功**:
+```bash
+python3 -c "import requests; print('✅ requests 已安装')"
+```
+
+### 3️⃣ 配置凭证
+
+创建本地配置文件:
+```bash
+cd scripts
+cp setup_env.local.sh.example setup_env.local.sh
+```
+
+编辑 `setup_env.local.sh`，填入您的火山引擎凭证:
+```bash
+export DOUBAO_APP_ID="your_app_id"
+export DOUBAO_ACCESS_TOKEN="your_access_token"
+```
+
+### 4️⃣ 使用
+
+```bash
+# 加载环境变量
+source scripts/setup_env.local.sh
+
+# 文字转语音
+python3 scripts/voice_converter.py tts "你好世界" -o hello.mp3
+
+# 语音转文字（需先启用ASR服务）
+python3 scripts/voice_converter.py asr audio.mp3
+```
+
+---
+
+## 🔧 系统要求
+
+| 需求 | 版本 | 状态 |
+|------|------|------|
+| **Python** | 3.6+ | ✅ 必需 |
+| **requests** | 任意版本 | ✅ 必需 |
+| **volcengine** | 任意版本 | ⚠️ 可选 |
+| **操作系统** | Linux/Mac/Windows | ✅ 都支持 |
+
+---
+
+## 🚨 常见问题
+
+### Q: 错误 "ModuleNotFoundError: No module named 'requests'"
+**解决**:
+```bash
+pip3 install requests
+```
+
+### Q: 错误 "DOUBAO_APP_ID not found"
+**解决**:
+```bash
+# 检查环境变量
+echo $DOUBAO_APP_ID
+
+# 如果为空，重新加载配置
+source setup_env.local.sh
+```
+
+### Q: 为什么 ASR 不工作？
+**原因**: 需要在火山引擎控制台启用 ASR 服务
+**解决**: 访问 https://console.volcengine.com/speech/service，启用语音识别服务
+
+### Q: 可以在 Windows 上使用吗？
+**可以！** 但环境变量设置方式不同：
+
+```batch
+REM Windows CMD
+set DOUBAO_APP_ID=your_app_id
+set DOUBAO_ACCESS_TOKEN=your_access_token
+python scripts\voice_converter.py tts "你好" -o hello.mp3
+```
+
+或在 PowerShell：
+```powershell
+$env:DOUBAO_APP_ID="your_app_id"
+$env:DOUBAO_ACCESS_TOKEN="your_access_token"
+python scripts/voice_converter.py tts "你好" -o hello.mp3
+```
+
+### Q: 如何在 Docker 中使用？
+**Dockerfile 示例**:
+```dockerfile
+FROM python:3.9-slim
+
+WORKDIR /app
+COPY . .
+
+RUN pip install requests
+
+ENV DOUBAO_APP_ID=${DOUBAO_APP_ID}
+ENV DOUBAO_ACCESS_TOKEN=${DOUBAO_ACCESS_TOKEN}
+
+ENTRYPOINT ["python", "scripts/voice_converter.py"]
+```
+
+运行:
+```bash
+docker build -t doubao-voice .
+docker run -e DOUBAO_APP_ID=xxx -e DOUBAO_ACCESS_TOKEN=xxx doubao-voice tts "你好"
+```
+
+---
+
+## 📦 三种使用方式
+
+### 方式 1: 命令行 (推荐简单使用)
+```bash
+python3 scripts/voice_converter.py tts "文本" -o output.mp3
+```
+
+### 方式 2: Python 模块导入
+```python
+import sys
+sys.path.insert(0, 'scripts')
+from voice_converter import DoubaoVoiceConverter
+
+converter = DoubaoVoiceConverter()
+converter.text_to_speech("你好世界", output_file="hello.mp3")
+```
+
+### 方式 3: Claude Code Skill (自动)
+如果安装在 Claude Code 的 plugins 目录，会自动识别为 Skill：
+```bash
+# 用户说: "把这段话转成语音：你好世界"
+# → 自动调用 TTS API
+```
+
+---
+
+## 🔐 安全提示
+
+✅ **推荐做法**:
+- 凭证存储在 `.local` 文件中（不在 Git 中）
+- 使用环境变量而不是硬编码
+- 定期更新 Access Token
+
+❌ **不要做**:
+- 不要把凭证提交到 Git
+- 不要在脚本中硬编码凭证
+- 不要分享包含凭证的配置文件
+
+---
+
+## 📝 最小化部署清单
+
+```bash
+✅ 复制文件夹
+✅ pip install requests
+✅ 复制并编辑 setup_env.local.sh
+✅ source setup_env.local.sh
+✅ python3 scripts/voice_converter.py tts "测试"
+✅ 成功！
+```
+
+---
+
+## 🆘 如需帮助
+
+1. 检查 README.md (用户文档)
+2. 查看 skills/SKILL.md (API 文档)
+3. 查看 STATUS.md (开发状态)
+
--- a/skills/doubao-voice-plugin/GIT_GUIDE.md
+++ b/skills/doubao-voice-plugin/GIT_GUIDE.md
@@ -0,0 +1,196 @@
+# Git 提交指南
+
+## 📋 提交清单
+
+### ✅ 应该提交的文件
+
+```bash
+git add .
+git status  # 确认以下文件已staged
+
+应包含：
+- .claude-plugin/plugin.json          # 插件配置
+- skills/SKILL.md                     # 技能文档
+- scripts/voice_converter.py          # 核心工具
+- scripts/voice_converter_v2.py       # 备选方案
+- scripts/voice_converter_sdk.py      # 备选方案
+- scripts/check_credentials.py        # 诊断工具
+- scripts/test_services.py            # 服务测试
+- scripts/test_v3_debug.py            # V3调试工具
+- scripts/setup_env.sh                # 示例脚本（占位符版本）
+- scripts/setup_env.local.sh.example  # 本地配置模板
+- README.md                           # 用户文档
+- STATUS.md                           # 开发状态
+- .gitignore                          # Git忽略规则
+- GIT_GUIDE.md                        # 本文件
+```
+
+### ❌ 被自动忽略的文件（勿手动提交）
+
+```bash
+# .gitignore 已配置，以下文件不会被提交：
+- *.mp3, *.wav, *.pcm                 # 音频文件
+- .DS_Store                           # 系统文件
+- setup_env.local.sh                  # 本地凭证文件
+- .env, .env.local                    # 环境变量文件
+- __pycache__/                        # Python缓存
+- .vscode/, .idea/                    # IDE配置
+```
+
+---
+
+## 🔐 凭证管理 (重要！)
+
+### 本地使用流程
+
+```bash
+# 1. 基于模板创建本地配置文件
+cd scripts
+cp setup_env.local.sh.example setup_env.local.sh
+
+# 2. 编辑本地文件，填入您的真实凭证
+nano setup_env.local.sh  # 或用您喜欢的编辑器
+
+# 3. 本地使用时，source 本地文件
+source setup_env.local.sh
+
+# 4. 验证（注意：setup_env.local.sh 在 .gitignore 中）
+git status  # 应该看不到 setup_env.local.sh
+```
+
+### 关键安全要点
+
+✅ **做这些**:
+- 凭证存储在本地的 `.local` 文件中
+- 凭证存储在环境变量中（不硬编码）
+- 公开文件只包含占位符 `your_app_id`, `your_access_token`
+- 定期检查 git status 确保没有凭证被暴露
+
+❌ **不要做这些**:
+- 不要把真实凭证提交到 Git
+- 不要硬编码凭证在 Python 文件中
+- 不要修改 .gitignore，让敏感文件被跟踪
+- 不要分享包含凭证的 shell 脚本
+
+---
+
+## 📝 提交步骤
+
+```bash
+# 1. 确保您创建了本地配置文件
+cd /Users/junhuang/coolbuy/claude-marketplace/plugins/doubao-voice-plugin/scripts
+cp setup_env.local.sh.example setup_env.local.sh
+# 编辑 setup_env.local.sh，填入您的凭证
+
+# 2. 检查状态
+cd ..
+git status
+
+# 3. 提交所有应提交的文件
+git add .
+
+# 4. 验证没有凭证泄露
+git diff --cached | grep -i "DOUBAO_APP_ID\|DOUBAO_ACCESS_TOKEN\|AKLT\|VOLCENGINE"
+# 如果有输出，说明有凭证要被提交，请取消并修改
+
+# 5. 提交
+git commit -m "feat: Add Doubao Voice plugin with TTS/ASR support"
+
+# 6. 再次检查
+git show HEAD  # 确认提交内容
+
+# 7. 推送
+git push origin main
+```
+
+---
+
+## 🔍 验证清单
+
+提交前，运行以下命令确认安全：
+
+```bash
+# 检查是否有真实凭证在staged文件中
+git diff --cached | grep -E "2288996168|LlDjcX-_UEnn4OW87iMorpXccQUilaHX|AKLTMGQ3"
+# 正常情况下应该没有输出
+
+# 检查 setup_env.local.sh 是否被忽略
+git status | grep setup_env.local.sh
+# 应该看不到这个文件
+
+# 检查 .gitignore 配置是否正确
+cat .gitignore | grep "setup_env.local"
+# 应该看到这一行
+
+# 查看即将提交的文件列表
+git ls-files
+# 确认关键文件都在其中，但不包含 setup_env.local.sh
+```
+
+---
+
+## 使用说明（给其他用户）
+
+在您发布插件后，其他用户应该：
+
+```bash
+# 1. 克隆插件
+git clone <repo-url> doubao-voice-plugin
+cd doubao-voice-plugin/scripts
+
+# 2. 创建本地配置
+cp setup_env.local.sh.example setup_env.local.sh
+
+# 3. 编辑配置，填入他们自己的凭证
+vim setup_env.local.sh
+
+# 4. 配置环境变量
+source setup_env.local.sh
+
+# 5. 测试功能
+python3 voice_converter.py tts "测试"
+
+# 6. setup_env.local.sh 不会被版本控制跟踪
+git status  # 看不到 setup_env.local.sh ✅
+```
+
+---
+
+## FAQ
+
+**Q: 我不小心提交了凭证怎么办？**
+
+A: 立即执行：
+```bash
+# 从 Git 历史中移除敏感文件
+git rm --cached scripts/setup_env.local.sh
+git commit --amend -m "Remove sensitive file"
+
+# 更改您的火山引擎 Access Token（出于安全考虑）
+# 在控制台重新生成新的 token
+```
+
+**Q: 为什么需要 setup_env.local.sh.example？**
+
+A: 这样其他用户可以看到配置文件应该包含哪些环境变量，而不会暴露任何真实凭证。
+
+**Q: 可以把凭证放在 ~/.bashrc 里吗？**
+
+A: 可以，但 setup_env.local.sh 更加灵活，易于项目专用配置。
+
+**Q: 如何在 CI/CD 中使用敏感凭证？**
+
+A: 在 CI/CD 平台（GitHub Actions, GitLab CI等）中使用 Secrets/Variables 功能，不要在代码中硬编码。
+
+---
+
+## 总结
+
+✅ **已完成的安全措施**：
+1. ✓ .gitignore 配置了敏感文件忽略规则
+2. ✓ setup_env.sh 改为占位符版本
+3. ✓ 创建了 setup_env.local.sh.example 模板
+4. ✓ 所有代码文件使用环境变量读取凭证
+5. ✓ 提供了清晰的本地配置说明
+
+现在可以安全地提交到 Git！🎉
--- a/skills/doubao-voice-plugin/README.md
+++ b/skills/doubao-voice-plugin/README.md
@@ -0,0 +1,182 @@
+# 豆包语音插件 (Doubao Voice Plugin)
+
+火山引擎豆包语音API集成插件，支持文字转语音(TTS)和唱歌功能。
+
+## 功能特性
+
+- **✅ 语音合成 (TTS)**: 文字转语音，支持多种音色 - **已测试可用**
+- **🎵 唱歌**: 让豆包唱歌，支持实时语音交互 - **已开通端到端大模型**
+- **简单易用**: 命令行工具，一行命令即可使用
+- **多种音色**: 支持女声/男声等多种基础音色
+- **实时交互**: 支持与豆包进行实时对话和唱歌
+
+## 快速开始
+
+### 1. 获取API凭证
+
+访问 [火山引擎控制台](https://console.volcengine.com/speech/app) 创建应用并获取：
+- **App ID** (数字)
+- **Access Token** (长字符串)
+
+开通所需服务：
+1. 在控制台勾选 **"语音合成"** 服务 (TTS)
+
+### 2. 配置环境变量
+
+**方式1: 使用配置脚本 (推荐)**
+```bash
+cd scripts
+source setup_env.sh  # 自动设置环境变量
+```
+
+**方式2: 手动设置**
+```bash
+export DOUBAO_APP_ID="your_app_id"
+export DOUBAO_ACCESS_TOKEN="your_access_token"
+```
+
+### 3. 安装依赖
+
+```bash
+pip3 install requests --break-system-packages
+```
+
+### 4. 检查凭证
+
+```bash
+# 检查凭证配置
+python3 scripts/check_credentials.py
+```
+
+### 5. 使用示例
+
+#### TTS 文字转语音（命令行）
+
+```bash
+cd scripts
+
+# 基础用法 - ✅ 已测试可用
+python3 voice_converter.py tts "你好，我是豆包语音助手" -o output.mp3
+
+# 使用不同音色
+python3 voice_converter.py tts "测试男声" -o male.mp3 -v BV701_V2_streaming
+```
+
+#### 唱歌（命令行）🎵
+
+```bash
+cd scripts
+
+# 让豆包唱歌
+python3 singing.py sing "请唱一首关于春天的歌" -o spring.mp3
+
+# 交互式唱歌模式（实时对话）
+python3 singing.py interactive
+```
+
+#### Python 代码方式
+
+```python
+# TTS - 文字转语音
+from scripts.voice_converter import DoubaoVoiceConverter
+
+converter = DoubaoVoiceConverter()
+audio_file = converter.text_to_speech("你好，欢迎使用豆包", output_file="hello.mp3")
+
+# 唱歌
+import asyncio
+from scripts.singing import DoubaoSinging
+
+async def sing():
+    singing = DoubaoSinging()
+    audio_file = await singing.sing("请唱一首情歌", output_file="love_song.mp3")
+
+asyncio.run(sing())
+```
+
+## 自然语言调用
+
+在 Claude Code 中可以使用自然语言调用：
+
+**TTS 文字转语音**:
+- "把这段话转成语音：你好世界"
+- "用温柔女声合成语音"
+- "用男声朗读这段文字"
+
+**唱歌**:
+- "请唱一首关于春天的歌"
+- "唱一个温柔的摇篮曲"
+- "开启与豆包的实时语音对话模式"
+
+示例：
+```
+用户: "帮我把'欢迎使用豆包语音'转成语音"
+Claude: 调用TTS服务生成output.mp3
+```
+
+## 价格说明
+
+### TTS (语音合成)
+- 大模型并发版: 2000元/并发/月
+- 按量付费: 按字符数计费
+
+### 免费试用
+新用户开通服务后可获得免费额度。
+
+## 支持的音色
+
+| 音色代码 | 描述 | 场景 | 状态 |
+|---------|------|------|------|
+| BV700_V2_streaming | 通用女声 | 通用场景 | ✅ V1 可用 |
+| BV701_V2_streaming | 通用男声 | 通用场景 | ✅ V1 可用 |
+| BV406_streaming | 温柔女声 | 客服、助手 | ✅ V1 可用 |
+| BV158_streaming | 活泼女声 | 教育、娱乐 | ✅ V1 可用 |
+| BV115_streaming | 磁性男声 | 新闻、播音 | ✅ V1 可用 |
+
+**注意**: 豆包2.0高级音色需要使用V3 API，目前正在调试中。
+
+## 常见问题
+
+### TTS 返回 "requested resource not granted"
+**解决方法**: 在控制台勾选"语音合成"服务选项
+
+### Authorization 头格式错误
+确保使用 `Bearer;{token}` 格式（注意分号），而不是 `Bearer {token}`
+
+### 环境变量未生效
+```bash
+# 检查环境变量
+echo $DOUBAO_APP_ID
+echo $DOUBAO_ACCESS_TOKEN
+
+# 如果为空，重新设置
+source setup_env.sh
+```
+
+## API 版本说明
+
+### V1 API (当前使用) ✅
+- **状态**: 已测试，稳定可用
+- **认证**: Bearer Token
+- **音色**: 支持基础音色
+- **推荐**: 日常使用推荐
+
+### V3 API (豆包2.0) ⚠️
+- **状态**: 调试中，存在 "get resource id empty" 问题
+- **认证**: Bearer Token + Resource-Id
+- **音色**: 支持豆包2.0高级音色
+- **说明**: 需要联系火山引擎技术支持获取正确配置
+
+## 技术支持
+
+- [官方文档](https://www.volcengine.com/docs/6561/1359369)
+- [控制台](https://console.volcengine.com/speech/app)
+- [计费说明](https://www.volcengine.com/docs/6561/1359370)
+
+## 许可证
+
+本插件遵循 MIT 许可证。
+
+## 作者
+
+qiudl @ zhiyuncai.com
--- a/skills/doubao-voice-plugin/STATUS.md
+++ b/skills/doubao-voice-plugin/STATUS.md
@@ -0,0 +1,200 @@
+# 豆包语音插件 - 开发状态
+
+**更新时间**: 2026-02-07
+**版本**: 1.0.0
+
+---
+
+## ✅ 已完成功能
+
+### 1. TTS (文字转语音) - 完全可用 ✅
+
+**测试状态**: 通过
+**API版本**: V1
+**可用音色**:
+- BV700_V2_streaming (通用女声)
+- BV701_V2_streaming (通用男声)
+- BV406_streaming (温柔女声)
+- BV158_streaming (活泼女声)
+- BV115_streaming (磁性男声)
+
+**测试命令**:
+```bash
+source scripts/setup_env.sh
+python3 scripts/voice_converter.py tts "你好世界" -o hello.mp3
+```
+
+**测试结果**:
+- ✅ HTTP 200 OK
+- ✅ Code 3000 Success
+- ✅ 成功生成 MP3 文件
+- ✅ 音质正常
+
+---
+
+## ⚠️ 待完成功能
+
+### 2. ASR (语音转文字) - 待启用服务
+
+**问题**: Code 1001 - "requested resource not granted"
+
+**原因**: ASR 服务未在火山引擎控制台正确启用
+
+**解决步骤**:
+1. 访问: https://console.volcengine.com/speech/service
+2. 找到 "语音识别 (ASR)" 服务
+3. 确保服务已启用并勾选必要选项
+4. 等待服务生效（可能需要几分钟）
+5. 重新测试
+
+**测试命令** (服务启用后):
+```bash
+python3 scripts/voice_converter.py asr audio.mp3
+```
+
+---
+
+### 3. V3 API / 豆包2.0音色 - 调试中
+
+**问题**: Code 45000000 - "get resource id empty"
+
+**已尝试的方法**:
+- [x] Resource-Id header
+- [x] X-Resource-Id header
+- [x] resource_id query parameter
+- [x] resource_id in app config
+- [x] 多种 resource_id 值: volc.bigmodel.tts, volc.seed-tts.default, volc.tts.default
+
+**当前状态**: 所有方法均返回相同错误
+
+**可能原因**:
+1. V3 API 可能需要不同的认证方式 (IAM签名)
+2. 需要特殊的服务实例配置
+3. Resource-Id 的获取或配置方法不正确
+
+**建议**:
+- 联系火山引擎技术支持获取 V3 API 正确配置方法
+- 或继续使用 V1 API (已满足基本需求)
+
+---
+
+## 📁 项目文件结构
+
+```
+plugins/doubao-voice-plugin/
+├── .claude-plugin/
+│   └── plugin.json                    # 插件元数据
+├── skills/
+│   └── SKILL.md                       # 技能定义和文档
+├── scripts/
+│   ├── voice_converter.py             # 主转换工具 (V1 API, 可用)
+│   ├── voice_converter_v2.py          # 手动签名版本 (待测试)
+│   ├── voice_converter_sdk.py         # SDK版本 (待测试)
+│   ├── check_credentials.py           # 凭证检查工具
+│   ├── test_services.py               # 服务状态测试
+│   ├── test_v3_debug.py               # V3 API 调试脚本
+│   ├── setup_env.sh                   # 环境变量配置脚本
+│   └── README_TEST.md                 # 测试报告
+├── README.md                          # 用户文档
+└── STATUS.md                          # 本文件 (开发状态)
+```
+
+---
+
+## 🔧 诊断工具
+
+### 检查凭证配置
+```bash
+python3 scripts/check_credentials.py
+```
+显示当前环境变量配置状态
+
+### 测试服务状态
+```bash
+python3 scripts/test_services.py
+```
+测试 TTS 和 ASR 服务是否可用
+
+### V3 API 调试
+```bash
+python3 scripts/test_v3_debug.py
+```
+测试多种 V3 API 配置方式
+
+---
+
+## 📊 当前凭证配置
+
+```bash
+DOUBAO_APP_ID="your_app_id"
+DOUBAO_ACCESS_TOKEN="your_access_token"
+
+# V3 可选配置 (暂不可用)
+# DOUBAO_USE_V3="true"
+# DOUBAO_RESOURCE_ID="volc.bigmodel.tts"
+```
+
+**Access Key 信息** (用于签名认证，暂未使用):
+- Access Key ID: your_access_key_id
+- Secret Access Key: your_secret_access_key
+
+---
+
+## 🎯 下一步计划
+
+### 立即可用
+1. ✅ **使用 TTS 功能**
+   - 集成到应用中
+   - 测试不同音色
+   - 生产环境部署
+
+### 短期目标 (1-3天)
+2. ⚠️ **启用 ASR 服务**
+   - 在控制台启用服务
+   - 测试语音识别功能
+   - 完善错误处理
+
+### 长期目标 (可选)
+3. 🔄 **V3 API 支持**
+   - 联系火山引擎技术支持
+   - 获取正确的 Resource-Id 配置方法
+   - 支持豆包2.0高级音色
+
+---
+
+## 📞 技术支持
+
+### 火山引擎
+- 文档: https://www.volcengine.com/docs/6561/1329505
+- 控制台: https://console.volcengine.com/speech/app
+- 服务管理: https://console.volcengine.com/speech/service
+
+### 常见问题解决
+1. **TTS 可用但 ASR 不可用**
+   - 检查控制台 ASR 服务是否启用
+   - 确认勾选了"语音识别"选项
+
+2. **V3 API 持续报错**
+   - 暂时使用 V1 API
+   - 联系火山引擎技术支持
+
+3. **认证失败**
+   - 检查环境变量是否正确设置
+   - 确认 Access Token 格式正确
+   - 注意 Authorization header 使用 `Bearer;{token}` (有分号)
+
+---
+
+## ✨ 总结
+
+**当前可用**: TTS (文字转语音) 功能完全可用，可以投入使用
+
+**待解决**:
+1. 在控制台启用 ASR 服务
+2. (可选) 解决 V3 API 配置问题
+
+**建议**: 先使用 V1 API 的 TTS 功能，满足基本语音合成需求。ASR 功能在控制台启用服务后即可使用。V3 API 的豆包2.0音色为可选功能，可以后续再解决。
+
+---
+
+*Generated by Claude Code on 2026-02-07*
--- a/skills/doubao-voice-plugin/scripts/README.md
+++ b/skills/doubao-voice-plugin/scripts/README.md
@@ -0,0 +1,186 @@
+# 豆包语音工具使用指南
+
+简单易用的豆包语音命令行工具，支持**文字转语音(TTS)**和**唱歌**。
+
+## 快速开始
+
+### 1. 配置环境变量
+
+```bash
+# 在 ~/.zshrc 或 ~/.bashrc 中添加
+export DOUBAO_APP_ID="your_app_id"
+export DOUBAO_ACCESS_TOKEN="your_access_token"
+
+# 使配置生效
+source ~/.zshrc
+```
+
+### 2. 安装依赖
+
+```bash
+pip install requests
+```
+
+## 使用方法
+
+### 📝 文字转语音 (TTS)
+
+**基础用法：**
+```bash
+python voice_converter.py tts "你好，我是豆包语音助手"
+```
+
+**指定输出文件和音色：**
+```bash
+python voice_converter.py tts "欢迎使用豆包语音" -o welcome.mp3 -v BV701_V2_streaming
+```
+
+**可用音色：**
+- `BV700_V2_streaming` - 通用女声（默认，推荐）
+- `BV701_V2_streaming` - 通用男声
+- `BV406_streaming` - 温柔女声
+- `BV158_streaming` - 活泼女声
+- `BV115_streaming` - 磁性男声
+
+### 🎵 唱歌 (Singing)
+
+**基础用法：**
+```bash
+python singing.py sing "请唱一首关于春天的歌"
+```
+
+**指定输出文件：**
+```bash
+python singing.py sing "唱一个温柔的摇篮曲" -o lullaby.mp3
+```
+
+**交互式模式（实时对话）：**
+```bash
+python singing.py interactive
+```
+
+在交互模式下可以自然地与豆包对话，要求她唱歌、讲故事等。输入 `quit` 退出。
+
+## Python 代码调用
+
+```python
+# TTS - 文字转语音
+from voice_converter import DoubaoVoiceConverter
+
+converter = DoubaoVoiceConverter()
+audio_file = converter.text_to_speech(
+    "你好，欢迎使用豆包语音",
+    output_file="hello.mp3",
+    voice_type="BV700_V2_streaming"
+)
+print(f"生成语音: {audio_file}")
+
+# 唱歌
+import asyncio
+from singing import DoubaoSinging
+
+async def main():
+    singing = DoubaoSinging()
+
+    # 让豆包唱歌
+    audio_file = await singing.sing(
+        "请唱一首情歌",
+        output_file="love_song.mp3",
+        language="zh-CN"
+    )
+    print(f"唱歌完成: {audio_file}")
+
+    # 或启动交互模式
+    # await singing.interactive_singing()
+
+asyncio.run(main())
+```
+
+## 完整示例
+
+### 示例1：生成通知语音
+
+```bash
+# 生成女声通知
+python voice_converter.py tts "您有一条新消息，请注意查收" -o notification.mp3
+
+# 生成男声通知
+python voice_converter.py tts "系统将在5分钟后进行维护" -o maintenance.mp3 -v BV701_V2_streaming
+```
+
+### 示例2：唱歌
+
+```bash
+# 让豆包唱一首情歌
+python singing.py sing "请唱一首温柔的情歌" -o love_song.mp3
+
+# 让豆包唱一首儿歌
+python singing.py sing "唱一首欢快的儿歌" -o kids_song.mp3
+
+# 启动交互式模式与豆包对话
+python singing.py interactive
+```
+
+
+## 错误处理
+
+### 常见错误
+
+**1. 环境变量未设置**
+```
+❌ 错误: 请先设置环境变量:
+export DOUBAO_APP_ID='your_app_id'
+export DOUBAO_ACCESS_TOKEN='your_access_token'
+```
+**解决：** 确保已正确设置环境变量并 `source ~/.zshrc`
+
+**2. API 调用失败**
+```
+❌ 错误: TTS 失败 (code: 4001): Invalid token
+```
+**解决：** 检查 Access Token 是否正确或已过期
+
+## 技术参数
+
+### 音频格式要求
+
+**TTS 输出：**
+- 格式：MP3
+- 采样率：16000 Hz
+- 声道：单声道
+
+### API 限制
+
+- **TTS**: 单次最长 5000 字符
+- **并发限制**: 根据购买的并发数
+
+## 在 Claude Code 中使用
+
+在 Claude Code 中可以直接用自然语言调用：
+
+**TTS - 文字转语音**:
+```
+"把这段话转成语音：你好世界"
+"用温柔女声合成：欢迎光临"
+```
+
+**唱歌**:
+```
+"请唱一首关于春天的歌"
+"唱一个温柔的摇篮曲"
+"开启与豆包的实时语音对话模式"
+```
+
+## 获取 API 凭证
+
+1. 访问 [火山引擎控制台](https://console.volcengine.com/speech/app)
+2. 创建应用
+3. 获取 App ID 和 Access Token
+4. 开通所需服务：
+   - 豆包语音合成模型2.0
+
+## 参考链接
+
+- [火山引擎豆包语音文档](https://www.volcengine.com/docs/6561)
+- [API 接口文档](https://www.volcengine.com/docs/6561/1096680)
+- [计费说明](https://www.volcengine.com/docs/6561/1359370)
--- a/skills/doubao-voice-plugin/scripts/setup_env.local.sh.example
+++ b/skills/doubao-voice-plugin/scripts/setup_env.local.sh.example
@@ -0,0 +1,21 @@
+#!/bin/bash
+# 豆包语音 API 环境变量配置（本地版本）
+#
+# 使用说明：
+# 1. 复制本文件: cp setup_env.local.sh.example setup_env.local.sh
+# 2. 编辑 setup_env.local.sh，填入您的真实凭证
+# 3. 运行: source setup_env.local.sh
+# 4. .gitignore 已配置忽略 setup_env.local.sh，所以您的凭证不会被提交到 Git
+
+# ⚠️ 重要：请在下面填入您的真实凭证（仅本地使用）
+export DOUBAO_APP_ID="your_app_id_here"
+export DOUBAO_ACCESS_TOKEN="your_access_token_here"
+
+# V3 API 配置 (可选，如需豆包2.0音色)
+# export DOUBAO_USE_V3="true"
+# export DOUBAO_RESOURCE_ID="volc.bigmodel.tts"
+
+echo "✅ 豆包语音 API 环境变量已设置（本地配置）"
+echo ""
+echo "App ID: ${DOUBAO_APP_ID:0:10}..."
+echo "Access Token: ${DOUBAO_ACCESS_TOKEN:0:20}..."
--- a/skills/doubao-voice-plugin/scripts/setup_env.sh
+++ b/skills/doubao-voice-plugin/scripts/setup_env.sh
@@ -0,0 +1,22 @@
+#!/bin/bash
+# 豆包语音 API 环境变量配置 (示例)
+#
+# ⚠️ 重要：这是示例脚本，包含占位符。
+# 本地使用时，请参考 setup_env.local.sh.example 创建 setup_env.local.sh，
+# 然后在其中填入您的真实凭证。.gitignore 已配置忽略 .local 文件。
+
+export DOUBAO_APP_ID="your_app_id"
+export DOUBAO_ACCESS_TOKEN="your_access_token"
+
+# V3 API 配置 (可选，如需豆包2.0音色)
+# export DOUBAO_USE_V3="true"
+# export DOUBAO_RESOURCE_ID="volc.bigmodel.tts"
+
+echo "✅ 豆包语音 API 环境变量已设置"
+echo ""
+echo "App ID: $DOUBAO_APP_ID"
+echo "Access Token: ${DOUBAO_ACCESS_TOKEN:0:20}..."
+echo ""
+echo "现在可以运行:"
+echo "  python3 voice_converter.py tts \"你好世界\" -o hello.mp3"
+echo "  python3 voice_converter.py asr audio.mp3  # 需先启用ASR服务"
--- a/skills/doubao-voice-plugin/scripts/singing.py
+++ b/skills/doubao-voice-plugin/scripts/singing.py
@@ -0,0 +1,327 @@
+#!/usr/bin/env python3
+"""
+豆包唱歌工具
+基于豆包端到端实时语音大模型，支持让豆包唱歌
+使用WebSocket实时对话和生成音频
+"""
+
+import os
+import sys
+import json
+import asyncio
+import websockets
+import struct
+import uuid
+from typing import Optional
+
+
+# 连接级事件（不需要session_id）
+CONNECTION_EVENTS = {1, 2, 50, 51, 52}
+
+
+class DoubaoSinging:
+    """豆包唱歌工具类"""
+
+    def __init__(self):
+        # 从环境变量读取配置
+        self.app_id = os.environ.get("DOUBAO_APP_ID")
+        self.access_token = os.environ.get("DOUBAO_ACCESS_TOKEN")
+
+        if not self.app_id or not self.access_token:
+            raise ValueError(
+                "请先设置环境变量:\n"
+                "export DOUBAO_APP_ID='your_app_id'\n"
+                "export DOUBAO_ACCESS_TOKEN='your_access_token'"
+            )
+
+        # 端到端实时语音WebSocket地址
+        self.ws_url = "wss://openspeech.bytedance.com/api/v3/realtime/dialogue"
+        self.app_key = "PlgvMymc7f3tQnJ6"  # 固定值
+        self.resource_id = "volc.speech.dialog"  # 固定值
+
+    def _build_message(self, event_id: int, payload: dict = None, session_id: str = None) -> bytes:
+        """
+        构建二进制消息
+
+        协议格式:
+        - header (4 bytes)
+        - event_id (4 bytes, big-endian)
+        - [session_id_len (4 bytes) + session_id (variable)] -- 仅非连接级事件
+        - payload_len (4 bytes, big-endian)
+        - payload (variable, JSON)
+        """
+        buf = bytearray()
+
+        # Header (4 bytes)
+        buf.append(0x11)  # version=1, header_size=1
+        buf.append(0x14)  # FULL_CLIENT_REQUEST(0x1) + WITH_EVENT(0x4)
+        buf.append(0x10)  # JSON serialization, no compression
+        buf.append(0x00)  # reserved
+
+        # Event ID
+        buf.extend(struct.pack('>I', event_id))
+
+        # Session ID (required for non-connection events)
+        if event_id not in CONNECTION_EVENTS:
+            sid_bytes = (session_id or "").encode('utf-8')
+            buf.extend(struct.pack('>I', len(sid_bytes)))
+            buf.extend(sid_bytes)
+
+        # Payload
+        if payload:
+            payload_bytes = json.dumps(payload, ensure_ascii=False).encode('utf-8')
+        else:
+            payload_bytes = b'{}'
+        buf.extend(struct.pack('>I', len(payload_bytes)))
+        buf.extend(payload_bytes)
+
+        return bytes(buf)
+
+    def _parse_response(self, data: bytes) -> dict:
+        """
+        解析服务端二进制消息
+
+        Returns:
+            dict with keys: msg_type, event_id, session_id, payload, payload_bytes
+        """
+        result = {"raw": data}
+        if len(data) < 4:
+            return result
+
+        # Header
+        msg_type = (data[1] >> 4) & 0x0F
+        flags = data[1] & 0x0F
+        result["msg_type"] = msg_type
+
+        offset = 4
+
+        # Event ID (if WITH_EVENT flag)
+        if flags & 0x04 and len(data) >= offset + 4:
+            event_id = struct.unpack('>I', data[offset:offset + 4])[0]
+            result["event_id"] = event_id
+            offset += 4
+
+            # Connect ID for connection events (50, 51, 52)
+            if event_id in {50, 51, 52} and len(data) >= offset + 4:
+                cid_len = struct.unpack('>I', data[offset:offset + 4])[0]
+                offset += 4
+                if len(data) >= offset + cid_len:
+                    result["connect_id"] = data[offset:offset + cid_len].decode('utf-8', errors='ignore')
+                    offset += cid_len
+            # Session ID for session-level events
+            elif event_id not in CONNECTION_EVENTS and len(data) >= offset + 4:
+                sid_len = struct.unpack('>I', data[offset:offset + 4])[0]
+                offset += 4
+                if len(data) >= offset + sid_len:
+                    result["session_id"] = data[offset:offset + sid_len].decode('utf-8', errors='ignore')
+                    offset += sid_len
+
+        # Payload
+        if len(data) >= offset + 4:
+            payload_len = struct.unpack('>I', data[offset:offset + 4])[0]
+            offset += 4
+            if len(data) >= offset + payload_len:
+                payload_raw = data[offset:offset + payload_len]
+                result["payload_bytes"] = payload_raw
+                # Audio-only responses (msg_type 0xB) have raw audio
+                if msg_type == 0x0B:
+                    result["is_audio"] = True
+                else:
+                    try:
+                        result["payload"] = json.loads(payload_raw.decode('utf-8'))
+                    except:
+                        result["payload_text"] = payload_raw.decode('utf-8', errors='ignore')
+
+        return result
+
+    async def sing(
+        self,
+        song_request: str,
+        output_file: str = "singing_output.mp3",
+        language: str = "zh-CN",
+        model: str = "1.2.1.0"
+    ) -> str:
+        """
+        让豆包唱歌
+
+        Args:
+            song_request: 唱歌请求，如 "请唱一首关于春天的歌"
+            output_file: 输出音频文件路径
+            language: 语言代码 (zh-CN/en-US)
+            model: 模型版本
+
+        Returns:
+            str: 输出文件路径
+        """
+        print(f"🎵 豆包唱歌中...")
+        print(f"   请求: {song_request}")
+        print(f"   模型: {model}")
+
+        try:
+            audio_data = bytearray()
+            session_id = str(uuid.uuid4())
+
+            # WebSocket连接头
+            headers = {
+                "X-Api-App-ID": self.app_id,
+                "X-Api-Access-Key": self.access_token,
+                "X-Api-Resource-Id": self.resource_id,
+                "X-Api-App-Key": self.app_key,
+                "X-Api-Connect-Id": str(uuid.uuid4()),
+            }
+
+            async with websockets.connect(self.ws_url, additional_headers=headers) as websocket:
+                print("✅ WebSocket连接成功")
+
+                # 1. StartConnection (event_id=1, 无需session_id)
+                await websocket.send(self._build_message(1))
+                response = await asyncio.wait_for(websocket.recv(), timeout=5)
+                resp = self._parse_response(response)
+                if resp.get("event_id") == 50:
+                    print(f"✅ 连接已建立")
+                else:
+                    print(f"⚠️  连接响应: {resp}")
+
+                # 2. StartSession (event_id=100, 需要session_id)
+                start_session_payload = {
+                    "tts": {
+                        "audio_config": {
+                            "channel": 1,
+                            "format": "pcm",
+                            "sample_rate": 24000
+                        }
+                    },
+                    "dialog": {
+                        "extra": {
+                            "enable_music": True,
+                            "input_mod": "text",
+                            "model": model
+                        }
+                    }
+                }
+                await websocket.send(self._build_message(100, start_session_payload, session_id))
+                response = await asyncio.wait_for(websocket.recv(), timeout=5)
+                resp = self._parse_response(response)
+                if resp.get("event_id") == 150:
+                    print(f"✅ 会话已建立")
+                elif resp.get("payload", {}).get("error"):
+                    print(f"❌ 会话错误: {resp['payload']['error']}")
+                    return None
+                else:
+                    print(f"📋 会话响应: {resp}")
+
+                # 3. SayHello/ChatTextQuery (event_id=300, 需要session_id)
+                chat_payload = {"content": song_request}
+                await websocket.send(self._build_message(300, chat_payload, session_id))
+                print(f"📤 已发送唱歌请求")
+
+                # 4. 接收音频流（使用超时检测结束）
+                print("\n📋 接收音频流...")
+                tts_started = False
+                recv_timeout = 5  # 5秒无数据则认为结束
+
+                while True:
+                    try:
+                        message = await asyncio.wait_for(websocket.recv(), timeout=recv_timeout)
+                    except asyncio.TimeoutError:
+                        break
+                    except websockets.exceptions.ConnectionClosed:
+                        break
+
+                    if isinstance(message, bytes) and len(message) >= 4:
+                        resp = self._parse_response(message)
+                        msg_type = resp.get("msg_type", 0)
+                        flags = message[1] & 0x0F
+
+                        # Audio-only response (0xB = 11)
+                        if resp.get("is_audio") and resp.get("payload_bytes"):
+                            audio_data.extend(resp["payload_bytes"])
+                            if not tts_started:
+                                print(f"   接收音频中...", end="", flush=True)
+                                tts_started = True
+                            else:
+                                print(".", end="", flush=True)
+
+                            # NEG_SEQUENCE flag = last packet
+                            if flags & 0x02:
+                                break
+
+                        # Server error (0xF = 15)
+                        elif msg_type == 0x0F:
+                            error = resp.get("payload", {}).get("error", "unknown")
+                            print(f"\n❌ 服务器错误: {error}")
+                            break
+
+                        # Full server response (0x9) - session finished
+                        elif msg_type == 0x09:
+                            event_id = resp.get("event_id", 0)
+                            if event_id in {152, 52}:
+                                break
+
+                # 5. 保存音频文件
+                if audio_data:
+                    # Save as PCM, convert extension if needed
+                    actual_output = output_file
+                    if output_file.endswith('.mp3'):
+                        actual_output = output_file.replace('.mp3', '.pcm')
+
+                    with open(actual_output, "wb") as f:
+                        f.write(audio_data)
+
+                    file_size = len(audio_data) / 1024
+                    print(f"\n\n✅ 唱歌完成!")
+                    print(f"   输出: {actual_output} ({file_size:.1f} KB)")
+                    print(f"   格式: PCM (24000Hz, 单声道)")
+                    return actual_output
+                else:
+                    print("\n⚠️ 未收到音频数据，请检查:")
+                    print("   1. 凭证是否正确")
+                    print("   2. 端到端实时语音大模型是否已开通")
+                    print("   3. 网络连接是否正常")
+                    return None
+
+        except websockets.exceptions.WebSocketException as e:
+            raise Exception(f"WebSocket连接错误: {str(e)}")
+        except Exception as e:
+            raise Exception(f"唱歌调用失败: {str(e)}")
+
+
+def main():
+    """命令行工具"""
+    import argparse
+
+    parser = argparse.ArgumentParser(description="豆包唱歌工具")
+    subparsers = parser.add_subparsers(dest="command", help="选择功能")
+
+    # 唱歌命令
+    sing_parser = subparsers.add_parser("sing", help="让豆包唱歌")
+    sing_parser.add_argument("request", help="唱歌请求，如 '请唱一首关于春天的歌'")
+    sing_parser.add_argument(
+        "-o", "--output", default="singing_output.mp3", help="输出音频文件（默认: singing_output.mp3）"
+    )
+    sing_parser.add_argument(
+        "-l", "--language", default="zh-CN", help="语言代码（默认: zh-CN）"
+    )
+    sing_parser.add_argument(
+        "-m", "--model", default="1.2.1.0", help="模型版本（默认: 1.2.1.0=O2.0版本）"
+    )
+
+    args = parser.parse_args()
+
+    if not args.command:
+        parser.print_help()
+        return
+
+    try:
+        singing = DoubaoSinging()
+
+        if args.command == "sing":
+            asyncio.run(singing.sing(args.request, args.output, args.language, args.model))
+
+    except Exception as e:
+        print(f"❌ 错误: {e}", file=sys.stderr)
+        sys.exit(1)
+
+
+if __name__ == "__main__":
+    main()
--- a/skills/doubao-voice-plugin/scripts/voice_converter.py
+++ b/skills/doubao-voice-plugin/scripts/voice_converter.py
@@ -0,0 +1,171 @@
+#!/usr/bin/env python3
+"""
+豆包语音转换工具
+支持：文字转语音 (TTS)
+"""
+
+import os
+import sys
+import json
+import base64
+import requests
+from pathlib import Path
+
+
+class DoubaoVoiceConverter:
+    """豆包语音转换工具类"""
+
+    def __init__(self):
+        # 从环境变量读取配置
+        self.app_id = os.environ.get("DOUBAO_APP_ID")
+        self.access_token = os.environ.get("DOUBAO_ACCESS_TOKEN")
+
+        if not self.app_id or not self.access_token:
+            raise ValueError(
+                "请先设置环境变量:\n"
+                "export DOUBAO_APP_ID='your_app_id'\n"
+                "export DOUBAO_ACCESS_TOKEN='your_access_token'"
+            )
+
+        # API版本选择: V1 (默认, 支持基础音色) 或 V3 (豆包2.0, 需额外配置)
+        self.use_v3 = os.environ.get("DOUBAO_USE_V3", "false").lower() == "true"
+
+        if self.use_v3:
+            self.tts_url = "https://openspeech.bytedance.com/api/v3/tts/unidirectional"
+            self.resource_id = os.environ.get("DOUBAO_RESOURCE_ID", "volc.bigmodel.tts")
+        else:
+            # V1 API - 稳定可用，支持基础音色
+            self.tts_url = "https://openspeech.bytedance.com/api/v1/tts"
+
+    def text_to_speech(
+        self,
+        text: str,
+        output_file: str = "output.mp3",
+        voice_type: str = "BV700_V2_streaming"
+    ) -> str:
+        """
+        文字转语音 (TTS)
+
+        Args:
+            text: 要转换的文字
+            output_file: 输出音频文件路径
+            voice_type: 音色类型
+                - BV700_V2_streaming: 通用女声（推荐）
+                - BV701_V2_streaming: 通用男声
+                - BV406_streaming: 温柔女声
+                - BV158_streaming: 活泼女声
+                - BV115_streaming: 磁性男声
+
+        Returns:
+            str: 输出文件路径
+        """
+        print(f"📝 文字转语音中...")
+        print(f"   文字: {text[:50]}{'...' if len(text) > 50 else ''}")
+        print(f"   音色: {voice_type}")
+
+        headers = {
+            "Authorization": f"Bearer;{self.access_token}",
+            "Content-Type": "application/json"
+        }
+
+        # V3 API需要Resource-Id (如果启用)
+        if self.use_v3:
+            headers["Resource-Id"] = self.resource_id
+
+        payload = {
+            "app": {
+                "appid": self.app_id,
+                "token": self.access_token,
+                "cluster": "volcano_tts"
+            },
+            "user": {
+                "uid": "user_001"
+            },
+            "audio": {
+                "voice_type": voice_type,
+                "encoding": "mp3",
+                "speed_ratio": 1.0,
+                "volume_ratio": 1.0,
+                "pitch_ratio": 1.0
+            },
+            "request": {
+                "reqid": f"tts_{os.urandom(8).hex()}",
+                "text": text,
+                "text_type": "plain",
+                "operation": "query"
+            }
+        }
+
+        try:
+            response = requests.post(self.tts_url, headers=headers, json=payload, timeout=30)
+
+            # 打印响应头信息
+            print(f"\n📋 响应信息:")
+            print(f"   HTTP状态码: {response.status_code}")
+            if 'X-Tt-Logid' in response.headers:
+                print(f"   RequestId: {response.headers['X-Tt-Logid']}")
+            if 'X-Request-Id' in response.headers:
+                print(f"   X-Request-Id: {response.headers['X-Request-Id']}")
+
+            data = response.json()
+
+            # 打印完整响应
+            print(f"\n📄 完整响应:")
+            print(json.dumps(data, indent=2, ensure_ascii=False))
+            print()
+
+            if data.get("code") == 3000:
+                # 成功：解码并保存音频
+                audio_data = base64.b64decode(data["data"])
+                with open(output_file, "wb") as f:
+                    f.write(audio_data)
+
+                file_size = len(audio_data) / 1024  # KB
+                print(f"✅ 语音合成成功!")
+                print(f"   输出: {output_file} ({file_size:.1f} KB)")
+                return output_file
+            else:
+                error_msg = data.get("message", "未知错误")
+                reqid = data.get("reqid", "未知")
+                raise Exception(f"TTS 失败\n   错误码: {data.get('code')}\n   错误信息: {error_msg}\n   RequestId: {reqid}")
+
+        except requests.exceptions.Timeout:
+            raise Exception("请求超时，请检查网络连接")
+        except Exception as e:
+            raise Exception(f"TTS 调用失败: {str(e)}")
+
+
+
+def main():
+    """命令行工具"""
+    import argparse
+
+    parser = argparse.ArgumentParser(description="豆包语音转换工具")
+    subparsers = parser.add_subparsers(dest="command", help="选择功能")
+
+    # TTS 命令
+    tts_parser = subparsers.add_parser("tts", help="文字转语音")
+    tts_parser.add_argument("text", help="要转换的文字")
+    tts_parser.add_argument("-o", "--output", default="output.mp3", help="输出音频文件（默认: output.mp3）")
+    tts_parser.add_argument("-v", "--voice", default="BV700_V2_streaming",
+                           help="音色类型（默认: BV700_V2_streaming 通用女声）")
+
+    args = parser.parse_args()
+
+    if not args.command:
+        parser.print_help()
+        return
+
+    try:
+        converter = DoubaoVoiceConverter()
+
+        if args.command == "tts":
+            converter.text_to_speech(args.text, args.output, args.voice)
+
+    except Exception as e:
+        print(f"❌ 错误: {e}", file=sys.stderr)
+        sys.exit(1)
+
+
+if __name__ == "__main__":
+    main()
--- a/skills/doubao-voice-plugin/skills/SKILL.md
+++ b/skills/doubao-voice-plugin/skills/SKILL.md
@@ -0,0 +1,508 @@
+---
+name: doubao-voice
+description: 豆包语音API调用。支持语音合成(TTS)和唱歌。当用户提到语音合成、文字转语音、唱歌、豆包语音相关任务时自动激活。
+---
+
+# 豆包语音API技能
+
+调用火山引擎豆包语音API，实现文字转语音(TTS)和唱歌功能。
+
+## 核心功能 ⭐
+
+### 1. 文字转语音 (TTS)
+
+```bash
+# 1. 配置环境变量
+export DOUBAO_APP_ID="your_app_id"
+export DOUBAO_ACCESS_TOKEN="your_access_token"
+
+# 2. 文字转语音
+python scripts/voice_converter.py tts "你好世界"
+```
+
+### 2. 唱歌 🎵
+
+```bash
+# 让豆包唱歌
+python scripts/singing.py sing "请唱一首关于春天的歌"
+
+# 交互式唱歌模式
+python scripts/singing.py interactive
+```
+
+## 功能概述
+
+| 模块 | 功能 | 推荐模型 |
+|------|------|---------|
+| **语音合成 (TTS)** | 文字转语音、多种音色 | 豆包语音合成模型2.0 |
+| **唱歌** | 实时语音交互、唱歌、角色扮演 | 豆包端到端实时语音大模型 |
+
+---
+
+## 环境配置
+
+### 1. 获取火山引擎豆包语音凭证
+
+1. 访问 [火山引擎控制台](https://console.volcengine.com/)
+2. 开通「豆包语音」服务
+3. 创建应用获取 `App ID` 和 `Access Token`
+4. 开通所需服务：
+   - 「语音合成」权限：大模型语音合成
+
+### 2. 环境变量配置
+
+```bash
+# ~/.zshrc 或 ~/.bashrc
+export DOUBAO_APP_ID="your_app_id"
+export DOUBAO_ACCESS_TOKEN="your_access_token"
+export DOUBAO_CLUSTER="volcano_tts"  # TTS服务集群
+```
+
+### 3. Python 依赖
+
+```bash
+# 推荐使用 uv
+uv pip install requests websocket-client
+
+# 或使用 pip
+pip install requests websocket-client
+```
+
+---
+
+## API 基础
+
+### Base URL
+
+```
+TTS API: https://openspeech.bytedance.com/api/v1/tts
+```
+
+### 认证方式
+
+使用 Access Token 进行认证，在请求头中添加：
+```
+Authorization: Bearer {access_token}
+```
+
+---
+
+## 一、语音合成 (TTS)
+
+### 1.1 基础语音合成
+
+将文本转换为语音文件。
+
+**自然语言示例**:
+- "把这段文字转成语音"
+- "用豆包合成语音"
+- "生成语音：你好，欢迎使用豆包语音"
+
+**Python 实现**:
+
+```python
+import os
+import requests
+import json
+import base64
+
+def text_to_speech(text: str, voice_type: str = "BV700_V2_streaming", output_file: str = "output.mp3"):
+    """
+    文字转语音
+
+    Args:
+        text: 要合成的文本
+        voice_type: 音色类型 (默认: BV700_V2_streaming)
+        output_file: 输出音频文件路径
+
+    Returns:
+        音频文件路径
+    """
+    app_id = os.environ.get("DOUBAO_APP_ID")
+    access_token = os.environ.get("DOUBAO_ACCESS_TOKEN")
+    cluster = os.environ.get("DOUBAO_CLUSTER", "volcano_tts")
+
+    url = "https://openspeech.bytedance.com/api/v1/tts"
+
+    headers = {
+        "Authorization": f"Bearer {access_token}",
+        "Content-Type": "application/json"
+    }
+
+    payload = {
+        "app": {
+            "appid": app_id,
+            "token": access_token,
+            "cluster": cluster
+        },
+        "user": {
+            "uid": "user123"
+        },
+        "audio": {
+            "voice_type": voice_type,
+            "encoding": "mp3",
+            "speed_ratio": 1.0,
+            "volume_ratio": 1.0,
+            "pitch_ratio": 1.0
+        },
+        "request": {
+            "reqid": "req_" + os.urandom(8).hex(),
+            "text": text,
+            "text_type": "plain",
+            "operation": "query"
+        }
+    }
+
+    response = requests.post(url, headers=headers, json=payload)
+    data = response.json()
+
+    if data.get("code") == 3000:
+        # 解码音频数据
+        audio_data = base64.b64decode(data["data"])
+        with open(output_file, "wb") as f:
+            f.write(audio_data)
+        return output_file
+    else:
+        raise Exception(f"TTS 失败: {data}")
+
+# 使用示例
+audio_file = text_to_speech("你好，我是豆包语音助手")
+print(f"语音已生成: {audio_file}")
+```
+
+### 1.2 流式语音合成
+
+适用于长文本，边生成边播放。
+
+```python
+import websocket
+import json
+import os
+
+def stream_tts(text: str, voice_type: str = "BV700_V2_streaming"):
+    """
+    流式语音合成
+
+    Args:
+        text: 要合成的文本
+        voice_type: 音色类型
+    """
+    app_id = os.environ.get("DOUBAO_APP_ID")
+    access_token = os.environ.get("DOUBAO_ACCESS_TOKEN")
+
+    ws_url = f"wss://openspeech.bytedance.com/api/v1/tts/ws?appid={app_id}&token={access_token}"
+
+    def on_message(ws, message):
+        data = json.loads(message)
+        if "audio" in data:
+            # 处理音频数据
+            audio_chunk = base64.b64decode(data["audio"])
+            # 播放或保存音频片段
+            print(f"收到音频片段: {len(audio_chunk)} 字节")
+
+    def on_open(ws):
+        payload = {
+            "app": {
+                "appid": app_id,
+                "token": access_token,
+                "cluster": "volcano_tts"
+            },
+            "user": {
+                "uid": "user123"
+            },
+            "audio": {
+                "voice_type": voice_type,
+                "encoding": "mp3"
+            },
+            "request": {
+                "reqid": "stream_" + os.urandom(8).hex(),
+                "text": text,
+                "text_type": "plain",
+                "operation": "submit"
+            }
+        }
+        ws.send(json.dumps(payload))
+
+    ws = websocket.WebSocketApp(
+        ws_url,
+        on_message=on_message,
+        on_open=on_open
+    )
+    ws.run_forever()
+
+# 使用示例
+stream_tts("这是一段很长的文本，使用流式合成可以边生成边播放...")
+```
+
+### 1.3 音色选择
+
+豆包语音提供多种音色：
+
+| 音色代码 | 描述 | 场景 |
+|---------|------|------|
+| BV700_V2_streaming | 通用女声 | 通用场景 |
+| BV701_V2_streaming | 通用男声 | 通用场景 |
+| BV406_streaming | 温柔女声 | 客服、助手 |
+| BV158_streaming | 活泼女声 | 教育、娱乐 |
+| BV115_streaming | 磁性男声 | 新闻、播音 |
+
+**查询可用音色**:
+
+```bash
+TOKEN="${DOUBAO_ACCESS_TOKEN}"
+APP_ID="${DOUBAO_APP_ID}"
+
+curl -s "https://openspeech.bytedance.com/api/v1/tts/voices?appid=$APP_ID" \
+  -H "Authorization: Bearer $TOKEN"
+```
+
+---
+
+## 完整工具类
+
+```python
+import os
+import requests
+import base64
+import json
+from typing import Optional
+
+class DoubaoVoice:
+    """豆包语音API工具类"""
+
+    BASE_URL = "https://openspeech.bytedance.com/api/v1"
+
+    def __init__(self, app_id: str = None, access_token: str = None):
+        self.app_id = app_id or os.environ.get("DOUBAO_APP_ID")
+        self.access_token = access_token or os.environ.get("DOUBAO_ACCESS_TOKEN")
+        self.cluster_tts = os.environ.get("DOUBAO_CLUSTER", "volcano_tts")
+
+    @property
+    def headers(self):
+        return {
+            "Authorization": f"Bearer {self.access_token}",
+            "Content-Type": "application/json"
+        }
+
+    def text_to_speech(
+        self,
+        text: str,
+        voice_type: str = "BV700_V2_streaming",
+        output_file: str = "output.mp3"
+    ) -> str:
+        """文字转语音"""
+        url = f"{self.BASE_URL}/tts"
+
+        payload = {
+            "app": {
+                "appid": self.app_id,
+                "token": self.access_token,
+                "cluster": self.cluster_tts
+            },
+            "user": {"uid": "user123"},
+            "audio": {
+                "voice_type": voice_type,
+                "encoding": "mp3",
+                "speed_ratio": 1.0,
+                "volume_ratio": 1.0,
+                "pitch_ratio": 1.0
+            },
+            "request": {
+                "reqid": "req_" + os.urandom(8).hex(),
+                "text": text,
+                "text_type": "plain",
+                "operation": "query"
+            }
+        }
+
+        response = requests.post(url, headers=self.headers, json=payload)
+        data = response.json()
+
+        if data.get("code") == 3000:
+            audio_data = base64.b64decode(data["data"])
+            with open(output_file, "wb") as f:
+                f.write(audio_data)
+            return output_file
+        else:
+            raise Exception(f"TTS 失败: {data}")
+
+    def list_voices(self) -> list:
+        """获取可用音色列表"""
+        url = f"{self.BASE_URL}/tts/voices"
+        params = {"appid": self.app_id}
+
+        response = requests.get(url, headers=self.headers, params=params)
+        data = response.json()
+
+        if data.get("code") == 0:
+            return data["voices"]
+        else:
+            raise Exception(f"获取音色列表失败: {data}")
+
+
+# ==================== 使用示例 ====================
+if __name__ == "__main__":
+    voice = DoubaoVoice()
+
+    # 示例1: 文字转语音
+    audio_file = voice.text_to_speech("你好，我是豆包语音助手")
+    print(f"语音已生成: {audio_file}")
+
+    # 示例2: 查看可用音色
+    voices = voice.list_voices()
+    for v in voices[:5]:
+        print(f"{v['voice_type']}: {v['description']}")
+```
+
+---
+
+## 二、唱歌 (豆包端到端实时语音大模型)
+
+### 2.1 基础唱歌
+
+让豆包唱歌，支持任何歌曲主题。
+
+**自然语言示例**:
+- "请唱一首关于春天的歌"
+- "唱一个温柔的摇篮曲"
+- "来一首欢快的儿歌"
+
+**Python 实现**:
+
+```python
+import asyncio
+from scripts.singing import DoubaoSinging
+
+async def main():
+    singing = DoubaoSinging()
+
+    # 让豆包唱歌
+    audio_file = await singing.sing(
+        "请唱一首关于春天的歌",
+        output_file="spring_song.mp3",
+        language="zh-CN"
+    )
+    print(f"唱歌完成: {audio_file}")
+
+asyncio.run(main())
+```
+
+### 2.2 交互式唱歌
+
+与豆包进行实时对话，可以要求她唱歌、讲故事等。
+
+**Python 实现**:
+
+```python
+import asyncio
+from scripts.singing import DoubaoSinging
+
+async def main():
+    singing = DoubaoSinging()
+
+    # 启动交互式模式
+    await singing.interactive_singing(language="zh-CN")
+
+asyncio.run(main())
+```
+
+**交互示例**:
+```
+你: 请唱一首情歌
+豆包: [生成音频] 我会为你唱一首温柔的情歌...
+
+你: 能加点方言吗？
+豆包: [用方言重新唱歌]
+
+你: quit
+再见!
+```
+
+---
+
+## 自然语言操作示例
+
+### TTS 操作
+
+| 用户说 | 执行操作 |
+|--------|----------|
+| "把这段话转成语音：你好世界" | 调用 TTS API 生成语音 |
+| "用温柔女声合成语音" | 使用 BV406_streaming 音色 |
+| "生成一段播音腔的新闻语音" | 使用磁性男声音色 |
+
+### 唱歌操作
+
+| 用户说 | 执行操作 |
+|--------|----------|
+| "请唱一首关于春天的歌" | 调用端到端实时语音大模型生成唱歌音频 |
+| "唱一首摇篮曲" | 生成温柔的摇篮曲 |
+| "唱歌的同时讲个故事" | 交互式对话中唱歌并讲故事 |
+| "开启交互式唱歌模式" | 启动实时语音交互 |
+
+---
+
+## 计费说明
+
+### TTS 计费
+
+- **并发版**: 2000元/并发/月（纯并发计费，不收取字符调用费用）
+- **按量付费**: 按合成字符数计费
+
+### 免费试用
+
+新用户开通服务后可获得一定免费额度，具体额度以控制台显示为准。
+
+---
+
+## 注意事项
+
+1. **音频格式**: TTS 支持 mp3/wav/pcm
+2. **文本长度**: TTS 单次请求最长支持 5000 字符
+3. **并发限制**: 注意 API 调用频率和并发数限制
+4. **Token 安全**: Access Token 存储在环境变量中，不要硬编码
+
+---
+
+## 错误处理
+
+```python
+def safe_tts(text: str):
+    """带错误处理的 TTS"""
+    try:
+        voice = DoubaoVoice()
+        return voice.text_to_speech(text)
+    except Exception as e:
+        if "401" in str(e):
+            print("认证失败，请检查 Access Token")
+        elif "429" in str(e):
+            print("请求过于频繁，请稍后重试")
+        else:
+            print(f"合成失败: {e}")
+        return None
+```
+
+---
+
+## 常见场景
+
+### 场景 1: 生成多语言语音
+
+```python
+voice = DoubaoVoice()
+
+# 中文
+voice.text_to_speech("你好", voice_type="BV700_V2_streaming", output_file="zh.mp3")
+
+# 英文
+voice.text_to_speech("Hello", voice_type="EN_001", output_file="en.mp3")
+```
+
+
+---
+
+## 参考资源
+
+- [火山引擎豆包语音文档](https://www.volcengine.com/docs/6561/1359369)
+- [豆包语音控制台](https://console.volcengine.com/speech/app)
+- [API 接口文档](https://www.volcengine.com/docs/6561/1359370)
+- [计费说明](https://www.volcengine.com/docs/6561/1359370)