Nightingale Karaoke - 基于Rust和AI的本地卡拉OK应用，支持人声分离与实时评分

nightingale-karaoke by aradotso/trending-skills

433 周安装量

15 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/aradotso/trending-skills --skill nightingale-karaoke

AI/机器学习音频处理 Rust

🇨🇳中文介绍

Nightingale Karaoke Skill

Skill by ara.so — Daily 2026 Skills collection.

Nightingale 是一个用 Rust（Bevy 引擎）编写的、基于机器学习的独立卡拉 OK 应用程序。它会扫描本地音乐文件夹，将人声与伴奏分离（使用 UVR Karaoke 模型或 Demucs），转录歌词并附带词级时间戳（使用 WhisperX），并在播放时同步高亮歌词、实时音高评分、玩家档案以及 GPU 着色器/视频背景。所有组件——ffmpeg、Python、PyTorch、机器学习模型——都将在首次启动时自动引导安装。

安装

预构建二进制文件（推荐）

从 Releases 页面下载适用于您平台的最新版本并运行。

仅限 macOS — 解压后移除隔离属性：

xattr -cr Nightingale.app

从源代码构建

先决条件：

Rust 1.85+（2024 版本）
Linux 系统额外需要：libasound2-dev libudev-dev libwayland-dev libxkbcommon-dev

git clone https://github.com/rzru/nightingale cd nightingale

开发构建

cargo build --release

直接运行

./target/release/nightingale

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

首次启动 / 引导安装

首次运行时，Nightingale 会下载并配置：

ffmpeg 二进制文件
uv（Python 包管理器）
通过 uv 安装的 Python 3.10
虚拟环境中的 PyTorch + WhisperX + audio-separator
UVR Karaoke ONNX 模型和 WhisperX large-v3 模型

根据网络速度，此过程需要 2-10 分钟。应用程序内会显示进度屏幕。

如需随时强制重新引导安装：

./nightingale --setup

引导安装完成会在 ~/.nightingale/vendor/.ready 处标记。

标志	描述
`--setup`	强制重新运行首次启动引导安装（重新下载供应商依赖）

键盘与游戏手柄控制

操作	键盘	游戏手柄
移动	方向键	方向键 / 左摇杆
确认	Enter	A（南键）
返回	Escape	B（东键） / Start
切换面板	Tab	—
搜索	输入以筛选	—

操作	键盘	游戏手柄
暂停 / 继续	空格键	Start
退出到菜单	Escape	B（东键）
切换导唱人声	G	—
导唱音量增大/减小	+ / -	—
切换背景	T	—
切换视频风格	F	—
切换麦克风	M	—
下一个麦克风	N	—
切换全屏	F11	—

位于 ~/.nightingale/config.json。可直接编辑或通过应用内设置修改。

{
  "music_folder": "/home/user/Music",
  "separator": "uvr",
  "guide_vocal_volume": 0.3,
  "background_theme": "plasma",
  "video_flavor": "nature",
  "default_profile": "Alice"
}

separator 选项： "uvr"（默认，保留和声） | "demucs"

background_theme 选项： "plasma", "aurora", "waves", "nebula", "starfield", "video", "source_video"

video_flavor 选项： "nature", "underwater", "space", "city", "countryside"

位于 ~/.nightingale/profiles.json：

{
  "profiles": [
    {
      "name": "Alice",
      "scores": {
        "blake3_hash_of_song": {
          "stars": 4,
          "score": 87250,
          "played_at": "2026-03-18T21:00:00Z"
        }
      }
    }
  ]
}

Pixabay 视频背景（开发）

发布版本中嵌入了 API 密钥。对于本地开发，请在项目根目录创建 .env 文件：

# .env
PIXABAY_API_KEY=$PIXABAY_API_KEY

发布脚本（make-release.sh）会自动读取 .env 文件。

~/.nightingale/
├── cache/              # 按歌曲存储的干声、转录、歌词（以 blake3 哈希为键）
├── config.json         # 应用设置
├── profiles.json       # 玩家档案和歌曲得分
├── videos/             # 预下载的 Pixabay 视频背景
├── sounds/             # 音效
├── vendor/
│   ├── ffmpeg          # ffmpeg 二进制文件
│   ├── uv              # uv 二进制文件
│   ├── python/         # Python 3.10
│   ├── venv/           # ML 虚拟环境（WhisperX, Demucs, audio-separator）
│   ├── analyzer/       # Python 分析器脚本
│   └── .ready          # 引导安装完成标记
└── models/
    ├── torch/          # Demucs 模型权重
    ├── huggingface/    # WhisperX large-v3 权重
    └── audio_separator/ # UVR Karaoke ONNX 模型

缓存键是源文件的 blake3 哈希——仅当文件更改或手动失效时才会触发重新分析。

支持的文件格式

音频： .mp3, .flac, .ogg, .wav, .m4a, .aac, .wma

视频： .mp4, .mkv, .avi, .webm, .mov, .m4v

视频文件：提取音轨，分离人声，原始视频自动作为背景播放。

PyTorch 后端会自动检测：

后端	设备	备注
CUDA	NVIDIA GPU	最快；约 2–5 分钟/首歌曲
MPS	Apple Silicon	macOS；WhisperX 对齐回退到 CPU
CPU	任何	始终可用；约 10–20 分钟/首歌曲

UVR Karaoke 模型自动使用 ONNX Runtime 配合 CUDA（NVIDIA）或 CoreML（Apple Silicon）。

Audio/Video file
       │
       ▼
 UVR Karaoke (ONNX) or Demucs (PyTorch)
       │  vocals.ogg + instrumental.ogg
       ▼
 LRCLIB API  ──▶  Synced lyrics fetch (if available)
       │
       ▼
 WhisperX large-v3  ──▶  Transcription + word-level timestamps
       │
       ▼
 Bevy App (Rust)
   - Plays instrumental audio
   - Synchronized word highlighting
   - Real-time pitch detection & scoring
   - GPU shader / video backgrounds
   - Scoreboards per profile

添加新的背景主题（Bevy 系统）

// In your Bevy plugin, register a new background variant
use bevy::prelude::*;

#[derive(Component)]
pub struct MyCustomBackground;

pub fn spawn_custom_background(mut commands: Commands) {
    commands.spawn((
        MyCustomBackground,
        // ... your background components
    ));
}

pub struct CustomBackgroundPlugin;

impl Plugin for CustomBackgroundPlugin {
    fn build(&self, app: &mut App) {
        app.add_systems(OnEnter(AppState::Playing), spawn_custom_background);
    }
}

扩展配置反序列化

use serde::{Deserialize, Serialize};

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct NightingaleConfig {
    pub music_folder: String,
    #[serde(default = "default_separator")]
    pub separator: StemSeparator,
    #[serde(default = "default_guide_volume")]
    pub guide_vocal_volume: f32,
}

#[derive(Debug, Clone, Serialize, Deserialize, Default)]
#[serde(rename_all = "lowercase")]
pub enum StemSeparator {
    #[default]
    Uvr,
    Demucs,
}

fn default_guide_volume() -> f32 { 0.3 }
fn default_separator() -> StemSeparator { StemSeparator::Uvr }

// Load config
fn load_config() -> NightingaleConfig {
    let path = dirs::home_dir()
        .unwrap()
        .join(".nightingale/config.json");
    let raw = std::fs::read_to_string(&path).unwrap_or_default();
    serde_json::from_str(&raw).unwrap_or_default()
}

以编程方式触发重新分析

use std::fs;
use std::path::PathBuf;

/// Remove cached stems/transcript for a song to force re-analysis
fn invalidate_song_cache(song_hash: &str) {
    let cache_dir = dirs::home_dir()
        .unwrap()
        .join(".nightingale/cache")
        .join(song_hash);

    if cache_dir.exists() {
        fs::remove_dir_all(&cache_dir)
            .expect("Failed to remove cache directory");
        println!("Cache invalidated for {}", song_hash);
    }
}

计算歌曲的 Blake3 哈希（用于缓存查找）

use blake3::Hasher;
use std::fs::File;
use std::io::{BufReader, Read};

fn hash_file(path: &std::path::Path) -> String {
    let file = File::open(path).expect("Cannot open file");
    let mut reader = BufReader::new(file);
    let mut hasher = Hasher::new();
    let mut buf = [0u8; 65536];
    loop {
        let n = reader.read(&mut buf).unwrap();
        if n == 0 { break; }
        hasher.update(&buf[..n]);
    }
    hasher.finalize().to_hex().to_string()
}

档案分数更新模式

use serde::{Deserialize, Serialize};
use std::collections::HashMap;

#[derive(Debug, Serialize, Deserialize)]
pub struct SongScore {
    pub stars: u8,
    pub score: u32,
    pub played_at: String,
}

#[derive(Debug, Serialize, Deserialize)]
pub struct Profile {
    pub name: String,
    pub scores: HashMap<String, SongScore>, // key = blake3 hash
}

fn update_score(profile: &mut Profile, song_hash: &str, stars: u8, score: u32) {
    profile.scores.insert(song_hash.to_string(), SongScore {
        stars,
        score,
        played_at: chrono::Utc::now().to_rfc3339(),
    });
}

引导安装失败 / 卡在设置屏幕

# 强制重新引导安装
./nightingale --setup

# 或手动移除 vendor 目录并重启
rm -rf ~/.nightingale/vendor
./nightingale

歌曲分析卡住或出错

# 检查分析器虚拟环境是否正常
~/.nightingale/vendor/venv/bin/python -c "import whisperx; print('ok')"

# 如果损坏，重新引导安装
./nightingale --setup

macOS "App is damaged" 错误

xattr -cr Nightingale.app

NVIDIA： 确保已安装 CUDA 驱动程序，并且 nvidia-smi 显示您的 GPU。
Apple Silicon： 在配备 Apple Silicon 的 macOS 上会自动使用 MPS；WhisperX 对齐会回退到 CPU（正常行为）。
检查 ~/.nightingale/vendor/venv —— 如果 PyTorch 安装的是仅 CPU 版本，请在安装 CUDA 驱动程序后重新引导安装。

缓存损坏 / 歌词错误

# 查找文件的 blake3 哈希（构建一个小工具或使用 b3sum）
b3sum /path/to/song.mp3

# 移除该歌曲的缓存
rm -rf ~/.nightingale/cache/<hash>

然后在 Nightingale 中重新打开歌曲以重新分析。

音频播放问题（Linux）

确保 ALSA/PulseAudio/PipeWire 正在运行。安装缺失的依赖：

sudo apt install libasound2-dev libudev-dev libwayland-dev libxkbcommon-dev

视频背景未加载

视频背景在设置期间通过 Pixabay API 预下载。对于开发版本，请确保 .env 包含有效的 PIXABAY_API_KEY。如果发布版本中缺少视频，请运行 --setup 以重新触发下载。

平台	目标三元组
Linux x86_64	`x86_64-unknown-linux-gnu`
Linux aarch64	`aarch64-unknown-linux-gnu`
macOS ARM	`aarch64-apple-darwin`
macOS Intel	`x86_64-apple-darwin`
Windows x86_64	`x86_64-pc-windows-msvc`

rustup target add aarch64-unknown-linux-gnu
cargo build --release --target aarch64-unknown-linux-gnu

GPL-3.0-or-later。参见 LICENSE。

🇺🇸English

Nightingale Karaoke Skill

Skill by ara.so — Daily 2026 Skills collection.

Nightingale is a self-contained, ML-powered karaoke application written in Rust (Bevy engine). It scans a local music folder, separates vocals from instrumentals (UVR Karaoke model or Demucs), transcribes lyrics with word-level timestamps (WhisperX), and plays back with synchronized highlighting, real-time pitch scoring, player profiles, and GPU shader / video backgrounds. Everything — ffmpeg, Python, PyTorch, ML models — is bootstrapped automatically on first launch.

Installation

Pre-built Binary (Recommended)

Download the latest release from the Releases page for your platform and run it.

macOS only — remove quarantine after extracting:

xattr -cr Nightingale.app

Build from Source

Prerequisites:

Rust 1.85+ (edition 2024)
Linux additionally needs: libasound2-dev libudev-dev libwayland-dev libxkbcommon-dev

git clone https://github.com/rzru/nightingale cd nightingale

Development build

cargo build --release

Run directly

./target/release/nightingale

Release Packaging

# Linux / macOS
scripts/make-release.sh

# Windows (PowerShell)
powershell -ExecutionPolicy Bypass -File scripts/make-release.ps1

Outputs a .tar.gz (Linux/macOS) or .zip (Windows) ready for distribution.

First Launch / Bootstrap

On first run, Nightingale downloads and configures:

ffmpeg binary
uv (Python package manager)
Python 3.10 via uv
PyTorch + WhisperX + audio-separator in a virtual environment
UVR Karaoke ONNX model and WhisperX large-v3 model

This takes 2–10 minutes depending on network speed. A progress screen is shown in-app.

To force re-bootstrap at any time:

./nightingale --setup

Bootstrap completion is marked by ~/.nightingale/vendor/.ready.

CLI Flags

Flag	Description
`--setup`	Force re-run of the first-launch bootstrap (re-downloads vendor deps)

Keyboard & Gamepad Controls

Navigation

Action	Keyboard	Gamepad
Move	Arrow keys	D-pad / Left stick
Confirm	Enter	A (South)
Back	Escape	B (East) / Start
Switch panel	Tab	—
Search	Type to filter	—

Playback

Action	Keyboard	Gamepad
Pause / Resume	Space	Start
Exit to menu	Escape	B (East)
Toggle guide vocals	G	—
Guide volume up/down	+ / -	—
Cycle background	T	—
Cycle video flavor	F	—
Toggle microphone	M	—
Next microphone	N	—
Toggle fullscreen	F11	—

Configuration

Main Config

Located at ~/.nightingale/config.json. Edit directly or via in-app settings.

{
  "music_folder": "/home/user/Music",
  "separator": "uvr",
  "guide_vocal_volume": 0.3,
  "background_theme": "plasma",
  "video_flavor": "nature",
  "default_profile": "Alice"
}

separator options: "uvr" (default, preserves backing vocals) | "demucs"

background_theme options: "plasma", "aurora", "waves", "nebula", "starfield", "video", "source_video"

video_flavor options: "nature", "underwater", "space", "city", "countryside"

Profiles

Located at ~/.nightingale/profiles.json:

{
  "profiles": [
    {
      "name": "Alice",
      "scores": {
        "blake3_hash_of_song": {
          "stars": 4,
          "score": 87250,
          "played_at": "2026-03-18T21:00:00Z"
        }
      }
    }
  ]
}

Pixabay Video Backgrounds (Dev)

API key is embedded in release builds. For local development, create .env at project root:

# .env
PIXABAY_API_KEY=$PIXABAY_API_KEY

The release script (make-release.sh) sources .env automatically.

Data Storage Layout

~/.nightingale/
├── cache/              # Per-song stems, transcripts, lyrics (keyed by blake3 hash)
├── config.json         # App settings
├── profiles.json       # Player profiles and per-song scores
├── videos/             # Pre-downloaded Pixabay video backgrounds
├── sounds/             # Sound effects
├── vendor/
│   ├── ffmpeg          # ffmpeg binary
│   ├── uv              # uv binary
│   ├── python/         # Python 3.10
│   ├── venv/           # ML virtualenv (WhisperX, Demucs, audio-separator)
│   ├── analyzer/       # Python analyzer scripts
│   └── .ready          # Bootstrap completion marker
└── models/
    ├── torch/          # Demucs model weights
    ├── huggingface/    # WhisperX large-v3 weights
    └── audio_separator/ # UVR Karaoke ONNX model

Cache keys are blake3 hashes of the source file — re-analysis only triggers if the file changes or is manually invalidated.

Supported File Formats

Audio: .mp3, .flac, .ogg, .wav, .m4a, .aac, .wma

Video: .mp4, .mkv, .avi, .webm, .mov, .m4v

Video files: audio track is extracted, vocals separated, original video plays as background automatically.

Hardware Acceleration

PyTorch backend is auto-detected:

Backend	Device	Notes
CUDA	NVIDIA GPU	Fastest; ~2–5 min/song
MPS	Apple Silicon	macOS; WhisperX alignment falls back to CPU
CPU	Any	Always works; ~10–20 min/song

UVR Karaoke model uses ONNX Runtime with CUDA (NVIDIA) or CoreML (Apple Silicon) automatically.

Processing Pipeline

Audio/Video file
       │
       ▼
 UVR Karaoke (ONNX) or Demucs (PyTorch)
       │  vocals.ogg + instrumental.ogg
       ▼
 LRCLIB API  ──▶  Synced lyrics fetch (if available)
       │
       ▼
 WhisperX large-v3  ──▶  Transcription + word-level timestamps
       │
       ▼
 Bevy App (Rust)
   - Plays instrumental audio
   - Synchronized word highlighting
   - Real-time pitch detection & scoring
   - GPU shader / video backgrounds
   - Scoreboards per profile

Code Patterns

Adding a New Background Theme (Bevy System)

// In your Bevy plugin, register a new background variant
use bevy::prelude::*;

#[derive(Component)]
pub struct MyCustomBackground;

pub fn spawn_custom_background(mut commands: Commands) {
    commands.spawn((
        MyCustomBackground,
        // ... your background components
    ));
}

pub struct CustomBackgroundPlugin;

impl Plugin for CustomBackgroundPlugin {
    fn build(&self, app: &mut App) {
        app.add_systems(OnEnter(AppState::Playing), spawn_custom_background);
    }
}

Extending Config Deserialization

use serde::{Deserialize, Serialize};

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct NightingaleConfig {
    pub music_folder: String,
    #[serde(default = "default_separator")]
    pub separator: StemSeparator,
    #[serde(default = "default_guide_volume")]
    pub guide_vocal_volume: f32,
}

#[derive(Debug, Clone, Serialize, Deserialize, Default)]
#[serde(rename_all = "lowercase")]
pub enum StemSeparator {
    #[default]
    Uvr,
    Demucs,
}

fn default_guide_volume() -> f32 { 0.3 }
fn default_separator() -> StemSeparator { StemSeparator::Uvr }

// Load config
fn load_config() -> NightingaleConfig {
    let path = dirs::home_dir()
        .unwrap()
        .join(".nightingale/config.json");
    let raw = std::fs::read_to_string(&path).unwrap_or_default();
    serde_json::from_str(&raw).unwrap_or_default()
}

Triggering Re-analysis Programmatically

use std::fs;
use std::path::PathBuf;

/// Remove cached stems/transcript for a song to force re-analysis
fn invalidate_song_cache(song_hash: &str) {
    let cache_dir = dirs::home_dir()
        .unwrap()
        .join(".nightingale/cache")
        .join(song_hash);

    if cache_dir.exists() {
        fs::remove_dir_all(&cache_dir)
            .expect("Failed to remove cache directory");
        println!("Cache invalidated for {}", song_hash);
    }
}

Computing a Song's Blake3 Hash (for Cache Lookup)

use blake3::Hasher;
use std::fs::File;
use std::io::{BufReader, Read};

fn hash_file(path: &std::path::Path) -> String {
    let file = File::open(path).expect("Cannot open file");
    let mut reader = BufReader::new(file);
    let mut hasher = Hasher::new();
    let mut buf = [0u8; 65536];
    loop {
        let n = reader.read(&mut buf).unwrap();
        if n == 0 { break; }
        hasher.update(&buf[..n]);
    }
    hasher.finalize().to_hex().to_string()
}

Profile Score Update Pattern

use serde::{Deserialize, Serialize};
use std::collections::HashMap;

#[derive(Debug, Serialize, Deserialize)]
pub struct SongScore {
    pub stars: u8,
    pub score: u32,
    pub played_at: String,
}

#[derive(Debug, Serialize, Deserialize)]
pub struct Profile {
    pub name: String,
    pub scores: HashMap<String, SongScore>, // key = blake3 hash
}

fn update_score(profile: &mut Profile, song_hash: &str, stars: u8, score: u32) {
    profile.scores.insert(song_hash.to_string(), SongScore {
        stars,
        score,
        played_at: chrono::Utc::now().to_rfc3339(),
    });
}

Troubleshooting

Bootstrap Fails / Stuck on Setup Screen

# Force re-bootstrap
./nightingale --setup

# Or manually remove the vendor directory and restart
rm -rf ~/.nightingale/vendor
./nightingale

Song Analysis Hangs or Errors

# Check the analyzer venv is healthy
~/.nightingale/vendor/venv/bin/python -c "import whisperx; print('ok')"

# Re-bootstrap if broken
./nightingale --setup

macOS "App is damaged" Error

xattr -cr Nightingale.app

GPU Not Being Used

NVIDIA: Ensure CUDA drivers are installed and nvidia-smi shows your GPU.
Apple Silicon: MPS is used automatically on macOS with Apple Silicon; WhisperX alignment falls back to CPU (normal behavior).
Check ~/.nightingale/vendor/venv — if PyTorch installed the CPU-only build, re-bootstrap after installing CUDA drivers.

Cache Corruption / Wrong Lyrics

# Find the blake3 hash of your file (build a small tool or use b3sum)
b3sum /path/to/song.mp3

# Remove that song's cache
rm -rf ~/.nightingale/cache/<hash>

Then re-open the song in Nightingale to re-analyze.

Audio Playback Issues (Linux)

Ensure ALSA/PulseAudio/PipeWire is running. Install missing deps:

sudo apt install libasound2-dev libudev-dev libwayland-dev libxkbcommon-dev

Video Backgrounds Not Loading

Video backgrounds are pre-downloaded during setup via the Pixabay API. For development builds, ensure .env contains a valid PIXABAY_API_KEY. If videos are missing in a release build, run --setup to re-trigger the download.

Platform Targets

Platform	Target Triple
Linux x86_64	`x86_64-unknown-linux-gnu`
Linux aarch64	`aarch64-unknown-linux-gnu`
macOS ARM	`aarch64-apple-darwin`
macOS Intel	`x86_64-apple-darwin`
Windows x86_64	`x86_64-pc-windows-msvc`

Cross-compile with:

rustup target add aarch64-unknown-linux-gnu
cargo build --release --target aarch64-unknown-linux-gnu

License

GPL-3.0-or-later. See LICENSE.

Weekly Installs

217

Repository

aradotso/trending-skills

GitHub Stars

First Seen

5 days ago

Security Audits

Gen Agent Trust HubWarn SocketPass SnykPass

Installed on

github-copilot216

codex216

warp216

amp216

cline216

kimi-cli216

超能力技能使用指南：AI助手技能调用优先级与工作流程详解

45,100 周安装

Nightingale Karaoke - 基于Rust和AI的本地卡拉OK应用，支持人声分离与实时评分

🇨🇳中文介绍

Nightingale Karaoke Skill

安装

预构建二进制文件（推荐）

从源代码构建

开发构建

直接运行

相关 Skills

发布打包

首次启动 / 引导安装

命令行标志

键盘与游戏手柄控制

导航

播放

配置

主配置

用户档案

Pixabay 视频背景（开发）

数据存储结构

支持的文件格式

硬件加速

处理流程

代码模式

添加新的背景主题（Bevy 系统）

扩展配置反序列化

以编程方式触发重新分析

计算歌曲的 Blake3 哈希（用于缓存查找）

档案分数更新模式

故障排除

引导安装失败 / 卡在设置屏幕

歌曲分析卡住或出错

macOS "App is damaged" 错误

未使用 GPU

缓存损坏 / 歌词错误

音频播放问题（Linux）

视频背景未加载

平台目标

许可证

🇺🇸English

Nightingale Karaoke Skill

Installation

Pre-built Binary (Recommended)

Build from Source

Development build

Run directly

Release Packaging

First Launch / Bootstrap

CLI Flags

Keyboard & Gamepad Controls

Navigation

Playback

Configuration

Main Config

Profiles

Pixabay Video Backgrounds (Dev)

Data Storage Layout

Supported File Formats

Hardware Acceleration

Processing Pipeline

Code Patterns

Adding a New Background Theme (Bevy System)

Extending Config Deserialization

Triggering Re-analysis Programmatically

Computing a Song's Blake3 Hash (for Cache Lookup)

Profile Score Update Pattern

Troubleshooting

Bootstrap Fails / Stuck on Setup Screen

Song Analysis Hangs or Errors

macOS "App is damaged" Error

GPU Not Being Used

Cache Corruption / Wrong Lyrics

Audio Playback Issues (Linux)

Video Backgrounds Not Loading

Platform Targets

License

最新 Skills