音频分析器 - Python音频分析工具包，检测BPM、调性、频率、响度并生成可视化图表

audio-analyzer by dkyazzentwatwa/chatgpt-skills

73 周安装量

38 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/dkyazzentwatwa/chatgpt-skills --skill audio-analyzer

Python Web框架数据分析音频处理

🇨🇳中文介绍

音频分析器

一个用于分析音频文件的综合工具包。提取音频的详细信息，包括速度、音乐调性、频率内容、响度指标，并生成专业的可视化图表。

快速开始

from scripts.audio_analyzer import AudioAnalyzer

# 分析音频文件
analyzer = AudioAnalyzer("song.mp3")
analyzer.analyze()

# 获取所有分析结果
results = analyzer.get_results()
print(f"BPM: {results['tempo']['bpm']}")
print(f"Key: {results['key']['key']} {results['key']['mode']}")

# 生成可视化图表
analyzer.plot_waveform("waveform.png")
analyzer.plot_spectrogram("spectrogram.png")

# 完整报告
analyzer.save_report("analysis_report.json")

功能特性

速度/BPM 检测 : 带有置信度评分的精确节拍跟踪
调性检测 : 音乐调性和调式（大调/小调）识别
频率分析 : 频谱、主导频率、频段
响度指标 : RMS、峰值、LUFS、动态范围
波形可视化 : 多通道波形图
频谱图 : 可自定义的时频可视化
色度图 : 用于和声分析的音级可视化
节拍网格 : 叠加在波形上的视觉节拍标记
导出格式 : JSON 报告、PNG/SVG 可视化图表

API 参考

初始化

# 从文件
analyzer = AudioAnalyzer("audio.mp3")

# 使用自定义采样率
analyzer = AudioAnalyzer("audio.wav", sr=44100)

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

相关 Skills

FlyClaw：零登录航班聚合查询工具，Python实现多源航班信息与价格搜索

4,000,000 周安装

Azure Data Explorer (Kusto) 查询技能：KQL数据分析、日志遥测与时间序列处理

138,800 周安装

专业SEO审计工具：全面网站诊断、技术SEO优化与页面分析指南

68,800 周安装

Python PDF处理教程：合并拆分、提取文本表格、创建PDF文件

65,000 周安装

# 运行完整分析
analyzer.analyze()

# 单独分析
analyzer.analyze_tempo()      # BPM 和节拍位置
analyzer.analyze_key()        # 音乐调性检测
analyzer.analyze_loudness()   # RMS、峰值、LUFS
analyzer.analyze_frequency()  # 频谱分析
analyzer.analyze_dynamics()   # 动态范围

# 以字典形式获取所有结果
results = analyzer.get_results()

# 单独结果
tempo = analyzer.get_tempo()        # {'bpm': 120, 'confidence': 0.85, 'beats': [...]}
key = analyzer.get_key()            # {'key': 'C', 'mode': 'major', 'confidence': 0.72}
loudness = analyzer.get_loudness()  # {'rms_db': -14.2, 'peak_db': -0.5, 'lufs': -14.0}
freq = analyzer.get_frequency()     # {'dominant_freq': 440, 'spectrum': [...]}

# 波形图
analyzer.plot_waveform(
    output="waveform.png",
    figsize=(12, 4),
    color="#1f77b4",
    show_rms=True
)

# 频谱图
analyzer.plot_spectrogram(
    output="spectrogram.png",
    figsize=(12, 6),
    cmap="magma",           # viridis, plasma, inferno, magma
    freq_scale="log",       # linear, log, mel
    max_freq=8000           # Hz
)

# 色度图（音级）
analyzer.plot_chromagram(
    output="chromagram.png",
    figsize=(12, 4)
)

# 起始强度 / 节拍网格
analyzer.plot_beats(
    output="beats.png",
    figsize=(12, 4),
    show_strength=True
)

# 组合仪表板
analyzer.plot_dashboard(
    output="dashboard.png",
    figsize=(14, 10)
)

# 包含所有分析的 JSON 报告
analyzer.save_report("report.json")

# 摘要文本
summary = analyzer.get_summary()
print(summary)

使用节拍跟踪算法检测：

BPM : 每分钟节拍数（速度）
节拍位置 : 检测到的节拍时间戳
置信度 : 可靠性评分（0-1）

tempo = analyzer.get_tempo()

{

'bpm': 128.0,

'confidence': 0.89,

'beats': [0.0, 0.469, 0.938, 1.406, ...], # 秒

'beat_count': 256

}

分析和声内容以识别：

调性 : 根音（C、C#、D 等）
调式 : 大调或小调
置信度 : 检测置信度
调性轮廓 : 与每个调性的相关性

key = analyzer.get_key()

{

'key': 'A',

'mode': 'minor',

'confidence': 0.76,

'profile': {'C': 0.12, 'C#': 0.08, ...}

}

全面的响度分析：

RMS dB : 均方根电平
峰值 dB : 最大采样电平
LUFS : 综合响度（广播标准）
动态范围 : 响亮和安静部分之间的差异

loudness = analyzer.get_loudness()

{

'rms_db': -14.2,

'peak_db': -0.3,

'lufs': -14.0,

'dynamic_range_db': 12.5,

'crest_factor': 8.2

}

频谱分析包括：

主导频率 : 最强的频率分量
频段 : 低音、中音、高音中的能量
频谱质心 : 音频的"亮度"
频谱滚降 : 85% 能量所在频率以下

freq = analyzer.get_frequency()

{

'dominant_freq': 440.0,

'spectral_centroid': 2150.3,

'spectral_rolloff': 4200.5,

'bands': {

'sub_bass': -28.5, # 20-60 Hz

'bass': -18.2, # 60-250 Hz

'low_mid': -12.1, # 250-500 Hz

'mid': -10.8, # 500-2000 Hz

'high_mid': -14.3, # 2000-4000 Hz

'high': -22.1 # 4000-20000 Hz

}

}

# 完整分析并生成所有可视化图表
python audio_analyzer.py --input song.mp3 --output-dir ./analysis/

# 仅分析速度和调性
python audio_analyzer.py --input song.mp3 --analyze tempo key --output report.json

# 生成特定可视化图表
python audio_analyzer.py --input song.mp3 --plot spectrogram --output spec.png

# 仪表板视图
python audio_analyzer.py --input song.mp3 --dashboard --output dashboard.png

# 批量分析目录
python audio_analyzer.py --input-dir ./songs/ --output-dir ./reports/

参数	描述	默认值
`--input`	输入音频文件	必需
`--input-dir`	音频文件目录	-
`--output`	输出文件路径	-
`--output-dir`	输出目录	`.`
`--analyze`	分析类型：tempo, key, loudness, frequency, all	`all`
`--plot`	绘图类型：waveform, spectrogram, chromagram, beats, dashboard	-
`--format`	输出格式：json, txt	`json`
`--sr`	分析采样率	`22050`

analyzer = AudioAnalyzer("track.mp3")
analyzer.analyze()

print(f"Tempo: {analyzer.get_tempo()['bpm']:.1f} BPM")
print(f"Key: {analyzer.get_key()['key']} {analyzer.get_key()['mode']}")
print(f"Loudness: {analyzer.get_loudness()['lufs']:.1f} LUFS")

analyzer.plot_dashboard("track_analysis.png")

analyzer = AudioAnalyzer("podcast.mp3")
analyzer.analyze_loudness()

loudness = analyzer.get_loudness()
if loudness['lufs'] > -16:
    print("警告：音频可能太响，不符合播客标准")
elif loudness['lufs'] < -20:
    print("警告：音频可能太安静")
else:
    print("响度符合播客标准（-16 到 -20 LUFS）")

import os
from scripts.audio_analyzer import AudioAnalyzer

results = []
for filename in os.listdir("./songs"):
    if filename.endswith(('.mp3', '.wav', '.flac')):
        analyzer = AudioAnalyzer(f"./songs/{filename}")
        analyzer.analyze()
        results.append({
            'file': filename,
            'bpm': analyzer.get_tempo()['bpm'],
            'key': f"{analyzer.get_key()['key']} {analyzer.get_key()['mode']}",
            'lufs': analyzer.get_loudness()['lufs']
        })

# 按 BPM 排序用于 DJ 混音
results.sort(key=lambda x: x['bpm'])

输入格式（通过 librosa/soundfile）：

MP3
WAV
FLAC
OGG
M4A/AAC
AIFF

JSON（分析报告）
PNG（可视化图表）
SVG（可视化图表）
TXT（摘要）

librosa>=0.10.0
soundfile>=0.12.0
matplotlib>=3.7.0
numpy>=1.24.0
scipy>=1.10.0

调性检测最适合旋律内容（对鼓/打击乐准确性较低）
BPM 检测可能难以处理自由速度或复杂节拍
非常短的片段（<5 秒）可能准确性降低
LUFS 计算是简化的（非完整的 ITU-R BS.1770-4）

🇺🇸English

Audio Analyzer

A comprehensive toolkit for analyzing audio files. Extract detailed information about audio including tempo, musical key, frequency content, loudness metrics, and generate professional visualizations.

Quick Start

from scripts.audio_analyzer import AudioAnalyzer

# Analyze an audio file
analyzer = AudioAnalyzer("song.mp3")
analyzer.analyze()

# Get all analysis results
results = analyzer.get_results()
print(f"BPM: {results['tempo']['bpm']}")
print(f"Key: {results['key']['key']} {results['key']['mode']}")

# Generate visualizations
analyzer.plot_waveform("waveform.png")
analyzer.plot_spectrogram("spectrogram.png")

# Full report
analyzer.save_report("analysis_report.json")

Features

Tempo/BPM Detection : Accurate beat tracking with confidence score
Key Detection : Musical key and mode (major/minor) identification
Frequency Analysis : Spectrum, dominant frequencies, frequency bands
Loudness Metrics : RMS, peak, LUFS, dynamic range
Waveform Visualization : Multi-channel waveform plots
Spectrogram : Time-frequency visualization with customization
Chromagram : Pitch class visualization for harmonic analysis
Beat Grid : Visual beat markers overlaid on waveform
Export Formats : JSON report, PNG/SVG visualizations

API Reference

Initialization

# From file
analyzer = AudioAnalyzer("audio.mp3")

# With custom sample rate
analyzer = AudioAnalyzer("audio.wav", sr=44100)

Analysis Methods

# Run full analysis
analyzer.analyze()

# Individual analyses
analyzer.analyze_tempo()      # BPM and beat positions
analyzer.analyze_key()        # Musical key detection
analyzer.analyze_loudness()   # RMS, peak, LUFS
analyzer.analyze_frequency()  # Spectrum analysis
analyzer.analyze_dynamics()   # Dynamic range

Results Access

# Get all results as dict
results = analyzer.get_results()

# Individual results
tempo = analyzer.get_tempo()        # {'bpm': 120, 'confidence': 0.85, 'beats': [...]}
key = analyzer.get_key()            # {'key': 'C', 'mode': 'major', 'confidence': 0.72}
loudness = analyzer.get_loudness()  # {'rms_db': -14.2, 'peak_db': -0.5, 'lufs': -14.0}
freq = analyzer.get_frequency()     # {'dominant_freq': 440, 'spectrum': [...]}

Visualization Methods

# Waveform
analyzer.plot_waveform(
    output="waveform.png",
    figsize=(12, 4),
    color="#1f77b4",
    show_rms=True
)

# Spectrogram
analyzer.plot_spectrogram(
    output="spectrogram.png",
    figsize=(12, 6),
    cmap="magma",           # viridis, plasma, inferno, magma
    freq_scale="log",       # linear, log, mel
    max_freq=8000           # Hz
)

# Chromagram (pitch classes)
analyzer.plot_chromagram(
    output="chromagram.png",
    figsize=(12, 4)
)

# Onset strength / beat grid
analyzer.plot_beats(
    output="beats.png",
    figsize=(12, 4),
    show_strength=True
)

# Combined dashboard
analyzer.plot_dashboard(
    output="dashboard.png",
    figsize=(14, 10)
)

Export

# JSON report with all analysis
analyzer.save_report("report.json")

# Summary text
summary = analyzer.get_summary()
print(summary)

Analysis Details

Tempo Detection

Uses beat tracking algorithm to detect:

BPM : Beats per minute (tempo)
Beat positions : Timestamps of detected beats
Confidence : Reliability score (0-1)

tempo = analyzer.get_tempo()

{

'bpm': 128.0,

'confidence': 0.89,

'beats': [0.0, 0.469, 0.938, 1.406, ...], # seconds

'beat_count': 256

}

Key Detection

Analyzes harmonic content to identify:

Key : Root note (C, C#, D, etc.)
Mode : Major or minor
Confidence : Detection confidence
Key profile : Correlation with each key

key = analyzer.get_key()

{

'key': 'A',

'mode': 'minor',

'confidence': 0.76,

'profile': {'C': 0.12, 'C#': 0.08, ...}

}

Loudness Metrics

Comprehensive loudness analysis:

RMS dB : Root mean square level
Peak dB : Maximum sample level
LUFS : Integrated loudness (broadcast standard)
Dynamic Range : Difference between loud and quiet sections

loudness = analyzer.get_loudness()

{

'rms_db': -14.2,

'peak_db': -0.3,

'lufs': -14.0,

'dynamic_range_db': 12.5,

'crest_factor': 8.2

}

Frequency Analysis

Spectrum analysis including:

Dominant frequency : Strongest frequency component
Frequency bands : Energy in bass, mid, treble
Spectral centroid : "Brightness" of audio
Spectral rolloff : Frequency below which 85% of energy exists

freq = analyzer.get_frequency()

{

'dominant_freq': 440.0,

'spectral_centroid': 2150.3,

'spectral_rolloff': 4200.5,

'bands': {

'sub_bass': -28.5, # 20-60 Hz

'bass': -18.2, # 60-250 Hz

'low_mid': -12.1, # 250-500 Hz

'mid': -10.8, # 500-2000 Hz

'high_mid': -14.3, # 2000-4000 Hz

'high': -22.1 # 4000-20000 Hz

}

}

CLI Usage

# Full analysis with all visualizations
python audio_analyzer.py --input song.mp3 --output-dir ./analysis/

# Just tempo and key
python audio_analyzer.py --input song.mp3 --analyze tempo key --output report.json

# Generate specific visualization
python audio_analyzer.py --input song.mp3 --plot spectrogram --output spec.png

# Dashboard view
python audio_analyzer.py --input song.mp3 --dashboard --output dashboard.png

# Batch analyze directory
python audio_analyzer.py --input-dir ./songs/ --output-dir ./reports/

CLI Arguments

Argument	Description	Default
`--input`	Input audio file	Required
`--input-dir`	Directory of audio files	-
`--output`	Output file path	-
`--output-dir`	Output directory	`.`
`--analyze`

Examples

Song Analysis

analyzer = AudioAnalyzer("track.mp3")
analyzer.analyze()

print(f"Tempo: {analyzer.get_tempo()['bpm']:.1f} BPM")
print(f"Key: {analyzer.get_key()['key']} {analyzer.get_key()['mode']}")
print(f"Loudness: {analyzer.get_loudness()['lufs']:.1f} LUFS")

analyzer.plot_dashboard("track_analysis.png")

Podcast Quality Check

analyzer = AudioAnalyzer("podcast.mp3")
analyzer.analyze_loudness()

loudness = analyzer.get_loudness()
if loudness['lufs'] > -16:
    print("Warning: Audio may be too loud for podcast standards")
elif loudness['lufs'] < -20:
    print("Warning: Audio may be too quiet")
else:
    print("Loudness is within podcast standards (-16 to -20 LUFS)")

Batch Analysis

import os
from scripts.audio_analyzer import AudioAnalyzer

results = []
for filename in os.listdir("./songs"):
    if filename.endswith(('.mp3', '.wav', '.flac')):
        analyzer = AudioAnalyzer(f"./songs/{filename}")
        analyzer.analyze()
        results.append({
            'file': filename,
            'bpm': analyzer.get_tempo()['bpm'],
            'key': f"{analyzer.get_key()['key']} {analyzer.get_key()['mode']}",
            'lufs': analyzer.get_loudness()['lufs']
        })

# Sort by BPM for DJ set
results.sort(key=lambda x: x['bpm'])

Supported Formats

Input formats (via librosa/soundfile):

MP3
WAV
FLAC
OGG
M4A/AAC
AIFF

Output formats:

JSON (analysis report)
PNG (visualizations)
SVG (visualizations)
TXT (summary)

Dependencies

librosa>=0.10.0
soundfile>=0.12.0
matplotlib>=3.7.0
numpy>=1.24.0
scipy>=1.10.0

Limitations

Key detection works best with melodic content (less accurate for drums/percussion)
BPM detection may struggle with free-tempo or complex time signatures
Very short clips (<5 seconds) may have reduced accuracy
LUFS calculation is simplified (not full ITU-R BS.1770-4)

Weekly Installs

Repository

dkyazzentwatwa/…t-skills

GitHub Stars

First Seen

Jan 24, 2026

Security Audits

Gen Agent Trust HubPass SocketPass SnykPass

Installed on

opencode61

codex59

gemini-cli59

cursor59

github-copilot57

amp51

DOCX文件创建、编辑与分析完整指南 - 使用docx-js、Pandoc和Python脚本

51,800 周安装

音频分析器 - Python音频分析工具包，检测BPM、调性、频率、响度并生成可视化图表

🇨🇳中文介绍

音频分析器

快速开始

功能特性

API 参考

初始化

相关 Skills

分析方法

结果获取

可视化方法

导出

分析详情

速度检测

{

'bpm': 128.0,

'confidence': 0.89,

'beats': [0.0, 0.469, 0.938, 1.406, ...], # 秒

'beat_count': 256

}

调性检测

{

'key': 'A',

'mode': 'minor',

'confidence': 0.76,

'profile': {'C': 0.12, 'C#': 0.08, ...}

}

响度指标

{

'rms_db': -14.2,

'peak_db': -0.3,

'lufs': -14.0,

'dynamic_range_db': 12.5,

'crest_factor': 8.2

}

频率分析

{

'dominant_freq': 440.0,

'spectral_centroid': 2150.3,

'spectral_rolloff': 4200.5,

'bands': {

'sub_bass': -28.5, # 20-60 Hz

'bass': -18.2, # 60-250 Hz

'low_mid': -12.1, # 250-500 Hz

'mid': -10.8, # 500-2000 Hz

'high_mid': -14.3, # 2000-4000 Hz

'high': -22.1 # 4000-20000 Hz

}

}

CLI 使用

CLI 参数

示例

歌曲分析

播客质量检查

批量分析

支持的格式

依赖项

局限性