音源分離・音声前処理（Demucs / UVR5 / ボーカル抽出 / ASR前処理）の実装ガイド

音源分離は『1本の音声を、声・ドラム・ベース・伴奏といった構成要素に分解する』技術です。カラオケ生成、BGMを残した動画の多言語吹き替え、雑音下の文字起こし精度向上、リミックスや耳コピ——応用は広い。本クラスタは、公開モデルでSOTA級のDemucs v4とボーカル分離特化のUVR5(MDX-Net)を軸に、要件からのツール選定、ASR前処理パイプライン、SDR/musevalでの品質評価、そしてGPUワーカー×ジョブキュー×冪等性の本番アーキテクチャまで——型安全・回復性・可観測性・コストを軸に、音源分離を本番で稼がせる設計を扱います。

12 articles in total

Foundational guide (start here)

音源分離

How to choose a source-separation tool: selecting Demucs / UVR5(MDX-Net) / Spleeter / Open-Unmix by requirements

A cross-comparison of the major music-source-separation OSS — Demucs v4, UVR5(MDX-Net), Spleeter, Open-Unmix — by quality, speed, license, setup difficulty, and memory. It explains, with real code, a decision framework you can reverse-look-up from requirements ('which to choose for which project') and the license pitfalls you must always confirm for commercial use.

6/25/202612 min read

音源分離・音声前処理（Demucs / UVR5 / ボーカル抽出 / ASR前処理）の実装ガイド

How to choose a source-separation tool: selecting Demucs / UVR5(MDX-Net) / Spleeter / Open-Unmix by requirements

Related practical articles

Scaling audio source separation in production on AWS: a GPU batch-processing platform (SQS × ECS/Batch × S3)

Complete guide to BS-RoFormer / Mel-Band RoFormer: using 2026's highest-quality source separation in production

Demucs v4 Complete Guide: Running Meta's Source-Separation Model (HT Demucs) in Production, Faithful to the Official Docs

Turning source separation into a production API: the design of GPU worker × job queue × idempotency

Measuring source-separation quality in numbers: SDR / museval and a CI quality gate

Is real-time source separation possible: the design and limits of low latency (the reality of streaming processing)

Raising Whisper transcription accuracy with source separation: designing an audio-preprocessing pipeline

Building TTS/ASR training data with source separation: a preprocessing pipeline for clean speech datasets

Complete UVR5 / audio-separator troubleshooting guide (GPU not used, CUDA, OOM, installation)

Complete guide to making karaoke tracks and a cappella with UVR5: instrumental extraction / vocal extraction / harmony removal

UVR5 (MDX-Net) Complete Guide: Separating Vocals/Accompaniment with High Accuracy and Automating It in Production, Faithful to Official Sources