mirror of
https://github.com/harivansh-afk/sep.git
synced 2026-04-16 16:01:05 +00:00
init
This commit is contained in:
commit
9d85ca1ebb
9 changed files with 2928 additions and 0 deletions
170
README.md
Normal file
170
README.md
Normal file
|
|
@ -0,0 +1,170 @@
|
|||
# Audio Separator API
|
||||
|
||||
REST API for separating audio into vocal and instrumental stems using ML models.
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
# Clone and install
|
||||
git clone <repo-url>
|
||||
cd sep
|
||||
chmod +x install.sh test.sh
|
||||
sudo ./install.sh
|
||||
|
||||
# Run tests
|
||||
./test.sh
|
||||
|
||||
# Start the API
|
||||
.venv/bin/uvicorn app:app --host 0.0.0.0 --port 8000
|
||||
```
|
||||
|
||||
## Requirements
|
||||
|
||||
- Python 3.10+
|
||||
- FFmpeg
|
||||
- 10GB+ disk space (for models)
|
||||
- NVIDIA GPU with CUDA (optional, but recommended)
|
||||
|
||||
## API Endpoints
|
||||
|
||||
### Health Check
|
||||
|
||||
```bash
|
||||
curl http://localhost:8000/health
|
||||
```
|
||||
|
||||
Response:
|
||||
```json
|
||||
{
|
||||
"status": "healthy",
|
||||
"cuda_available": true,
|
||||
"cuda_device": "NVIDIA GeForce RTX 5090"
|
||||
}
|
||||
```
|
||||
|
||||
### Separate Audio
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:8000/separate \
|
||||
-F "file=@song.mp3" \
|
||||
-F "output_format=mp3"
|
||||
```
|
||||
|
||||
Response:
|
||||
```json
|
||||
{
|
||||
"job_id": "a1b2c3d4",
|
||||
"status": "completed",
|
||||
"vocals_url": "/download/song_(Vocals)_model_bs_roformer.mp3",
|
||||
"instrumental_url": "/download/song_(Instrumental)_model_bs_roformer.mp3"
|
||||
}
|
||||
```
|
||||
|
||||
### Download Stems
|
||||
|
||||
```bash
|
||||
curl -O http://localhost:8000/download/song_(Vocals)_model_bs_roformer.mp3
|
||||
```
|
||||
|
||||
### List Models
|
||||
|
||||
```bash
|
||||
curl http://localhost:8000/models
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
### Output Formats
|
||||
|
||||
- `mp3` (default) - Good compression, iOS compatible
|
||||
- `wav` - Lossless, larger files
|
||||
- `flac` - Lossless compression
|
||||
|
||||
### Models
|
||||
|
||||
| Model | Quality | Speed | Best For |
|
||||
|-------|---------|-------|----------|
|
||||
| BS-RoFormer (default) | Highest | Slow | Production use |
|
||||
| UVR_MDXNET_KARA_2 | Good | Fast | Karaoke |
|
||||
| Kim_Vocal_2 | Good | Medium | Vocal isolation |
|
||||
|
||||
## VM Deployment
|
||||
|
||||
### Using systemd (Linux)
|
||||
|
||||
The install script creates a systemd service:
|
||||
|
||||
```bash
|
||||
sudo systemctl enable audio-separator
|
||||
sudo systemctl start audio-separator
|
||||
sudo systemctl status audio-separator
|
||||
```
|
||||
|
||||
### Manual Start
|
||||
|
||||
```bash
|
||||
.venv/bin/uvicorn app:app --host 0.0.0.0 --port 8000 --workers 1
|
||||
```
|
||||
|
||||
Note: Use `--workers 1` because the ML model is not thread-safe.
|
||||
|
||||
## GPU Support
|
||||
|
||||
The API automatically detects CUDA GPUs. To verify:
|
||||
|
||||
```bash
|
||||
./test.sh
|
||||
```
|
||||
|
||||
Look for:
|
||||
```
|
||||
[PASS] CUDA available: NVIDIA GeForce RTX 5090 (32.0GB VRAM)
|
||||
```
|
||||
|
||||
### CUDA Installation (Ubuntu)
|
||||
|
||||
```bash
|
||||
# Add NVIDIA repo
|
||||
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
|
||||
sudo dpkg -i cuda-keyring_1.1-1_all.deb
|
||||
sudo apt-get update
|
||||
sudo apt-get install -y cuda-toolkit-12-1
|
||||
```
|
||||
|
||||
## iOS Integration
|
||||
|
||||
The API returns MP3 files by default, which are natively supported on iOS.
|
||||
|
||||
Example Swift code:
|
||||
|
||||
```swift
|
||||
func separateAudio(fileURL: URL) async throws -> (vocals: URL, instrumental: URL) {
|
||||
var request = URLRequest(url: URL(string: "http://your-vm:8000/separate")!)
|
||||
request.httpMethod = "POST"
|
||||
|
||||
// Upload file and get response with download URLs
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
## File Cleanup
|
||||
|
||||
Uploaded and output files are automatically deleted after 5 minutes.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "CUDA not available"
|
||||
|
||||
1. Check NVIDIA drivers: `nvidia-smi`
|
||||
2. Reinstall PyTorch with CUDA:
|
||||
```bash
|
||||
uv pip install torch --index-url https://download.pytorch.org/whl/cu121
|
||||
```
|
||||
|
||||
### "Model download failed"
|
||||
|
||||
Check network access to huggingface.co and github.com.
|
||||
|
||||
### "Out of memory"
|
||||
|
||||
Reduce batch size or use a smaller model like `UVR_MDXNET_KARA_2`.
|
||||
Loading…
Add table
Add a link
Reference in a new issue