Merge Fast

POST/audio/v1/merge-fast

What it does

Upload 2–20 audio files (MP3 or AAC) and get back a single merged file instantly. Tracks are joined at the stream level — zero re-encoding, zero quality loss, processing in milliseconds regardless of file duration.

How it works

StepWhat happens
1. UploadSend files as multipart form data. Optionally include silence spacing between tracks.
2. ValidateServer verifies all files share the same codec, sample rate, and channel count.
3. MergeStreams are concatenated losslessly. Silence frames inserted if spacing requested.
4. Stream backBy default the API returns responseFormat=file with raw audio bytes. Opt into responseFormat=json when you want a JSON wrapper instead.

Response modes

merge-fast defaults to responseFormat=file. That means omitted transport selection returns the merged audio as the raw response body. If you need a wrapper with base64 data and metadata in meta, send responseFormat=json inside the multipart metadata object.

  • audio/mpeg or audio/aac → default file-mode success
  • application/json with success: true → explicit responseFormat=json success
  • application/json with success: false → error in either mode

Why use it?

Built for production audio pipelines

  • Bit-perfect output. Every sample is preserved exactly as-is. No generational loss, no artifacts, no re-encoding. Your listeners hear exactly what you produced.
  • Millisecond merges. A 2-hour audiobook merges as fast as a 10-second jingle — there's no re-encoding step, so duration doesn't affect latency.
  • Programmable silence gaps. Insert precise silence between tracks — 500 ms pause after an intro, 2 seconds between chapters, custom per-boundary. No need to create silent audio files yourself.
  • One request, one response. No job queues, no polling, no webhook callbacks, no pre-signed download URLs. Send files, get merged audio back in the same HTTP call.

Common use cases

  • Podcast production — stitch intro + ad + episode + outro in your CI/CD pipeline
  • Audiobook assembly — combine chapter recordings with timed silence breaks
  • Music playlists — create seamless mixes or DJ sets from individual tracks
  • Voice-over — merge narration segments for e-learning or video production
  • Automated workflows — batch merge thousands of files from a queue or cron job

Examples

Basic merge (cURL)

curl -X POST 'https://api.creatornode.io/audio/v1/merge-fast' \ -H 'X-API-Key: YOUR_KEY' \ -F 'files=@intro.mp3' \ -F 'files=@content.mp3' \ -F 'files=@outro.mp3' \ --output merged.mp3

Merge with silence spacing

curl -X POST 'https://api.creatornode.io/audio/v1/merge-fast' \ -H 'X-API-Key: YOUR_KEY' \ -F 'files=@chapter1.mp3' \ -F 'files=@chapter2.mp3' \ -F 'files=@chapter3.mp3' \ -F 'metadata={"format":"auto","spacing":[500,1000]}' \ --output audiobook.mp3 # spacing: 500ms gap after chapter 1, 1000ms gap after chapter 2

Explicit JSON mode

curl -X POST 'https://api.creatornode.io/audio/v1/merge-fast' -H 'X-API-Key: YOUR_KEY' -F 'files=@intro.mp3' -F 'files=@outro.mp3' -F 'metadata={"responseFormat":"json","format":"auto"}'

Response headers

HeaderDescription
X-Audio-FormatOutput format (mp3 or aac)
X-Audio-CodecAudio codec name
X-Audio-Sample-RateSample rate in Hz (e.g. 44100)
X-Audio-ChannelsChannel count (1 mono, 2 stereo)
X-Audio-Duration-MsTotal output duration in milliseconds
X-Audio-File-CountNumber of input files merged
X-Audio-Spacing-CountNumber of silence gaps inserted
X-Output-Size-BytesOutput file size in bytes
X-Processing-Time-MsServer-side processing time

Tips & tricks

  • Check Content-Type before reading the body. audio/* means default file-mode success, while application/json can be either explicit JSON success or an error.
  • Normalize your sources first. All files must share the same codec, sample rate, and channel count. Use the X-Audio-* headers from a previous merge to check what format you're working with.
  • Let auto-detection do the work. Use format: "auto" (or omit the field entirely) — the server reads the codec from the first file. Only set mp3/aac explicitly to enforce strict format validation.
  • Read metadata from headers, not the file. Duration, file count, format, and output size are all in X-Audio-* response headers — no need to probe the binary to get stats for your UI or database.
  • Use file mode for streaming pipelines. The default raw body is ideal for piping straight to disk or object storage without buffering the whole file in memory.

Cost & Limits

FeatureDetail
Base cost2 credits (first 50 MB)
Extra cost+1 credit per additional 50 MB block
Max cost5 credits per request (200 MB cap on premium tier)

Tier Limits

LimitFreePremium
Max files per request320
Max file size10 MB50 MB
Max total payload25 MB200 MB
Silence spacing

Other Endpoints