stable-audio-2.0
Stable Audio 2.0 generates audio up to 3 minutes long from text prompts, supporting text-to-audio and audio-to-audio transformations with customizable settings like duration, steps, CFG scale, and more. It is ideal for creative professionals seeking detailed and extended outputs from simple prompts. Note: Audio-to-audio mode requires a prompt alongside an uploaded audio file for generation. Parameter controls available: 1. Basic - Default: text-to-audio (no `--mode` needed) - If transforming uploaded audio: `--mode audio-to-audio` - `--output_format wav` (for high quality, otherwise omit for mp3) 2. Timing and Randomness - `--duration [1-190 seconds]` controls how long generated audio is - '--random_seed false --seed [0-4294967294]' disables random seed generation 3. Advanced - `--cfg_scale [1-25]`: Higher = closer to prompt (recommended 7-15) - `--steps [30-100]`: Higher = better quality (recommended 50-80) 4. Transformation control (only for audio-to-audio) - `--strength [0-1]`: How much to change/transform (0.3-0.7 typical)
| Provider | Source | Input Price ($/1M) | Output Price ($/1M) | Description | Free |
|---|---|---|---|---|---|
| poe | poe | Input: - | Output: - | Stable Audio 2.0 generates audio up to 3 minutes long from text prompts, supporting text-to-audio and audio-to-audio transformations with customizable settings like duration, steps, CFG scale, and more. It is ideal for creative professionals seeking detailed and extended outputs from simple prompts. Note: Audio-to-audio mode requires a prompt alongside an uploaded audio file for generation. Parameter controls available: 1. Basic - Default: text-to-audio (no `--mode` needed) - If transforming uploaded audio: `--mode audio-to-audio` - `--output_format wav` (for high quality, otherwise omit for mp3) 2. Timing and Randomness - `--duration [1-190 seconds]` controls how long generated audio is - '--random_seed false --seed [0-4294967294]' disables random seed generation 3. Advanced - `--cfg_scale [1-25]`: Higher = closer to prompt (recommended 7-15) - `--steps [30-100]`: Higher = better quality (recommended 50-80) 4. Transformation control (only for audio-to-audio) - `--strength [0-1]`: How much to change/transform (0.3-0.7 typical) |