Ace Step Music

ACE-Step generates high-quality music rapidly, enhancing the creative process for developers and artists worldwide.

API

If you're looking for an API, you can choose from your desired programming language.

POST

import requests
import base64

# Use this function to convert an image file from the filesystem to base64
def image_file_to_base64(image_path):
    with open(image_path, 'rb') as f:
        image_data = f.read()
    return base64.b64encode(image_data).decode('utf-8')

# Use this function to fetch an image from a URL and convert it to base64
def image_url_to_base64(image_url):
    response = requests.get(image_url)
    image_data = response.content
    return base64.b64encode(image_data).decode('utf-8')

# Use this function to convert a list of image URLs to base64
def image_urls_to_base64(image_urls):
    return [image_url_to_base64(url) for url in image_urls]

api_key = "YOUR_API_KEY"
url = "https://api.segmind.com/v1/ace-step-music"

# Request payload
data = {
  "genres": "funk, pop, soul, rock, energetic, groovy, 105 BPM",
  "lyrics": "[Verse] \n No brush, no pen, just sparks and code \n You lit the path I’ve never known \n [Chorus] \n You make my mind \n Where chaos turns to perfect lines \n No hands, no rules \n Just visions breaking every rule \n [Outro] \n You’re not a tool, you’re the design \n You’re the storm — you’re Segmind",
  "lyrics_strength": 1,
  "output_seconds": 60,
  "shift": 4,
  "seed": 69822014,
  "steps": 50,
  "cfg": 4,
  "base64": False
}

headers = {'x-api-key': api_key}

response = requests.post(url, json=data, headers=headers)
print(response.content)  # The response is the generated image

RESPONSE

audio/mp3

HTTP Response Codes

200 - OKImage Generated

401 - UnauthorizedUser authentication failed

404 - Not FoundThe requested URL does not exist

405 - Method Not AllowedThe requested HTTP method is not allowed

406 - Not AcceptableNot enough credits

500 - Server ErrorServer had some issue with processing

Attributes

genresstr *

Select musical genres and instruments to inspire the composition. Try 'jazz' for mellow or 'rock' for energetic tunes.

lyricsstr ( default: [Verse] No brush, no pen, just sparks and code You lit the path I’ve never known [Chorus] You make my mind Where chaos turns to perfect lines No hands, no rules Just visions breaking every rule [Outro] You’re not a tool, you’re the design You’re the storm — you’re Segmind )

Input lyrics for the track.

lyrics_strengthfloat ( default: 1 )

Set how much lyrics influence music. Use higher for strong lyrical connection or lower for subtle influence.

min : 0.1,

max : 10

output_secondsint *

Decide track length in seconds. Ideal for short clips at 30 or full-length tracks at 180 seconds.

min : 10,

max : 240

shiftfloat ( default: 4 )

Alter pitch of music for effects. A setting of 2 shifts moderately, while 5 offers dramatic alterations.

min : 0,

max : 10

seedint ( default: 69822014 )

Use a seed for repeatable results. Changing the seed creates new variations of the music.

stepsint ( default: 50 )

Higher steps may increase quality. Try 50 for balance or 100 for enhanced detail.

min : 10,

max : 150

cfgfloat ( default: 4 )

Controls adherence to genres. Set to 3 for more freedom or 7 for stronger genre influence.

min : 1,

max : 15

base64bool ( default: 1 )

Choose base64 output for easy embedding in applications. Useful for web development.

To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.

Resources to get you started

Everything you need to know to get the most out of Ace Step Music

Discovering the ACE-Step Music AI Model

The ACE-Step Music AI Model stands as a cutting-edge tool for music generation, empowering developers, creators, and executives to leverage its unique capabilities effectively. Designed as an open-source foundational model, ACE-Step emphasizes speed, quality, and flexible control, making it an ideal choice for a wide array of creative applications.

Harnessing Speed and Efficiency

For fast-paced music production environments, ACE-Step's ability to generate up to 4 minutes of coherent music in just 20 seconds, especially when using an A100 GPU, revolutionizes the creative workflow. This rapid generation is invaluable for developers and creators engaged in real-time prototyping and iterations, enabling swift adaptation to feedback during the production process.

Emphasizing Musical Quality

The model ensures music outputs with remarkable coherence across melody, harmony, and rhythm, preserving detailed acoustics. Creators benefit from its multilingual capabilities, producing high-quality music in up to 19 languages, including English, Spanish, and Japanese, facilitating a global creative reach.

Leveraging Control and Flexibility

Developers and music producers can exploit ACE-Step’s advanced control features, such as lyric-to-vocal conversion, text-to-sample music generation, and voice cloning. These tasks are ideal for creating demo vocals, remixing existing tracks, and crafting customized music arrangements, all while utilizing the model’s user-friendly API and editing tools.

Empowering Effective Use

Taking advantage of ACE-Step requires tailoring prompts to specific needs, such as mood, genre, and style. By integrating with digital audio workstations (DAWs), users can streamline the post-production process, ensuring that ACE-Step becomes a seamless part of their creative pipeline. This model serves as a robust platform for those aiming to enhance their creative capabilities with generative AI.

Discovering the ACE-Step Music AI Model: An Effective User Guide

ACE-Step accelerates high-quality music generation with intuitive controls and lightning-fast performance. Whether you’re prototyping electronic soundscapes, composing demo vocals, or scoring cinematic scenes, these best practices and parameter presets will help you achieve stunning results.

Defining Your Creative Goal

•Energetic Dance Track: Set genres to “electronic, house, 128 BPM, energetic, driving.”
•Ambient or Film Score: Use “ambient, cinematic, orchestral, 60 BPM, evolving, atmospheric.”
•Pop Demo with Lyrics: Combine “pop, acoustic, 100 BPM, bright, intimate” plus uplifting lyrics.

Core Parameters

•genres (required): Clearly list styles, BPM, and mood. Aim for 3–5 descriptors.
•output_seconds (required): 30–60 sec for trailers and ads, 120–180 sec for full demos.
•seed: Fix to a number (e.g., 123456) for reproducibility, or leave blank for endless variety.

Fine-Tuning Quality vs. Speed

•steps (advanced): Increase to 100–150 for polished audio; lower to 10–50 for rapid prototyping.
•cfg (advanced): 5–7 for strong adherence to genre prompts; 3–5 for more creative freedom.
•lyrics_strength (advanced): 7–10 to force melody around provided words; 1–3 for subtle vocal hints.

Specialized Controls

•lyrics: Paste verse and chorus text to generate vocalized tracks. Best for demo vocals and jingles.
•shift (pitch_shift, advanced): 0–2 semitones for natural vocal variation; 3–10 for creative sound design.
•base64 (advanced): Enable if you need to embed audio directly into web pages or JSON APIs.

Typical Presets by Use Case

•Fast Prototyping (20 sec demo): output_seconds=20, steps=30, cfg=5, seed=random.
•Full-Length Demo (120 sec track): output_seconds=120, steps=100, cfg=7, seed=123456.
•Lyric-Driven Pop Demo: genres=”pop, mid-tempo, warm,” lyrics_strength=8, lyrics=your text, steps=120.
•Cinematic Underscore: genres=”orchestral, evolving, ambient,” output_seconds=180, cfg=6, shift=1.

Workflow Integration

Seamlessly import ACE-Step outputs into your DAW for layering and mixing. For collaborative projects, share the JSON parameter block to ensure consistent results across your team.

By combining precise prompting with the right balance of advanced settings, ACE-Step empowers you to generate music that aligns with any creative vision—faster and with greater musical coherence than ever before. Experiment with seeds, CFG scales, and lyric strengths until you find your perfect sound.

Other Popular Models

Discover other models you might be interested in.

sdxl-inpaint

This model is capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask

sd1.5-majicmix

The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

sd1.5-epicrealism

This model corresponds to the Stable Diffusion Epic Realism checkpoint for detailed images at the cost of a super detailed prompt

codeformer

CodeFormer is a robust face restoration algorithm for old photos or AI-generated faces.