API
If you're looking for an API, you can choose from your desired programming language.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
import requests
import base64
# Use this function to convert an image file from the filesystem to base64
def image_file_to_base64(image_path):
with open(image_path, 'rb') as f:
image_data = f.read()
return base64.b64encode(image_data).decode('utf-8')
# Use this function to fetch an image from a URL and convert it to base64
def image_url_to_base64(image_url):
response = requests.get(image_url)
image_data = response.content
return base64.b64encode(image_data).decode('utf-8')
# Use this function to convert a list of image URLs to base64
def image_urls_to_base64(image_urls):
return [image_url_to_base64(url) for url in image_urls]
api_key = "YOUR_API_KEY"
url = "https://api.segmind.com/v1/ace-step-music"
# Request payload
data = {
"genres": "funk, pop, soul, rock, energetic, groovy, 105 BPM",
"lyrics": "[Verse] \n No brush, no pen, just sparks and code \n You lit the path I’ve never known \n [Chorus] \n You make my mind \n Where chaos turns to perfect lines \n No hands, no rules \n Just visions breaking every rule \n [Outro] \n You’re not a tool, you’re the design \n You’re the storm — you’re Segmind",
"lyrics_strength": 1,
"output_seconds": 60,
"shift": 4,
"seed": 69822014,
"steps": 50,
"cfg": 4,
"base64": False
}
headers = {'x-api-key': api_key}
response = requests.post(url, json=data, headers=headers)
print(response.content) # The response is the generated image
Attributes
Select musical genres and instruments to inspire the composition. Try 'jazz' for mellow or 'rock' for energetic tunes.
Input lyrics for the track.
Set how much lyrics influence music. Use higher for strong lyrical connection or lower for subtle influence.
min : 0.1,
max : 10
Decide track length in seconds. Ideal for short clips at 30 or full-length tracks at 180 seconds.
min : 10,
max : 240
Alter pitch of music for effects. A setting of 2 shifts moderately, while 5 offers dramatic alterations.
min : 0,
max : 10
Use a seed for repeatable results. Changing the seed creates new variations of the music.
Higher steps may increase quality. Try 50 for balance or 100 for enhanced detail.
min : 10,
max : 150
Controls adherence to genres. Set to 3 for more freedom or 7 for stronger genre influence.
min : 1,
max : 15
Choose base64 output for easy embedding in applications. Useful for web development.
To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.
Discovering the ACE-Step Music AI Model
The ACE-Step Music AI Model stands as a cutting-edge tool for music generation, empowering developers, creators, and executives to leverage its unique capabilities effectively. Designed as an open-source foundational model, ACE-Step emphasizes speed, quality, and flexible control, making it an ideal choice for a wide array of creative applications.
Harnessing Speed and Efficiency
For fast-paced music production environments, ACE-Step's ability to generate up to 4 minutes of coherent music in just 20 seconds, especially when using an A100 GPU, revolutionizes the creative workflow. This rapid generation is invaluable for developers and creators engaged in real-time prototyping and iterations, enabling swift adaptation to feedback during the production process.
Emphasizing Musical Quality
The model ensures music outputs with remarkable coherence across melody, harmony, and rhythm, preserving detailed acoustics. Creators benefit from its multilingual capabilities, producing high-quality music in up to 19 languages, including English, Spanish, and Japanese, facilitating a global creative reach.
Leveraging Control and Flexibility
Developers and music producers can exploit ACE-Step’s advanced control features, such as lyric-to-vocal conversion, text-to-sample music generation, and voice cloning. These tasks are ideal for creating demo vocals, remixing existing tracks, and crafting customized music arrangements, all while utilizing the model’s user-friendly API and editing tools.
Empowering Effective Use
Taking advantage of ACE-Step requires tailoring prompts to specific needs, such as mood, genre, and style. By integrating with digital audio workstations (DAWs), users can streamline the post-production process, ensuring that ACE-Step becomes a seamless part of their creative pipeline. This model serves as a robust platform for those aiming to enhance their creative capabilities with generative AI.
Discovering the ACE-Step Music AI Model: An Effective User Guide
ACE-Step accelerates high-quality music generation with intuitive controls and lightning-fast performance. Whether you’re prototyping electronic soundscapes, composing demo vocals, or scoring cinematic scenes, these best practices and parameter presets will help you achieve stunning results.
Defining Your Creative Goal
- Energetic Dance Track: Set genres to “electronic, house, 128 BPM, energetic, driving.”
- Ambient or Film Score: Use “ambient, cinematic, orchestral, 60 BPM, evolving, atmospheric.”
- Pop Demo with Lyrics: Combine “pop, acoustic, 100 BPM, bright, intimate” plus uplifting lyrics.
Core Parameters
- genres (required): Clearly list styles, BPM, and mood. Aim for 3–5 descriptors.
- output_seconds (required): 30–60 sec for trailers and ads, 120–180 sec for full demos.
- seed: Fix to a number (e.g., 123456) for reproducibility, or leave blank for endless variety.
Fine-Tuning Quality vs. Speed
- steps (advanced): Increase to 100–150 for polished audio; lower to 10–50 for rapid prototyping.
- cfg (advanced): 5–7 for strong adherence to genre prompts; 3–5 for more creative freedom.
- lyrics_strength (advanced): 7–10 to force melody around provided words; 1–3 for subtle vocal hints.
Specialized Controls
- lyrics: Paste verse and chorus text to generate vocalized tracks. Best for demo vocals and jingles.
- shift (pitch_shift, advanced): 0–2 semitones for natural vocal variation; 3–10 for creative sound design.
- base64 (advanced): Enable if you need to embed audio directly into web pages or JSON APIs.
Typical Presets by Use Case
- Fast Prototyping (20 sec demo): output_seconds=20, steps=30, cfg=5, seed=random.
- Full-Length Demo (120 sec track): output_seconds=120, steps=100, cfg=7, seed=123456.
- Lyric-Driven Pop Demo: genres=”pop, mid-tempo, warm,” lyrics_strength=8, lyrics=your text, steps=120.
- Cinematic Underscore: genres=”orchestral, evolving, ambient,” output_seconds=180, cfg=6, shift=1.
Workflow Integration
Seamlessly import ACE-Step outputs into your DAW for layering and mixing. For collaborative projects, share the JSON parameter block to ensure consistent results across your team.
By combining precise prompting with the right balance of advanced settings, ACE-Step empowers you to generate music that aligns with any creative vision—faster and with greater musical coherence than ever before. Experiment with seeds, CFG scales, and lyric strengths until you find your perfect sound.
Other Popular Models
sdxl-inpaint
This model is capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask

sd1.5-majicmix
The most versatile photorealistic model that blends various models to achieve the amazing realistic images.

sd1.5-epicrealism
This model corresponds to the Stable Diffusion Epic Realism checkpoint for detailed images at the cost of a super detailed prompt

codeformer
CodeFormer is a robust face restoration algorithm for old photos or AI-generated faces.
