Viral Talking Baby AI Video Maker

Instantly create realistic talking baby videos, powered by Google's Veo 3 Fast

If you're looking for an API, here is a sample code in NodeJS to help you out.

const axios = require('axios');
   
   const api_key = "YOUR API KEY";
   const url = "https://api.segmind.com/workflows/68a56739c49c91c2edbbb81e-v1";
   const data = {
     Scene_Details: "the user input string"
   };
    
   axios.post(url, data, {
     headers: {
       'x-api-key': api_key,
       'Content-Type': 'application/json'
     }
   }).then((response) => {
     console.log(response.data);
   });

Response

application/json

{
  "poll_url": "<base_url>/requests/<some_request_id>",
  "request_id": "some_request_id",
  "status": "QUEUED"
}

You can poll the above link to get the status and output of your request.

Response

application/json

1
2
3

{
  "Video_Output": "any user input string"
}

Attributes

Scene_Detailsstr*

To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.

Viral Baby Talking video generator

The internet is suddenly filled with kids and toddlers giving interviews and speaking on podcasts. The recent developments in AI video generators has enabled creators to generate these cute, funny videos to make your day. This workflow allows you to generate such videos simply from a prompt.

We use Google's Text to Video model Veo 3 Fast to generate talking baby videos with just a simple prompt. The only input required is prompt that contains the age of the child, gender, scene and dialogue information. We use Claude Sonnet to generate the prompts for two videos, 8 seconds each and then use a node to merge the two videos together.

How this works

We ask the user to input the basic details and pass it on to Claude Sonnet LLM to generate two prompts, one for each scene. Claude then generate two prompts that includes the all the context including dialogues that Veo 3 needs to generate the video. We enable the "Generate Audio" feature to get the audio automatically generated and synced to the video. One we have the two videos, we simply merge them to create a single 720p 16 second video.

Please note that Veo 3 could throw errors when it finds any content that is not meeting Google's policy. We have seen this happen more often when feeding in an image to Veo 3 model and hence in this workflow, we have generated the video directly from a prompt.

How to extend this workflow

You can add ESRGAN Video or Topaz Video Upscaler to improve the detailing and resolution of the video.
Replace Veo 3 Fast with Veo 3 to get higher quality videos. Veo 3 Fast is a more affordable version of Veo 3. Although the Veo 3 Fast model should be good enough, replacing it with Veo 3 gives you a option to get a higher quality output

Models Used in the Pixelflow

veo-3-fast

Veo 3 Fast rapidly creates high-quality, 8-second videos with synchronized audio for diverse content needs.

llama4-scout-instruct-basic

Unlock powerful multimodal AI with Llama 4 Scout basic, a 17 billion active parameters model offering leading text & image understanding.

claude-3.7-sonnet

Claude 3.7 Sonnet is a large language model (LLM) launched by Anthropic AI. It is considered state-of-the-art, outperforming previous versions of Claude and competing models in a variety of tasks

video-stitch

Revolutionize your video editing with the Video Stitch Model. Seamlessly stitch clips, add captivating audio, and create professional-looking videos in minutes.