Skip to main content

Overview

AI Voice Cloner creates realistic speech audio from text by cloning voices from sample audio files. The API analyzes the unique characteristics of a voice sample and produces high-quality voice synthesis with natural intonation, emotion, and speaking patterns that closely match the original voice.

How It Works

  1. Upload voice sample - Provide an audio file of the voice you want to clone
  2. Write the text - Enter the text you want spoken in the cloned voice
  3. API clones and generates - AI analyzes the voice and synthesizes speech
  4. Download the result - Retrieve your audio file with the cloned voice

Use Cases

  • Content creation - Clone your own voice for consistent narration
  • Voice preservation - Preserve voices of loved ones or historical figures
  • Multilingual content - Create content in your voice across languages
  • Personalized assistants - Custom voice for virtual assistants and chatbots
  • Accessibility - Text-to-speech with your own voice
  • Entertainment - Create audio content with specific voice characteristics

Voice Sample Requirements

For best results, your voice sample should:
  • Be clear and high-quality - Minimal background noise
  • Be 10-30 seconds long - Enough audio for the AI to learn the voice
  • Contain natural speech - Conversational tone works best
  • Be single-speaker - Only one person speaking in the sample
  • Have good enunciation - Clear pronunciation of words
Quality tip - Use a quiet environment and a good microphone for recording your voice sample.

Best Practices

Voice Sample Guidelines

Record naturally - Speak in your natural tone and pace for the most authentic cloning.
  • Natural speech - Read naturally, not robotically
  • Good audio quality - Use proper recording equipment
  • Consistent volume - Maintain steady speaking volume
  • No background noise - Record in a quiet environment
  • Varied intonation - Include different emotions and tones

Text Guidelines

  • Natural punctuation - Use periods, commas, and question marks for natural pauses
  • Avoid abbreviations - Write out words fully (e.g., “Mister” not “Mr.”)
  • Short sentences - Break long text into shorter, digestible sentences
  • Read it aloud - Test how text sounds before generating

Optimizing Output Quality

TechniqueWhy It Helps
High-quality sampleBetter voice characteristics captured
Natural punctuationCreates appropriate pauses and intonation
Phonetic spellingHelps with unusual words or names
Sentence breaksImproves rhythm and comprehension

Text Length Considerations

  • Short clips - 1-2 sentences work great for social media
  • Medium content - Paragraphs work well for narration
  • Long content - Consider breaking into multiple clips for variety

Code Examples

Basic Voice Cloning

from magic_hour import Client
from os import getenv

client = Client(token=getenv("API_TOKEN"))

result = client.v1.ai_voice_cloner.generate(
    assets={
        "audio_file_path": "path/to/voice_sample.mp3"
    },
    style={
        "prompt": "Hello! This is a test of the Magic Hour voice cloner. Pretty cool, right?"
    },
    name="Voice Cloner audio",
    wait_for_completion=True,
    download_outputs=True,
    download_directory="."
)

if result.status == "complete":
    print(f"✅ Voice cloning complete!")
    print(f"Downloaded to: {result.downloaded_paths}")
    print(f"Credits charged: {result.credits_charged}")
else:
    print(f"❌ Job failed with status: {result.status}")
    if hasattr(result, 'error_message'):
        print(f"Error: {result.error_message}")

Clone Voice with URL

result = client.v1.ai_voice_cloner.generate(
    assets={
        "audio_file_path": "https://example.com/voice_sample.mp3"
    },
    style={
        "prompt": "Welcome to the show! Today we're going to talk about something really exciting."
    },
    name: "Voice Cloner audio",
    wait_for_completion=True,
    download_outputs=True,
    download_directory="."
)

if result.status == "complete":
    print(f"✅ Voice cloning complete!")
    print(f"Downloaded to: {result.downloaded_paths}")
    print(f"Credits charged: {result.credits_charged}")
else:
    print(f"❌ Job failed with status: {result.status}")
    if hasattr(result, 'error_message'):
        print(f"Error: {result.error_message}")

Professional Narration

result = client.v1.ai_voice_cloner.generate(
    assets={
        "audio_file_path": "path/to/narrator_voice.mp3"
    },
    style={
        "prompt": "In a world where technology meets creativity, Magic Hour brings your ideas to life."
    },
    name="Voice Cloner audio",
    wait_for_completion=True,
    download_outputs=True,
    download_directory="."
)

if result.status == "complete":
    print(f"✅ Voice cloning complete!")
    print(f"Downloaded to: {result.downloaded_paths}")
    print(f"Credits charged: {result.credits_charged}")
else:
    print(f"❌ Job failed with status: {result.status}")
    if hasattr(result, 'error_message'):
        print(f"Error: {result.error_message}")

Combining with Lip Sync

Clone a voice, then sync it to a video:
Python
from magic_hour import Client
from os import getenv

client = Client(token=getenv("API_TOKEN"))

# Step 1: Clone the voice
from pathlib import Path

# Make sure download dirs exist
Path("temp").mkdir(parents=True, exist_ok=True)
Path("outputs").mkdir(parents=True, exist_ok=True)

# Step 1: Clone the voice
voice_result = client.v1.ai_voice_cloner.generate(
    assets={
        "audio_file_path": "https://raw.githubusercontent.com/runshouse/Sample_Assets/main/you-are-just-a-line-of-code.mp3"
    },
    style={
        "prompt": "This is my custom voiceover for the video."
    },
    name="Voice Cloner audio",
    wait_for_completion=True,
    download_outputs=True,
    download_directory=".",
)

# Step 2: Use the generated audio for lip sync
lip_sync_result = client.v1.lip_sync.generate(
    assets={
        "video_file_path": "https://raw.githubusercontent.com/runshouse/Sample_Assets/main/sideeyegirl.mp4",
        "audio_file_path": voice_result.downloaded_paths[0],  # local path now exists
        "video_source": "file",  # see note below
    },
    start_seconds=0,
    end_seconds=2,
    max_fps_limit=30,
    style={"generation_mode": "lite"},
    name="Lip Synced Video",
    wait_for_completion=True,
    download_outputs=True,
    download_directory=".",
)

print("voice_status:", voice_result.status, "paths:", getattr(voice_result, "downloaded_paths", None))
print("lip_status:", lip_sync_result.status, "paths:", getattr(lip_sync_result, "downloaded_paths", None))

Pricing

Voice cloning uses credits based on text length and processing:
Text LengthApproximate Credits
Short (1-2 sentences)~10 credits
Medium (paragraph)~20-30 credits
Long (multiple paragraphs)~50+ credits
Try this in our Google Colab Cookbook: Run this API with sample code. Just add your API key.

API Reference

AI Voice Cloner API Reference

View full API specification

Voice Generator

Generate speech with celebrity and character voices

Lip Sync

Sync generated audio with video lip movements

Animation

Create animated videos with audio