Skip to main content

Overview

AI Talking Photo brings static photos to life by animating faces to speak with realistic lip-sync and natural facial movements. The API analyzes facial features and synchronizes mouth movements, head poses, and expressions with provided audio or generated speech.

How It Works

  1. Provide a photo - Upload an image with a clear face
  2. Add audio - Upload audio or provide text for speech generation
  3. API animates - AI creates realistic lip-sync and facial movements
  4. Download video - Retrieve your animated talking photo

Use Cases

  • Marketing videos - Create spokesperson videos from headshots
  • Educational content - Animate historical figures or characters
  • Personalized messages - Send video messages from static photos
  • Social media - Create engaging content from profile pictures
  • Presentations - Add dynamic talking heads to slides

Best Practices

Photo Selection

Use clear, front-facing photos - Best results come from high-quality headshots with visible facial features.
  • Good lighting - Well-lit faces produce better animations
  • Front-facing angles - Avoid extreme profile shots
  • Clear features - Eyes, nose, and mouth should be unobstructed
  • High resolution - At least 512x512 pixels recommended

Audio Guidelines

Audio TypeBest Practice
Voice recordingClear speech without background noise
Generated speechUse natural-sounding text prompts
Music/songsWorks best with clear vocals
LengthKeep under 30 seconds for best results

Code Examples

Basic Talking Photo with Text

from magic_hour import Client
from os import getenv

client = Client(token=getenv("API_TOKEN"))

result = client.v1.ai_talking_photo.generate(
    assets={
        "image_file_path": "https://raw.githubusercontent.com/runshouse/Sample_Assets/main/tomcruise.png",
        "audio_file_path": "https://raw.githubusercontent.com/runshouse/Sample_Assets/main/you-are-just-a-line-of-code.mp3"
    },
    name="Talking Photo",
    start_seconds=0,
    end_seconds=2,
    wait_for_completion=True,
    download_outputs=True,
    download_directory="outputs"
)

if result.status == "complete":
    print(f"✅ Talking photo complete!")
    print(f"Downloaded to: {result.downloaded_paths}")
    print(f"Credits charged: {result.credits_charged}")
else:
    print(f"❌ Job failed with status: {result.status}")
    if hasattr(result, 'error_message'):
        print(f"Error: {result.error_message}")

With Audio File

result = client.v1.ai_talking_photo.generate(
    assets={
        "image_file_path": "https://raw.githubusercontent.com/runshouse/Sample_Assets/main/tomcruise.png",
        "audio_file_path": "https://raw.githubusercontent.com/runshouse/Sample_Assets/main/you-are-just-a-line-of-code.mp3"
    },
    name="Talking Photo",
    wait_for_completion=True,
    download_outputs=True,
    download_directory="outputs"
)

if result.status == "complete":
    print(f"✅ Talking photo complete!")
    print(f"Downloaded to: {result.downloaded_paths}")
    print(f"Credits charged: {result.credits_charged}")
else:
    print(f"❌ Job failed with status: {result.status}")
    if hasattr(result, 'error_message'):
        print(f"Error: {result.error_message}")

Pricing

Talking Photo pricing varies by video length and resolution:
ConfigurationCredits per Second
720p or lower~10-15 credits/sec
Higher resolution~20-30 credits/sec

Resolution Limits

AI Talking Photo has a maximum resolution of 720p across all subscription tiers due to computational requirements.
Try this in our Google Colab Cookbook: Run this API with sample code. Just add your API key.

API Reference

AI Talking Photo API Reference

View full API specification