Home Papers Patents Talks Posts Experience Projects
Back to publications

arXiv

One Shot Audio to Animated Video Generation

N Kumar , S Goel , A Narang , B Lall , M Hasan , P Agarwal , Dipankar Sarkar

arXiv preprint arXiv:2102.09737

We present a novel approach for generating animated videos from single images using audio as the driving signal. Our method allows for the creation of realistic talking head animations by combining a single source image with an audio input. This work bridges the gap between audio processing and computer animation, offering applications in virtual avatars, content creation, and human-computer interaction.

The system employs deep learning techniques to analyze speech patterns and facial movements, translating audio features into natural-looking animations. Our approach requires only one shot (single image) of the target subject, making it highly practical for real-world applications where multiple images or video data might not be available.

Key contributions:

  • Single-image animation synthesis driven by audio input
  • End-to-end deep learning framework for audio-visual mapping
  • Real-time capable animation generation
  • Preservation of identity and facial features from source image

Related Content