Transform any audio—music, podcasts, stories, documentaries, ads—into stunning synchronized videos with AI-generated visuals.
- Universal Transcription: Transcribes any audio content with word-level timing
- AI Visual Generation: Creates cinematic imagery using Google Gemini or DeAPI
- Smart Storyboarding: Analyzes content structure for emotionally-paced visuals
- Multiple Export Options: Cloud render (fast) or browser-based (private)
- Cross-Platform: Web, Android, and iOS support via Capacitor
Prerequisites: Node.js v18+, FFmpeg (for server export)
# Install dependencies
npm install
# Set up environment
cp .env.example .env.local
# Edit .env.local and add your GEMINI_API_KEY
# Run the app (frontend + backend)
npm run dev:allSee GEMINI.md for detailed architecture, services, and development conventions.
| Command | Description |
|---|---|
npm run dev:all |
Run frontend + backend server |
npm run dev |
Frontend only |
npm run server |
Backend only |
npm run build |
Production build |
npm run cap:android |
Open Android Studio |
MIT
