Kling 3.0 Omni I2V API
/v1/tasks All models are called through the Unified Async API POST /v1/tasks endpoint; only the input fields differ (see input parameters below).
Model summary
| Model name | kling-3.0-omni/image-to-video |
|---|---|
| Type | Video generation (image-to-video) |
| Endpoint | POST /v1/tasks |
| Pricing | See HiAPI Pricing |
Kling 3.0 Omni image-to-video API by Kuaishou Kling. Drive cinematic clips from a first frame (or first/last frame) image with native audio, at 720p / 1080p / 4K resolutions and 3-15 second durations.
Production guidance
- For production, pass callback.url at the top level of the request body so HiAPI can notify your service when the task reaches a terminal state.
- GET /v1/tasks/:id is better for local debugging, low-volume jobs, or fallback reconciliation if a callback is missed.
- Use callback.when=final. Both success and fail are terminal states, so your service should deduplicate by taskId.
Best suited for
Generate motion video from a single first-frame image — style, composition, and subject come from the image. Great for animating static posters or illustrations.
image_urlsThe first frame sets the scene; the prompt describes motion and camera moves to precisely steer the animation.
image_urlspromptPass two images as first and last frames to generate a smooth transition between the two shots.
image_urlsEnable native audio and pick 720p / 1080p / 4K to fit placements from social to big screen (billed per second).
soundresolutionRequest parameters
model string required Fixed value kling-3.0-omni/image-to-video.
input object required Business parameters. Put Kling 3.0 Omni I2V-specific configuration here.
image_urls string[] required 1-2 image URLs. 1 = first frame; 2 = [first, last] frame. JPEG/PNG/WEBP.
prompt string optional Optional text prompt to guide the generated motion and style.
resolution enum optional Output resolution. Higher resolution costs more.
duration integer optional Clip length in seconds (3-15). Cost scales with duration.
sound boolean optional Generate synchronized audio (effects/ambience). Costs more when enabled.
callback object optional Optional callback configuration. When set, HiAPI notifies your service when the task reaches a terminal state.
url string required Required when callback is set; HTTPS URL that receives terminal task notifications.
when enum optional Callback trigger timing. Use final.
Example requests
First-frame image + 1080p / 5s with native audio — turn one image into a sound-on clip.
{
"model": "kling-3.0-omni/image-to-video",
"input": {
"image_urls": [
"https://static.hiapi.ai/example/input.jpg"
],
"prompt": "Gentle waves roll in, light sweeps slowly across the scene, cinematic motion",
"resolution": "1080p",
"duration": 5,
"sound": true
}
}Two images as first and last frames, 720p / 5s — a smooth transition between shots.
{
"model": "kling-3.0-omni/image-to-video",
"input": {
"image_urls": [
"https://static.hiapi.ai/example/subject-1.jpg",
"https://static.hiapi.ai/example/subject-2.jpg"
],
"resolution": "720p",
"duration": 5,
"sound": false
}
}First-frame image + 4K / 5s ultra-high-resolution clip for big screens and HD delivery.
{
"model": "kling-3.0-omni/image-to-video",
"input": {
"image_urls": [
"https://static.hiapi.ai/example/input.jpg"
],
"prompt": "Slow camera push-in, clouds drifting, shifting light, cinematic",
"resolution": "4K",
"duration": 5,
"sound": false
}
}Getting the result
- The response returns a taskId immediately without waiting for generation to finish.
- In production, prefer waiting for callback.url to receive the terminal notification. For local debugging, poll GET /v1/tasks/:id.
- When status=success, download the generated video from output[].url.
- When status=fail, fix the request based on the returned error instead of retrying the same invalid payload.
FAQ
How do I pass images, and how many?
image_urls is an array of image URLs: one image is the first frame; two images are [first, last] frames for a transition. JPEG / PNG / WEBP are supported.
Which resolutions and durations does Kling 3.0 Omni image-to-video support?
Resolutions 720p / 1080p / 4K and durations 3-15 seconds (default 5). Pricing is billed by resolution, duration (per second), and whether native audio is enabled; 4K costs the most and sound-on tiers cost more. See the live pricing page for current rates.
How do I get the generated video?
The response returns a taskId immediately. When the task reaches a terminal state, download the video from output[].url. In production, pass callback.url at the top level to receive terminal notifications and avoid polling.
Does it support text-to-video?
This model requires a first-frame image. For text-only generation, use kling-3.0-omni/text-to-video.