Skip to content
English

HappyHorse 1.1 R2V API

POST /v1/tasks

All models are called through the Unified Async API POST /v1/tasks endpoint; only the input fields differ (see input parameters below).

Model summary

Model name happyhorse-1.1/reference-to-video
Type Video generation (reference-to-video)
Endpoint POST /v1/tasks
Pricing See HiAPI Pricing

HappyHorse 1.1 reference-to-video API by Alibaba. Generate short videos from up to 9 reference images, referenced in the prompt as [Image 1], [Image 2], keeping subject, scene and style consistent across shots, with native audio.

Production guidance

Production guidance
  • For production, pass callback.url at the top level of the request body so HiAPI can notify your service when the task reaches a terminal state.
  • GET /v1/tasks/:id is better for local debugging, low-volume jobs, or fallback reconciliation if a callback is missed.
  • Use callback.when=final. Both success and fail are terminal states, so your service should deduplicate by taskId.

Best suited for

Precise multi-reference control

Up to 9 reference images for precise control of subject, wardrobe, and style.

reference_image
Name references in the prompt

Use [Image N] in the prompt to point at a specific reference image and compose new scenes.

promptreference_image
Cross-shot consistency

Keep character and style stable across shots, ideal for series and continuous narrative.

reference_image
Landscape and portrait ratios

Nine aspect ratios let one set of assets adapt to landscape and vertical placements.

aspect_ratio

Request parameters

model string required

Fixed value happyhorse-1.1/reference-to-video.

example happyhorse-1.1/reference-to-video
input object required

Business parameters. Put HappyHorse 1.1 R2V-specific configuration here.

prompt string required

Text prompt. Reference images in the prompt as [Image 1], [Image 2] ... in the same order as reference_image.

reference_image string[] required

1-9 reference image URLs. JPEG/PNG/WEBP, shortest side >= 400px, <= 20MB each.

resolution enum optional

Output video resolution. Higher resolution costs more.

default 1080p enum: 720p1080p
aspect_ratio enum optional

Output video aspect ratio.

default 16:9 enum: 16:99:163:44:34:55:41:19:2121:9
duration integer optional

Clip length in seconds (3-15). Cost scales with duration.

default 5
callback object optional

Optional callback configuration. When set, HiAPI notifies your service when the task reaches a terminal state.

url string required

Required when callback is set; HTTPS URL that receives terminal task notifications.

example https://your-domain.com/hiapi/callback
when enum optional

Callback trigger timing. Use final.

default final enum: final

Example requests

Character into a scene

1080p / 16:9 / 5s, 2 reference images, cinematic camera.

Request body
{
  "model": "happyhorse-1.1/reference-to-video",
  "input": {
    "prompt": "The character in [Image 1] walking through a scene styled like [Image 2], cinematic camera",
    "reference_image": [
      "https://static.hiapi.ai/example/ref-1.jpg",
      "https://static.hiapi.ai/example/ref-2.jpg"
    ],
    "resolution": "1080p",
    "aspect_ratio": "16:9",
    "duration": 5
  }
}
Preserve subject design

720p / 9:16 / 5s vertical — lower cost.

Request body
{
  "model": "happyhorse-1.1/reference-to-video",
  "input": {
    "prompt": "Keep the subject design from [Image 1], moving through the mood of [Image 2]",
    "reference_image": [
      "https://static.hiapi.ai/example/char.jpg",
      "https://static.hiapi.ai/example/mood.jpg"
    ],
    "resolution": "720p",
    "aspect_ratio": "9:16",
    "duration": 5
  }
}
Blend subject and style

1080p / 16:9 / 8s, longer duration for a coherent clip.

Request body
{
  "model": "happyhorse-1.1/reference-to-video",
  "input": {
    "prompt": "Blend the subject of [Image 1] with the style of [Image 2] into a coherent clip",
    "reference_image": [
      "https://static.hiapi.ai/example/subject.jpg",
      "https://static.hiapi.ai/example/style.jpg"
    ],
    "resolution": "1080p",
    "aspect_ratio": "16:9",
    "duration": 8
  }
}

Getting the result

  1. The response returns a taskId immediately without waiting for generation to finish.
  2. In production, prefer waiting for callback.url to receive the terminal notification. For local debugging, poll GET /v1/tasks/:id.
  3. When status=success, download the generated video from output[].url.
  4. When status=fail, fix the request based on the returned error instead of retrying the same invalid payload.

FAQ

How many reference images does reference-to-video support?

reference_image takes 1-9 reference images (JPEG/PNG/WEBP, shortest side >= 400px, <= 20MB). Reference them in the prompt as [Image 1] ... [Image N].

How do I reference images in the prompt?

Use markers like [Image 1], [Image 2] in the prompt; the index matches the order of the reference_image array, and the model uses the corresponding image in the output.

Which resolutions and durations are supported?

Resolutions 720p / 1080p and durations 3-15 seconds (default 5). Billed by resolution and duration (per second); 1080p costs more than 720p. See the live pricing page for current rates. View pricing

How is this different from image-to-video?

Image-to-video is driven by a single first-frame image; reference-to-video uses up to 9 reference images for subject and style control, better for multi-subject composition and series consistency. For pure text generation use happyhorse-1.1/text-to-video.

Next steps