Skip to content

feat(ai): support fal.ai and make it the default provider for image/video generation #876

Description

@softmarshmallow

Summary

Add fal.ai as a first-class generation provider and make it the default provider for image and video generation.

Today fal.ai exists in our catalogue only as video provider bindings (metadata) — there is no runtime fal client, no @fal-ai/client dependency, and no fal route for image generation at all. Image generation runs exclusively through the Vercel AI Gateway. So both "support fal.ai" and "make it the default" require new runtime wiring, not just flipping a config value.

Related: the /ai-models skill (.claude/skills/ai-models) and the @grida/ai-models catalogue. Sibling precedent: #809 (move Lyria 3 provider).

Current state (grounded)

Catalogue — packages/grida-ai-models/src/models.ts

  • models.video already models a multi-provider world: VideoProvider = "vercel" | "fal" | "openrouter", with per-provider VideoProviderBindings. fal bindings already exist for Veo 3.1 i2v (fal-ai/veo3.1/image-to-video) and Grok Imagine Video 1.5 i2v. These are catalogue data only — there is no code that calls fal.
  • models.image cards are single-provider (provider: "vercel") — OpenAI (gpt-image-*), Google Gemini image, BFL Flux. There is no provider-binding shape for images, and no fal image entries.

Runtime

  • Image generation: editor/lib/ai/actions/image-generate.tsmethods.getSDKImageModel(input.model) → resolves via the Vercel AI Gateway only. No fal path.
  • Default image model "openai/gpt-image-1-mini" is hardcoded in two places:
    • editor/grida-canvas-hosted/ai/tools/canvas-use.ts:139 (agent canvas image tool)
    • editor/app/(tools)/(playground)/playground/image/_page.tsx:104 (playground)
  • Video generation: no runtime call path exists yet. Provider selection is deliberately deferred to runtime (models.ts), so there is no default-provider mechanism and nothing actually invokes a video model.

Provider seam / secrets — editor/lib/ai/models.ts, editor/lib/ai/server.ts

  • Hosted/billed path authenticates the Vercel AI Gateway via server env vars; usage cost is sent as mills to Metronome.
  • BYOK is env-gated (BYOK_OPENROUTER_API_KEY, BYOK_AI_GATEWAY_API_KEY) and bypasses the billing seam. There is no fal equivalent.
  • Budget/rate limiting: editor/app/(api)/private/ai/ratelimit.ts (Upstash sliding window, mills). Note the skill's existing caveat: a single video clip can exceed the current per-window budget.

Why fal.ai

  • Broadest pay-per-use image + video catalogue (Flux family, nano-banana, Kling, Veo, etc.), much of which the Vercel gateway doesn't serve.
  • Per-model metered pricing (per-image / per-megapixel / per-second) retrievable from the fal pricing API — fits our mills/Metronome model.
  • We already lean on fal for video metadata; promoting it to the default consolidates image+video onto one provider with the widest model coverage.

Proposed scope

This is a feature spike — exact shapes to be decided (see open questions). Rough work breakdown:

  • Runtime fal client — add the fal SDK/dependency and a server-side call path (queue/subscribe semantics), behind the existing billing seam.
  • Image catalogue shape for fal — decide how a non-Vercel image provider is modeled. Either a provider label on image cards (like audio's "replicate") or a video-style providers binding map. Add fal image models.
  • Defaults → fal — point the two hardcoded image defaults at the chosen fal image model; define the default fal video provider/model.
  • Secrets / BYOK — add FAL_KEY handling for the hosted path and a BYOK equivalent.
  • Billing — compute real cost from fal's returned usage (per-image / per-MP / per-second) → mills → Metronome.
  • Rate limit — revisit ratelimit.ts budget before wiring video (a single clip can exceed the current window).
  • Docs — update docs/models/index.md and the /ai/models page; reflect fal as a provider.

Open questions / decisions needed

  1. Which fal image model becomes the default? fal serves many (Flux dev/pro, nano-banana, etc.) — pick the specific default for the agent tool + playground, and the quality/cost tradeoff.
  2. Which fal video model/route is the default, and do we ship a runtime video call path as part of this (none exists today)?
  3. Image catalogue shapeprovider label vs providers binding map. Aligning images with the video binding model is more uniform but a bigger change.
  4. Billing parity — does fal's hosted billing go through the same Metronome mills path, and how do we map per-MP / per-second usage to mills accurately?
  5. Fallback — if fal is default, do we keep Vercel/OpenAI as a fallback provider, or fully switch?

References

  • packages/grida-ai-models/src/models.ts — catalogue (image / video bindings)
  • editor/lib/ai/actions/image-generate.ts, editor/lib/ai/server.ts, editor/lib/ai/models.ts — runtime seam
  • editor/app/(api)/private/ai/ratelimit.ts — budget
  • .claude/skills/ai-models — model research/update workflow
  • fal.ai docs: https://fal.ai/models · pricing API: https://fal.ai/docs/documentation/model-apis/pricing

Metadata

Metadata

Assignees

No one assigned

    Labels

    aiAI models, prompts, and pricingenhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions