Gemini 3.5 Flash is now available on Vercel AI Gateway, accessible by setting the model to google/gemini-3.5-flash in the AI SDK.

The model ships with improved coding proficiency, parallel agentic execution loops, and stronger multi-turn coherence versus prior Flash versions. Thinking mode defaults to medium level, trading some depth for faster, cheaper generation. Four parameters are explicitly unsupported: temperature, topP, topK, and thinking_budget. That last omission matters if you were planning to control reasoning depth programmatically.

The full changelog is worth reading for how AI Gateway layers on top of this: unified API routing, automatic retries, failover, custom reporting, observability, and Bring Your Own Key support. The architecture is designed to exceed single-provider uptime, which changes the reliability calculus for production agentic workloads.

[READ ORIGINAL →]