What you need to know
- Google is capping how much quota a single Gemini prompt can consume after user complaints.
- Failed Gemini requests will no longer count against your usage limits or available quota.
- Google will add detailed usage breakdowns so users can better understand quota consumption.
After users started complaining about Gemini’s new usage limits, Google has now announced that it’s making changes based on that feedback.
One of the bigger announcements from Google I/O 2026 was the shift away from fixed message limits toward a new compute-based usage system. Under the new setup, Gemini usage is capped within rolling five-hour windows, alongside a broader weekly usage limit.
However, users quickly started complaining on Reddit and X that they were hitting those limits much sooner than expected. One user claimed they exhausted a significant portion of their quota with a single prompt, while another reported hitting the limit while asking Gemini Omni to generate a video before the request had even finished processing.
Latest Videos From
You may like
(Image credit: Google)
Google has now confirmed that it is adjusting the system. In a post, Gemini lead Josh Woodward said the company is now “capping the amount of quota a single prompt can use so you get more out of the Pro model.”
Woodward also clarified that users won’t be charged quota for failed requests. According to him, only successfully completed requests will count toward usage limits, meaning errors and failed generations shouldn’t consume any of your allowance.
He also acknowledged that heavier tasks, particularly things like Deep Research, require significantly more compute resources. To make this clearer, Google says it will introduce more detailed usage breakdowns and notifications so users can better understand where their quota is going.
Google is also making another small but welcome change. Gemini will now remember which model you prefer to use across sessions. As Woodward explained, once you select a model, Gemini will continue using it unless you manually switch models or hit a usage cap that forces an automatic fallback to a lighter model.

