- Velvet
- Posts
- Velvet latency benchmarks
Velvet latency benchmarks
Plus, improve response times by 50% with caching
The Velvet AI gateway is a proxy, so it’s critical that we don’t add unnecessary latency to requests. We ran an experiment to test average and p99 latency.
We found that Velvet's latency is nominal - between 200-300ms per request on average, with minimums as low as 85ms. And with caching, we improve response times by 50% or more. This level of latency seems imperceptible to end users.
What we’ll cover in this article
Overview of latency when leveraging AI
Velvet’s latency benchmarks (average and p99)
How to decrease response times with caching
More from Velvet
Watch a demo of Velvet’s AI Gateway. Set up your workspace in just a few minutes and start querying your logs.
Learn how to implement response caching with Velvet. Return results to identical queries in milliseconds and reduce costs.
Warehouse OpenAI logs to PostgreSQL | Read the docs →