• Velvet
  • Posts
  • Velvet latency benchmarks

Velvet latency benchmarks

Plus, improve response times by 50% with caching

The Velvet AI gateway is a proxy, so it’s critical that we don’t add unnecessary latency to requests. We ran an experiment to test average and p99 latency.‍

We found that Velvet's latency is nominal - between 200-300ms per request on average, with minimums as low as 85ms. And with caching, we improve response times by 50% or more. This level of latency seems imperceptible to end users.

What we’ll cover in this article

  • Overview of latency when leveraging AI

  • Velvet’s latency benchmarks (average and p99)

  • How to decrease response times with caching

More from Velvet

Watch a demo of Velvet’s AI Gateway. Set up your workspace in just a few minutes and start querying your logs.

Learn how to implement response caching with Velvet. Return results to identical queries in milliseconds and reduce costs.

Warehouse OpenAI logs to PostgreSQL | Read the docs