Velvet
Posts
Velvet latency benchmarks

Velvet latency benchmarks

Plus, improve response times by 50% with caching

August 16, 2024

The Velvet AI gateway is a proxy, so it’s critical that we don’t add unnecessary latency to requests. We ran an experiment to test average and p99 latency.‍

We found that Velvet's latency is nominal - between 200-300ms per request on average, with minimums as low as 85ms. And with caching, we improve response times by 50% or more. This level of latency seems imperceptible to end users.

What we’ll cover in this article

Overview of latency when leveraging AI
Velvet’s latency benchmarks (average and p99)
How to decrease response times with caching

More from Velvet

Watch a demo of Velvet’s AI Gateway. Set up your workspace in just a few minutes and start querying your logs.

Watch the demo →

Learn how to implement response caching with Velvet. Return results to identical queries in milliseconds and reduce costs.

Implement caching →

Warehouse OpenAI logs to PostgreSQL | Read the docs →