
How Vapi Works
Orchestration Models
All the fancy stuff Vapi does on top of the core models.
Vapi also runs a suite of audio and text models that make it’s latency-optimized Speech-to-Text (STT), Large Language Model (LLM), & Text-to-Speech (TTS) pipeline feel human.
Here’s a high-level overview of the Vapi architecture:
These are some of the models that are part of the Orchestration suite. We currently have lots of other models in the pipeline that will be added to the orchestration suite soon. The ultimate goal is to achieve human performance.


