Every recorded call gets an AI summary. Three to five sentences covering what the call was about and what happened. It appears in the call journey next to the audio player, ready to read before you press play.
Summaries save the step of listening back to a ten-minute call just to remember what was discussed. Skim the summary, and if you need the detail, play the recording or read the transcription.
How they are generated
The pipeline runs automatically. After a call ends and the recording is merged, the audio file is queued for transcription. Whisper transcribes it on our own NVIDIA T4 GPUs running in us-west4-a. GPT-4o-mini reads the transcription and writes the summary.
Nothing leaves our infrastructure unnecessarily. The transcription engine is on-premises to us. The summarization model is called through a private endpoint.
Where they appear
Summaries show up inside the call journey in admin.vocatech.com. Open a call and you see the audio player, the summary next to it, and the full transcription underneath.
Summaries are also delivered in the daily email report and available through the REST API. If you want them in your CRM, the API returns them as a plain string field on every call record.
When they are ready
The recording is merged within about 10 seconds of the call ending. Transcription and summarization run on a priority queue and usually complete within another minute or two for a standard-length call.
The portal auto-refreshes while it waits. Open a call that just ended and you will see the summary populate without reloading. If transcription hits a retry loop, the portal keeps trying in the background.
Priority queue
Transcription runs with three priority lanes: high, normal, and low. Your group assignment sets the lane. Most accounts run in the normal lane with turnaround measured in seconds to minutes.
- High priority for accounts that need near-instant summaries
- Normal priority for the default experience
- Low priority for bulk backfills or historical processing
Recording requirement
AI summaries require recording to be enabled on the extension. No recording, no audio. No audio, no transcription. No transcription, no summary.
Recording is off by default on every Vocatech account. You can turn it on per extension by contacting support. See enabling recording for the consent and activation details.
Accuracy and edge cases
Summaries are as good as the transcription they are built on. A clean call with two speakers and good audio produces a strong summary. A noisy cell-phone call with background conversation produces a weaker one.
Extremely short calls under about 6 seconds generally do not generate a recording, so no summary is produced. See transcription accuracy for what the engine handles well and where it needs help.