Text-to-Speech
If you run a business that takes appointments by phone, you already know the pain. Missed calls, voicemail black holes, and customers who never call back. 62% of calls to small businesses go unanswered, and 85% of those callers never try again. The average SMB loses $126,000 per year to missed calls. That is not a minor leak — that is a revenue hemorrhage.
In this guide, I am going to show you exactly how I built an AI voice receptionist that picks up every call 24/7, books appointments by checking Google Calendar in real-time, and sends SMS confirmations — all automatically. No human needed.
The entire project is open-source. You can clone it, deploy it, and customize it for any business.
What This AI Voice Receptionist Does
Here is the thing — this is not a basic IVR system that says “press 1 for appointments.” This is a conversational AI agent that talks like a real receptionist.
When someone calls your business number:
- The AI picks up instantly — no hold music, no voicemail
- It has a natural conversation — understands intent, asks follow-up questions
- Checks your Google Calendar in real-time for available slots
- Books the appointment and creates a calendar event
- Sends an SMS confirmation to the caller with all the details
- Logs the entire call with a summary and booking status
The average call takes about 90 seconds. The response time is under 1 second.
Architecture Overview
Here is how the system is designed:
Phone Call → Vapi (STT/TTS) → Django Backend → Claude API (Tool Use)
↓
┌─────────┴──────────┐
│ │
Google Calendar Twilio SMS
(Check/Book) (Confirmation)
│ │
└─────────┬──────────┘
↓
PostgreSQL (Call Logs)
The key design decision: Vapi is only the voice transport layer. It handles speech-to-text and text-to-speech. All intelligence, business logic, and tool execution runs on your own Django backend. This means you have full control over the AI behavior, and you are not locked into any platform.
Vapi sends transcribed speech to your Django server. Django passes it to Claude with access to 4 tools. Claude decides which tools to call, executes them, and returns a response. Vapi converts the response back to speech.
Tech Stack Breakdown
| Component | Technology | Role |
|---|---|---|
| Backend | Django | API server, tool execution, business logic |
| AI Brain | Claude API (Tool Use) | Natural language understanding + agentic decisions |
| Voice Layer | Vapi | Speech-to-text, text-to-speech, phone number |
| Calendar | Google Calendar API | Real-time availability checks and booking |
| SMS | Twilio | Appointment confirmation messages |
| Database | PostgreSQL | Call logs, analytics, summaries |
| Infrastructure | Docker + AWS EC2 | Deployment and scaling |
Why These Choices?
Claude API with Tool Use — This is what makes the AI agent truly agentic. Claude does not just generate text. It decides which tools to call, in what order, and handles multi-step reasoning in a single conversation turn. For example, in one turn it can: check calendar → find a slot → book it → send SMS → summarize the call. That is 4 tool calls in one response.
Vapi over building STT/TTS from scratch — Vapi handles all the hard parts of voice: low-latency transcription, natural-sounding speech, phone number provisioning, and call management. Building this yourself would take months. Vapi lets you focus on the intelligence layer.
Django over FastAPI — Django gives you the admin dashboard, ORM, and authentication out of the box. When you need to review call logs, search by phone number, or filter by booking status — Django admin handles it with zero extra code.
How the Agentic Tool Loop Works
This is the most interesting part of the architecture. Here is where it gets interesting.
Claude operates in an agentic tool loop. This means the AI does not just respond to messages — it makes multiple decisions per conversation turn.
Here is a typical call flow:
# Caller says: "I'd like to book an appointment for tomorrow afternoon"
# Claude's internal reasoning:
# 1. The caller wants an appointment tomorrow afternoon
# 2. I need to check what's available
# 3. Tool call: check_calendar_availability(date="2026-03-11")
# Tool returns: ["2:00 PM", "2:30 PM", "3:00 PM", "3:30 PM", "4:00 PM"]
# Claude responds: "I have several openings tomorrow afternoon.
# I have 2 PM, 2:30, 3, 3:30, and 4 PM available.
# Which time works best for you?"
# Caller says: "3 PM works"
# Claude's internal reasoning:
# 1. The caller wants 3 PM tomorrow
# 2. I need to ask if they're a new or returning patient
# 3. (asks follow-up question)
# After getting all info, Claude makes multiple tool calls:
# Tool call 1: book_appointment(date="2026-03-11", time="15:00", type="new_patient")
# Tool call 2: send_sms(phone="+1234567890", message="Your appointment is confirmed...")
# Tool call 3: log_call(summary="New patient booked for March 11 at 3 PM", status="booked")
The key is that Claude decides the tool execution flow dynamically. It is not following a fixed script. If a caller asks a question instead of booking, Claude handles it conversationally. If the requested time is not available, Claude suggests alternatives.
Step-by-Step Setup Guide
Prerequisites
- Python 3.11+
- Docker and Docker Compose
- A Vapi account (for voice)
- An Anthropic API key (for Claude)
- A Google Cloud service account (for Calendar)
- A Twilio account (for SMS)
1. Clone the Repository
git clone https://github.com/codewithmuh/ai-voice-agent.git
cd ai-voice-agent
2. Configure Environment Variables
cp .env.example .env
Open .env and add your API keys:
# AI
ANTHROPIC_API_KEY=sk-ant-...
# Vapi
VAPI_API_KEY=your-vapi-key
# Google Calendar
GOOGLE_CALENDAR_ID=your-calendar@gmail.com
GOOGLE_CREDENTIALS_JSON=path/to/credentials.json
# Twilio
TWILIO_ACCOUNT_SID=your-sid
TWILIO_AUTH_TOKEN=your-token
TWILIO_PHONE_NUMBER=+1234567890
# Database
POSTGRES_DB=voice_agent
POSTGRES_USER=postgres
POSTGRES_PASSWORD=your-password
3. Set Up Google Calendar
This step requires a Google Cloud service account:
- Go to Google Cloud Console
- Create a new project (or use an existing one)
- Enable the Google Calendar API
- Create a Service Account and download the credentials JSON
- Share your Google Calendar with the service account email (it looks like
name@project.iam.gserviceaccount.com)
This is the part most people get stuck on. The service account needs “Make changes to events” permission on your calendar.
4. Start with Docker
docker compose up --build
This starts:
- Django on port 8000
- PostgreSQL on port 5432
5. Expose Your Server with ngrok
ngrok http 8000
Copy the HTTPS URL (something like https://abc123.ngrok.io).
6. Connect Vapi
Now pay attention to this part — this is where everything comes together:
- Log into the Vapi Dashboard
- Create a new assistant
- Set the Custom LLM URL to:
your-ngrok-url/api/vapi/chat/completions/ - Set the Server URL to:
your-ngrok-url/api/vapi/webhook/ - Assign a phone number to the assistant
- Call the number to test
If everything is configured correctly, you should hear the AI receptionist pick up and start a conversation.
The 4 AI Tools the Agent Uses
Claude has access to 4 tools that it can call during any conversation turn:
1. check_calendar_availability
Queries Google Calendar for available 30-minute slots on a given date. Returns a list of open times.
def check_calendar_availability(date: str) -> list[str]:
"""Check available appointment slots for a given date."""
# Queries Google Calendar API
# Returns available 30-min slots between business hours
# Handles new patient (60 min) vs returning (30 min) slots
2. book_appointment
Creates a Google Calendar event with the caller’s details.
def book_appointment(
date: str,
time: str,
patient_name: str,
patient_type: str, # "new" or "returning"
phone: str
) -> dict:
"""Book an appointment and create a calendar event."""
# Creates event on Google Calendar
# Sets duration based on patient type (60 or 30 min)
# Returns confirmation details
3. send_sms_confirmation
Sends an SMS via Twilio with the appointment details.
def send_sms_confirmation(
phone: str,
appointment_date: str,
appointment_time: str,
appointment_type: str
) -> bool:
"""Send SMS confirmation to the caller."""
# Uses Twilio API to send confirmation
# Includes date, time, and appointment type
4. log_call
Logs the call details to PostgreSQL for analytics.
def log_call(
caller_phone: str,
duration: int,
summary: str,
booking_status: str # "booked", "cancelled", "inquiry"
) -> None:
"""Log call details for analytics and review."""
# Stores in PostgreSQL via Django ORM
# Available in Django Admin for search/filter
How Google Calendar Integration Works
The calendar integration is what makes this actually useful for businesses. Here is how it works:
- Service Account Authentication — The agent authenticates with Google Calendar using a service account. No OAuth flow needed. The service account has direct access to the shared calendar.
- Real-Time Availability — When Claude calls
check_calendar_availability, it queries the Google Calendar API for a specific date, checks all existing events, and returns the gaps as available slots. - Slot Duration Logic — New patient appointments are 60 minutes. Returning patient appointments are 30 minutes. The system accounts for this when showing available slots.
- Instant Booking — When Claude calls
book_appointment, it creates a calendar event immediately. The business owner sees it on their calendar in real-time.
SMS Confirmation with Twilio
After booking, the agent automatically sends an SMS. Here is what the caller receives:
Hi [Name]! Your appointment at [Business Name] is confirmed
for March 11, 2026 at 3:00 PM (New Patient - 60 min).
We're located at [Address]. Please arrive 10 min early.
Reply CANCEL to cancel. See you soon!
The SMS is sent via Twilio’s API. You configure your Twilio phone number and credentials in the .env file, and the agent handles the rest.
Call Logging and Analytics
Every call is logged in PostgreSQL with:
- Caller phone number
- Call duration
- AI-generated summary of the conversation
- Booking status (booked, cancelled, inquiry, no-show)
- Timestamp
All of this is accessible through the Django Admin dashboard. You can search by phone number, filter by booking status, and see call trends over time. No extra code needed — Django Admin gives you this for free.
Real-World Use Cases
This AI voice receptionist is built for any business that books time-based appointments by phone:
| Industry | Use Case |
|---|---|
| Dental Clinics | 24/7 appointment booking, new patient vs returning, emergency slot handling |
| Medical Practices | After-hours coverage, appointment scheduling, basic FAQ |
| Hair Salons & Spas | Service booking, stylist selection, availability checks |
| Law Firms | Consultation scheduling, intake screening, callback requests |
| Real Estate | Property viewing bookings, agent scheduling |
| Auto Repair Shops | Service appointments, drop-off scheduling |
| Restaurants | Reservation booking (with modification for table sizes) |
| Veterinary Clinics | Pet appointment booking, emergency triage |
The cool part is — the system prompt is fully customizable. You can change the business name, services, hours, and personality without touching any code.
Cost Breakdown: AI vs Human Receptionist
Let me break down the real numbers:
| Human Receptionist | AI Voice Receptionist | |
|---|---|---|
| Monthly Cost | $3,500+ (salary + benefits) | ~$100/month |
| Availability | 8 hours/day, 5 days/week | 24/7/365 |
| Missed Calls | Lunch breaks, sick days, holidays | Zero — always answers |
| Simultaneous Calls | 1 at a time | Unlimited |
| Response Time | Variable | < 1 second |
| SMS Follow-up | Manual | Automatic |
| Call Logging | Manual or forgotten | Automatic with AI summary |
The AI receptionist is not meant to replace your entire front desk. It is meant to catch every call your team misses — after hours, during lunch, on weekends, and when all lines are busy.
Here is the thing — 42% of SMBs lose at least $500 per month to missed calls. A single missed call can mean $1,000 in lost business. At ~$100/month, even catching one extra appointment per month makes this worth it for most businesses.
Compare that to SaaS AI receptionist tools: My AI Front Desk starts at $79/month, Smith.ai charges $140+/month, and RingCentral’s AI receptionist is enterprise-priced. This open-source solution gives you full control at a fraction of the cost.
Cost Per Component
| Service | Cost |
|---|---|
| Vapi | ~$0.05/min (STT + TTS) |
| Claude API | ~$0.01-0.03/call (tool use) |
| Twilio SMS | ~$0.0079/message |
| Google Calendar API | Free |
| AWS EC2 (t3.small) | ~$15/month |
| Total per call | ~$0.05-0.10 |
For an average business handling 500 calls/month, that is roughly $25-50/month in variable costs plus $15 for hosting.
What I Would Build Next
If I were extending this project, here is what I would add:
- Multi-language support — Vapi supports multiple languages. Adding Spanish or French would expand the use case significantly.
- Call transfer — If the AI cannot handle a request, transfer to a human with full context.
- CRM integration — Push call data to HubSpot, Salesforce, or any CRM via webhooks.
- Custom voice cloning — Use ElevenLabs to match your brand’s voice.
- Waitlist management — If no slots are available, offer to add the caller to a waitlist and notify when a cancellation happens.
Takeaways
Here is what you should take away from this:
- An AI voice receptionist is not science fiction anymore. You can build one today with open-source tools and deploy it for under $100/month.
- The agentic tool loop is the key. Claude does not just respond — it reasons, decides which tools to use, and executes multi-step workflows in a single conversation turn.
- Vapi + Django + Claude is a powerful stack. Vapi handles voice, Django handles logic, Claude handles intelligence. Each does what it is best at.
- The ROI is clear. If your business misses even a few calls per week, this pays for itself instantly.
For context, SaaS AI receptionist tools like My AI Front Desk ($79/month), Goodcall, and Smith.ai charge $79-200+/month — and you have zero control over the AI logic. This open-source project gives you the same capabilities with full customization for a fraction of the cost.
The virtual receptionist market is now worth $6.26 billion. Real businesses are seeing real results:
- Image Orthodontics handled 10,865 AI calls over 3 months, capturing revenue that would have been lost
- Medbelle saw 2.5x more booked appointments and a 60% scheduling efficiency boost
- One healthcare implementation projected $1.7M in additional revenue from AI receptionist deployment
- A service business reported: “We used to miss 10-15 calls a day. Now every call gets answered, weekly lead count up 40%.”
The full source code is available on GitHub. Clone it, deploy it, customize it.
Want to see this in action? Watch the full tutorial on my YouTube channel where I build this from scratch.
Need help deploying this for your business? Book a free demo call and I will walk you through the setup.
Found this helpful? Share it with a developer friend who builds AI agents.
Written by Muhammad Rashid (CodeWithMuh) — I build AI agents, automate developer workflows, and deploy them to production. Follow me on LinkedIn for daily AI insights.
Comments
No comments yet. Be the first to comment!
