Text-to-Speech

Hiring is broken. Recruiters spend 23 hours screening resumes for a single hire. First-round phone screens eat up entire days. And most companies still do this manually in 2026.
So I built an AI agent that handles the entire pipeline: resume screening, voice interviews, transcript analysis, and candidate ranking — all automated, all open-source.
In this article, I'll walk you through exactly how I built it, the architecture decisions, and how you can deploy it yourself with one Docker Compose command.
What It Does
The AI Hiring Agent automates the first two stages of any hiring pipeline:
- Resume Screening — Upload a PDF resume, Claude reads it natively (no parser needed), and scores it 0-100 in under 5 seconds
- Voice Interview — Candidates scoring 60+ automatically receive a phone call from an AI interviewer that asks 4 structured questions
- Transcript Analysis — After the call, Claude analyzes the transcript and scores communication, technical ability, enthusiasm, and experience
- Smart Ranking — Overall score = 60% resume + 40% interview. Score 75+ = shortlisted, under 50 = rejected
The entire flow is automated. A candidate applies, gets screened in seconds, receives a phone call, and gets ranked — all without a human touching anything.
The Tech Stack
| Component | Technology | Why |
|---|---|---|
| Backend | Django 5 + DRF | Battle-tested, great ORM, built-in admin |
| Frontend | Next.js 16 + React 19 | App Router, server components, fast builds |
| AI (Screening) | Claude Sonnet | Native PDF reading, structured JSON output |
| AI (Interviews) | Claude via Vapi | Custom LLM endpoint with agentic tool use |
| Voice | Vapi + ElevenLabs | Outbound calls with natural-sounding TTS |
| Database | PostgreSQL 16 | Reliable, UUID primary keys, JSON fields |
| Infra | Docker Compose | One command deployment, all services orchestrated |
How Resume Screening Works
Most resume screeners use PDF parsers like PyPDF2 or pdfplumber, then feed the extracted text to an LLM. This loses formatting, tables, and layout — all things that matter in a resume.
Instead, I send the raw PDF as a base64-encoded document directly to Claude's document mode. Claude reads the PDF natively — headers, tables, bullet points, everything preserved.
The screening prompt asks Claude to score on 4 weighted dimensions:
- Relevant Experience (40%) — Years and depth of relevant work
- Skills Match (30%) — How well skills align with job requirements
- Education (15%) — Degree relevance and institution quality
- Career Trajectory (15%) — Growth pattern, promotions, trajectory
Claude returns a structured JSON response with the score, 3-5 strengths, red flags (if any), and a recommendation: interview (60+), maybe (40-59), or reject (<40).
The whole screening takes under 5 seconds. Compare that to the 7 minutes a human recruiter spends per resume.
How the AI Voice Interview Works
This is the most interesting part. When a candidate scores 60+, the system automatically triggers an outbound phone call using Vapi.
Here's the flow:
- Django calls Vapi's API to initiate an outbound call to the candidate's phone number
- Vapi handles the voice transport — speech-to-text (STT) and text-to-speech (TTS via ElevenLabs Rachel voice)
- For the AI brain, Vapi calls my custom LLM endpoint at
/api/vapi/chat/completions/ - This endpoint converts Vapi's OpenAI-format messages to Claude format, runs the conversation through Claude, and returns the response
- Claude has access to an
end_calltool — it uses this to hang up after all 4 questions are answered
The Interview Questions
Claude asks 4 structured questions, one at a time:
- "Tell me about your experience with {key_skill}" — Tailored to the candidate's strongest skill from resume screening
- "Describe a challenging project you worked on recently"
- "Why are you interested in this role?"
- "What is your availability and salary expectation?"
The system prompt instructs Claude to keep responses to 1-2 sentences (it's a phone call, not a chatbot), be conversational, and end the call after all questions are answered. The whole interview takes about 5 minutes.
The Agentic Loop
The custom LLM endpoint implements a proper agentic pattern:
def handle_conversation_turn(messages, candidate_name, key_skill):
# Convert OpenAI format to Claude format
claude_messages = openai_messages_to_claude(messages)
# Call Claude with end_call tool available
response = client.messages.create(
model="claude-sonnet-4-20250514",
system=get_interview_system_prompt(candidate_name, key_skill),
messages=claude_messages,
tools=[END_CALL_TOOL],
)
# Check if Claude wants to end the call
for block in response.content:
if block.type == "tool_use" and block.name == "end_call":
return text_response, True # Signal Vapi to hang up
return text_response, False
When Claude calls the end_call tool, the endpoint sets an X-Vapi-End-Call header that tells Vapi to terminate the call gracefully.
Transcript Analysis
After the call ends, Vapi sends a webhook to /api/webhooks/vapi/ with the full transcript. Claude then analyzes it across 4 dimensions, each scored 0-25:
- Communication (0-25) — Clarity, articulation, coherence
- Technical (0-25) — Domain knowledge, problem-solving ability
- Enthusiasm (0-25) — Interest level, energy, cultural fit
- Experience (0-25) — Depth and relevance of background
Total interview score = sum of all 4 dimensions (0-100). Claude also returns highlights (positive moments), concerns (red flags), and a final recommendation: strong_yes, yes, maybe, or no.
The Scoring Formula
The final candidate ranking combines both stages:
Overall Score = (Resume Score × 0.6) + (Interview Score × 0.4) If score >= 75 → SHORTLISTED If score < 50 → REJECTED If 50-75 → INTERVIEWING (needs human decision)
Why 60/40? Resumes show qualifications, but interviews reveal communication and fit. Weighting resumes slightly higher prevents a charismatic but unqualified candidate from gaming the system.
The Dashboard
The Next.js frontend includes a full recruitment dashboard:
- Stats Overview — Total candidates, average score, interviewed count, shortlisted count (all with animated counters)
- Candidate Table — Filterable by status, sortable by score, with status badges and action links
- Candidate Detail Page — Resume score donut chart, interview score breakdown (4 donut charts), pipeline progress visualization, full transcript with chat-bubble styling, strengths/red flags lists
- Job Board — Browse open positions, filter by department, see applicant counts
- Multi-Step Application — Personal info → resume upload → time slot picker → review → submit
The frontend also has a mock data fallback — if the Django API is down, it renders demo data so you can still see the UI.
Deploy It Yourself
The entire project is open-source. Here's how to get it running:
# Clone the repo git clone https://github.com/codewithmuh/ai-hiring-agent.git cd ai-hiring-agent # Configure environment cp backend/.env.example backend/.env # Add: ANTHROPIC_API_KEY, VAPI_API_KEY, ELEVENLABS keys, DB credentials # Start everything docker compose up -d --build # Seed demo data (5 jobs, 10 candidates, 35 interview slots) docker compose exec backend python manage.py migrate docker compose exec backend python manage.py seed_data # For voice interviews, expose backend publicly ngrok http 8000 # Set BACKEND_PUBLIC_URL in .env to ngrok URL
Frontend runs on localhost:3001, backend API on localhost:8000. Total infrastructure cost for a small team: roughly $50/month (Claude API + Vapi calls + a small server).
What I Learned Building This
Claude's PDF reading is underrated. Sending base64 PDFs directly eliminates an entire category of parsing bugs. No more PyPDF2 extraction failures on complex layouts.
Voice AI needs short responses. The biggest mistake in voice agent design is having the AI talk too much. Phone conversations need 1-2 sentence responses, not paragraphs.
The agentic tool pattern matters for call control. Without the end_call tool, there's no clean way to terminate a Vapi call from the AI side. The tool-use pattern gives Claude explicit control over call flow.
Weighted scoring prevents gaming. If you only score interviews, a charming candidate with no qualifications can pass. If you only score resumes, you miss communication red flags. The 60/40 blend catches both.
What's Next
This is a v1. Some things I'd like to add:
- Email notifications to candidates at each stage
- Slack/Teams integration for hiring manager alerts
- Multi-language interview support
- Interview recording playback in the dashboard
- Analytics: conversion funnel, time-to-hire, score distributions
The full source code is on GitHub. Star it if you find it useful, and feel free to fork and customize for your own hiring needs.
Found this helpful? Share it with a developer friend who's building AI agents.
Written by Muhammad Rashid (CodeWithMuh) — I build AI agents, automate developer workflows, and deploy them to production. Follow me on LinkedIn for daily AI insights.
Comments
No comments yet. Be the first to comment!
