The average phone interpretation call costs $1.50 to $3.50 per minute. A 20-minute discharge conversation costs your clinic up to $70 — per patient, per encounter, every time.
At scale — across a patient panel with significant LEP representation — that's a budget line that makes clinic managers wince. But the answer isn't to do less interpreting. It's to do it smarter.
Language access in healthcare has evolved through three distinct generations: in-person interpreters, telephone/video remote interpreting (VRI), and now AI-powered real-time interpretation. Each has a legitimate use case. Each has real limitations. This comparison is designed to help clinic leaders make an honest, data-informed decision — not chase the cheapest option.
What it is: A trained, often certified medical interpreter physically present in the clinical setting.
Cost: Industry estimates range from $50–150/hour for contract interpreters, often with minimum-hour billing (National Council on Interpreting in Health Care, NCIHC, cost guidance). For rare languages, costs can be higher. Staff interpreter positions in large hospital systems may carry annual salaries of $45,000–$65,000, plus benefits.
Wait time: Scheduling-dependent — typically hours to days for advance notice, with on-call availability possible in large systems for high-demand languages.
Compliance documentation: Generally excellent when paired with proper charting. Interpreter identity and credentials can be logged.
Quality ceiling: Highest for complex conversations — informed consent, mental health, end-of-life discussions. A trained human interpreter picks up on non-verbal cues, can clarify cultural nuance, and can flag when a patient seems confused even if they say they understand.
Key limitation: Geography and scheduling. A rural clinic, a community health center, a multispecialty practice — most cannot staff in-person interpreters across the 20–40+ languages a diverse patient panel may require. Even large systems don't have on-demand access for Somali, Pashto, or Haitian Creole at 9 PM.
What it is: A remote human interpreter accessed via telephone, typically through a language services company (Language Line Solutions, Voiance, Certified Languages International, etc.).
Cost: $1.50–3.50 per minute for most major services; some contracts negotiate lower rates at volume (industry pricing benchmarks, AAMC Language Services Survey data). Per-minute billing means a 30-minute appointment costs $45–$105 in interpretation alone.
Wait time: Connection typically within 1–3 minutes for high-demand languages; longer for rare languages.
Compliance documentation: Variable — most enterprise contracts include usage reporting, but granular session-level documentation requires manual logging by clinical staff.
Quality considerations: Phone interpretation removes visual cues for both the interpreter and the provider. Audio quality in clinical environments (background noise, speaker phones, hand-passed handsets) degrades accuracy. Studies have found that telephone interpretation, while better than no interpretation, performs below in-person for complex clinical conversations (Crossman et al., 2010; Journal of General Internal Medicine).
Key limitation: Per-minute billing creates economic pressure to rush. Staff abbreviate conversations. Providers skip teach-back. The financial structure of OPI is directly at odds with thorough clinical communication.
What it is: A remote human interpreter accessed via video connection, typically on a tablet or dedicated VRI cart.
Cost: $1.00–3.00 per minute for most services, plus hardware costs (VRI carts run $1,500–$5,000; tablets require mounting, charging infrastructure, IT management). Some vendors offer monthly platform fees.
Wait time: 1–5 minutes for connection; longer for rare languages. Technical failures add unpredictable delays.
Compliance documentation: Platform-level session logs available through most enterprise vendors.
Quality considerations: VRI outperforms phone by restoring visual cues — providers can see the interpreter, patients can see both. The Joint Commission considers VRI an acceptable modality for most clinical encounters (The Joint Commission, Advancing Effective Communication, 2010). For ASL and Deaf patients, VRI is often the standard of care. For spoken language LEP patients, it is a meaningful upgrade over phone.
Key limitation: Hardware dependency and technical fragility. VRI carts need charging, connectivity, and regular maintenance. In high-volume environments, carts are often tied up or unavailable. The economic model — still per-minute — carries the same perverse incentives as OPI.
What it is: Machine learning-based interpretation running in real time, integrated into clinical workflow via software — no hardware cart required, no per-minute billing, no connection delay.
Cost: Subscription-based pricing, typically per provider or per-location — fundamentally different economic model from per-minute services. No marginal cost per additional conversation means there is no financial incentive to shorten patient interactions.
Wait time: Zero. Interpretation begins immediately when the conversation starts.
Compliance documentation: Automated — full conversation transcripts, language identified, interpretation log tied to encounter. No manual logging required.
Quality considerations: AI interpretation has advanced rapidly. For the broad middle ground of clinical encounters — intake, medication counseling, follow-up, discharge education — current models perform at a level that meaningfully supports accurate communication. For the most complex conversations (informed consent for major procedures, psychiatric assessment, detailed surgical counseling), human backup remains best practice.
Key limitation: Not a replacement for human interpretation in every scenario. High-stakes, nuanced conversations benefit from trained human interpreters who can exercise judgment beyond linguistic translation.
| Factor | In-Person | Phone (OPI) | VRI | AI |
|---|---|---|---|---|
| Cost model | Hourly/salary | Per minute | Per minute | Subscription |
| Availability | Scheduled | 24/7, ~1–3 min wait | 24/7, ~1–5 min wait | Instant |
| Languages | Limited to available staff | 200+ | 200+ | Growing rapidly |
| Documentation | Manual | Manual | Platform logs | Automated |
| Visual cues | Full | None | Partial | Varies by implementation |
| Best for | Complex, high-stakes encounters | Urgent access when no alternatives | Routine encounters, ASL | High-volume routine touchpoints |
| Economic incentive | Time = cost | Shorter = cheaper | Shorter = cheaper | Thorough = same cost |
No single modality wins across every scenario. But for clinic administrators trying to make language access consistent and economically sustainable, the math increasingly favors AI for the high-volume, routine communication that dominates clinical workflow — with human interpretation maintained as a resource for complex and high-stakes situations.
The key insight is economic: per-minute billing creates perverse incentives that compromise care quality. A model where thorough communication costs the same as abbreviated communication is structurally better for patients.
SpeeTch AI brings real-time AI interpretation into your clinical workflow — instant, documented, and priced to make thorough communication the default, not the exception. See it in action with a free trial at speetch.ai.
Sources: - National Council on Interpreting in Health Care (NCIHC), interpreter compensation guidance - AAMC Language Services Survey (reference for phone interpretation rate benchmarks) - Crossman KL et al. "Interpreters: telephonic, in-person interpretation and bilingual providers." Journal of General Internal Medicine, 2010; 25(Suppl 2): 292–298 - The Joint Commission, Advancing Effective Communication, Cultural Competence, and Patient- and Family-Centered Care: A Roadmap for Hospitals, 2010 - CMS, Interpreter Services under Title VI guidance, 2014
7-day free trial. No credit card. No commitment.