Building multilingual AI chatbots in English, Arabic, and Kurdish

TL;DR: Multilingual AI chatbots are mostly a solved problem in 2026 — except for Kurdish. Here's what works, what breaks, and the four-layer pattern I use in production.

The state of Kurdish in modern LLMs

GPT-4o, Claude 3.5+, Gemini 2 Pro all handle Kurmanji and Sorani well enough for everyday business tasks. Where they break:

Idiomatic phrases — they pattern-match Arabic or Persian.
Mixed-script content — Sorani Arabic-script vs. Kurmanji Latin-script switching.
Domain-specific terminology — hospitality, medical, legal.
Tone — defaults to formal even when your business is casual.

The four-layer pattern

Detection. Don't trust the user-supplied locale; detect script and language from the message itself. A user might type Sorani in Arabic letters, Kurmanji in Latin letters, or mix all three in one conversation.
Normalization. Convert dialects, scripts, and numerals to a consistent internal representation before sending to the LLM.
Prompt engineering with few-shot examples — show the model 4–6 example interactions in the dialect, tone, and terminology your business uses. This is where 80% of quality comes from.
Output validation — check the model's response is actually in the requested locale (yes, models occasionally answer Sorani questions in Persian). Reject and retry if mismatched.

RTL and bidi text rendering — the boring part that breaks everything

Half the bugs in multilingual chatbots aren't AI bugs at all. They're bidirectional text rendering bugs. When a user sends "Order #1234 من فضلك" the rendering order is browser-dependent. Always test with mixed scripts in production browsers, not just on macOS Safari.

Cost and latency math

Multilingual LLM calls cost the same per token as English calls — but Arabic and Kurdish typically use 30–50% more tokens for the same meaning. Budget accordingly. Mitigation: use a smaller, cheaper model for triage and routing, and only escalate to the larger model when the question warrants it. This pattern saves 60–80% on token cost in production.

What I won't do

Train a Kurdish-specific model from scratch for an SMB. The cost is five figures minimum, the resulting quality is rarely better than prompt-engineered GPT-4o, and the maintenance burden is real. Stick to fine-tuning, prompt engineering, and good evaluations.

BuildingmultilingualAIchatbotsinEnglish,Arabic,andKurdish

The state of Kurdish in modern LLMs

The four-layer pattern

RTL and bidi text rendering — the boring part that breaks everything

Cost and latency math

What I won't do

Related posts

Tellmeaboutyourproject