Senior Voice AI Engineer
Brussels - Hybrid (50% on-site)
Freelance Contract Apr 2026 - Apr 2027
English mandatory (Dutch/French a plus)
We're looking for a Senior Voice AI Engineer to build a real-time voice bot using a hybrid stack (open-source + vendor tools) for ASR, TTS, and VAD.
You will design and optimise end-to-end streaming voice pipelines and ensure low-latency, production-grade performance.
Role & Responsibilities
- Design and build end-to-end real-time voice pipelines (audio ingestion, streaming STT, orchestration, TTS)
- Implement and optimise turn-taking mechanisms (barge-in, interruption handling, endpointing, silence detection)
- Integrate with telephony and communication systems (SIP, WebRTC, CPaaS platforms)
- Ensure low-latency, high-availability performance in production environments
- Develop resilient systems with retries, fallback strategies, and backpressure handling
- Collaborate with Data Scientists to industrialise and deploy ML models
- Contribute to CI/CD pipelines, automated testing, deployment, and monitoring
- Support continuous improvement of model performance and system reliability
Requirements & Experience
- Strong experience in Python, with proficiency in a systems language (Go, Rust, or C++)
- Solid background in streaming systems and real-time audio processing
- Experience with speech technologies (ASR, TTS, VAD pipelines)
- Practical knowledge of telephony and audio constraints (SIP, WebRTC, codecs, 8kHz streams)
- Experience working with WebSockets, gRPC, or similar streaming protocols
- Ability to evaluate performance using metrics such as latency (p95) and WER
- Experience with containerization (Docker), CI/CD (GitLab), and PostgreSQL
- Proven experience delivering production-grade voice or speech AI systems
Experience Level
- Minimum 4+ years in software engineering
- At least 2+ years in production voice or speech AI systems
- Experience working in Agile environments
- Exposure to production monitoring, deployment, and system reliability practices
Nice to Have
- Knowledge of speaker diarization and echo cancellation
- Experience with semantic VAD or endpointing models
- Experience in regulated industries (banking, insurance, healthcare)
- Experience integrating with complex or legacy systems
Apply now or reach out directly to Lydia.wills@church-int.com