Senior voice AI Engineer
Brussels - Hybrid (50% on-site)
Freelance contract APR 2026 - APR 2027
English mandatory (dutch/french a plus)
We're looking for a Senior Voice AI Engineer to build a real-time voice bot using a hybrid stack (open-source + vendor tools) for ASR, TTS, and Vad.
You will design and optimise end-to-end streaming voice pipelines and ensure low-latency, production-grade performance.
Role & responsibilities
- Design and build end-to-end real-time voice pipelines (audio ingestion, streaming Stt, orchestration, TTS)
- Implement and optimise turn-taking mechanisms (barge-in, interruption handling, endpointing, silence detection)
- Integrate with telephony and communication systems (SIP, WebRTC, Cpaas platforms)
- Ensure low-latency, high-availability performance in production environments
- Develop resilient systems with retries, fallback strategies, and backpressure handling
- Collaborate with Data Scientists to industrialise and deploy ML models
- Contribute to CI/CD pipelines, automated testing, deployment, and monitoring
- Support continuous improvement of model performance and system reliability
Requirements & experience
- Strong experience in Python, with proficiency in a systems language (Go, Rust, or C++)
- Solid background in streaming systems and real-time audio processing
- Experience with speech technologies (ASR, TTS, Vad pipelines)
- Practical knowledge of telephony and audio constraints (SIP, WebRTC, codecs, 8khz streams)
- Experience working with WebSockets, gRPC, or similar streaming protocols
- Ability to evaluate performance using metrics such as latency (p95) and Wer
- Experience with containerization (Docker), CI/CD (GitLab), and PostgreSQL
- Proven experience delivering production-grade voice or speech AI systems
Experience level:
- Minimum 4+ years in software engineering
- At least 2+ years in production voice or speech AI systems
- Experience working in Agile environments
- Exposure to production monitoring, deployment, and system reliability practices
Nice to have:
- Knowledge of speaker diarization and echo cancellation
- Experience with semantic Vad or endpointing models
- Experience in regulated industries (banking, insurance, healthcare)
- Experience integrating with complex or legacy systems