Hilbi IQ only earns clinical trust if its answers are relevant and provably safe — you build the retrieval and evaluation that make that real, not aspirational.
Research & DevelopmentOpen
AI / LLM Engineer
"AI that supports clinicians — measured, EU-first."
Why this role matters
What you'll work on
- RAG over clinical knowledge (retrieval, re-ranking, embeddings)
- Prompts with pseudonymisation and minimum context
- A production eval harness
- EU-resident inference
Skills needed
- Hands-on with production LLM systems
- You evaluate with real eval sets
- You minimise what you send to a model
Valuable extras
OrchestrationTool calling
What we evaluate
- RAG design and eval rigor
- Seriousness about data minimisation
- Retrieval quality, cost and latency awareness
The assignment
Design a RAG pipeline over a small clinical dataset and define an eval set with metrics.
Full brief is shared after a short intro call.
