Research & DevelopmentOpen

AI / LLM Engineer

"AI that supports clinicians — measured, EU-first."

Why this role matters

Hilbi IQ only earns clinical trust if its answers are relevant and provably safe — you build the retrieval and evaluation that make that real, not aspirational.

What you'll work on

RAG over clinical knowledge (retrieval, re-ranking, embeddings)
Prompts with pseudonymisation and minimum context
A production eval harness
EU-resident inference

Skills needed

Hands-on with production LLM systems
You evaluate with real eval sets
You minimise what you send to a model

Valuable extras

OrchestrationTool calling

What we evaluate

RAG design and eval rigor
Seriousness about data minimisation
Retrieval quality, cost and latency awareness

The assignment

Design a RAG pipeline over a small clinical dataset and define an eval set with metrics.

Full brief is shared after a short intro call.

Apply to this role How hiring works