Research & DevelopmentOpen

AI / LLM Engineer

"AI that supports clinicians — measured, EU-first."

Why this role matters

Hilbi IQ only earns clinical trust if its answers are relevant and provably safe — you build the retrieval and evaluation that make that real, not aspirational.

What you'll work on

  • RAG over clinical knowledge (retrieval, re-ranking, embeddings)
  • Prompts with pseudonymisation and minimum context
  • A production eval harness
  • EU-resident inference

Skills needed

  • Hands-on with production LLM systems
  • You evaluate with real eval sets
  • You minimise what you send to a model

Valuable extras

OrchestrationTool calling

What we evaluate

  • RAG design and eval rigor
  • Seriousness about data minimisation
  • Retrieval quality, cost and latency awareness

The assignment

Design a RAG pipeline over a small clinical dataset and define an eval set with metrics.

Full brief is shared after a short intro call.