- Company Name
- Nabla
- Job Title
- Prompt Engineer
- Job Description
-
**Job title:** Prompt Engineer
**Role Summary:**
Design, iterate, and evaluate high‑quality prompts for large language models used in clinical note generation and internal quality assessment (“judge AI”). Own the quality layer of the LLM pipeline, ensuring outputs are accurate, safe, and compliant with clinical standards. Work closely with product, machine learning, and clinical teams to translate user needs into precise, testable prompt instructions.
**Expectations:**
* Deliver production‑ready prompt artifacts that are versioned, documented, and rigorously tested.
* Define and monitor measurable quality criteria (accuracy, completeness, style, clinical relevance).
* Continuously iterate on prompts based on real‑world failure modes, edge cases, and stakeholder feedback.
* Bridge the gap between clinical expectations and LLM behavior, maintaining a focus on safety and reliability.
**Key Responsibilities:**
1. **Prompt Design & Iteration** – Create and refine prompts for:
* Clinical note generation.
* Judge models evaluating hallucinations, recall, structure, tone, and adherence to clinical standards.
2. **Evaluation & Quality Management** –
* Establish clear, measurable quality metrics for each output type.
* Build prompt‑based evaluation frameworks to score and compare outputs across models and prompt versions.
* Detect blind spots, regressions, and trade‑offs in model behavior.
3. **Reliability & Rigor** – Treat prompts as production artifacts: version, document, test against known failure cases, and produce reusable patterns and guidelines.
4. **Cross‑Functional Collaboration** –
* Partner with ML engineers to surface model limitations and propose prompt‑level mitigations.
* Work with product and clinical teams to translate qualitative expectations into explicit, testable instructions.
5. **Continuous Improvement** – Maintain a proactive approach to iterative, detail‑oriented work, ensuring prompt artifacts stay aligned with evolving model capabilities and clinical workflows.
**Required Skills:**
* Proven experience in prompt engineering for production LLM systems (not just research demos).
* Proficiency with LLM APIs (OpenAI, Anthropic, Gemini, etc.).
* Strong scripting or tooling skills (Python, notebooks) for prompt testing and evaluation.
* Excellent written communication, attention to detail, and ability to define criteria and edge cases.
* Analytical mindset to reason about hallucinations, recall failures, ambiguity, and instruction‑following limits.
* Autonomous, self‑directed, comfortable operating with minimal process in a fast‑moving environment.
* Knowledge or interest in healthcare, clinical workflows, or safety‑critical AI systems is a plus.
**Required Education & Certifications:**
* Bachelor’s or higher degree in Computer Science, Engineering, Linguistics, or a related field.
* No specific certifications required, but experience with LLMs and prompt design is mandatory.