Founding AI Wrangler

Artificer Health - Remote (US hours overlap required)

Why this role exists

Prior authorization is broken in a specific, infuriating way. It's not broken because nobody smart has looked at it. It's broken because the people who designed it had no incentive to fix it. Payers built a system optimized for friction, and for thirty years, that friction has been measured in denied claims, delayed procedures, and staff hours spent on hold.

We're building Artificer Health to end that. Not improve it. End it.

AI is central to how we do that. Not as a selling point on a deck slide - as the actual mechanism. The prior authorization process is a machine that takes clinical information in one form and demands it in another, cross-referenced against criteria that change constantly, payer by payer, code by code. That is a machine we can build something smarter against.

But here's the honest version: healthcare AI is also a space full of people who've convinced themselves a model that's right 92% of the time is a product. In a domain where the other 8% is a patient who didn't get their MRI, that is not a product. That's a liability.

This role exists because we need someone who is genuinely excited about what AI can do in this space and genuinely skeptical of every claim - including their own. Someone who has shipped models into production, watched them fail in ways nobody predicted, and built the guardrails and validation layers that make them safe to keep running. Someone who knows when the answer is a transformer and when the answer is a lookup table.

We are a small team. We are pre-revenue. We are deliberately lean. This is a founding seat, which means you will define how AI is built and used at Artificer from the ground up. There is no existing stack to inherit, no previous decisions to defend, and no department to hide in.

What you'll actually do

Extract signal from clinical noise. Medical documents - clinical notes, lab results, imaging reports, discharge summaries - are written for humans reading in context. You will build systems that read them at scale, pull out the clinically relevant pieces, and structure that information in a way the rest of the system can use. You'll work directly with our Clinical Reviewer to make sure what you're extracting is what actually matters.

Match patient data to payer criteria. Every payer has criteria for what they'll approve. Most of it is not machine-readable in any useful sense. You'll build the infrastructure to ingest payer policies, represent them in a form the system can reason over, and evaluate patient records against them. When there's a match, the system should know. When there's a gap, the system should know that too.

Generate clinical justification narratives. When a prior auth submission goes in, it often needs to tell a story: why this patient, why this treatment, why now. You'll build systems that draft that narrative from structured clinical data, and you'll work with the Clinical Reviewer to make sure those drafts are accurate, appropriate, and defensible. You will build evaluation frameworks to catch hallucinations before they leave the building.

Build the feedback loop. Approvals and denials are signals. Over time, the system should get smarter about what works with which payer, in which clinical context, for which codes. You'll design the learning infrastructure that makes this possible, and you'll do it in a way that is HIPAA-compliant from the start, not bolted on later.

Decide when not to use AI. CPT code lookups do not need a language model. Some payer rules are deterministic. Some validation steps are better as assertions than as inferences. Part of this job is having the judgment to write the if-statement instead of training the model, and being willing to say so out loud.

Make the system explainable. When Artificer recommends an approach to a prior auth submission, the clinic needs to understand why. The Clinical Reviewer needs to understand why. "The model said so" is not an answer we accept. You'll build explainability into the system, not as an afterthought, but as a design constraint.

What you bring

You have shipped AI/ML systems into production in a high-stakes domain. Healthcare experience is a genuine advantage here, not a checkbox. You understand what it means when wrong outputs have consequences beyond a bad metric.

You have hands-on experience with LLMs in production - prompt engineering that holds up under variance, RAG architectures that retrieve accurately, fine-tuning when it's worth it, and evaluation frameworks that tell you honestly whether the system is working. You've dealt with hallucination in contexts where hallucination is dangerous and you've built the layers that catch it.

You are fluent in Python. You know when to use PyTorch versus a hosted API versus a rules engine versus none of the above. You have opinions about vector databases that are grounded in having used them, not in having read about them.

You understand clinical data structures - HL7, FHIR, clinical NLP - well enough to work with them. You don't need to have spent a decade in healthcare, but you need to care enough to learn what you don't know.

You can talk to a clinician and understand what they mean, not just what they say. You can take that conversation and turn it into system behavior. You can then sit with that clinician and verify that what you built is what they meant.

You operate well without a playbook. At Artificer, the role does not come with a defined process for every situation. When something is your problem, you work it until it is resolved. When you hit a wall, you say so clearly and ask for what you need. You don't wait for someone to hand you a plan.

You communicate without jargon. When you say the model is performing well, you can say exactly what that means in terms the rest of the team can evaluate. When there's a risk, you name it plainly.

You are the kind of person who reads a new patio11 essay about the hidden complexity of some mundane-sounding system and thinks "yes, this is exactly why I do what I do." You care about how things actually work underneath, not just how they appear to work.

What we bring

A problem that is genuinely hard and genuinely important. Prior authorization costs the US healthcare system billions of dollars a year in administrative waste, and the cost in delayed and denied care is harder to count but worse. We are not building a productivity tool. We are building infrastructure that changes how care gets authorized.

A founding role with real scope. You will make technical decisions that define how this company operates for years. That is not a figure of speech.

A team that ships. We do not spend months in planning cycles. We build things, learn what does not work, and rebuild them. If you prefer to work that way, this is the right place.

Honest equity in a pre-revenue company. We will tell you what the equity means, what it requires, and what has to go right for it to matter. We will not oversell it.

A culture that will not tolerate spectators. Everyone on this team works the problem in front of them, regardless of whether it is technically in their job description. If that sounds exhausting, this is probably the wrong place. If that sounds like the kind of team you've been looking for, we should talk.

How to apply

Send an email to [email protected] with the subject line AI Wrangler.

Tell us about a production AI system you built that failed in a way you did not predict. What broke, what the consequences were, and what you did about it. No cover letter required. Skip the list of tools you've used. We want to know how you think when things go wrong, because in this domain, things will go wrong, and how you handle that is the job.