XY Logo
Open Position

Senior QA EngineerFull-Stack Automation & LLM Evaluation

Remote (US)Full-timeEarly-Stage

🏢About XY.AI Labs

XY.AI Labs is a venture-backed startup on a mission to remove friction from healthcare. Our agentic AI platform combines purpose-built agents and intelligence layers to streamline complex operational workflows—giving healthcare providers back what matters most: time for care.

💼About the Role

Build an AI-powered, fully-automated quality, evaluation, and release platform at a rocket-ship health-tech startup to support our rapidly evolving AI Agent orchestration platform.

You'll contribute to shaping our QA strategy while driving hands-on execution—designing self-healing test harnesses, spinning up LLM-driven evaluation pipelines, and weaving everything into a one-click GitHub Actions release flow.

We're hunting for a 10x, QA-obsessed full-stack engineer who already wields AI tools (Copilot, auto-test agents, code-gen) to amplify output and who thrives on turning blank slates into world-class systems. This is your canvas to experiment, take smart risks, and help scale quality from day 0 to Unicorn.

Key Responsibilities

🔧AI-Enhanced Full-Stack Test Automation

  • Build and maintain test infrastructure in Python (back-end) and Node/Next.js (front-end), instrumented with Pydantic models for type-safe data flows.
  • Architect GitHub Actions pipelines that run unit, API, UI (Selenium/Playwright) and performance suites on every PR, auto-deploying to GCP or Azure test environments.
  • Implement container-first test environments (Docker/K8s) and infrastructure-as-code for repeatable staging and blue/green releases.

📊Tooling & QA Best-Practice Leadership

  • Codify modern QA processes—shift-left test design, flaky-test quarantine, branch-based quality gates.
  • Craft living dashboards that surface coverage, MTTR, and AI performance drift metrics to Engineering & Product leadership.

🤖LLM Evaluation & AI Quality

  • Design prompt-stress suites, reference-answer graders, and automated hallucination/bias checks to guarantee safe, deterministic agent behavior.
  • Stay tool-agnostic so we can adopt or sunset frameworks quickly as the LLM landscape evolves.

🚀Autonomous Release Governance

  • Act as final quality gatekeeper—own the go/no-go call, publish release notes, and champion progressive rollout strategies (canary, feature flags).

👥Growing the QA Practice

  • Contribute to shaping our QA strategy and help establish peer code-review standards that propagate a culture of testability across all services.
  • As the team grows, help onboard and mentor new QA team members.

🔍What We're Looking For

1
Full-Stack Automation Mastery

5-7 years coding tests & prod code; expert in Python (PyTest, FastAPI) and JavaScript/TypeScript (Jest, Playwright). Deep GitHub Actions, Selenium, Docker/K8s on GCP & Azure.

2
QA Methodologies & CI/CD

Proven design of end-to-end automation frameworks, shift-left testing, and zero-downtime releases. Experience mentoring dev teams in quality practices.

3
AI / LLM Evaluation

Hands-on testing of ML or LLM features—prompt engineering, response grading, drift detection. Understanding of AI safety, bias, and data privacy.

4
Collaboration & Autonomy

Comfortable owning a charter, setting priorities with VP Eng/Product, and communicating trade-offs in plain English. Startup or 0→1 experience is strongly preferred.

You Might Be a Great Fit If...

You Have:

  • Builder-Owner Mentality — you see constraints as creative fuel and bias toward shipping.
  • Data-Driven Experimenter — you design hypotheses, measure, iterate.
  • Mentor at Heart — eager to uplevel peers and cultivate a blameless, high-trust culture.

📈The Growth Path

Join as a key early member of the QA function, with opportunities to grow into senior technical or leadership roles as the team scales. You'll have the chance to explore emerging LLM-Ops tooling and help position XY as an industry pioneer in AI quality engineering.

🚀Why Join Us

🛠️Build quality infrastructure from scratch at a production-grade AI agent system
🏥Build in a space with real-world urgency — billing errors = lost care
🧪Pioneer LLM evaluation techniques and define industry benchmarks
🚀Collaborate with a world-class team — we ship fast
Your quality gates will protect healthcare operations used every single day

Compensation

High-growth opportunity with equity in an early-stage startup.

Equal Opportunity

XY.AI Labs is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

Ready to redefine healthcare with AI?

Let's talk! Apply now and join us in building the next-generation AI-powered healthcare platform.

Apply Now

or email us at careers@xy.ai