cover image
Envision Technology Solutions

Sr AI Platform Engineer- AI Platform Engineer (Guardrails, Observability & Evaluation Infrastructure)

Hybrid

Charlotte, United states

Freelance

16-03-2026

Share this job:

Skills

Communication Python CI/CD Kubernetes Monitoring Test Training Databases GCP FastAPI Langchain

Job Specifications

Dear Applicant,

Please let me know if you are interested.

Title – Sr AI Platform Engineer- AI Platform Engineer (Guardrails, Observability & Evaluation Infrastructure)

Location: Charlotte, NC, USA (3 days onsite)

Hire Type – Long term Contract

Requirement:

Role Overview

AI Platform Engineer to design and build the foundational components that power enterprise‑scale GenAI applications. This includes data guardrails, model safety tooling, observability pipelines, evaluation harnesses, and standardized logging/monitoring frameworks. This role is critical for enabling safe, reliable, and compliant AI development across multiple use cases, teams, and business units. Idea is to create the common platform services that AI team will build upon.

Key Responsibilities

1. Guardrails, Safety & Governance

Design and implement data guardrail frameworks (pre‑processing, redaction, PII/PHI filtering, DLP integration, prompt defenses).
Build “Model Armor” components such as:
Input validation & sanitization
Prompt‑injection defenses
Harmful content detection & policy enforcement
Output filtering, fact‑checking, grounding checks
Integrate safety tooling (policy engines, classifiers, DLP APIs, safety models).
Collaborate with Security, Compliance, and Data Privacy teams to ensure frameworks meet enterprise governance requirements.

2. Observability Frameworks

Build and maintain observability pipelines using tools like Arize AI (tracing, quality metrics, dataset drift/hallucination tracking, embedding monitoring).
Define and enforce platform‑wide standards for:
Tracing LLM calls
Token usage and cost monitoring
Latency and reliability metrics
Prompt/model version tracking
Provide reusable SDKs or middleware for engineering teams to adopt observability with minimal friction.

3. Logging, Monitoring & Telemetry

Design standardized LLM-specific logging schemas, including:
Inputs/outputs
Model metadata
Retrieval metadata
Safety flags
User context and attribution
Build monitoring dashboards for performance, cost, anomalies, errors, and safety events.
Implement alerting and SLOs/SLIs for LLM inference systems.

4. Evaluation Infrastructure

Architect and maintain evaluation harnesses for GenAI systems, including:
RAG evaluation (faithfulness, relevance, hallucination risk)
Summarization/QA evaluation
Human-in-the-loop review workflows
Automated eval pipelines integrated into CI/CD

Support frameworks such as RAGAS, G‑Eval, rubric scoring, pairwise comparisons, and test case generation.
Build reusable tooling for teams to write, run, and track model evaluations.

5. Platform Engineering & Reusable Components

Develop shared libraries, APIs, and services for:
Prompt management/versioning
Embedding pipelines and model wrappers
Retrieval adapters
Common data loaders and document preprocessing
Tool/function schemas

Drive consistency across teams with standards, reference architectures, and best practices.
Review system designs across use cases to ensure alignment to platform patterns.

6. Collaboration & Enablement

Partner with AI engineers, product teams, and data scientists to understand cross‑cutting needs and convert them into reusable platform features.
Create documentation, onboarding guides, examples, and developer tooling.
Provide internal training (brown bags, workshops) on guardrails, observability, and evaluation frameworks.

Required Qualifications

Technical Skills

5–10+ years software engineering or ML infrastructure experience.
Strong Python engineering fundamentals (FastAPI, async, typing/Pydantic, testing).
Experience with model safety/guardrails approaches (prompt injection defense, PII redaction, toxicity filters, policy enforcement).
Hands‑on with Arize AI, LangSmith, or similar LLM observability platforms.
Experience creating evaluation frameworks using RAGAS, G‑Eval, or custom rubric systems.
Strong familiarity with vector databases (Pinecone, Weaviate, Milvus), embeddings, and retrieval pipelines.
Solid understanding of LLM architectures, tokenization, embeddings, context limits, and RAG patterns.
Experience in cloud (GCP preferred), Kubernetes/GKE, containers, and CI/CD.
Strong understanding of security, governance, DLP, data privacy, RBAC, and enterprise compliance requirements.

Soft Skills

Strong documentation and communication skills.
Ability to influence engineering teams and standardize best practices.
Comfortable working across multiple stakeholders—platform, security, ML engineering, product.

Nice to Have

Experience with LangChain/LangGraph or LlamaIndex orchestrations.
Experience with Guardrails.ai, Rebuff, Protect AI, or similar LLM security tooling.
Experience with GCP Vertex AI pipelines, Model Monitoring, and Vector Search.
Familiarity with knowledge graphs, grounding models, fact‑checking models.
Building SDKs or developer frameworks adopted across multiple teams.
On‑prem or hybrid AI deployment experience.

About the Company

Envision Technology Solutions (ETS) is a leading staffing and recruitment firm specializing in providing top-tier talent and workforce solutions across industries. With a proven track record, we connect exceptional candidates with exceptional opportunities, helping businesses thrive and individuals achieve their career goals. ETS has a team of highly skilled recruiters and industry experts with in-depth knowledge of different sectors. Leveraging our expertise, we identify, attract, and select the best candidates for clie... Know more