2 results for tag "phoenix-evals"
Builds and validates code-first and LLM-as-judge evaluators for AI/LLM applications using Phoenix, with reference workflows for error analysis, axial coding, RAG faithfulness, batch DataFrame evaluation, and experiment runs. Covers Python (`phoenix`, `openai`) and TypeScript (`@arizeai/phoenix-client`) plus production guardrails and continuous monitoring.