// AI / Machine Learning Engineer

Production AI for document intelligence.

I design and ship generative-AI and ML systems end to end — extraction, RAG, agents, and classification — right-sized for cost, latency, and reliability, and held to hard accuracy metrics.

About

Gabriel Ott de Medeiros, AI/ML engineer
Gabriel Ott de Medeiros
West Des Moines, Iowa

Hi, I'm Gabriel — an AI/ML engineer who designs and ships production generative-AI and machine-learning systems end to end. My focus is document intelligence: turning unstructured documents into structured, verified data that teams can actually act on, through extraction, RAG, agents, and classification.

What I care about most is rigor over buzzwords. I'd rather right-size a model — classical ML or an LLM — to the cost, latency, and reliability a problem actually needs, then prove it works with real evaluation: benchmarking against human-verified files, LLM-as-judge, and post-generation citation checks. Most of my work runs on AWS (Bedrock, AgentCore, SageMaker, Redshift) behind REST APIs and event-driven pipelines.

If you're building something where the answer has to be right, not just plausible, I'd love to talk.

Selected work — Merchants Bonding Company

Production generative-AI and ML systems I designed, shipped, and own — each held to a measured result.

1,600+ docs / mo
3 production workflows

Reusable document-AI platform

Architected the end-to-end pipeline — OCR → Bedrock/Claude via AgentCore → Strands structured output → human-review APIs — now powering three production workflows across the business.

AWS Bedrock · AgentCore · Strands · OCR · REST APIs

98% field accuracy
85% less manual entry

Financial-statement extraction service

Captures key line items at 98% field-level accuracy, tracked in production against underwriter corrections. Scaled throughput from ~700 to ~1,200 statements/month; per the company's annual report, the platform drove an 85% reduction in manual data entry and 50% growth in instant endorsements.

LLM extraction · structured output · model benchmarking

400+ contracts / mo
~60 pages each

Contract summarization with verifiable citations

Automated summaries of 400+ sixty-page contracts a month with page-and-quote citations, grounded by post-generation string-match verification against the source PDF.

RAG · grounding · citation verification

F1 0.98–0.99
2,500+ files / mo routed

Document classifier (classical ML)

Built an 8-class classifier — TF-IDF + one-class SVM for OOD detection + logistic regression — routing 2,500+ files a month. Chose classical ML over an LLM for cost, latency, and interpretability.

TF-IDF · SVM · logistic regression · OOD detection

3,200+ documents
solo-built & led

Underwriter-opinion reconciliation system

Built a system matching 3,200+ underwriter-opinion documents to database records, scoring section similarity and flagging discrepancies (red/yellow/green) to establish a single source of truth.

similarity scoring · data reconciliation

LLM-as-judge
evaluated in production

Agentic internal analytics chatbot

Built a RAG + tool-calling assistant over dashboards and live database metrics, letting staff find dashboards and pull numbers in plain English.

RAG · tool calling · LLM-as-judge

model → rubric
VP-level analysis

Financial Risk Indicator redesign

Presented analysis to VPs showing no correlation between financial data and claims, driving the pivot from a predictive model to an LLM-with-rubric system applying underwriter guidelines.

data analysis · LLM-with-rubric

~70% token cut
≈3M → 900K

Agent workflow re-architecture (mentored)

Mentored a developer redesigning a quarterly-analysis LLM workflow into parallel task-specific agents — cutting token cost ~70% and eliminating threshold hallucinations.

agent design · cost optimization · mentorship

Projects

Open-source work. Replace each placeholder below with a real project.

[ project-name ] [ One line on the problem it solves ]

[ 2–3 sentences: what you built, the key technical decision, and the result or metric. ]

[ tag ] · [ tag ] · [ tag ] repo demo
[ project-name ] [ One line on the problem it solves ]

[ 2–3 sentences: what you built, the key technical decision, and the result or metric. ]

[ tag ] · [ tag ] · [ tag ] repo demo
[ project-name ] [ One line on the problem it solves ]

[ 2–3 sentences: what you built, the key technical decision, and the result or metric. ]

[ tag ] · [ tag ] · [ tag ] repo demo

Skills

languages & data
PythonSQLETLdata modeling
genAI & ML
RAGagentsstructured output (Strands)prompt engineeringgrounding & citation verificationLLM-as-judge evaltime-series forecasting (Prophet)classificationTF-IDFSVMlogistic regression
AWS & MLOps
BedrockAgentCoreSageMakerRedshiftS3Step FunctionsLambdaREST API designevent-driven pipelinesmodel deployment
BI & reporting
dashboard developmentLogiSQL Server (SSMS)