Q3 2026 — 1 engagement slot open

Production AI engineering, without the partner-shop markup.

The same caliber of work you’d get from an Anthropic or OpenAI implementation partner — advanced RAG, Graph RAG, fine-tuning, multi-step agents — at boutique pricing and without the enterprise sales process.

Fully async. No discovery calls, no demos, no zoom marathons. You send a brief → we send a written proposal in 48 hours → you decide.

20+
Years in practice
68
Systems shipped
14
LLM apps in prod
$35k
Engagements from
Positioning

Partner-shop expertise.
Boutique-shop pricing.

Anthropic and OpenAI implementation partners deliver excellent work. They also charge $150k+ for engagements that start with a six-week sales process. We deliver the same caliber of production AI work, fully async, at one-third the price.

Partner shopsMini Trends
Engagement floor$150k – $500k$35k
Sales process4 – 8 weeks of calls48-hour written proposal
Senior engineer on the keyboardSometimes (juniors hidden in rate)Always (senior-led, no juniors)
AI capability ceilingFrontier (RAG, agents, fine-tuning)Frontier (RAG, Graph RAG, fine-tuning, agents)
Code & infra ownershipYours, eventuallyYours, day one
Async deliveryNoYes — entire engagement

Mini Trends is independent. We are not affiliated with Anthropic or OpenAI; the comparison is on capability and price.

AI capabilities

Anything AI. Done at the production bar.

The full surface area of modern AI engineering, delivered by a senior-led team. If it is shipping in production somewhere in 2026, we have shipped it.

Advanced RAG

Hybrid search (BM25 + dense), cross-encoder re-ranking, query rewriting, document-aware chunking, citation tracking, metadata filtering, multi-tenant retrieval.

Graph RAG

Knowledge graph construction over unstructured data (LangChain, LlamaIndex, custom). Entity extraction, relationship modeling, graph-aware retrieval for multi-hop reasoning.

Fine-tuning & adaptation

LoRA, QLoRA, and full fine-tunes on Llama, Mistral, Qwen, and open-weight models. Distillation from frontier models to smaller production-targets. DPO and preference-tuning where it pays.

AI agents

Single-agent and multi-agent systems with tool use, function calling, durable state (Temporal, Inngest), human-in-the-loop checkpoints, and replayable execution.

Evals & observability

Eval harnesses with LLM-as-judge and deterministic scoring. Production tracing with LangSmith, Braintrust, Phoenix. Continuous shadow evals on real traffic. Drift detection.

Multi-modal

Vision (Claude, GPT-4V, Gemini), document understanding, OCR pipelines, audio (Whisper, Deepgram), and voice agents (LiveKit, Pipecat, Vapi).

Embeddings & vector infra

Vector database selection and tuning (pgvector, Qdrant, Pinecone, Weaviate, Turbopuffer). Embedding model selection. Hybrid index design at scale.

Cost & latency optimization

Prompt caching, model routing, speculative decoding, structured output for reliability, response streaming, batch APIs. Bills cut 40–80% on most engagements.

Frontier-model integration

Claude (Sonnet 4.6, Opus 4.7), GPT-5, Gemini 2.5 Pro, Llama 4. Provider-agnostic abstractions where it pays; provider-specific features where it does not.

Cloud & infrastructure

We know the clouds, deeply.

AI is half the job. The other half is the infrastructure that runs it. We are equally fluent in AWS, GCP, Azure, and the modern PaaS layer — and equally opinionated about when each is the right call.

AWS — production-deep

Bedrock, ECS Fargate, Lambda, RDS Aurora, S3, EventBridge, Step Functions, OpenSearch, IAM. Well-Architected reviews. Cost optimization on six- and seven-figure monthly bills.

Google Cloud

Vertex AI, Cloud Run, BigQuery, Pub/Sub, Cloud SQL, Firestore, Cloud Functions, Workflows, IAM. Multi-region deployments and BigQuery ML pipelines.

Azure

Azure OpenAI Service, Container Apps, Cosmos DB, Functions, Service Bus, AKS, Key Vault. Comfortable when the enterprise has already standardized here.

Modern PaaS

Vercel, Fly.io, Railway, Render, Cloudflare Workers + R2 + D1, Supabase, Neon. The right pick for early-stage products that need to ship without a DevOps team.

Infrastructure as code

Terraform, Pulumi, AWS CDK, SST. Reproducible environments, drift detection, secrets management, multi-account and multi-tenant patterns.

Observability & SRE

Datadog, Honeycomb, Grafana, OpenTelemetry, Sentry, Cloudwatch. SLOs, error budgets, on-call runbooks. Production-grade from day one.

Services

Three lines of work, done well.

We're deliberately narrow. Greenfield specialists — building net-new AI systems and platforms from zero is what we do best. Brownfield (existing codebase) work is welcome too, especially Engineering Rescue.

Production AI Applications

LLM-native products, advanced RAG, Graph RAG, agents, and fine-tuned models that survive evaluation.

  • Frontier-model integration (Claude, GPT, Gemini)
  • Advanced RAG + Graph RAG over your data
  • Multi-step agents with guardrails
  • Fine-tuning open-weight models
  • Eval harness + observability

Bespoke Software Platforms

Custom SaaS, internal tooling, and B2B platforms built for teams that have outgrown templates.

  • Typed end-to-end (TypeScript)
  • Postgres + your cloud of choice
  • Auth, billing, webhooks, integrations
  • Documented for your team

Engineering Rescue

Inherited a broken codebase from a previous shop, contractor, or AI-assisted build? We diagnose, stabilize, and ship the plan.

  • 10-day technical audit
  • Written stabilization plan
  • Rebuild-vs-rescue recommendation
  • Hands-on remediation sprint
Selected work

Shipped, with metrics.

Most clients prefer not to be named. Industries, architectures, and outcomes are real. References available under NDA on request.

Series B fintech · Production AI

Real-time fraud-triage agent

Problem
Manual review queue creating 4-hour decision lag on flagged transactions.
What we built
Claude-powered triage agent over Postgres + pgvector with 50k labeled cases, eval harness on every prompt change.

94% auto-resolution

p95 latency 12s · est. $1.4M annual savings · NPS 9.6

Healthcare startup · RAG

HIPAA-aware clinical search

Problem
Clinicians spending 40% of charting time searching across 90k unstructured documents.
What we built
Hybrid retrieval (BM25 + dense) over partitioned multi-tenant pgvector, citation tracking, on-prem Bedrock deployment for HIPAA.

37% time saved

Sub-second p50 retrieval · zero PHI egress · expanded to 4 hospital systems

Logistics · Bespoke platform

Replaced 3 vendor tools with one

Problem
Ops team paying $180k/yr for three SaaS tools that didn’t talk to each other.
What we built
Custom internal platform on AWS (ECS + RDS + EventBridge), typed end-to-end, integrated with carriers and ERP.

$160k/yr saved

Shipped in 11 weeks · 4× faster ops cycle · zero downtime cutover

Why companies choose us

The difference between shipping and shipped.

Working demo isn't a working system. Here's how we close that gap.

Senior-only delivery

Every line of code shipped by a senior engineer. No juniors hidden in the rate. Our principals do the work, not subordinates.

AI-augmented, not AI-careless

We use AI tooling aggressively to move faster. We do not ship its first draft. That distinction is the entire job.

Fixed-scope, fixed-price

Tiered engagements with a published floor. No hourly drift, no scope-creep games. You know the cost before we start.

You own everything

Code in your repo from day one. Infrastructure in your cloud account. Documentation for your next hire.

What clients say

On the record, off the record.

Quotes anonymized at the client’s request. Names available under NDA after a brief is exchanged.

We went from "is this even possible" to a working production system in six weeks. The architecture memo alone was worth the engagement fee.

VP Engineering

Series B B2B SaaS

Hired Mini Trends after two failed attempts with offshore vendors. They diagnosed our codebase in two weeks, gave us a written rebuild plan, and shipped it in eight.

CTO

Healthcare startup

The async model worked exactly as advertised. We never had a standing meeting. Their work product was better than agencies that demanded weekly demos.

Director of Engineering

Logistics platform

Process

A complete engagement, zero meetings.

Every step happens in writing. You will never be asked to schedule a discovery call, sit through a demo, or attend a standing weekly. The site does the selling. The work product does the rest.

  1. Phase 01Day 0

    You send a written brief

    Use the structured form on this site. Ten minutes, no calls. You describe the problem, budget range, and timeline; we do the rest.

  2. Phase 02Day 1 – 2

    We respond in 48 hours, in writing

    Either a written proposal — scope, timeline, fixed price, deliverables — or a plain "no, here is why, here is who I would call instead". Always written. Never a sales call.

  3. Phase 03Day 2 – 5

    You sign asynchronously

    Standard SOW, e-sign, deposit invoice. No follow-up calls. If you have questions, email them. We answer in writing within one business day.

  4. Phase 04Day 5+

    We ship — async by default

    Code lands in your repo from day one. Weekly written updates with demo links. Slack channel for questions. Optional Loom walkthroughs of major milestones. No standing meetings.

Want to see what an async engagement looks like before committing?

Browse the articles for examples of how we write up architecture decisions, eval results, and shipping plans.

Read sample writing
Engagement & pricing

Three doors. All priced honestly. All published.

Fixed-scope engagements with a published floor. If your problem doesn't fit a tier, we'll write you back and tell you so.

Most common entry point

Discovery Sprint

$35,000

2 weeks · fixed

A new AI feature, a stalled MVP, or a build-vs-buy question that needs a real answer in writing.

  • Architecture audit + written plan
  • Working prototype of one critical path
  • Risk register & dependency map
  • Cost & timeline for the full build
Send a brief
Most chosen

Most engagements land here

Build Engagement

From $75,000

6 – 16 weeks · milestone

Ship a production system. Advanced RAG, an LLM agent, a fine-tuned model, or a full bespoke platform.

  • Senior-led delivery, async by default
  • Eval harness & observability
  • Deployed to your infra; you own it
  • Documentation for your next hire
Send a brief

Fractional CTO

Embedded Principal

$30,000 / mo

3-month minimum

Force-multiplier for an existing engineering team. Fractional CTO with hands on the keyboard.

  • ~60% hands-on engineering
  • Architecture & hiring review
  • Async access in your Slack
  • Optional board / investor reporting
Send a brief

All engagements are fully async. No discovery calls, no demos, no standing weeklies. Send a brief, get a written proposal in 48 hours.

Procurement & payment

Boutique pricing. Enterprise-ready terms.

Forward this section to procurement. Every common question is answered up front so AP and legal can sign off without back-and-forth.

$

Fixed-scope, fixed-price

All engagements priced upfront in USD. No hourly drift, no scope-creep change orders. The number on the SOW is the number on the invoice.

50/50

Standard payment schedule

50% on signing, 50% on delivery for fixed-scope. Milestone-based invoicing for builds. Monthly in advance for retainers.

ACH

Wire / ACH preferred

Wire transfer (USD) or US ACH for engagements over $10k. Credit card accepted under $10k via Stripe. Tax invoices issued for international clients.

N15

Net 15 standard, Net 30 on request

Default invoice terms are Net 15. Net 30 available for enterprise procurement on request — no questions asked.

Doc

Vendor docs ready

W-9, W-8BEN, COI ($2M general liability + $1M E&O), MSA template, mutual NDA — provided before contract on request.

Sys

Procurement-system friendly

We onboard via Coupa, SAP Ariba, Workday, and similar. Standard vendor packets accepted. Tax-exempt billing handled.

Need something custom? Net 60, milestone variations, tax-exempt billing, escrow, or specific SOW templates — note it in the brief and we’ll accommodate within reason.

Field notes, weekly

One essay a week. No promo.

A single field-note from production AI engineering — one opinionated take, two or three links worth reading. Unsubscribe with a one-word reply.

FAQ

Common questions, answered up front.

The questions procurement, legal, and AP usually ask. If yours is not here, send a brief and ask in the body — we will answer in writing within one business day.

What does an engagement cost?
+
Discovery Sprint from $35,000 (2 weeks, fixed). Build engagements from $75,000 (6–16 weeks, milestone). Embedded Principal $30,000/month with a 3-month minimum.
Do you offer discovery calls?
+
No. Our entire engagement is async — no discovery calls, no demos, no standing meetings. You send a written brief, we respond in writing within 48 business hours.
Who actually does the work?
+
A senior-led practice. The principal answers your brief, runs the engagement, and writes the code. Specialized collaborators are brought in for fine-tuning, security audits, or on-prem deployment phases — never juniors hidden in the rate.
How is this different from an Anthropic / OpenAI partner shop?
+
Same caliber of work, ~one-third the price ($35k floor vs $150k+), 48-hour written proposal vs 4–8 weeks of sales calls. We are deliberately small and async-first; partner shops are deliberately large and account-team-driven. Both serve real markets.
What payment terms do you support?
+
Fixed-scope, fixed-price USD. 50% on signing, 50% on delivery for sprints. Wire / ACH preferred over $10k; credit card under $10k. Net 15 standard, Net 30 on request. W-9, COI ($2M GL + $1M E&O), MSA, and NDA available before contract.
Can I get a refund if it does not work out?
+
Discovery Sprints are first-week refundable. If you do not like the direction by end of week one, we return the deposit, no questions asked.
Why send a brief

Three reasons this is risk-free.

The downside of sending a brief is nothing. The upside is a written, opinionated, senior-engineer take on what you’re building — even if you never engage us.

01

Free written quote in 48 hours

Send a brief, get a fixed-scope SOW in two business days. Costs nothing. Commits to nothing.

02

Honest "no" if it’s not us

If your project isn’t the right fit, we say so in writing — and refer you to two or three shops who would be. We turn down work most weeks.

03

First week refundable

On Discovery Sprints, the first 5 business days are refundable if you don’t like the direction. Full deposit returned, no awkward conversation.

The team

Senior bench. No juniors. No agencies inside the agency.

Mini Trends is a senior-led practice with a small bench of trusted collaborators we bring in for specialized phases — fine-tuning runs, security audits, design sprints, on-prem deployment work. The bench is short and deliberately so. Every engagement has one principal accountable from brief to handover.

Our principal has been writing software professionally since 2004 — through Rails, into React, and now firmly into the age of LLM-native products. Eight years inside production AI before that phrase existed in pitch decks. Past lives include staff roles at venture-backed startups, principal-track consulting for Fortune 500 platform teams, and three of his own companies (one acquired, two graveyard, all instructive).

We are deliberately small and deliberately async. We say no to most engagements because the bar is high and the bench is finite. If you want a fifty-person agency, this is the wrong page. If you want a senior team that delivers in writing and ships production AI without hand-holding — send a brief.

2004
Principal since
4 + 1
Bench size
3 / qtr
Active engagements

Have a real problem and a real budget?
Send a brief.

Ten minutes of writing on your end. Forty-eight hours of consideration on ours. Either a written proposal or an honest no — never a sales call.

  • No discovery calls, ever
  • Written proposal in 48 hours
  • Fixed scope, fixed price
  • You own the code from day one