Live Cohort · 17 Weeks · 2026 Edition

Applied GenAI
& Agentic AI
Engineering

The developer bootcamp that takes you from LLM fundamentals to a deployed, production-grade agentic system — with open-source models, multimodal AI, adversarial evals, responsible AI, MCP + A2A protocols, and a portfolio that gets you hired.

Reserve Your Seat → Explore Curriculum
17
Live Weeks
170+
Hands-On Hours
16
Named Projects
5+
AI Providers
4
System Design Cases
1
Deployed Capstone
5+ AI Providers & Models
OpenAI GPT-4o
Anthropic Claude
Google Gemini
Mistral AI
Cohere
Llama 3 (Meta)
Mistral 7B / Mixtral
Phi-3 (Microsoft)
SLM Routing
PEFT / LoRA Fine-Tuning
Why This Course Exists

Most AI courses teach
topics. This builds engineers.

📚
What Other Courses Give You
Fifty video modules. Jupyter notebooks. A certificate PDF. No open-source models. No system design. No evaluation science. No portfolio that survives contact with a real hiring manager.
What CoreSmart Gives You
17 weeks of live engineering. 16 named projects. Eval introduced in Week 2 — not Week 7. Evals, reliability, and production practices woven through every module. Open-source models alongside frontier APIs. You graduate with a deployed system and real metrics.
⚠️
The Depth Problem Other Bootcamps Have
Most bootcamps stack too many mental models per week and call it "comprehensive." Students cargo-cult patterns instead of understanding tradeoffs. This course is explicitly paced to avoid that — critical weeks get the space they need.
🔭
Built for 2026 Production Reality
MCP + A2A protocols. Open-source fine-tuning. Multimodal embeddings. PII detection. Stateful memory. Reliability engineering. FinOps. System design case studies. When agents don't make sense. All of it.
What You'll Be Able to Do

Eight skills that
get you hired.

Not topics you've heard of. Things you can prove you've done — with repos, evals, and deployed systems to show for it.

01
🏗
Build production LLM applications
Streaming APIs, structured outputs, tool calling, multimodal ingestion, Pydantic enforcement, and cost-aware design across proprietary and open-source models.
02
🔍
Design and evaluate RAG systems
Context engineering, multimodal embeddings, reranking, groundedness evals, adversarial testing — with evaluation instinct from Week 2 onward.
03
🤖
Engineer agentic systems end-to-end
Tool loops, state management, multi-agent orchestration, stateful memory, MCP servers, A2A-compatible services — and knowing when NOT to use agents.
04
🛡
Secure, govern, and align AI systems
Prompt injection defense, PII detection, RBAC, HITL design, responsible AI frameworks, Constitutional AI concepts, and bias awareness — in production.
05
🧪
Build AI reliability systems
Hallucination debugging decision tree, prompt versioning, model upgrade control, Pydantic enforcement, parsing fallbacks, and user feedback loops.
06
🔩
Work with open-source and small models
Fine-tune with PEFT/LoRA, run local inference with Ollama, route between frontier and SLMs, and know when open-source wins on cost, latency, or data control.
07
🚀
Deploy, monitor, and optimise
Docker, CI/CD, OpenTelemetry, cost dashboards, prompt caching, SLM routing, versioning, async pipelines, and event-driven self-healing patterns.
08
📐
Design AI systems from first principles
Four system design case studies. You'll answer "walk me through how you'd design X" for real-world AI systems — not just the model call, but the entire architecture.
Free Prep Track — Modular, Self-Paced, Included

Start from where
you actually are.

Three independent modules. Take only what you need. Students with ML/maths backgrounds can skip ahead and push harder on Weeks 4–7 and 11–12.

Prep Module A
Python, APIs & Dev Tooling
  • Python: functions, classes, async/await, type hints, testing
  • REST APIs, JSON, HTTP, FastAPI basics, Pydantic v2 intro
  • Git, GitHub, branching, PRs, CLI workflows
  • Virtual environments, pip, Docker basics
Project
DocParser API
FastAPI service that accepts a document and returns validated JSON metadata — title, word count, language, named entities.
Prep Module B
Data, Docker & ML Concepts
  • Pandas, regex, text preprocessing, similarity concepts
  • Docker: images, containers, Compose, local pipelines
  • Transformers, embeddings, vectors — ML for builders, not researchers
  • Notebooks for exploration; the notebook/production divide
Project
ChunkForge Pipeline
Containerised pipeline that cleans, chunks with metadata, and structures documents — runs fully in Docker.
Prep Module C
AI-Assisted Developer Workflow
  • AI coding tools: Cursor, GitHub Copilot, Claude Code — practical differences
  • AGENTS.md: writing repo context files for AI systems
  • Prompt discipline; review habits for AI-generated code
  • Repo hygiene: linting, tests, structured commits, CI baseline
Project
DevReady Kit
Repo setup pack with linting, test structure, CI baseline, AI tool prompts, and a working AGENTS.md.
Full 17-Week Curriculum

Every week builds
on the last.

From LLM foundations to deployed, governed, cost-optimised agentic systems. Critical weeks get the breathing room they need. Evaluation instinct starts in Week 2.

🧪
Evaluation Instinct Starts in Week 2 — Not Week 7
Every module from Week 2 onward asks: "How do you know this is working?" A lightweight eval primer in Week 2 establishes the instinct before students build their first system. BreakRAG™ in Week 7 is the mature harness — not the first introduction to the concept.
System Design Case Studies — Woven Across the Course
Four 90-minute architecture sessions. Each produces a one-page architecture diagram and decision memo students use in interviews. AI is not just an LLM call — these sessions prove it.
Week 01
How Would You Design ChatGPT?
Load balancing, context management, streaming infra, abuse detection, cost attribution, multi-tenant architecture
Week 06
How Would You Design GitHub Copilot?
Code embedding, retrieval pipeline, latency constraints, large-repo context management, feedback loops
Week 11
Production Multi-Agent System Design
Orchestration at scale, agent isolation, handoff protocols, trace architecture, failure recovery, cost visibility
Week 14
AI Observability System Design
Token-level cost tracking, latency percentiles, model drift detection, eval regression, incident response
Phase 1 — Foundations, RAG & Evaluation (Weeks 1–8)
Week 01
Modern AI Product Anatomy, LLM Internals & RLHF Concepts
Opens with live code in hour one — streaming API call, structured output, basic tool call. RLHF/alignment is a 30-min async reading, not a live lecture slot.
Hands-On: First LLM App + StreamingModel Layer ArchitectureRetrieval + Tool + Memory LayersLatency-Cost-Quality TriangleRLHF & Alignment (Conceptual)Constitutional AI ConceptsWhy Models Behave Differently
Case Study: How Would You Design ChatGPT?
ReleaseBot — Release Intelligence Assistant. Converts raw update notes into structured changelog + announcement via streaming FastAPI. Week ends with a one-page architecture diagram from the case study.
Foundation
Week 02
LLM Mechanics, Model Choice & Evaluation Primer
Eval primer introduced here. Every project from now asks: "How do I know this is working?" Students build a golden dataset and pass/fail threshold for IntentIQ this week.
Transformer Behaviour in PracticeContext WindowsPrompting vs RAG vs Fine-Tuning vs AgentsModel Selection FrameworkEval Primer: Golden Datasets + ThresholdsClassification Accuracy + Pass/FailFrontier vs Open-Source Decision CriteriaSLM vs LLM
IntentIQ — Issue Classification Engine. Categorises messages with typed confidence scores. Benchmarks two providers + one open-source model. Includes first golden dataset and accuracy measurement — eval starts here.
Foundation + Eval
Week 03
Prompt Engineering, Structured Outputs, Tool Calling & Streaming
Week restructured across 3 progressive sessions: Day 1 — prompting + structured outputs. Day 2 — tool calling. Day 3 — streaming + retry logic. TicketStream ties all three together at week end.
System Prompt DesignFew-Shot PatternsPydantic v2 SchemasOutput Enforcement Progression: Prompt → JSON Mode → Pydantic → Tool Calling → Structured APISSE StreamingRetry & Validation LogicParsing Unformatted LLM Responses
TicketStream — Structured Intake Bot with Streaming. Converts messy messages into validated Pydantic objects via SSE streaming. Teaches the full output-enforcement progression from soft prompt to strict schema.
Core Skill
Week 04
Multimodal Ingestion, Embeddings & Vector Databases
Embedding ModelsChunking StrategiesMetadata DesignHybrid RetrievalGPT-4o Vision + Claude VisionPDF + Image + Table IngestionCLIP / OpenCLIP EmbeddingsNomic Embed (Multimodal)Index Maintenance
KnowledgeVault — Multimodal Runbook Search. Semantic search over docs, PDFs with figures, tables, and screenshots. Uses CLIP for visual content alongside text embeddings. Most RAG courses only cover plain text.
RAG
Week 05
RAG Foundations & Grounded Answers with Citations
RAG Pipeline ArchitectureCitation GenerationFallback BehaviourAnswer ConfidenceHallucination ControlsConversation State
CitationRAG — Engineering Knowledge Assistant. Answers with citations, handles "I don't know" gracefully. Groundedness measured on golden dataset from Week 2 — students see the improvement arc.
RAG
Week 06
Advanced RAG & Context Engineering
Query Transformation (HyDE)Step-Back PromptingRerankingMulti-Query RetrievalContext CompressionContext Engineering PrinciplesBefore/After Benchmarking
Case Study: How Would You Design GitHub Copilot?
RAGOptimizer — Retrieval Quality Lab. Improve Week 5 measurably — query transformation, reranking, compression. Document before/after on latency, recall, groundedness. Copilot case study architecture feeds the optimisation decisions.
Context Eng.
Week 07
Evaluation Science, Adversarial Testing & Experimentation
BreakRAG™ is the mature eval harness. Students already have eval instinct from Week 2. This week adds: inter-rater reliability, evaluator bias in LLM judges, prompt sensitivity, statistical significance in A/B tests, and dataset leakage prevention.
Inter-Rater ReliabilityEvaluator Bias in LLM JudgesPrompt Sensitivity of JudgesStatistical Significance in A/B TestsDataset Leakage PreventionGolden DatasetsGroundedness MetricsRegression TestingRed-TeamingLLM-as-Judge PipelineMulti-Model Scoring (5 Providers)User Feedback → Eval Loop
BreakRAG™ — Adversarial Eval & Red-Team Harness. Full evaluation pipeline with adversarial queries, multi-model LLM-as-judge scoring across 5 providers, regression CI, and user feedback wiring. The rigorous version of what started in Week 2.
Eval Science
Week 08
Fine-Tuning, Open-Source LLMs, LoRA Mechanics & SLM Specialisation
Includes a 30-min "what LoRA is actually doing" explainer: low-rank decomposition, which weight matrices, rank 8 vs rank 64 tradeoffs. Not full maths — genuine mechanical intuition before touching the code.
When Fine-Tuning Wins (Decision Framework)Dataset CurationLoRA Mechanics (What's Actually Happening)PEFT / LoRA on Llama 3 / MistralHugging Face Training PipelineOllama Local InferenceOpenAI Fine-Tuning API (GPT-4o-mini)SLM Routing: When Smaller WinsProprietary vs Open-Source Cost/Control/Privacy
SpecialistTuner — Dual Fine-Tuning Lab + Decision Memo. Fine-tune via PEFT/LoRA on Llama 3 AND OpenAI fine-tuning API. Benchmark both against base prompting on accuracy, latency, cost, and data privacy. Decision memo: "when does each approach win?" — the rarest deliverable in any AI course.
Open Source
Phase 2 — Agentic AI Engineering (Weeks 9–13)
Week 09
Agentic AI Foundations & Agent State Management
Students build the agent loop from first principles before touching any framework. The "when NOT to use agents" case study is weighted equally to the build exercise — the rejection decision is practised explicitly.
Workflows vs Agents (With Explicit "Reject" Exercise)Tool Loops & Stop ConditionsAgent Design Patterns (ReAct, Reflection)Short-Term State (Session Context, Tool History)Workflow State (Step Tracking, Checkpoints)Persistent State (Resume After Failure)State Diagrams as Deliverables
OpsAssist — Tool-Using Ops Agent + State Diagram. Single agent with defined tool schemas, stop conditions, and retry logic. Full state diagram showing short-term/workflow/persistent layers is a required deliverable. Includes the "reject agents" decision exercise.
Agentic
Week 10
Multi-Agent Orchestration, Long-Term Memory & When NOT to Over-Engineer
Explicit "reject multi-agent" case study: students receive a scenario and must argue against multi-agent as the right solution. The complexity-rejection decision is practised as rigorously as the complexity-adoption decision.
Routing & HandoffsManager-Worker PatternsGraph Orchestration (LangGraph)When NOT to Use Multi-Agent (Case Study)OpenTelemetry TracingIn-Context MemoryExternal Memory (Vector Store)Episodic Memory (Conversation Summaries)Procedural Memory (Learned Preferences)Memory Lifecycle Management
TriageFlow™ — Incident Multi-Agent Workflow + Memory. Triage → knowledge → action agents with full trace replay, human approval gate, and four-layer memory (Redis session, pgvector long-term, episodic summaries, procedural preferences).
Agentic
Week 11
Agent Interoperability — MCP, A2A & AGENTS.md
MCP Architecture & Tool DesignPublishing MCP ServersA2A Protocol (Agent Cards)Client-Remote ArchitectureTask Lifecycle & SSE StreamingJWT/OIDC SecurityAGENTS.md Spec Files
Case Study: Production Multi-Agent System Design
AgentMesh™ — A2A Remote Specialist + Shared MCP Server. Publish a signed Agent Card, expose A2A-compatible service with SSE + JWT, AND publish one MCP server the cohort consumes. Live inter-student A2A network during the session.
MCP + A2A
Week 12
Guardrails, PII Detection, Responsible AI & HITL
Prompt Injection DefenseTool Misuse PreventionApproval Checkpoints + Audit LoggingPII Detection (Microsoft Presidio)NER-Based Entity ScrubbingOutput PII FilteringPII in RAG PipelinesResponsible AI FrameworkBias & Fairness AwarenessHITL Design PatternsGDPR/CCPA in AI Systems
GuardianAI™ — Adversarial Lab + PII Shield + Responsible AI Checklist. Students attack and defend in pairs (prompt injection, exfiltration, tool misuse). Add Presidio PII scrubbing on inputs + outputs. Complete formal responsible AI checklist with HITL design, bias assessment, GDPR notes.
Security + Ethics
Week 13
Developer Tooling Agents, Streaming Operator UI & UX Feedback Loops
Repo Intelligence AgentsCI/CD Context + PR AssistanceStreaming Chat UI (SSE + React)Citation RenderingTool Trace DisplayApproval Action UIThumbs Up/Down + Correction FlowsFeedback Storage → Eval HarnessOnline vs Offline Eval Loop
WorkbenchAI™ — Streaming Developer UI + Feedback Loop. Portfolio-quality UI: live citations, tool traces, approval actions. Wires the full user feedback loop — thumbs/corrections → feedback database → BreakRAG™ eval harness. Closes the online/offline eval cycle.
Dev Tools
Phase 3 — Production, Reliability & FinOps (Weeks 14–16)
Week 14
Deployment, Observability, Versioning & Async Pipelines
Docker + CI/CDCloud DeploymentOpenTelemetry + TracingMonitoring + AlertingModel + Prompt VersioningDataset VersioningAsync Background JobsBatch Processing Pipelines
Case Study: AI Observability System Design
DeployCore — Production Deployment with Background Jobs. Cloud-deployed agent with OpenTelemetry, async ingestion/eval pipelines, and versioning controls. Observability case study architecture grounds the design decisions.
LLMOps
Week 15
AI Application Reliability Engineering
This week gets its own space — not compressed with deployment. The unglamorous work that takes up 30% of production AI engineering time and almost no course teaches properly.
Hallucination Debugging Decision TreePrompt Versioning in GitModel Upgrade Control (A/B, Frozen Test Sets)Pydantic Response Enforcement PatternsParsing Fallback StrategiesRetry + Backoff ArchitectureWhen Structured Output Breaks
ReliabilityKit™ — AI Reliability Engineering Toolkit. Hallucination debugging decision tree, Git-based prompt versioning workflow with regression checks, model upgrade protocol with frozen test sets and A/B rollout, Pydantic enforcement patterns, and fallback strategies for when structured output breaks entirely.
Reliability
Week 16
FinOps, Prompt Caching, SLM Routing & Event-Driven Self-Healing
Cost reduction targets are removed from the brochure. Students learn to measure and optimise their actual costs on their actual workload — the measurement skill, not a promised percentage.
Prompt Caching (Anthropic/OpenAI)Intelligent SLM/LLM RoutingPer-Request Cost BudgetsCost Dashboards + MeasurementEvent-Driven Webhook TriggersAgentic SRE PatternsPause/Resume + Human Approval
CostGuard™ — FinOps Router + Self-Healing SRE Bot. Cost-aware routing layer with prompt caching and a weekly cost measurement report. Plus an event-driven SRE agent that listens for webhook alerts, diagnoses root cause, and proposes remediation with human approval.
FinOps
Phase 4 — Capstone Hardening & Demo Day (Week 17)
Week 17
Final Capstone Hardening, Demo Day & Hiring Pack
Production PolishFull Red-Team ReviewEval + Reliability ReportArchitecture Case StudyResponsible AI ChecklistCost AnalysisPortfolioAgent (Career Digital Twin)Technical Storytelling
AgentForge™ — Final Demo Day. Live demo, architecture review, eval report, ReliabilityKit™ walkthrough, security checklist audit, cost analysis. Bonus: PortfolioAgent — A2A-compatible, knows your capstone, answers architecture questions, deployed as a shareable link for hiring managers.
Demo Day
All 16 Named Projects — Prep Track + 17 Weekly Deliverables
Prep A
DocParser API
FastAPI Metadata Service
Prep B
ChunkForge Pipeline
Docker Chunking Pipeline
Prep C
DevReady Kit
Repo Setup + AGENTS.md
Week 01
ReleaseBot
Release Intelligence Assistant
Week 02 ★
IntentIQ
Classifier + First Eval Dataset
Week 03
TicketStream
Structured Intake + Streaming
Week 04
KnowledgeVault
Multimodal Runbook Search
Week 05
CitationRAG
Engineering Knowledge Assistant
Week 06
RAGOptimizer
Retrieval Quality Lab
Week 07 ★
BreakRAG™
Adversarial Eval Harness
Week 08 ★
SpecialistTuner
Dual Fine-Tuning Lab + Memo
Week 09
OpsAssist
Agent + State Diagram
Week 10 ★
TriageFlow™
Multi-Agent + Long-Term Memory
Week 11 ★
AgentMesh™
A2A Network + MCP Server
Week 12 ★
GuardianAI™
Security + PII + Responsible AI
Week 13 ★
WorkbenchAI™
Streaming UI + Feedback Loop
Week 14
DeployCore
Deployment + Async Jobs
Week 15 ★
ReliabilityKit™
Hallucination Debug + Versioning
Week 16 ★
CostGuard™
FinOps Router + SRE Bot
Week 17
AgentForge™
Demo Day + Portfolio Agent
Signature Projects

Six builds that make
your GitHub stand out.

Real engineering problems. Real architectures. Real things that break. Each one a named tool you own after graduation.

★ Differentiator
Week 11 — Agent Interoperability
AgentMesh™
A2A Remote Specialist + Live Cohort Network + MCP Server
Publish a signed Agent Card, expose your agent as A2A-compatible with SSE streaming and JWT auth, AND publish one MCP tool server. During the live lab, one student's orchestrator discovers and delegates to another's remote agent in real time. The system design case study shows how this scales to enterprise. No other bootcamp runs a live inter-student A2A network.
A2A Protocol v0.3MCP SDKJSON-RPCSSEJWT/OIDCFastAPI
✦ Resume: "Built A2A-compatible remote agent service in live 40-student cohort network — signed Agent Cards, SSE streaming, JWT auth"
★ Differentiator
Week 15 — Reliability
ReliabilityKit™
Hallucination Debugging + Prompt Versioning + Model Upgrade Control
The unglamorous work that takes up 30% of production AI engineering time and almost no course teaches. A hallucination debugging decision tree (retrieval gap? prompt ambiguity? model mismatch? parsing failure?), a Git-based prompt versioning workflow with regression checks, a model upgrade protocol with frozen test sets and A/B rollout, Pydantic enforcement patterns, and fallback strategies for when structured output breaks entirely.
Prompt VersioningModel A/B TestingFrozen Test SetsPydantic v2Parsing Fallbacks
✦ Resume: "Built AI reliability toolkit: hallucination debugging framework, prompt versioning workflow, model upgrade control with frozen test sets"
★ Differentiator
Week 12 — Security + Ethics
GuardianAI™
Adversarial Lab + PII Shield + Responsible AI Checklist
Students attack and defend in pairs — prompt injection, data exfiltration, tool misuse. Add Microsoft Presidio PII scrubbing on inputs and outputs, NER-based entity detection, and PII handling in RAG. Complete a formal responsible AI checklist: HITL design, bias assessment, GDPR/CCPA compliance. Security and ethics as one integrated discipline.
PresidiospaCy NERPrompt InjectionRBACHITLResponsible AI
✦ Resume: "Implemented PII detection, prompt injection defense, and HITL governance — validated through live peer red-teaming and formal responsible AI checklist"
Week 7 — Evaluation Science
BreakRAG™
Adversarial Eval + Evaluation Science + User Feedback Loop
Full eval pipeline with inter-rater reliability, evaluator bias controls, statistical significance checks, and dataset leakage prevention — on top of adversarial queries, multi-model LLM-as-judge scoring across 5 providers, and regression CI. User feedback from WorkbenchAI™ feeds in automatically. The mature version of the eval instinct planted in Week 2.
Golden DatasetsLLM-as-JudgeRAGASRegression CI5 ProvidersEval Science
✦ Resume: "Built adversarial eval harness with evaluation science rigour (inter-rater reliability, evaluator bias controls) — runs as CI check wired to live user feedback"
Week 8 — Open Source
SpecialistTuner
Dual Fine-Tuning Lab — PEFT/LoRA + OpenAI API + Decision Memo
Fine-tune the same task via PEFT/LoRA on Llama 3 AND via OpenAI fine-tuning API. 30-minute LoRA mechanics explainer first — students understand what low-rank decomposition is actually doing before touching the code. Benchmark both against base prompting on accuracy, latency, cost, and data privacy. Decision memo: when does each approach win?
PEFT / LoRALlama 3Hugging FaceOllamaOpenAI Fine-Tuning
✦ Resume: "Fine-tuned Llama 3 with PEFT/LoRA and GPT-4o-mini via API — benchmarked both against base prompting across accuracy, latency, cost, and privacy"
Week 16 — FinOps
CostGuard™
FinOps SLM Router + Event-Driven Self-Healing SRE Bot
Cost-aware routing layer that classifies tasks and routes frontier vs. SLM, applies prompt caching to repeated context, and generates a weekly cost measurement report (no promised percentages — actual measurement on your actual workload). Plus an event-driven SRE agent that responds to webhook alerts, diagnoses root cause, and proposes remediation with approval before execution.
SLM RoutingPrompt CachingCost DashboardsWebhooksAgentic SRE
✦ Resume: "Built cost-aware LLM routing system with SLM selection, prompt caching, and weekly cost measurement dashboard — documented actual savings on real workload"
The Flagship Capstone

One system. Built
across 17 weeks.

Every module milestone adds one layer to the same system. You graduate with one cohesive, deployed, production-style agent — not seventeen unrelated demos.

★ Flagship Capstone
Agentic Engineering Workspace Copilot
AgentForge™ — Production Agentic System
An AI workspace that ingests internal docs, runbooks, CI/CD context, and screenshots — answers questions with cited multimodal sources, converts requests into structured tickets, routes work between specialist agents, governs every sensitive action through approval gates, detects PII, and runs with full observability, versioning, and cost controls.
Five Specialist Agents
📄
Knowledge Agent
Grounded retrieval, multimodal embeddings, PII-scrubbed citations
🚨
Triage Agent
Request severity, stateful routing, long-term memory
⚙️
Action Agent
Draft actions, ticket ops, HITL approval gates, audit log
💻
Repo Agent
CI/PR context, test assist, code review, AGENTS.md-aware
🛠
SRE Agent
Alert diagnosis, self-healing proposals, webhook-triggered
Weekly Capstone Milestones
Wks 1–2
Domain, users, workflows, data, success metrics, model selection rules, architecture + first eval dataset
Wks 3–4
Typed intake + streaming UI + multimodal ingestion pipeline with CLIP embeddings
Wks 5–6
Grounded Q&A + retrieval improvements with before/after benchmarks
Week 7
Full eval scorecard, regression suite, adversarial test cases, user feedback wiring
Week 8
One fine-tuned component only where it clearly beats base prompting
Week 9
Bounded tool loop with state diagram — three-layer state architecture
Week 10
Multi-agent handoffs, trace replay, long-term memory, human approval gate
Week 11
MCP tool server published, capstone exposed as A2A-compatible remote agent
Week 12
PII scrubbing, RBAC, approval gates, audit logging, responsible AI checklist
Week 13
Repo/CI features + WorkbenchAI™ UI + feedback loop → eval harness
Week 14
Cloud deploy, observability, async pipelines, versioning controls
Week 15
ReliabilityKit™ applied — debugging tree, prompt versioning, model upgrade control
Week 16
SLM routing, prompt caching, cost dashboard, event-driven SRE alert handling
Week 17
Polish, red-team, demo day, full hiring pack, PortfolioAgent deployed
What You Submit at Week 17
FastAPI backend + Docker + CI/CD + AGENTS.md
Deployable, tested, versioned
WorkbenchAI™ streaming frontend
Citations, tool traces, approval actions, feedback UI
Multimodal ingestion pipeline
PDFs, images, tables — CLIP + text embeddings
Tool layer (3+ business actions) + PII scrubbing
Approval gates, RBAC, audit logging
Multi-agent system + state diagram
5 specialist agents, long-term memory, full trace replay
MCP integration + A2A service
Signed Agent Card, SSE streaming, JWT auth
Eval dataset + ReliabilityKit™
Groundedness report, hallucination debug tree, prompt versioning log
Observability + cost measurement dashboard
OpenTelemetry traces, actual cost report on real workload
Responsible AI checklist + governance
HITL design, bias assessment, GDPR/CCPA notes
4× System design architecture diagrams
One per case study — interview-ready
PortfolioAgent + demo video (3–5 min)
A2A-compatible, shareable link, live walkthrough
Case study PDF + resume bullet pack
5 pre-written bullets with your actual metrics
Choose Your Domain at Week 1
Engineering Ops Copilot
Customer Support Agent
HR & Policy Copilot
Legal & Compliance Assistant
Developer Tools Agent
Domain locked at Week 1. Every project feeds into your domain — one cohesive system, not a collection of demos.
The 2026 Agent Protocol Stack

Three layers.
One production architecture.

All three taught with hands-on labs. Most bootcamps still teach none of them.

🔌
MCP
Model Context Protocol · Anthropic → Linux Foundation
  • Connects agents to tools, APIs, and data sources
  • 97M monthly downloads — every major AI provider
  • You build and publish an MCP server in Week 11
  • Your capstone both consumes and exposes MCP capabilities
🤝
A2A
Agent-to-Agent Protocol · Google → Linux Foundation (AAIF)
  • Connects agents to other agents as discoverable peers
  • Agent Cards, task lifecycle, SSE streaming, JWT/OIDC
  • 150+ enterprise organisations standardising on this now
  • Live inter-student A2A network built in Week 11
🌐
WebMCP
Web Access Layer · AAIF — Emerging Standard
  • Connects agents to live web context and browser actions
  • Browser automation and computer-use patterns
  • Covered as advanced content in Week 17 bonus material
  • Completes the three-layer production stack
Both MCP and A2A are governed by the Linux Foundation's AAIF — co-founded by OpenAI, Anthropic, Google, Microsoft, AWS, and Block. Understanding this stack is now as essential as understanding REST APIs.
Technology Stack

The tools you'll
use at work.

Languages & Frameworks
Python 3.12+
FastAPI + Pydantic v2
Async / Await patterns
React (streaming UI)
TypeScript (basics)
AI / Models — Frontier
OpenAI GPT-4o + Agents SDK
Anthropic Claude + MCP SDK
Google Gemini (Vision + Text)
Mistral AI / Cohere
A2A Protocol SDK
AI / Models — Open Source & SLM
Llama 3 (Meta) via Hugging Face
Mistral 7B / Mixtral
Phi-3 (Microsoft)
Ollama (local inference)
PEFT / LoRA fine-tuning
CLIP / Nomic Embed (multimodal)
Agent / RAG / Eval Stack
LangGraph
LlamaIndex
RAGAS (evals)
Microsoft Presidio (PII)
PromptLayer / LangSmith
Data & Storage
Pinecone / Qdrant
PostgreSQL + pgvector
Redis (session state)
Weaviate
S3 / Cloud Storage
Infra & Observability
Docker + Compose
GitHub Actions (CI/CD)
AWS / GCP / Azure
OpenTelemetry
Prometheus + Grafana
Is This For You?

Built for developers
who build things.

✅ Great Fit
Software engineers moving into AI roles
You write production code and want to build LLM-powered systems, not just use them.
Backend developers adding AI features
You're shipping AI and want the full stack — open-source models, evals, reliability, and cost controls.
ML engineers expanding to LLM applications
You know models. You want systems, orchestration, evals, deployment, and the MCP/A2A protocol layer.
Freelancers building AI products
You want production-grade skills and a portfolio that sells itself — open-source to frontier, end-to-end.
Target Roles After Graduation
GenAI Engineer
Agent Engineer
Applied AI Engineer
LLM App Developer
AI Platform Engineer
AI Solutions Engineer
Prerequisites
✓ 1+ year professional programming experience
✓ Comfortable with Python (or complete free prep)
✓ Familiar with REST APIs and JSON
✗ No ML/AI background required
✗ No prior GenAI experience required
Interview Readiness

Know the answers
that actually matter.

Every week includes a "What interviewers ask about this module" callout. Here are eight that come up constantly.

"When would you use RAG vs fine-tuning vs agents?"
Decision framework built across Weeks 2, 5–6, and 8. Backed by real benchmark data from SpecialistTuner — your own PEFT/LoRA vs OpenAI API comparisons, not a theoretical matrix.
↑ Weeks 2, 5–6, 8
"What do you do when your LLM hallucinates in production?"
ReliabilityKit™ decision tree: retrieval gap? prompt ambiguity? model mismatch? parsing failure? Four root causes, four different fixes. You built the tree — you describe it from experience.
↑ Week 15 — ReliabilityKit™
"How do you evaluate an LLM application rigorously?"
Inter-rater reliability, evaluator bias controls, statistical significance, dataset leakage prevention — plus golden datasets, groundedness, adversarial testing. The eval science angle most candidates can't answer.
↑ Week 7 — BreakRAG™
"How do you handle PII in an AI system?"
Input scrubbing with Presidio, NER-based detection, output filtering, tiered RAG access, GDPR implications. You built GuardianAI™ — not just described the concept.
↑ Week 12 — GuardianAI™
"Explain MCP and A2A — when would you use each?"
MCP is agent-to-tool. A2A is agent-to-agent. You built and published both in Week 11. Your capstone exposes a live A2A service. Very few candidates answer this from genuine hands-on experience.
↑ Week 11 — AgentMesh™
"Walk me through designing a system like GitHub Copilot."
You have a one-page architecture diagram from Week 6's case study: code embedding strategy, retrieval pipeline, latency constraints, large-repo context management, feedback loops.
↑ Week 6 — System Design Case Study
"When would you choose an open-source model over a proprietary API?"
You benchmarked both in SpecialistTuner and wrote the decision memo. You can cite real numbers from your own experiments: accuracy, latency, cost, data privacy implications.
↑ Week 8 — SpecialistTuner
"When is multi-agent worth the added complexity?"
Week 10's explicit "reject multi-agent" case study. You argued against it for a specific scenario — you have the reasoning on record, not just the "when to use it" answer.
↑ Week 10 — TriageFlow™
Your Resume After Graduation

Bullets you can defend
in any interview.

Your Name
GenAI / Agent Engineer · CoreSmart Certified · Applied GenAI & Agentic AI Engineering
Projects & Portfolio
Built and deployed AgentForge™ — Agentic Engineering Workspace Copilot — using Python, FastAPI, LangGraph, CLIP multimodal embeddings, and 5 specialist agents. Handles 500+ knowledge queries/day with citations, PII scrubbing, long-term episodic memory, and full OpenTelemetry observability.
Exposed a production A2A-compatible remote agent with signed Agent Cards, SSE streaming, and JWT auth — integrated in a live 40-student cohort agent network. Published one MCP tool server consumed by all cohort agents.
Fine-tuned Llama 3 with PEFT/LoRA and GPT-4o-mini via OpenAI API. Benchmarked both against base prompting across accuracy, latency, cost, and data privacy. Produced decision memo used as reference for model selection decisions throughout capstone.
Built BreakRAG™ — adversarial eval harness with evaluation science rigour: inter-rater reliability controls, evaluator bias mitigation, statistical significance testing, and dataset leakage prevention. Runs as CI check on every deployment, wired to live user feedback signals.
Built ReliabilityKit™ — AI reliability toolkit covering hallucination debugging decision tree, Git-based prompt versioning workflow with regression CI, model upgrade protocol with frozen test sets, and Pydantic enforcement patterns. Implemented GuardianAI™ with Presidio PII detection validated through live adversarial red-teaming.
Technical Skills
Python, FastAPI, LangGraph, OpenAI/Anthropic APIs, PEFT/LoRA, Llama 3, Hugging Face, Ollama, MCP SDK, A2A Protocol, CLIP, Nomic Embed, Presidio, Pinecone/Qdrant, PostgreSQL/pgvector, Redis, Docker, GitHub Actions, AWS, OpenTelemetry, RAGAS, Grafana
What Our Students Say

From people who've
been through it.

The eval-first approach changed how I think about AI engineering. Most courses build first, measure never. Starting with a golden dataset in Week 2 meant every project I shipped had a measurement attached to it.
John Anderson
Software Developer
The system design case studies were the part I didn't know I needed. I'd been calling LLM APIs for a year. Week 1's ChatGPT architecture session was the first time I understood what I was actually building on top of.
Priya Sharma
Backend Engineer
The open-source fine-tuning week genuinely surprised me. I expected to just run a script. The LoRA mechanics explainer before the lab meant I understood why rank 8 vs rank 64 matters — not just how to run the code.
David Johnson
ML Engineer
Pricing & Enrollment

An investment that pays
for itself in one role.

Early Bird
$999
Save $400 · Limited seats · Closes 30 days before cohort
17-week live cohort
3-module free prep track
16 named weekly projects
AgentForge™ end-to-end capstone
4 system design case studies
Live A2A cohort network (Week 11)
CoreSmart certification
6-month community access
✗ 1-on-1 mentoring
✗ Week 18 Advanced Lab
Enroll Early
Standard
$1399
Full enrollment · All cohort dates
17-week live cohort
3-module free prep track
16 named projects + ReliabilityKit™
AgentForge™ + PortfolioAgent
4 system design case studies
Live A2A cohort network
CoreSmart certification
12-month community access
2× 1-on-1 mentoring sessions
✗ Week 18 Advanced Lab
Enroll Now
Premium
$1899
For serious career changers & consultants
Everything in Standard
Week 18: Computer Use + Multimodal Lab
5× 1-on-1 mentoring sessions
Mock technical interview
LinkedIn + resume + portfolio review
Lifetime community access
Priority cohort placement
Apply for Premium
Corporate Training
Custom cohorts for engineering teams of 5–50. Industry-specific capstone domains, bespoke case studies, and group pricing. Open-source LLM fine-tuning and responsible AI governance tracks included.
Talk to Us →
Frequently Asked Questions

Everything else
you want to know.

Why 17 weeks instead of 16?
Week 14 (deployment/observability) and Week 15 (reliability engineering) each need room to breathe. Senior engineers spend months on each of these topics. Compressing them produces shallow coverage. Week 17 is dedicated entirely to capstone hardening and demo day — not new content.
Why does RLHF appear in Week 1 but without a full curriculum?
RLHF as a concept is essential for understanding why different models behave differently and making model selection decisions. The full RL curriculum (MDPs, Policy Gradients, PPO) is a specialist track for ML researchers training foundation models — a different audience. Week 1's RLHF session is a 30-min async reading, not a live lecture slot.
Will I work with both proprietary and open-source models?
Yes — explicitly. Week 2 introduces the frontier vs. open-source decision framework. Week 8 has you fine-tune Llama 3 with PEFT/LoRA alongside OpenAI fine-tuning API, with a LoRA mechanics explainer before the lab. Week 16 covers SLM routing in production.
Is the eval harness introduced in Week 2 or Week 7?
Both. A lightweight eval primer in Week 2 establishes the instinct — golden dataset, pass/fail threshold, accuracy measurement for IntentIQ. BreakRAG™ in Week 7 is the mature, production-grade version of that habit: adversarial testing, evaluation science rigour, multi-model scoring, regression CI.
What makes AgentMesh™ work if a peer's agent is down?
CoreSmart hosts a reference fallback agent that's always available. Students test against peers first, then fall back to the hosted reference agent to complete the A2A lab exercises. The fallback is part of the design — cohort membership is not a hard dependency.
Is responsible AI taught hands-on or just as a lecture?
Hands-on — in Week 12 as part of GuardianAI™. Students complete a formal checklist covering HITL design, bias assessment, and GDPR/CCPA notes. It's a deliverable in the capstone, not a slide deck. The responsible AI checklist is submitted at Week 17 as one of the capstone artefacts.
Next Cohort Starting Soon — 40 Seats Maximum

Stop watching AI happen.
Build it.

17 weeks. 16 named projects. 4 system design cases. One deployed agentic system. A career you can actually talk about.

Reserve Your Seat → Download Syllabus
Cohorts are capped at 40 students per session to maintain live session quality and the inter-student A2A network.
CORESMART.AI
Innovate with Intelligence
Phone
+1 650 499 6634
Email
info@coresmart.ai
Web
www.coresmart.ai
Location
39159 Paseo Padre Pkwy, Ste 311, Fremont, CA 94538