Live Cohort · 17 Weeks · 2026 Edition

Applied GenAI
& Agentic AI
Engineering

The developer bootcamp that takes you from LLM fundamentals to a deployed, production-grade agentic system — with open-source models, multimodal AI, adversarial evals, responsible AI, MCP + A2A protocols, and a portfolio that gets you hired.

Reserve Your Seat → Explore Curriculum

Live Weeks

170+

Hands-On Hours

Named Projects

AI Providers

System Design Cases

Deployed Capstone

Why This Course Exists

Most AI courses teach
topics. This builds engineers.

📚

What Other Courses Give You

Fifty video modules. Jupyter notebooks. A certificate PDF. No open-source models. No system design. No evaluation science. No portfolio that survives contact with a real hiring manager.

⚡

What CoreSmart Gives You

17 weeks of live engineering. 16 named projects. Eval introduced in Week 2 — not Week 7. Evals, reliability, and production practices woven through every module. Open-source models alongside frontier APIs. You graduate with a deployed system and real metrics.

⚠️

The Depth Problem Other Bootcamps Have

Most bootcamps stack too many mental models per week and call it "comprehensive." Students cargo-cult patterns instead of understanding tradeoffs. This course is explicitly paced to avoid that — critical weeks get the space they need.

🔭

Built for 2026 Production Reality

MCP + A2A protocols. Open-source fine-tuning. Multimodal embeddings. PII detection. Stateful memory. Reliability engineering. FinOps. System design case studies. When agents don't make sense. All of it.

What You'll Be Able to Do

Eight skills that
get you hired.

Not topics you've heard of. Things you can prove you've done — with repos, evals, and deployed systems to show for it.

🏗

Build production LLM applications

Streaming APIs, structured outputs, tool calling, multimodal ingestion, Pydantic enforcement, and cost-aware design across proprietary and open-source models.

🔍

Design and evaluate RAG systems

Context engineering, multimodal embeddings, reranking, groundedness evals, adversarial testing — with evaluation instinct from Week 2 onward.

🤖

Engineer agentic systems end-to-end

Tool loops, state management, multi-agent orchestration, stateful memory, MCP servers, A2A-compatible services — and knowing when NOT to use agents.

🛡

Secure, govern, and align AI systems

Prompt injection defense, PII detection, RBAC, HITL design, responsible AI frameworks, Constitutional AI concepts, and bias awareness — in production.

🧪

Build AI reliability systems

Hallucination debugging decision tree, prompt versioning, model upgrade control, Pydantic enforcement, parsing fallbacks, and user feedback loops.

🔩

Work with open-source and small models

Fine-tune with PEFT/LoRA, run local inference with Ollama, route between frontier and SLMs, and know when open-source wins on cost, latency, or data control.

🚀

Deploy, monitor, and optimise

Docker, CI/CD, OpenTelemetry, cost dashboards, prompt caching, SLM routing, versioning, async pipelines, and event-driven self-healing patterns.

📐

Design AI systems from first principles

Four system design case studies. You'll answer "walk me through how you'd design X" for real-world AI systems — not just the model call, but the entire architecture.

Free Prep Track — Modular, Self-Paced, Included

Start from where
you actually are.

Three independent modules. Take only what you need. Students with ML/maths backgrounds can skip ahead and push harder on Weeks 4–7 and 11–12.

Prep Module A

Python, APIs & Dev Tooling

Python: functions, classes, async/await, type hints, testing
REST APIs, JSON, HTTP, FastAPI basics, Pydantic v2 intro
Git, GitHub, branching, PRs, CLI workflows
Virtual environments, pip, Docker basics

Project

DocParser API

FastAPI service that accepts a document and returns validated JSON metadata — title, word count, language, named entities.

Prep Module B

Data, Docker & ML Concepts

Pandas, regex, text preprocessing, similarity concepts
Docker: images, containers, Compose, local pipelines
Transformers, embeddings, vectors — ML for builders, not researchers
Notebooks for exploration; the notebook/production divide

Project

ChunkForge Pipeline

Containerised pipeline that cleans, chunks with metadata, and structures documents — runs fully in Docker.

Prep Module C

AI-Assisted Developer Workflow

AI coding tools: Cursor, GitHub Copilot, Claude Code — practical differences
AGENTS.md: writing repo context files for AI systems
Prompt discipline; review habits for AI-generated code
Repo hygiene: linting, tests, structured commits, CI baseline

Project

DevReady Kit

Repo setup pack with linting, test structure, CI baseline, AI tool prompts, and a working AGENTS.md.

Full 17-Week Curriculum

Every week builds
on the last.

From LLM foundations to deployed, governed, cost-optimised agentic systems. Critical weeks get the breathing room they need. Evaluation instinct starts in Week 2.

🧪

Evaluation Instinct Starts in Week 2 — Not Week 7

Every module from Week 2 onward asks: "How do you know this is working?" A lightweight eval primer in Week 2 establishes the instinct before students build their first system. BreakRAG™ in Week 7 is the mature harness — not the first introduction to the concept.

Week 01

How Would You Design ChatGPT?

Load balancing, context management, streaming infra, abuse detection, cost attribution, multi-tenant architecture

Week 06

How Would You Design GitHub Copilot?

Code embedding, retrieval pipeline, latency constraints, large-repo context management, feedback loops

Week 11

Production Multi-Agent System Design

Orchestration at scale, agent isolation, handoff protocols, trace architecture, failure recovery, cost visibility

Week 14

AI Observability System Design

Token-level cost tracking, latency percentiles, model drift detection, eval regression, incident response

Phase 1 — Foundations, RAG & Evaluation (Weeks 1–8)

Week 01

Modern AI Product Anatomy, LLM Internals & RLHF Concepts

Opens with live code in hour one — streaming API call, structured output, basic tool call. RLHF/alignment is a 30-min async reading, not a live lecture slot.

Hands-On: First LLM App + StreamingModel Layer ArchitectureRetrieval + Tool + Memory LayersLatency-Cost-Quality TriangleRLHF & Alignment (Conceptual)Constitutional AI ConceptsWhy Models Behave Differently

Case Study: How Would You Design ChatGPT?

▸

ReleaseBot — Release Intelligence Assistant. Converts raw update notes into structured changelog + announcement via streaming FastAPI. Week ends with a one-page architecture diagram from the case study.

Foundation

Week 02

LLM Mechanics, Model Choice & Evaluation Primer

Eval primer introduced here. Every project from now asks: "How do I know this is working?" Students build a golden dataset and pass/fail threshold for IntentIQ this week.

Transformer Behaviour in PracticeContext WindowsPrompting vs RAG vs Fine-Tuning vs AgentsModel Selection FrameworkEval Primer: Golden Datasets + ThresholdsClassification Accuracy + Pass/FailFrontier vs Open-Source Decision CriteriaSLM vs LLM

▸

IntentIQ — Issue Classification Engine. Categorises messages with typed confidence scores. Benchmarks two providers + one open-source model. Includes first golden dataset and accuracy measurement — eval starts here.

Foundation + Eval

Week 03

Prompt Engineering, Structured Outputs, Tool Calling & Streaming

Week restructured across 3 progressive sessions: Day 1 — prompting + structured outputs. Day 2 — tool calling. Day 3 — streaming + retry logic. TicketStream ties all three together at week end.

System Prompt DesignFew-Shot PatternsPydantic v2 SchemasOutput Enforcement Progression: Prompt → JSON Mode → Pydantic → Tool Calling → Structured APISSE StreamingRetry & Validation LogicParsing Unformatted LLM Responses

▸

TicketStream — Structured Intake Bot with Streaming. Converts messy messages into validated Pydantic objects via SSE streaming. Teaches the full output-enforcement progression from soft prompt to strict schema.

Core Skill

Week 04

Multimodal Ingestion, Embeddings & Vector Databases

Embedding ModelsChunking StrategiesMetadata DesignHybrid RetrievalGPT-4o Vision + Claude VisionPDF + Image + Table IngestionCLIP / OpenCLIP EmbeddingsNomic Embed (Multimodal)Index Maintenance

▸

KnowledgeVault — Multimodal Runbook Search. Semantic search over docs, PDFs with figures, tables, and screenshots. Uses CLIP for visual content alongside text embeddings. Most RAG courses only cover plain text.

RAG

Week 05

RAG Foundations & Grounded Answers with Citations

RAG Pipeline ArchitectureCitation GenerationFallback BehaviourAnswer ConfidenceHallucination ControlsConversation State

▸

CitationRAG — Engineering Knowledge Assistant. Answers with citations, handles "I don't know" gracefully. Groundedness measured on golden dataset from Week 2 — students see the improvement arc.

RAG

Week 06

Advanced RAG & Context Engineering

Query Transformation (HyDE)Step-Back PromptingRerankingMulti-Query RetrievalContext CompressionContext Engineering PrinciplesBefore/After Benchmarking

Case Study: How Would You Design GitHub Copilot?

▸

RAGOptimizer — Retrieval Quality Lab. Improve Week 5 measurably — query transformation, reranking, compression. Document before/after on latency, recall, groundedness. Copilot case study architecture feeds the optimisation decisions.

Context Eng.

Week 07

Evaluation Science, Adversarial Testing & Experimentation

BreakRAG™ is the mature eval harness. Students already have eval instinct from Week 2. This week adds: inter-rater reliability, evaluator bias in LLM judges, prompt sensitivity, statistical significance in A/B tests, and dataset leakage prevention.

Inter-Rater ReliabilityEvaluator Bias in LLM JudgesPrompt Sensitivity of JudgesStatistical Significance in A/B TestsDataset Leakage PreventionGolden DatasetsGroundedness MetricsRegression TestingRed-TeamingLLM-as-Judge PipelineMulti-Model Scoring (5 Providers)User Feedback → Eval Loop

▸

BreakRAG™ — Adversarial Eval & Red-Team Harness. Full evaluation pipeline with adversarial queries, multi-model LLM-as-judge scoring across 5 providers, regression CI, and user feedback wiring. The rigorous version of what started in Week 2.

Eval Science

Week 08

Fine-Tuning, Open-Source LLMs, LoRA Mechanics & SLM Specialisation

Includes a 30-min "what LoRA is actually doing" explainer: low-rank decomposition, which weight matrices, rank 8 vs rank 64 tradeoffs. Not full maths — genuine mechanical intuition before touching the code.

When Fine-Tuning Wins (Decision Framework)Dataset CurationLoRA Mechanics (What's Actually Happening)PEFT / LoRA on Llama 3 / MistralHugging Face Training PipelineOllama Local InferenceOpenAI Fine-Tuning API (GPT-4o-mini)SLM Routing: When Smaller WinsProprietary vs Open-Source Cost/Control/Privacy

▸

SpecialistTuner — Dual Fine-Tuning Lab + Decision Memo. Fine-tune via PEFT/LoRA on Llama 3 AND OpenAI fine-tuning API. Benchmark both against base prompting on accuracy, latency, cost, and data privacy. Decision memo: "when does each approach win?" — the rarest deliverable in any AI course.

Open Source

Phase 2 — Agentic AI Engineering (Weeks 9–13)

Week 09

Agentic AI Foundations & Agent State Management

Students build the agent loop from first principles before touching any framework. The "when NOT to use agents" case study is weighted equally to the build exercise — the rejection decision is practised explicitly.

Workflows vs Agents (With Explicit "Reject" Exercise)Tool Loops & Stop ConditionsAgent Design Patterns (ReAct, Reflection)Short-Term State (Session Context, Tool History)Workflow State (Step Tracking, Checkpoints)Persistent State (Resume After Failure)State Diagrams as Deliverables

▸

OpsAssist — Tool-Using Ops Agent + State Diagram. Single agent with defined tool schemas, stop conditions, and retry logic. Full state diagram showing short-term/workflow/persistent layers is a required deliverable. Includes the "reject agents" decision exercise.

Agentic

Week 10

Multi-Agent Orchestration, Long-Term Memory & When NOT to Over-Engineer

Explicit "reject multi-agent" case study: students receive a scenario and must argue against multi-agent as the right solution. The complexity-rejection decision is practised as rigorously as the complexity-adoption decision.

Routing & HandoffsManager-Worker PatternsGraph Orchestration (LangGraph)When NOT to Use Multi-Agent (Case Study)OpenTelemetry TracingIn-Context MemoryExternal Memory (Vector Store)Episodic Memory (Conversation Summaries)Procedural Memory (Learned Preferences)Memory Lifecycle Management

▸

TriageFlow™ — Incident Multi-Agent Workflow + Memory. Triage → knowledge → action agents with full trace replay, human approval gate, and four-layer memory (Redis session, pgvector long-term, episodic summaries, procedural preferences).

Agentic

Week 11

Agent Interoperability — MCP, A2A & AGENTS.md

MCP Architecture & Tool DesignPublishing MCP ServersA2A Protocol (Agent Cards)Client-Remote ArchitectureTask Lifecycle & SSE StreamingJWT/OIDC SecurityAGENTS.md Spec Files

Case Study: Production Multi-Agent System Design

▸

AgentMesh™ — A2A Remote Specialist + Shared MCP Server. Publish a signed Agent Card, expose A2A-compatible service with SSE + JWT, AND publish one MCP server the cohort consumes. Live inter-student A2A network during the session.

MCP + A2A

Week 12

Guardrails, PII Detection, Responsible AI & HITL

Prompt Injection DefenseTool Misuse PreventionApproval Checkpoints + Audit LoggingPII Detection (Microsoft Presidio)NER-Based Entity ScrubbingOutput PII FilteringPII in RAG PipelinesResponsible AI FrameworkBias & Fairness AwarenessHITL Design PatternsGDPR/CCPA in AI Systems

▸

GuardianAI™ — Adversarial Lab + PII Shield + Responsible AI Checklist. Students attack and defend in pairs (prompt injection, exfiltration, tool misuse). Add Presidio PII scrubbing on inputs + outputs. Complete formal responsible AI checklist with HITL design, bias assessment, GDPR notes.

Security + Ethics

Week 13

Developer Tooling Agents, Streaming Operator UI & UX Feedback Loops

Repo Intelligence AgentsCI/CD Context + PR AssistanceStreaming Chat UI (SSE + React)Citation RenderingTool Trace DisplayApproval Action UIThumbs Up/Down + Correction FlowsFeedback Storage → Eval HarnessOnline vs Offline Eval Loop

▸

WorkbenchAI™ — Streaming Developer UI + Feedback Loop. Portfolio-quality UI: live citations, tool traces, approval actions. Wires the full user feedback loop — thumbs/corrections → feedback database → BreakRAG™ eval harness. Closes the online/offline eval cycle.

Dev Tools

Phase 3 — Production, Reliability & FinOps (Weeks 14–16)

Week 14

Deployment, Observability, Versioning & Async Pipelines

Docker + CI/CDCloud DeploymentOpenTelemetry + TracingMonitoring + AlertingModel + Prompt VersioningDataset VersioningAsync Background JobsBatch Processing Pipelines

Case Study: AI Observability System Design

▸

DeployCore — Production Deployment with Background Jobs. Cloud-deployed agent with OpenTelemetry, async ingestion/eval pipelines, and versioning controls. Observability case study architecture grounds the design decisions.

LLMOps

Week 15

AI Application Reliability Engineering

This week gets its own space — not compressed with deployment. The unglamorous work that takes up 30% of production AI engineering time and almost no course teaches properly.

Hallucination Debugging Decision TreePrompt Versioning in GitModel Upgrade Control (A/B, Frozen Test Sets)Pydantic Response Enforcement PatternsParsing Fallback StrategiesRetry + Backoff ArchitectureWhen Structured Output Breaks

▸

ReliabilityKit™ — AI Reliability Engineering Toolkit. Hallucination debugging decision tree, Git-based prompt versioning workflow with regression checks, model upgrade protocol with frozen test sets and A/B rollout, Pydantic enforcement patterns, and fallback strategies for when structured output breaks entirely.

Reliability

Week 16

FinOps, Prompt Caching, SLM Routing & Event-Driven Self-Healing

Cost reduction targets are removed from the brochure. Students learn to measure and optimise their actual costs on their actual workload — the measurement skill, not a promised percentage.

Prompt Caching (Anthropic/OpenAI)Intelligent SLM/LLM RoutingPer-Request Cost BudgetsCost Dashboards + MeasurementEvent-Driven Webhook TriggersAgentic SRE PatternsPause/Resume + Human Approval

▸

CostGuard™ — FinOps Router + Self-Healing SRE Bot. Cost-aware routing layer with prompt caching and a weekly cost measurement report. Plus an event-driven SRE agent that listens for webhook alerts, diagnoses root cause, and proposes remediation with human approval.

FinOps

Phase 4 — Capstone Hardening & Demo Day (Week 17)

Week 17

Final Capstone Hardening, Demo Day & Hiring Pack

Production PolishFull Red-Team ReviewEval + Reliability ReportArchitecture Case StudyResponsible AI ChecklistCost AnalysisPortfolioAgent (Career Digital Twin)Technical Storytelling

▸

AgentForge™ — Final Demo Day. Live demo, architecture review, eval report, ReliabilityKit™ walkthrough, security checklist audit, cost analysis. Bonus: PortfolioAgent — A2A-compatible, knows your capstone, answers architecture questions, deployed as a shareable link for hiring managers.

Demo Day

                All 16 Named Projects — Prep Track + 17 Weekly Deliverables

Prep A

DocParser API

FastAPI Metadata Service

Prep B

ChunkForge Pipeline

Docker Chunking Pipeline

Prep C

DevReady Kit

Repo Setup + AGENTS.md

Week 01

ReleaseBot

Release Intelligence Assistant

Week 02 ★

IntentIQ

Classifier + First Eval Dataset

Week 03

TicketStream

Structured Intake + Streaming

Week 04

KnowledgeVault

Multimodal Runbook Search

Week 05

CitationRAG

Engineering Knowledge Assistant

Week 06

RAGOptimizer

Retrieval Quality Lab

Week 07 ★

BreakRAG™

Adversarial Eval Harness

Week 08 ★

SpecialistTuner

Dual Fine-Tuning Lab + Memo

Week 09

OpsAssist

Agent + State Diagram

Week 10 ★

TriageFlow™

Multi-Agent + Long-Term Memory

Week 11 ★

AgentMesh™

A2A Network + MCP Server

Week 12 ★

GuardianAI™

Security + PII + Responsible AI

Week 13 ★

WorkbenchAI™

Streaming UI + Feedback Loop

Week 14

DeployCore

Deployment + Async Jobs

Week 15 ★

ReliabilityKit™

Hallucination Debug + Versioning

Week 16 ★

CostGuard™

FinOps Router + SRE Bot

Week 17

AgentForge™

Demo Day + Portfolio Agent

Signature Projects

Six builds that make
your GitHub stand out.

Real engineering problems. Real architectures. Real things that break. Each one a named tool you own after graduation.

★ Differentiator

Week 11 — Agent Interoperability

AgentMesh™

A2A Remote Specialist + Live Cohort Network + MCP Server

Publish a signed Agent Card, expose your agent as A2A-compatible with SSE streaming and JWT auth, AND publish one MCP tool server. During the live lab, one student's orchestrator discovers and delegates to another's remote agent in real time. The system design case study shows how this scales to enterprise. No other bootcamp runs a live inter-student A2A network.

A2A Protocol v0.3MCP SDKJSON-RPCSSEJWT/OIDCFastAPI

✦ Resume: "Built A2A-compatible remote agent service in live 40-student cohort network — signed Agent Cards, SSE streaming, JWT auth"

★ Differentiator

Week 15 — Reliability

ReliabilityKit™

Hallucination Debugging + Prompt Versioning + Model Upgrade Control

The unglamorous work that takes up 30% of production AI engineering time and almost no course teaches. A hallucination debugging decision tree (retrieval gap? prompt ambiguity? model mismatch? parsing failure?), a Git-based prompt versioning workflow with regression checks, a model upgrade protocol with frozen test sets and A/B rollout, Pydantic enforcement patterns, and fallback strategies for when structured output breaks entirely.

Prompt VersioningModel A/B TestingFrozen Test SetsPydantic v2Parsing Fallbacks

✦ Resume: "Built AI reliability toolkit: hallucination debugging framework, prompt versioning workflow, model upgrade control with frozen test sets"

★ Differentiator

Week 12 — Security + Ethics

GuardianAI™

Adversarial Lab + PII Shield + Responsible AI Checklist

Students attack and defend in pairs — prompt injection, data exfiltration, tool misuse. Add Microsoft Presidio PII scrubbing on inputs and outputs, NER-based entity detection, and PII handling in RAG. Complete a formal responsible AI checklist: HITL design, bias assessment, GDPR/CCPA compliance. Security and ethics as one integrated discipline.

PresidiospaCy NERPrompt InjectionRBACHITLResponsible AI

✦ Resume: "Implemented PII detection, prompt injection defense, and HITL governance — validated through live peer red-teaming and formal responsible AI checklist"

Week 7 — Evaluation Science

BreakRAG™

Adversarial Eval + Evaluation Science + User Feedback Loop

Full eval pipeline with inter-rater reliability, evaluator bias controls, statistical significance checks, and dataset leakage prevention — on top of adversarial queries, multi-model LLM-as-judge scoring across 5 providers, and regression CI. User feedback from WorkbenchAI™ feeds in automatically. The mature version of the eval instinct planted in Week 2.

Golden DatasetsLLM-as-JudgeRAGASRegression CI5 ProvidersEval Science

✦ Resume: "Built adversarial eval harness with evaluation science rigour (inter-rater reliability, evaluator bias controls) — runs as CI check wired to live user feedback"

Week 8 — Open Source

SpecialistTuner

Dual Fine-Tuning Lab — PEFT/LoRA + OpenAI API + Decision Memo

Fine-tune the same task via PEFT/LoRA on Llama 3 AND via OpenAI fine-tuning API. 30-minute LoRA mechanics explainer first — students understand what low-rank decomposition is actually doing before touching the code. Benchmark both against base prompting on accuracy, latency, cost, and data privacy. Decision memo: when does each approach win?

PEFT / LoRALlama 3Hugging FaceOllamaOpenAI Fine-Tuning

✦ Resume: "Fine-tuned Llama 3 with PEFT/LoRA and GPT-4o-mini via API — benchmarked both against base prompting across accuracy, latency, cost, and privacy"

Week 16 — FinOps

CostGuard™

FinOps SLM Router + Event-Driven Self-Healing SRE Bot

Cost-aware routing layer that classifies tasks and routes frontier vs. SLM, applies prompt caching to repeated context, and generates a weekly cost measurement report (no promised percentages — actual measurement on your actual workload). Plus an event-driven SRE agent that responds to webhook alerts, diagnoses root cause, and proposes remediation with approval before execution.

SLM RoutingPrompt CachingCost DashboardsWebhooksAgentic SRE

✦ Resume: "Built cost-aware LLM routing system with SLM selection, prompt caching, and weekly cost measurement dashboard — documented actual savings on real workload"

The Flagship Capstone

One system. Built
across 17 weeks.

Every module milestone adds one layer to the same system. You graduate with one cohesive, deployed, production-style agent — not seventeen unrelated demos.

★ Flagship Capstone

Agentic Engineering Workspace Copilot

AgentForge™ — Production Agentic System

An AI workspace that ingests internal docs, runbooks, CI/CD context, and screenshots — answers questions with cited multimodal sources, converts requests into structured tickets, routes work between specialist agents, governs every sensitive action through approval gates, detects PII, and runs with full observability, versioning, and cost controls.

                Five Specialist Agents

📄

Knowledge Agent

Grounded retrieval, multimodal embeddings, PII-scrubbed citations

🚨

Triage Agent

Request severity, stateful routing, long-term memory

⚙️

Action Agent

Draft actions, ticket ops, HITL approval gates, audit log

💻

Repo Agent

CI/PR context, test assist, code review, AGENTS.md-aware

🛠

SRE Agent

Alert diagnosis, self-healing proposals, webhook-triggered

                Weekly Capstone Milestones

Wks 1–2

Domain, users, workflows, data, success metrics, model selection rules, architecture + first eval dataset

Wks 3–4

Typed intake + streaming UI + multimodal ingestion pipeline with CLIP embeddings

Wks 5–6

Grounded Q&A + retrieval improvements with before/after benchmarks

Week 7

Full eval scorecard, regression suite, adversarial test cases, user feedback wiring

Week 8

One fine-tuned component only where it clearly beats base prompting

Week 9

Bounded tool loop with state diagram — three-layer state architecture

Week 10

Multi-agent handoffs, trace replay, long-term memory, human approval gate

Week 11

MCP tool server published, capstone exposed as A2A-compatible remote agent

Week 12

PII scrubbing, RBAC, approval gates, audit logging, responsible AI checklist

Week 13

Repo/CI features + WorkbenchAI™ UI + feedback loop → eval harness

Week 14

Cloud deploy, observability, async pipelines, versioning controls

Week 15

ReliabilityKit™ applied — debugging tree, prompt versioning, model upgrade control

Week 16

SLM routing, prompt caching, cost dashboard, event-driven SRE alert handling

Week 17

Polish, red-team, demo day, full hiring pack, PortfolioAgent deployed

                What You Submit at Week 17

✓

FastAPI backend + Docker + CI/CD + AGENTS.md

Deployable, tested, versioned

✓

WorkbenchAI™ streaming frontend

Citations, tool traces, approval actions, feedback UI

✓

Multimodal ingestion pipeline

PDFs, images, tables — CLIP + text embeddings

✓

Tool layer (3+ business actions) + PII scrubbing

Approval gates, RBAC, audit logging

✓

Multi-agent system + state diagram

5 specialist agents, long-term memory, full trace replay

✓

MCP integration + A2A service

Signed Agent Card, SSE streaming, JWT auth

✓

Eval dataset + ReliabilityKit™

Groundedness report, hallucination debug tree, prompt versioning log

✓

Observability + cost measurement dashboard

OpenTelemetry traces, actual cost report on real workload

✓

Responsible AI checklist + governance

HITL design, bias assessment, GDPR/CCPA notes

✓

4× System design architecture diagrams

One per case study — interview-ready

✓

PortfolioAgent + demo video (3–5 min)

A2A-compatible, shareable link, live walkthrough

✓

Case study PDF + resume bullet pack

5 pre-written bullets with your actual metrics

Choose Your Domain at Week 1

Engineering Ops Copilot

Customer Support Agent

HR & Policy Copilot

Legal & Compliance Assistant

Developer Tools Agent

                    Domain locked at Week 1. Every project feeds into your domain — one cohesive system, not a
                    collection of demos.

The 2026 Agent Protocol Stack

Three layers.
One production architecture.

All three taught with hands-on labs. Most bootcamps still teach none of them.

🔌

MCP

Model Context Protocol · Anthropic → Linux Foundation

Connects agents to tools, APIs, and data sources
97M monthly downloads — every major AI provider
You build and publish an MCP server in Week 11
Your capstone both consumes and exposes MCP capabilities

🤝

A2A

Agent-to-Agent Protocol · Google → Linux Foundation (AAIF)

Connects agents to other agents as discoverable peers
Agent Cards, task lifecycle, SSE streaming, JWT/OIDC
150+ enterprise organisations standardising on this now
Live inter-student A2A network built in Week 11

🌐

WebMCP

Web Access Layer · AAIF — Emerging Standard

Connects agents to live web context and browser actions
Browser automation and computer-use patterns
Covered as advanced content in Week 17 bonus material
Completes the three-layer production stack

            Both MCP and A2A are governed by the Linux Foundation's AAIF — co-founded by OpenAI, Anthropic, Google,
            Microsoft, AWS, and Block. Understanding this stack is now as essential as understanding REST APIs.

Technology Stack

The tools you'll
use at work.

Languages & Frameworks

Python 3.12+

FastAPI + Pydantic v2

Async / Await patterns

React (streaming UI)

TypeScript (basics)

AI / Models — Frontier

OpenAI GPT-4o + Agents SDK

Anthropic Claude + MCP SDK

Google Gemini (Vision + Text)

Mistral AI / Cohere

A2A Protocol SDK

AI / Models — Open Source & SLM

Llama 3 (Meta) via Hugging Face

Mistral 7B / Mixtral

Phi-3 (Microsoft)

Ollama (local inference)

PEFT / LoRA fine-tuning

CLIP / Nomic Embed (multimodal)

Agent / RAG / Eval Stack

LangGraph

LlamaIndex

RAGAS (evals)

Microsoft Presidio (PII)

PromptLayer / LangSmith

Data & Storage

Pinecone / Qdrant

PostgreSQL + pgvector

Redis (session state)

Weaviate

S3 / Cloud Storage

Infra & Observability

Docker + Compose

GitHub Actions (CI/CD)

AWS / GCP / Azure

OpenTelemetry

Prometheus + Grafana

Is This For You?

Built for developers
who build things.

✅ Great Fit

→

Software engineers moving into AI roles

You write production code and want to build LLM-powered systems, not just use them.

→

Backend developers adding AI features

You're shipping AI and want the full stack — open-source models, evals, reliability, and cost controls.

→

ML engineers expanding to LLM applications

You know models. You want systems, orchestration, evals, deployment, and the MCP/A2A protocol layer.

→

Freelancers building AI products

You want production-grade skills and a portfolio that sells itself — open-source to frontier, end-to-end.

Target Roles After Graduation

GenAI Engineer

Agent Engineer

Applied AI Engineer

LLM App Developer

AI Platform Engineer

AI Solutions Engineer

Prerequisites

✓ 1+ year professional programming experience
✓ Comfortable with Python (or complete free prep)
✓ Familiar with REST APIs and JSON
✗ No ML/AI background required
✗ No prior GenAI experience required

Interview Readiness

Know the answers
that actually matter.

Every week includes a "What interviewers ask about this module" callout. Here are eight that come up constantly.

"When would you use RAG vs fine-tuning vs agents?"

Decision framework built across Weeks 2, 5–6, and 8. Backed by real benchmark data from SpecialistTuner — your own PEFT/LoRA vs OpenAI API comparisons, not a theoretical matrix.

↑ Weeks 2, 5–6, 8

"What do you do when your LLM hallucinates in production?"

ReliabilityKit™ decision tree: retrieval gap? prompt ambiguity? model mismatch? parsing failure? Four root causes, four different fixes. You built the tree — you describe it from experience.

↑ Week 15 — ReliabilityKit™

"How do you evaluate an LLM application rigorously?"

Inter-rater reliability, evaluator bias controls, statistical significance, dataset leakage prevention — plus golden datasets, groundedness, adversarial testing. The eval science angle most candidates can't answer.

↑ Week 7 — BreakRAG™

"How do you handle PII in an AI system?"

Input scrubbing with Presidio, NER-based detection, output filtering, tiered RAG access, GDPR implications. You built GuardianAI™ — not just described the concept.

↑ Week 12 — GuardianAI™

"Explain MCP and A2A — when would you use each?"

MCP is agent-to-tool. A2A is agent-to-agent. You built and published both in Week 11. Your capstone exposes a live A2A service. Very few candidates answer this from genuine hands-on experience.

↑ Week 11 — AgentMesh™

"Walk me through designing a system like GitHub Copilot."

You have a one-page architecture diagram from Week 6's case study: code embedding strategy, retrieval pipeline, latency constraints, large-repo context management, feedback loops.

↑ Week 6 — System Design Case Study

"When would you choose an open-source model over a proprietary API?"

You benchmarked both in SpecialistTuner and wrote the decision memo. You can cite real numbers from your own experiments: accuracy, latency, cost, data privacy implications.

↑ Week 8 — SpecialistTuner

"When is multi-agent worth the added complexity?"

Week 10's explicit "reject multi-agent" case study. You argued against it for a specific scenario — you have the reasoning on record, not just the "when to use it" answer.

↑ Week 10 — TriageFlow™

Your Resume After Graduation

Bullets you can defend
in any interview.

Your Name

GenAI / Agent Engineer · CoreSmart Certified · Applied GenAI & Agentic AI Engineering

Projects & Portfolio

Built and deployed AgentForge™ — Agentic Engineering Workspace Copilot — using Python, FastAPI, LangGraph, CLIP multimodal embeddings, and 5 specialist agents. Handles 500+ knowledge queries/day with citations, PII scrubbing, long-term episodic memory, and full OpenTelemetry observability.

Exposed a production A2A-compatible remote agent with signed Agent Cards, SSE streaming, and JWT auth — integrated in a live 40-student cohort agent network. Published one MCP tool server consumed by all cohort agents.

Fine-tuned Llama 3 with PEFT/LoRA and GPT-4o-mini via OpenAI API. Benchmarked both against base prompting across accuracy, latency, cost, and data privacy. Produced decision memo used as reference for model selection decisions throughout capstone.

Built BreakRAG™ — adversarial eval harness with evaluation science rigour: inter-rater reliability controls, evaluator bias mitigation, statistical significance testing, and dataset leakage prevention. Runs as CI check on every deployment, wired to live user feedback signals.

Built ReliabilityKit™ — AI reliability toolkit covering hallucination debugging decision tree, Git-based prompt versioning workflow with regression CI, model upgrade protocol with frozen test sets, and Pydantic enforcement patterns. Implemented GuardianAI™ with Presidio PII detection validated through live adversarial red-teaming.

Technical Skills

Python, FastAPI, LangGraph, OpenAI/Anthropic APIs, PEFT/LoRA, Llama 3, Hugging Face, Ollama, MCP SDK, A2A Protocol, CLIP, Nomic Embed, Presidio, Pinecone/Qdrant, PostgreSQL/pgvector, Redis, Docker, GitHub Actions, AWS, OpenTelemetry, RAGAS, Grafana

What Our Students Say

From people who've
been through it.

The eval-first approach changed how I think about AI engineering. Most courses build first, measure never. Starting with a golden dataset in Week 2 meant every project I shipped had a measurement attached to it.

John Anderson

Software Developer

The system design case studies were the part I didn't know I needed. I'd been calling LLM APIs for a year. Week 1's ChatGPT architecture session was the first time I understood what I was actually building on top of.

Priya Sharma

Backend Engineer

The open-source fine-tuning week genuinely surprised me. I expected to just run a script. The LoRA mechanics explainer before the lab meant I understood why rank 8 vs rank 64 matters — not just how to run the code.

David Johnson

ML Engineer

Pricing & Enrollment

An investment that pays
for itself in one role.

Early Bird

^$999

Save $400 · Limited seats · Closes 30 days before cohort

✓17-week live cohort

✓3-module free prep track

✓16 named weekly projects

✓AgentForge™ end-to-end capstone

✓4 system design case studies

✓Live A2A cohort network (Week 11)

✓CoreSmart certification

✓6-month community access

✗ 1-on-1 mentoring

✗ Week 18 Advanced Lab

Enroll Early

Standard

^$1399

Full enrollment · All cohort dates

✓17-week live cohort

✓3-module free prep track

✓16 named projects + ReliabilityKit™

✓AgentForge™ + PortfolioAgent

✓4 system design case studies

✓Live A2A cohort network

✓CoreSmart certification

✓12-month community access

✓2× 1-on-1 mentoring sessions

✗ Week 18 Advanced Lab

Enroll Now

Premium

^$1899

For serious career changers & consultants

✓Everything in Standard

✓Week 18: Computer Use + Multimodal Lab

✓5× 1-on-1 mentoring sessions

✓Mock technical interview

✓LinkedIn + resume + portfolio review

✓Lifetime community access

✓Priority cohort placement

Apply for Premium

Corporate Training

Custom cohorts for engineering teams of 5–50. Industry-specific capstone domains, bespoke case studies, and group pricing. Open-source LLM fine-tuning and responsible AI governance tracks included.

Talk to Us →

Frequently Asked Questions

Everything else
you want to know.

Why 17 weeks instead of 16?

Week 14 (deployment/observability) and Week 15 (reliability engineering) each need room to breathe. Senior engineers spend months on each of these topics. Compressing them produces shallow coverage. Week 17 is dedicated entirely to capstone hardening and demo day — not new content.

Why does RLHF appear in Week 1 but without a full curriculum?

RLHF as a concept is essential for understanding why different models behave differently and making model selection decisions. The full RL curriculum (MDPs, Policy Gradients, PPO) is a specialist track for ML researchers training foundation models — a different audience. Week 1's RLHF session is a 30-min async reading, not a live lecture slot.

Will I work with both proprietary and open-source models?

Yes — explicitly. Week 2 introduces the frontier vs. open-source decision framework. Week 8 has you fine-tune Llama 3 with PEFT/LoRA alongside OpenAI fine-tuning API, with a LoRA mechanics explainer before the lab. Week 16 covers SLM routing in production.

Is the eval harness introduced in Week 2 or Week 7?

Both. A lightweight eval primer in Week 2 establishes the instinct — golden dataset, pass/fail threshold, accuracy measurement for IntentIQ. BreakRAG™ in Week 7 is the mature, production-grade version of that habit: adversarial testing, evaluation science rigour, multi-model scoring, regression CI.

What makes AgentMesh™ work if a peer's agent is down?

CoreSmart hosts a reference fallback agent that's always available. Students test against peers first, then fall back to the hosted reference agent to complete the A2A lab exercises. The fallback is part of the design — cohort membership is not a hard dependency.

Is responsible AI taught hands-on or just as a lecture?

Hands-on — in Week 12 as part of GuardianAI™. Students complete a formal checklist covering HITL design, bias assessment, and GDPR/CCPA notes. It's a deliverable in the capstone, not a slide deck. The responsible AI checklist is submitted at Week 17 as one of the capstone artefacts.

Next Cohort Starting Soon — 40 Seats Maximum

Stop watching AI happen.
Build it.

17 weeks. 16 named projects. 4 system design cases. One deployed agentic system. A career you can actually talk about.

Reserve Your Seat → Download Syllabus

Cohorts are capped at 40 students per session to maintain live session quality and the inter-student A2A network.

Applied GenAI& Agentic AIEngineering

Most AI courses teachtopics. This builds engineers.

Eight skills thatget you hired.

Start from whereyou actually are.

Every week buildson the last.

Six builds that makeyour GitHub stand out.

One system. Builtacross 17 weeks.

Three layers.One production architecture.

The tools you'lluse at work.

Built for developerswho build things.

Know the answersthat actually matter.

Bullets you can defendin any interview.

From people who'vebeen through it.

An investment that paysfor itself in one role.

Everything elseyou want to know.

Stop watching AI happen.Build it.

Applied GenAI
& Agentic AI
Engineering

Most AI courses teach
topics. This builds engineers.

Eight skills that
get you hired.

Start from where
you actually are.

Every week builds
on the last.

Six builds that make
your GitHub stand out.

One system. Built
across 17 weeks.

Three layers.
One production architecture.

The tools you'll
use at work.

Built for developers
who build things.

Know the answers
that actually matter.

Bullets you can defend
in any interview.

From people who've
been through it.

An investment that pays
for itself in one role.

Everything else
you want to know.

Stop watching AI happen.
Build it.