Author: aiwithpe

  • Building Your Mobile Edge: An iOS Lab with Angular, Ionic, and Capacitor

    Modern IT work no longer lives only in browsers and back‑office systems; the most visible and valuable experiences now run in people’s hands, on phones they carry everywhere.

    Whether you work in data, web, cloud, or traditional enterprise development, employers increasingly expect you to understand how ideas become mobile apps that feel native, perform well, and integrate securely with the rest of the stack.

    A portfolio that includes concrete mobile work is one of the clearest signals that you can deliver end‑to‑end solutions instead of just isolated code fragments.

    • Build an Ionic + Angular + TypeScript app.
    • Create a Fahrenheit → Celsius converter with a slider.
    • Run it in the iOS Simulator.
    • Deploy it onto a physical iPhone.
    • Reflect on why Angular + Ionic + Capacitor is a powerful combination.

    Citations reference official docs and tutorials.ionicframework+5


    Lab: iOS Temperature Converter with Angular, Ionic, and Capacitor

    https://docs.google.com/presentation/d/18yrkWTm5JRBYm6NOeYdgYdssBTuVpEnE/edit?usp=sharing&ouid=103411675731117310047&rtpof=true&sd=true

    Lab Overview

    In this lab you will:

    1. Set up a Mac + Node + Xcode environment.
    2. Create an Ionic + Angular project in VS Code.
    3. Implement a Fahrenheit → Celsius converter using an Ionic slider.
    4. Wrap the app with Capacitor and run it in the iOS Simulator.
    5. Deploy the app to a physical iPhone for testing.
    6. Understand the advantages of Angular + Ionic + Capacitor.

    Each step ends with a “Success target” so you know what you should see before moving on.


    Step 0 – Prerequisites and Setup (Mac)

    0.1 – Required hardware and OS

    You must have:

    • A Mac running a recent version of macOS that can install:
      • Xcode (from the Mac App Store).
      • Node.js (LTS version).
    • Optional but recommended:
      • An actual iPhone and a Lightning/USB‑C cable, or wireless debugging enabled.

    iOS development requires macOS and Xcode; you cannot run the iOS Simulator on Windows or Linux.appmysite+1

    0.2 – Install Xcode

    1. Open the App Store on macOS.
    2. Install Xcode.
    3. After installation, open Xcode once:
      • Accept the license.
      • Allow any additional components to install.
    4. Close Xcode.

    Xcode provides the iOS Simulator you’ll use to run your app.ionic+1

    0.3 – Install Node.js and Ionic CLI

    1. Install Node.js (LTS) from the official website if it’s not already installed.
    2. Open Terminal and run:bashnode -v npm -vBoth should print version numbers.
    3. Install the Ionic CLI:bashnpm install -g @ionic/cli ionic --versionYou should see a version number for ionic.ionicframework+1

    0.4 – Install Visual Studio Code

    • Download and install Visual Studio Code.
    • Open VS Code once to let it register with the system.

    Step 0 – Success target

    You can open Terminal and run:

    bashnode -v
    npm -v
    ionic --version

    All three commands print version numbers, and Xcode is installed on your Mac.


    Step 1 – Create the Ionic Angular Project

    You’ll create a new Ionic project using the Angular framework and a blank starter template.interserver+2

    1.1 – Create project folder

    In Terminal:

    bashcd ~
    ionic start temp-converter blank --type=angular
    cd temp-converter
    • temp-converter is the project folder.
    • Choose Yes if the CLI asks to integrate Capacitor. If not, we’ll add it later.

    1.2 – Open project in VS Code

    bashcode .

    VS Code should open with the temp-converter project.

    1.3 – Run in the browser

    In the same project folder, run:

    bashionic serve
    • Ionic will start a development server and open your default browser.
    • You should see a basic Ionic starter page.

    Step 1 – Success target

    In your browser you see a starter Ionic page (e.g., “Ionic App”) running at a local address (like http://localhost:8100). You can edit files in VS Code, and the browser reloads.


    Step 2 – Build the Temperature Converter UI

    You’ll use Ionic’s range slider (ion-range) to select a Fahrenheit temperature and compute Celsius.ionicframework

    2.1 – Locate the main page

    In VS Code, open:

    • src/app/home/home.page.html
    • src/app/home/home.page.ts
    • src/app/home/home.module.ts

    If the starter has a different main page, find the equivalent “Home” page referenced in app-routing.module.ts.

    2.2 – Replace home.page.html

    Replace the entire content of home.page.html with:

    xml<ion-header [translucent]="true">
      <ion-toolbar>
        <ion-title>
          Fahrenheit → Celsius
        </ion-title>
      </ion-toolbar>
    </ion-header>
    
    <ion-content class="ion-padding">
    
      <ion-card>
        <ion-card-header>
          <ion-card-title>Temperature Converter</ion-card-title>
          <ion-card-subtitle>
            Use the slider to choose a Fahrenheit value
          </ion-card-subtitle>
        </ion-card-header>
    
        <ion-card-content>
          <ion-item>
            <ion-label position="stacked">
              Fahrenheit: {{ fahrenheit | number:'1.0-0' }} °F
            </ion-label>
            <ion-range
              min="0"
              max="212"
              step="1"
              [(ngModel)]="fahrenheit">
              <ion-label slot="start">0°F</ion-label>
              <ion-label slot="end">212°F</ion-label>
            </ion-range>
          </ion-item>
    
          <ion-item lines="none" class="result-row">
            <ion-label>
              Celsius:
            </ion-label>
            <ion-note slot="end" color="primary">
              {{ celsius | number:'1.1-1' }} °C
            </ion-note>
          </ion-item>
        </ion-card-content>
      </ion-card>
    
    </ion-content>

    This is standard HTML with Ionic’s ion- components and Angular bindings.ionicframework+1

    2.3 – Implement logic in home.page.ts

    Replace the contents of home.page.ts with:

    tsimport { Component } from '@angular/core';
    
    @Component({
      selector: 'app-home',
      templateUrl: 'home.page.html',
      styleUrls: ['home.page.scss'],
    })
    export class HomePage {
    
      // Slider value in Fahrenheit
      fahrenheit = 32; // default to freezing point
    
      constructor() {}
    
      // Computed Celsius value
      get celsius(): number {
        return (this.fahrenheit - 32) * 5 / 9;
      }
    
    }

    This is pure TypeScript logic backing the Ionic UI.

    2.4 – Ensure FormsModule is imported (for ngModel)

    Open src/app/home/home.module.ts. Make sure it looks like this (key part is FormsModule in imports):

    tsimport { NgModule } from '@angular/core';
    import { CommonModule } from '@angular/common';
    import { FormsModule } from '@angular/forms';
    import { IonicModule } from '@ionic/angular';
    
    import { HomePageRoutingModule } from './home-routing.module';
    import { HomePage } from './home.page';
    
    @NgModule({
      imports: [
        CommonModule,
        FormsModule,
        IonicModule,
        HomePageRoutingModule
      ],
      declarations: [HomePage]
    })
    export class HomePageModule {}

    FormsModule is required for [(ngModel)] to work.ionicstart+1

    2.5 – Test in the browser

    If ionic serve is still running, it should auto‑reload. If not, start it again:

    bashionic serve

    In the browser:

    • You should see a header “Fahrenheit → Celsius”.
    • You should see a card with a slider from 0°F to 212°F.
    • As you move the slider:
      • The Fahrenheit label updates.
      • The Celsius value updates live and correctly.

    Try these test points:

    • 32°F → 0°C
    • 212°F → 100°C
    • 68°F → approximately 20°C

    Step 2 – Success target

    In the browser:

    • Moving the slider changes the Fahrenheit number and the Celsius result live.
    • The formula behaves correctly for known reference temperatures.

    Step 3 – Add Capacitor and iOS Platform

    Now you’ll wrap the web app in a native iOS shell using Capacitor, which generates an Xcode project and a WKWebView container.capacitorjs+2

    3.1 – Initialize Capacitor (if not already done)

    In the project root (temp-converter):

    bashnpx cap init

    When prompted:

    • App name: Temp Converter
    • App ID: something like com.yourname.tempconverter

    This creates a capacitor.config.ts or capacitor.config.json file that tells Capacitor about your app.joshmorony+1

    3.2 – Add iOS platform

    Add the iOS platform once:

    bashnpx cap add ios

    This generates an ios/ folder with a complete native iOS project that loads your Ionic web app.capacitorjs+1

    3.3 – Build and sync

    Whenever you change your app and want to run on iOS, do:

    bashionic build
    npx cap sync ios
    • ionic build generates the production web bundle in www/.
    • npx cap sync ios copies www/ into the native project and updates plugins.capacitorjs+1

    Step 3 – Success target

    In your project folder you now have an ios directory. Running npx cap sync ios completes without errors.


    Step 4 – Run in iOS Simulator via Xcode

    You’ll now open the native iOS project in Xcode and run it on a simulated iPhone.ionicframework+2

    4.1 – Open Xcode project

    bashnpx cap open ios
    • Xcode opens with a workspace named something like App.xcworkspace under the ios/App folder.

    4.2 – Choose a Simulator

    In Xcode:

    1. At the top toolbar, find the device selector (next to the “Run” ▶ button).
    2. Choose a simulator (e.g., iPhone 15 Pro).

    4.3 – Run the app

    Click the Run ▶ button.

    • The iOS Simulator launches automatically.
    • Your app appears and shows the same temperature converter UI, now running in a native iOS container.

    Step 4 – Success target

    In the iOS Simulator:

    • You can move the slider with the mouse/trackpad.
    • Fahrenheit and Celsius values update live, just as in the browser.
    • The app has an iOS status bar and behaves like a native app.

    Step 5 – Run on a Physical iPhone

    Now you’ll install the app onto a real iPhone directly from Xcode.ionic+1

    5.1 – Prepare your device

    On your iPhone:

    • Connect the iPhone to your Mac via USB (or configure wireless debugging).
    • Unlock the device and accept any “Trust This Computer?” prompts.

    In Xcode:

    1. Add your Apple ID (if not already):
      • Xcode → Settings → Accounts → Add Apple ID.
    2. Ensure your device appears in the device list at the top of Xcode.

    5.2 – Configure signing

    Back in the Xcode project:

    1. Select the App target in the Project Navigator.
    2. Click the Signing & Capabilities tab.
    3. Check Automatically manage signing.
    4. Choose your Team (Apple ID or developer team).joshmorony+1

    Xcode should create a development provisioning profile.

    5.3 – Run on your iPhone

    1. In the device selector at the top of Xcode, choose your physical iPhone.
    2. Click the Run ▶ button.

    First run:

    • iOS may show a warning about untrusted developer.
    • If so, go to Settings → General → VPN & Device Management and trust the certificate.

    After that:

    • The app will install and launch on your iPhone home screen.

    Step 5 – Success target

    On your physical iPhone:

    • You see the Temp Converter app icon.
    • Tapping it opens the Ionic temperature converter.
    • The slider works, and values update live, just like in the browser and simulator.

    Step 6 – Reflection: Why Angular + Ionic + Capacitor?

    To close the lab, reflect on the advantages:

    1. Single codebase for multiple platforms
      • You wrote the UI and logic once in TypeScript + Angular + Ionic, and it runs:
        • In a browser (ionic serve).
        • In iOS Simulator.
        • On a physical iPhone.
      • Capacitor can also target Android using the same code.capacitorjs+2
    2. Modern, component‑based UI
      • Ionic provides a full set of mobile‑style components (ion-range, ion-card, ion-toolbar, etc.) that automatically adapt to iOS and Android look‑and‑feel.kellton+2
      • Angular components and templates keep UI and logic organized and testable.
    3. Native capabilities via plugins
      • Capacitor’s plugin system gives you access to camera, geolocation, filesystem, notifications, and more from TypeScript.capacitorjs+3
      • You can still drop into Swift/Objective‑C to create custom native plugins when needed.capacitorjs+1
    4. Standard iOS toolchain
      • The generated iOS project is a normal Xcode project, so all native tooling still works:
        • iOS Simulator.
        • Device deployment.
        • Profiling and debugging.
        • App Store / TestFlight distribution.capacitorjs+1

    Hard Targets (Checklist)

    By the time you finish this lab, you should be able to say “yes” to all of the following:

    1. Mac setup
      • I can run node -v, npm -v, and ionic --version on my Mac, and I have Xcode installed.
    2. Web app working
      • I can run ionic serve and see a working Fahrenheit → Celsius converter with a slider in the browser.
    3. Capacitor/iOS project generated
      • I have run npx cap init, npx cap add ios, and npx cap sync ios without errors.
      • There is an ios folder in my project.
    4. iOS Simulator working
      • I have run npx cap open ios, selected a simulator, and launched the app in the iOS Simulator using Xcode.
    5. Physical device deployment
      • I have configured signing in Xcode, selected my iPhone, and deployed the app to my device.
      • I can open the Temp Converter app and use it on my iPhone.

    First, instead of wiring up Storyboards and dragging connections from UI widgets to IBOutlets and IBActions in Xcode, you designed your interface as code‑first UI using Ionic’s HTML templates and TypeScript. This approach is much friendlier to modern practices like version control, code review, and CI/CD pipelines, because your entire UI lives in readable, diff‑able source files rather than opaque Storyboard XML. It also lets you reuse your web skills across platforms and keep your layout logic close to your business logic, instead of split between Interface Builder and code.

    Second, you experienced how navigation and screen transitions are handled declaratively with Angular’s router, not with ad‑hoc view controller wiring. Routes define which component shows for each URL, and Ionic’s ion-router-outlet adds mobile‑style transitions and a navigation stack on top. This means that pushing, popping, and passing data between screens follows clear, testable patterns you already know from Angular, while still feeling native on iOS.

    Finally, by wrapping your Angular/Ionic app with Capacitor, you were able to run exactly the same code:

    • In a browser for fast iteration.
    • Inside a native iOS shell on the Simulator.
    • On a physical iPhone as a real app.

    You now have a working example of:

    • Code‑first UI with Ionic components.
    • Navigation powered by Angular’s router.
    • Cross‑platform delivery using Capacitor and Xcode.

    These are the key wins that make Angular + Ionic + Capacitor a compelling option for modern iOS development and for integrating iOS into CI/CD‑driven, multi‑platform projects.

  • Android Studio Panda 4 and the Rise of AI-First Kotlin Development

    Why students should learn to build AI-enabled Android apps now

    Mobile Development Is Not Declining — It Is Becoming the Edge of AI

    There is an absolute myth floating around that mobile application development is somehow on the decline as a development technology to learn.

    Nothing could be further from the truth.

    With edge computing, the Internet of Things, and the rising need to make AI-powered intelligent applications available everywhere, learning to build apps with Android is one of the fastest lanes for new developers to learn the software engineering principles that put information intelligence in everyone’s pocket.

    Back in the big days of IBM in the early 1990s, everyone was talking about “ubiquitous computing.” We did not fully know what it meant then, but it had that cool technical panache. It sounded like the future was coming, even if the shape of that future was still hidden in the fog.

    Now the fog has lifted.

    We are all living on the edge now.

    Our phones are not just communication devices. They are sensors, cameras, wallets, identity systems, learning tools, business dashboards, AI clients, and personal command centres. The mobile platform is where cloud intelligence, local data, human attention, and real-world context all meet.

    That is why Android development matters.

    Mobile platform computing is not yesterday’s skill. It is the next enabler.

    And for students who want to become serious developers in the AI age, Android Kotlin development offers something rare: a practical, hands-on way to learn user interface design, APIs, cloud integration, databases, secure architecture, edge-aware thinking, and AI-powered business logic — all inside one platform that people actually carry with them every day.

    The future of AI will not live only in research labs, enterprise dashboards, or browser windows.

    It will live in apps.

    It will live in pockets.

    And the developers who understand how to build those apps will be the ones who help bring intelligence to the edge of everyday life.

    Android development has entered a new era.

    https://docs.google.com/presentation/d/1crk3WAhsI5V9iRj7OXPBgMmvWXtxTank/edit?usp=sharing&ouid=103411675731117310047&rtpof=true&sd=true

    Now Android Studio Panda 4 adds something new to that picture: AI is becoming part of the development environment itself.

    Android Studio Panda 4 is now stable and includes major AI-assisted development features such as Planning Mode, Next Edit Prediction, Ask Mode, and Agent Web Search. Google describes Planning Mode as a way for the agent to create a detailed project plan before making code changes, while Next Edit Prediction is designed to suggest related edits even away from the current cursor position. (Android Developers)

    That matters deeply for students.

    Because the winning student of the next few years will not merely know how to “use AI.” The winning student will know how to build AI into applications.

    And in Android Kotlin development, that means learning to place AI where it belongs: not as a toy chatbot pasted onto the side of the app, but as a properly designed service inside the business logic layer.


    The new Android developer is an AI systems builder

    Here is the shift I want my students to understand:

    The app is no longer just a user interface connected to a database.
    The modern app is a user interface connected to memory, reasoning, retrieval, workflow, and AI services.

    That is why AI should now sit at the centre of your development efforts.

    Not because AI writes all the code for you.

    That is the cheap interpretation.

    The serious interpretation is this:

    A modern Kotlin Android app may now include:

    LayerTraditional purposeAI-first purpose
    Compose UIDisplay screens and receive inputLet users interact with intelligent workflows
    ViewModelManage state and eventsCoordinate AI calls, loading states, retrieved context, and generated responses
    RepositoryFetch and store dataRetrieve documents, notes, embeddings, and AI outputs
    Business logicApply rulesDecide when to call Gemini, ChatGPT, Grok, or another model
    Backend/FirebaseAuthentication, storage, functionsSecure key management, model routing, AI service orchestration
    MongoDB / vector storeStore documentsSupport retrieval-augmented generation, or RAG

    This is where the gold is.

    Students who learn this early can build apps that do more than display information. They can build apps that reason over information.


    1. Planning Mode

    Planning Mode is the big one.

    Instead of asking AI to immediately produce code, students can ask the agent to create an implementation plan first. This supports the teaching principle we have been developing for Android classes: deliberation before coding.

    That lines up directly with our Planning Mode teaching module: students should read the specification, identify UI responsibilities, data responsibilities, navigation, state, and risks before touching the code. The teaching material emphasizes that Planning Mode is a “no code edits yet” phase where students produce a written implementation plan before implementation begins.

    This is exactly how we stop students from treating AI as a vending machine.

    The student should not say:

    “Build me the app.”

    The student should say:

    “Here is the app specification. Create a plan showing screens, state, repositories, API calls, data models, error handling, and testing steps. Do not write code yet.”

    That is the difference between AI dependency and AI-augmented engineering.


    2. Next Edit Prediction

    Next Edit Prediction, or NEP, is especially useful for Kotlin students because Android development often involves related changes across multiple files.

    Change a data class, and you may need to update:

    • a ViewModel
    • a repository
    • a mapper
    • a Compose screen
    • a test
    • a Firebase DTO
    • a serialization model

    Google describes NEP as an evolution of code completion that anticipates edits away from the current cursor position, not just at the line where you are typing. (Android Developers)

    For teaching, this is beautiful.

    It helps students see that professional code is connected. A change in one file has consequences elsewhere. NEP becomes a kind of “codebase radar.”


    3. Agent Web Search

    Agent Web Search lets the Gemini agent pull current documentation for third-party libraries directly into Android Studio. Google’s release notes describe this as expanding Gemini beyond the Android knowledge base so it can fetch current reference material from the web for external libraries such as Coil, Koin, or Moshi. (Android Developers)

    This matters because students often work from outdated tutorials.

    Agent Web Search helps keep the student closer to current practice.


    The real win: AI inside the app, not just inside the IDE

    The IDE is only half the story.

    The more important teaching move is this:

    Use AI to build the app, then build AI into the app.

    That means teaching students to integrate model APIs into Kotlin apps through a clean architecture.

    Do not hardwire “Gemini” or “ChatGPT” all over the UI.

    Instead, teach a stable abstraction:

    interface AIClient {
        suspend fun complete(prompt: String): String
    }
    

    Then you can have different implementations:

    class GeminiClient : AIClient {
        override suspend fun complete(prompt: String): String {
            // Call Gemini through Firebase AI Logic
            return "Gemini response"
        }
    }
    
    class OpenAIClient : AIClient {
        override suspend fun complete(prompt: String): String {
            // Call OpenAI Responses API through backend or secure service
            return "OpenAI response"
        }
    }
    
    class GrokClient : AIClient {
        override suspend fun complete(prompt: String): String {
            // Call xAI Grok API through backend or secure service
            return "Grok response"
        }
    }
    

    This teaches students one of the most valuable professional patterns in AI application development:

    Your app should depend on an AI capability, not on a single vendor.

    Firebase AI Logic supports Gemini model access from mobile and web apps, including Kotlin and Java SDKs for Android. (Firebase) OpenAI’s platform exposes APIs for text, structured output, multimodal workflows, tools, and stateful interactions through the Responses API. (OpenAI Developers) xAI also provides API access for integrating Grok models into applications. (xAI Docs)

    That means students can learn a vendor-neutral design:

    Compose UI
       ↓
    ViewModel
       ↓
    AIUseCase / Business Logic
       ↓
    AIClient interface
       ↓
    Gemini / ChatGPT / Grok / other model provider
    

    That is serious architecture.

    That is employable knowledge.


    Firebase as the RAD backbone for AI apps

    Firebase is now one of the fastest ways to teach students how to build serious AI-enabled mobile apps without forcing them to become backend infrastructure engineers on day one.

    Firebase AI Logic is designed to let developers build generative AI features into mobile and web apps using Gemini models, with Android support through Kotlin and Java SDKs. (Firebase) Firebase also provides a Gemini API template through Firebase Studio for building apps with the Gemini API pre-loaded. (Firebase)

    For students, Firebase can serve as a RAD environment: Rapid Application Development.

    It gives them a practical path to:

    • authenticate users
    • store app data
    • call cloud functions
    • manage AI access more safely
    • avoid embedding raw API keys directly into the Android app
    • connect app logic to Gemini-powered features

    This is a major professionalism point.

    One of the pitfalls in AI-assisted Android development is leaking sensitive data or keys to third-party APIs, or sending user data without proper masking and consent. Our AI-assisted Android pitfall guide explicitly flags weak privacy handling, bad key practices, and poor review/testing habits as recurring problems students must learn to avoid.

    So the classroom message is simple:

    Do not build “AI toy apps.”
    Build AI apps with architecture, privacy, testing, and secure backend thinking.


    Lab: Build an AI Study Coach with Android Studio Panda 4, Kotlin, Firebase, Gemini, and MongoDB RAG

    Project theme

    Students will build a simple AI-powered Android app called:

    StudyForge AI

    The app helps a student save study notes and then ask questions about those notes.

    Example:

    The student saves notes like:

    Kotlin coroutines let us run asynchronous work without blocking the main thread.
    

    Then the student asks:

    Why should I not make network calls on the main thread?
    

    The app retrieves relevant notes from MongoDB, sends them as context to an AI model, and returns a study explanation.

    That gives students a practical introduction to RAG: Retrieval-Augmented Generation.

    MongoDB Atlas Vector Search supports semantic search by storing vector representations of data and retrieving relevant documents for generative AI applications. (MongoDB) MongoDB’s own RAG tutorials show how to create vector search indexes, store embeddings, and retrieve relevant documents for LLM-powered applications. (MongoDB)

    For a student lab, I would keep MongoDB on the backend side rather than embedding database credentials directly into the Android app. The Android app should call Firebase or a small backend endpoint, and that backend should talk to MongoDB.

    That keeps the app cleaner and safer.


    What students will build

    The app will include:

    FeaturePurpose
    Add study noteUser saves short study notes
    View saved notesCompose displays a list
    Ask AIUser asks a question
    Retrieve contextBackend searches MongoDB for relevant notes
    Generate answerGemini, ChatGPT, Grok, or another model answers using retrieved notes
    Display answerCompose UI shows the AI response

    Following current development trends, we showcase the new, Compose way of doing Android.

    The benefit? Lots – but – mainly, if you want to play in this space you need to get on board with Docker and especially the CI CD way of generating your apps directly from GIT. Git does Code. XML declaration files for UI – not so much.

    Architecture

    Android Kotlin App
       ↓
    Jetpack Compose UI
       ↓
    StudyCoachViewModel
       ↓
    StudyCoachRepository
       ↓
    Firebase Callable Function or HTTPS endpoint
       ↓
    MongoDB notes collection + vector search
       ↓
    AI model provider: Gemini / ChatGPT / Grok
       ↓
    Answer returned to Android app
    

    This is the key teaching point:

    The Android app is not “the whole system.”
    The Android app is the mobile front end of an AI-enabled system.

    That is how modern apps increasingly work.


    Step 1: Create the Android Studio Panda 4 project

    1. Open Android Studio Panda 4.
    2. Create a new project.
    3. Choose a Kotlin + Jetpack Compose project.
    4. Use the Gemini API Starter template where available.
    5. Run the starter app on an emulator.

    Now pause.

    Before coding, students must use Planning Mode.

    Prompt:

    I am building an Android Kotlin Jetpack Compose app called StudyForge AI.
    
    The app lets users save short study notes, view them in a list, ask a question, retrieve relevant notes from a MongoDB-backed RAG service, and send the question plus retrieved notes to an AI model.
    
    Create an implementation plan only. Do not write code yet.
    
    Include:
    - screens
    - composables
    - ViewModel state
    - repository methods
    - backend API calls
    - data models
    - loading and error states
    - testing steps
    

    Students should save the plan as part of the assignment.

    This matches the teaching strategy from our earlier Planning Mode module: students should submit not only working code, but also the plan, prompts, AI responses, and their own edits to the plan.


    Step 2: Create the core data model

    Create a Kotlin data class:

    data class StudyNote(
        val id: String,
        val text: String,
        val createdAt: Long
    )
    

    Then create a second model for AI answers:

    data class StudyAnswer(
        val question: String,
        val answer: String,
        val sources: List<StudyNote>
    )
    

    Teaching note:

    This is a good moment to use Next Edit Prediction. After changing the data model, students should watch how Android Studio suggests related updates in ViewModels, repositories, or UI files.


    Step 3: Build the Compose screen

    Create a simple Compose screen:

    @Composable
    fun StudyForgeScreen(
        viewModel: StudyForgeViewModel = viewModel()
    ) {
        val notes by viewModel.notes.collectAsState()
        val newNote by viewModel.newNote.collectAsState()
        val question by viewModel.question.collectAsState()
        val answer by viewModel.answer.collectAsState()
        val isLoading by viewModel.isLoading.collectAsState()
    
        Column(modifier = Modifier.padding(16.dp)) {
            Text("StudyForge AI")
    
            OutlinedTextField(
                value = newNote,
                onValueChange = viewModel::onNewNoteChanged,
                label = { Text("Add a study note") }
            )
    
            Button(onClick = viewModel::saveNote) {
                Text("Save Note")
            }
    
            LazyColumn {
                items(notes) { note ->
                    Text(note.text)
                }
            }
    
            OutlinedTextField(
                value = question,
                onValueChange = viewModel::onQuestionChanged,
                label = { Text("Ask a question") }
            )
    
            Button(onClick = viewModel::askQuestion) {
                Text("Ask AI")
            }
    
            if (isLoading) {
                Text("Thinking...")
            }
    
            if (answer.isNotBlank()) {
                Text("AI Answer")
                Text(answer)
            }
        }
    }
    

    This is not meant to be visually perfect.

    It is meant to teach structure.

    Students can improve the UI later.


    Step 4: Create the ViewModel

    class StudyForgeViewModel(
        private val repository: StudyForgeRepository = StudyForgeRepository()
    ) : ViewModel() {
    
        private val _notes = MutableStateFlow<List<StudyNote>>(emptyList())
        val notes: StateFlow<List<StudyNote>> = _notes
    
        private val _newNote = MutableStateFlow("")
        val newNote: StateFlow<String> = _newNote
    
        private val _question = MutableStateFlow("")
        val question: StateFlow<String> = _question
    
        private val _answer = MutableStateFlow("")
        val answer: StateFlow<String> = _answer
    
        private val _isLoading = MutableStateFlow(false)
        val isLoading: StateFlow<Boolean> = _isLoading
    
        fun onNewNoteChanged(value: String) {
            _newNote.value = value
        }
    
        fun onQuestionChanged(value: String) {
            _question.value = value
        }
    
        fun saveNote() {
            val text = _newNote.value.trim()
            if (text.isBlank()) return
    
            val note = StudyNote(
                id = UUID.randomUUID().toString(),
                text = text,
                createdAt = System.currentTimeMillis()
            )
    
            _notes.value = _notes.value + note
            _newNote.value = ""
    
            viewModelScope.launch {
                repository.saveNote(note)
            }
        }
    
        fun askQuestion() {
            val currentQuestion = _question.value.trim()
            if (currentQuestion.isBlank()) return
    
            viewModelScope.launch {
                _isLoading.value = true
                _answer.value = repository.askAI(currentQuestion)
                _isLoading.value = false
            }
        }
    }
    

    Teaching note:

    Students must understand why the AI call runs inside viewModelScope.launch.

    One of the common Android AI pitfalls is running inference or network calls on the main thread, causing freezes or ANRs. Our pitfall guide specifically recommends lifecycle-aware background work such as coroutines, WorkManager, and lifecycle-aware scopes for AI integration labs.


    Step 5: Create the repository

    class StudyForgeRepository(
        private val apiClient: StudyForgeApiClient = StudyForgeApiClient()
    ) {
        suspend fun saveNote(note: StudyNote) {
            apiClient.saveNote(note)
        }
    
        suspend fun askAI(question: String): String {
            return apiClient.askQuestion(question)
        }
    }
    

    The repository keeps the ViewModel clean.

    This is where students learn separation of concerns.

    The UI should not know whether the answer came from Gemini, ChatGPT, Grok, or a future model that has not been invented yet.


    Step 6: Connect to Firebase or backend endpoint

    For teaching, keep this part simple.

    The Android app calls:

    POST /saveNote
    POST /askQuestion
    

    The backend handles:

    1. storing notes in MongoDB
    2. embedding the note
    3. retrieving relevant notes
    4. calling the selected AI model
    5. returning the answer

    A simplified Android API client might look like:

    class StudyForgeApiClient {
    
        suspend fun saveNote(note: StudyNote) {
            // Send note to Firebase function or backend endpoint
        }
    
        suspend fun askQuestion(question: String): String {
            // Send question to Firebase function or backend endpoint
            // Receive AI answer as String
            return "AI answer will appear here"
        }
    }
    

    In a production-quality version, students should use Retrofit, Ktor Client, Firebase Functions, or Firebase AI Logic depending on the teaching path.


    The backend should perform this sequence:

    Receive question
       ↓
    Generate embedding for the question
       ↓
    Search MongoDB for similar note embeddings
       ↓
    Retrieve top 3–5 relevant notes
       ↓
    Build prompt with retrieved notes
       ↓
    Call Gemini / ChatGPT / Grok
       ↓
    Return answer to Android app
    

    Example prompt sent to the model:

    You are a helpful study coach.
    
    Use only the notes below as your source material.
    If the answer is not present in the notes, say what is missing.
    
    Student question:
    {question}
    
    Relevant notes:
    {retrieved_notes}
    
    Answer in clear student-friendly language.
    

    This teaches students that RAG is not magic.

    It is a workflow:

    Store knowledge. Retrieve relevant knowledge. Add it to the prompt. Ask the model to answer from that context.


    Step 8: Add a model switcher

    Once the Gemini path works, students can add a provider setting:

    enum class AIProvider {
        GEMINI,
        OPENAI,
        GROK
    }
    

    Then the backend can route the request:

    if provider == GEMINI → call Gemini
    if provider == OPENAI → call OpenAI
    if provider == GROK → call xAI Grok
    

    This reinforces vendor-neutral architecture.

    The lesson is not “learn one AI API.”

    The lesson is:

    Learn how AI APIs fit into application architecture.

    That is a much more durable skill.


    Student deliverables

    Students submit:

    1. Screenshot of the running app
    2. Kotlin data models
    3. Compose screen
    4. ViewModel
    5. Repository/API client
    6. Planning Mode document
    7. AI prompts used
    8. Short reflection: “What did AI help with, and what did I have to verify?”

    Assessment should not reward blind copying. Our prior Android teaching outline stresses that students should be graded on planning, AI prompt quality, edits, final code clarity, and their ability to critique AI output.


    Common warnings for students

    Do not put raw API keys in your Android app

    Mobile apps can be inspected. Secrets embedded in APKs are not truly secret.

    Use Firebase, backend functions, or secure server-side routing.

    Do not paste private user data into prompts without thinking

    AI apps must be designed with privacy awareness.

    Do not accept generated code blindly

    AI can create code that looks professional but contains lifecycle mistakes, outdated APIs, bad threading, or weak error handling.

    Do not start with multi-agent complexity

    For student projects, begin with one clean API call.

    Then add retrieval.

    Then add model switching.

    Then add advanced orchestration.

    In that order.


    Conclusion: this is the moment for AI-enabled Android students

    Android Studio Panda 4 is not just another IDE update.

    It is a signal.

    The development environment is becoming AI-assisted. The applications are becoming AI-enabled. The student who understands both sides of that equation has a real advantage.

    This is why I am bringing this into my teaching practice.

    Students should not graduate knowing only how to build static screens and simple CRUD apps. They should graduate understanding how to build apps where AI is part of the reasoning layer, the business logic layer, and the user value proposition.

    The next wave of Android apps will not merely ask:

    “What button did the user press?”

    They will ask:

    “What does the user need to understand, decide, retrieve, summarize, automate, or create?”

    That is the opportunity.

    Android Studio Panda 4 gives us the development environment.

    Kotlin gives us the app architecture.

    Firebase gives us the rapid backend.

    MongoDB gives us the memory and retrieval layer.

    Gemini, ChatGPT, Grok, and other models give us the reasoning engines.

    Now the job of the student is to learn how to connect them intelligently.

    Where the next generation of AI-enabled Android developers will win.

    Show Me The Money: The Android Job Scene in Toronto

    Let’s be blunt: you are not studying late at night and grinding through labs for a gold star sticker. You want a career, rent money, travel money, and—yes—some room for fun. Android development in Toronto/GTA can absolutely give you that.

    Right now there are around a hundred Android‑focused roles and many more “mobile developer (iOS/Android)” postings in the Toronto area, across banks, consultancies, and product companies. That means real demand, not just hype. Companies like General Motors, TD, Tangerine, and dozens of startups and fintechs list Android and Kotlin as core skills for their mobile teams.glassdoor+5

    The pay is serious even at the junior level. Glassdoor data for Toronto shows Android developers earning a typical base range of about 66,000–101,000 CAD, with an average around 88,000 CAD once you have some experience under your belt. PayScale puts an entry‑level Android developer (less than one year) around 51,000 CAD and early‑career (1–4 years) around 73,000 CAD in Toronto. In other words, if you put a couple of focused years into building skills and a portfolio, seeing numbers in the 70k–90k range is realistic—not a fantasy.glassdoor+1

    As you level up, the ceiling gets much higher. Senior and staff Android roles in Toronto regularly advertise six‑figure salaries, with some postings showing 140,000–160,000 CAD or more for specialized Android work. Crypto, fintech, and big‑tech‑adjacent companies sometimes push even higher, with some data sources reporting averages above 120,000 CAD for experienced Android developers in the city.glassdoor+2


    Why This Matters For Your Life (Not Just Your Resume)

    Money isn’t everything, but it changes your options. A solid Android or mobile developer salary in Toronto can mean:

    • Moving out sooner and choosing where you want to live, instead of taking whatever is cheapest.
    • Paying off OSAP or other loans on your terms.
    • Having the budget for travel, festivals, hobbies, and the kind of social life that makes your twenties and thirties memorable.
    • The confidence that comes from being in demand—recruiters reach out to you, not the other way around.

    Whether you’re a pragmatic young woman who wants independence and career security, or a young guy who wants enough income to impress himself and everyone around him, the equation is the same: tech skills that employers actually pay for. Android is one of those skills.


    How George Brown Full Stack Leads Into These Jobs

    Here’s the good news: the George Brown Full Stack program already teaches most of the building blocks Toronto employers are paying for.

    Job ads for Android and junior mobile developers in the GTA consistently mention:

    • Kotlin or Java, plus Android Studio, as the main programming environment.glassdoor+1
    • REST APIs, JSON, and cloud platforms like Firebase or AWS.linkedin+1
    • Databases and data modeling—skills you practice in your back‑end and SQL courses.payscale+1
    • Version control with Git and working in agile teams.indeed+1

    When you add a couple of focused Android projects on top of your Full Stack coursework—especially AI‑powered apps built in Android Studio Panda 4—you suddenly match the wish list in real Toronto job postings. The difference between “I took some courses” and “I can show you a working Android app that talks to a cloud backend and uses AI” is the difference between hoping for a job and walking into interviews with leverage.


    Android + AI: An Edge In a Crowded Market

    Toronto is competitive, which means you want something that makes your resume jump out of the pile. Right now, that “something” is clearly AI.

    Employers are already asking mobile teams to integrate chatbots, recommendation systems, and smart in‑app assistants. When you can say, “I’ve built an AI‑powered Android app in Kotlin using Jetpack Compose, Firebase, and an external model like Gemini or ChatGPT,” you are no longer just another junior dev—you are the person who can help them ship the next generation of their product.

    That’s exactly what we practice in my labs: Android Studio Panda 4, AI agents in the IDE, Firebase for secure backends, and MongoDB/RAG for intelligent data retrieval. It’s not just a cool classroom exercise; it’s training for the job descriptions that are live in Toronto right now.glassdoor+3


    Bottom Line

    If your goal is financial independence, career flexibility, and the ability to build things people actually use every day, Android development is a very pragmatic path—especially when combined with the George Brown Full Stack program. The market is there, the salary bands are real, and the skills you learn in class map directly to what Toronto employers are hiring for.

    Toronto Android developer roles expect a mix of solid Kotlin/Android fundamentals, modern architecture, cloud/API skills, and collaboration practices.indeed+3


    Core Android & Kotlin skills

    • Strong Kotlin (and often some Java) with Android Studio and the Android SDK.indeed+2
    • Experience building screens with Jetpack Compose and modern UI toolkits.indeed
    • Understanding of Android components (activities, fragments, services), app lifecycle, and manifest configuration.indeed+1
    • Familiarity with design patterns like MVVM, MVP, or Clean Architecture.indeed+2

    Architecture, data, and networking

    • Comfortable using coroutines and Flow or other reactive patterns for async work.indeed
    • Consuming RESTful APIs and JSON, including authentication and error handling.indeed+2
    • Local data storage with Room/SQLite or similar, and awareness of caching strategies.indeed+1
    • Basic understanding of app performance, memory, and responsiveness on mobile devices.indeed+1

    Testing, tooling, and DevOps habits

    • Unit and UI testing using tools like JUnit and Espresso; some roles mention test automation and TDD.indeed+2
    • Git proficiency (branches, pull requests, code review) and experience with CI/CD is commonly requested.indeed+1
    • Ability to debug, troubleshoot crashes, and stay on top of security updates and vulnerabilities.indeed+1

    Cloud, cross‑platform, and AI‑adjacent expectations

    • Experience with cloud services such as Firebase or AWS (auth, analytics, serverless functions, etc.).indeed+2
    • Many “mobile developer” postings want Android plus iOS or React Native, so awareness of Swift/Objective‑C or cross‑platform frameworks is a plus.linkedin+2
    • Increasingly, job ads mention AI‑enhanced workflows or modern tooling, and some junior roles (e.g., at Intuit) explicitly reference AI‑assisted coding and UX‑focused Android development.talent

    Professional and soft skills

    • Ability to understand a mobile app end‑to‑end: from UI, through business logic, to backend integration.indeed
    • Collaboration with designers, product owners, and other developers in agile teams, often using Jira/Confluence.indeed+1
    • Clear written and verbal communication, plus a portfolio of apps or Play Store contributions is frequently listed as “strongly preferred.”indeed+2

    If you can:

    • build a Kotlin/Compose app,
    • talk to a cloud backend (e.g., Firebase),
    • integrate REST APIs,
    • write basic tests, and
    • work in Git with a team,

    https://ca.indeed.com/q-android-kotlin-l-toronto,-on-jobs.html?vjk=ef5b5150148db027

  • From Stone Tablets to Grok Connectors: 5 Best Practices to 1000x Your Productivity


    In the beginning, there was the stone tablet—and it was great.

    It allowed persistent memory across generations.

    Somebody who figured out a cool way to capture a mastodon could share it with the young learners of the tribe.

    But chisels are difficult to wield and clay tablets are expensive, so after a while a smart guy named Gutenberg said,

    “I have an idea. Let’s make a movable type press, ink, and paper. Suddenly we could mass-produce tracts, spark the Renaissance in Europe, and give everyone access to knowledge.”

    If your thinking is still stuck in the stone-tablet-and-chisel era, you’re missing massive leverage. Today we’re re-gearing our minds to take full advantage of Grok’s live connectors to Gmail, Google Calendar, Notion, and more.

    Best Practice: Stop reading every email. Ask Grok to search, summarize, and extract action items instead.

    Example prompts that work incredibly well:

    • “Search my Gmail for everything about the Q3 budget and give me a decision-ready summary with open questions.”
    • “Show me all unread emails from clients in the last 48 hours, ranked by urgency.”

    Why it 1000x’s you: You go from reactive inbox slave to strategic commander.

    Best Practice: Treat your calendar as a living database, not a static list.

    Powerful examples:

    • “Find 90 minutes of deep work blocks I can protect this week.”
    • “Scan my calendar and Gmail for anything related to ‘Toronto real estate’ and surface conflicts.”

    Best Practice: Stop copying and pasting. Create a unified knowledge layer.

    Try this:

    • “Pull the last three emails about the CTO from Gmail, the relevant Notion pages, and my calendar notes—then write a one-page briefing.”

    Best Practice: Never let important relationships fall through the cracks.

    • “Find every email I sent to prospects in the last 30 days that hasn’t received a reply. Draft friendly follow-ups.”

    Best Practice: Build simple recurring commands that replace entire productivity systems.

    • Morning: “Good morning briefing — unread high-priority emails, today’s calendar, top 3 Notion tasks.”
    • End-of-day: “Close the loop — what got done, what needs follow-up tomorrow.”

    The New Renaissance Starts Now

    We’re living through another Gutenberg moment—except this time the “printing press” thinks, searches, summarizes, and acts across all your tools in real time.

    Start small. Pick one of the five practices above and try it today. Then come back and tell the community what changed for you.

    What’s the first prompt you’re going to try? Drop it in the comments!

    — Peter & Grok
    🚀


    Productivity Workbook: 10 Grok Connector Prompts to Amplify Your Information Universe

    Print this, copy it into Notion, or keep it in this chat. Work through one prompt per day for the next 10 days. After each exercise, note what surprised you, what you saved, and how much time you reclaimed.

    These prompts assume your Gmail, Google Calendar, and Notion are connected to Grok. Just paste them directly into our chat and watch the magic happen.

    Day 1: Inbox Zero with Intelligence

    Prompt: “Search my Gmail for all unread emails from the last 7 days. Group them by priority (high/medium/low), extract key action items and deadlines, and give me a 5-bullet decision dashboard.”

    Goal: Move from overwhelm to clarity in under 60 seconds.

    Follow-up you can add: “Draft replies to the top 3 high-priority ones.”

    Day 2: Calendar Audit & Protection

    Prompt: “Analyze my Google Calendar for the next 14 days. Identify conflicts, back-to-back meetings, and unprotected deep work time. Suggest an optimized schedule with at least 2 hours of focused blocks per day and move or cancel low-value items.”

    Goal: Reclaim your time and energy. Follow-up: “Block the suggested deep work slots on my calendar.”

    Day 3: Cross-Tool Knowledge Synthesis

    Prompt: “Pull the last 5 emails about [specific topic, e.g., ‘CTO’ or ‘Q3 budget’] from Gmail, relevant pages from my Notion workspace, and any related calendar events. Synthesize everything into a one-page briefing with risks, opportunities, and recommended next actions.”

    Goal: Stop hunting across apps. Pro tip: Replace the topic with whatever you’re working on.

    Day 4: Automatic Follow-Up Engine

    Prompt: “Find every email I sent in the last 30 days that has no reply. List the top 5 most important ones with context from previous threads, then draft personalized, friendly follow-up messages for each.”

    Goal: Never lose momentum on relationships or deals.

    Day 5: Morning Chief-of-Staff Briefing

    Prompt: “Good morning briefing: Show unread high-priority emails, today’s calendar events with prep notes, top 3 open Notion tasks, and one powerful focus question for the day.”

    Goal: Start every day like you have a world-class assistant. Variation: Change to “End-of-day close the loop” at night.

    Day 6: Meeting Superpowers

    Prompt: “After my [meeting name/time] today, summarize the key decisions, action items with owners and deadlines, and create corresponding Notion tasks plus calendar reminders for follow-ups in 7 and 30 days.”

    Goal: Never lose what was said in a meeting again.

    Day 7: Weekly Review on Steroids

    Prompt: “Run my full weekly review: Major accomplishments this week from email + calendar + Notion, open loops or risks, calendar conflicts for next week, and three key lessons or insights.”

    Goal: Turn reflection into rocket fuel.

    Day 8: Idea Capture & Development

    Prompt: “Anytime I have a new idea, create a new Notion page in my ‘Ideas’ database. Pull any related emails or calendar context, expand the idea with pros/cons, and suggest first 3 action steps.”

    Goal: Capture and grow ideas instead of losing them.

    Day 9: Relationship Nurturing

    Prompt: “Who in my key network (personal or professional) have I not connected with in over 45 days? Pull context from past emails or meetings and suggest short, warm outreach messages or coffee catch-up invites.”

    Goal: Strengthen your network without extra effort.

    Day 10: Custom System Builder

    Prompt: “Help me design a custom productivity system. I want [describe your needs, e.g., ‘daily email triage + weekly Notion review + automatic client follow-ups’]. Suggest 3–5 recurring Grok prompts I can use and show me how to combine Gmail, Calendar, and Notion.”

    Goal: Build your own personalized AI operating system.


    1. Do it live — Paste the prompts exactly as written, then refine them in follow-up messages.
    2. Track results — After each prompt, write down: Time saved / Insight gained / Output created.
    3. Scale it — Once comfortable, combine prompts (e.g., morning briefing + follow-up engine).
    4. Share wins — Reply here or on X with your favorite prompt and results. We’ll feature the best ones.

    You now have everything you need to move from stone-tablet thinking to commanding an intelligent, connected productivity universe.

    Which prompt are you starting with today?

    Paste it here and I’ll run it with you right now, or refine it for your exact needs.

  • The AI Agent Accessibility Imperative: Don’t Be the Sears of the Agentic Web


    The web is bifurcating.

    The time to build for the new channel is before your competitors realize the channel exists.

    Before We Talk About AI Agents, Let’s Talk About a Catalogue

    If you grew up in Canada before the millennium, you probably remember the Sears catalogue. Not as a historical artifact — as furniture. It sat on the kitchen table, the coffee table, the shelf beside the phone. It was how Canadians shopped for everything from refrigerators to hockey equipment to school clothes. For generations, Sears wasn’t just a retailer. It was infrastructure.

    At its peak, Sears Canada operated over 100 full-line department stores and more than 1,700 catalogue pick-up locations across the country. It employed approximately 17,000 people. It was, by any reasonable measure, one of the most trusted and deeply embedded commercial institutions in Canadian life. The Sears name carried the weight of reliability, range, and reach that no competitor could match.

    It closed permanently in 2017.

    The bankruptcy filing cited the usual suspects — pension shortfalls, declining foot traffic, aggressive competition — but the forensic cause of death was something more specific and more instructive:

    Two failures compounded each other with fatal efficiency.

    First, Sears entered e-commerce late — not slightly late, but strategically late, in the way that signals an organization that treated the new channel as an experiment rather than an existential imperative. Amazon, Best Buy, and a generation of born-digital retailers had already built the logistics networks, the customer trust, and the user experience standards that would define what “shopping online” meant. Sears arrived at that table after the food was gone.

    Second, and more quietly devastating, Sears underinvested in mobile. As the smartphone became the primary device through which Canadians browsed, compared, and purchased, Sears’s digital presence remained optimized for a desktop experience that fewer and fewer people were using.

    They were building for the audience of five years ago while their competitors were building for the audience of five years ahead.

    The lesson is not that Sears was incompetent. They were not.

    They were large, experienced, and resource-rich.

    The lesson is that competence accumulated under one set of channel assumptions does not automatically transfer when the channel itself transforms. Sears knew retail. They never fully learned the new retail.

    Because the channel is transforming again — and most web developers are making exactly the same category of error Sears made.

    Not the technical error.

    The attitudinal error.

    The assumption that the primary consumer of your web content is a human being sitting at a browser, making deliberate navigational choices.

    That assumption is becoming less true every quarter. And the rate at which it is becoming less true is accelerating.

    This series is about not making that mistake.

    Sears had two fatal blind spots: they assumed the channel for commerce was still stores, and they assumed the device for web interaction was still a desktop.

    By the time they corrected both assumptions, the market had already structurally reorganized around their competitors.

    Web developers today face an identical structural risk — and the analogues map precisely:

    Sears Failure2026 Equivalent
    Ignored e-commerce as a channelIgnoring AI agents as primary web consumers
    Built for desktop, not mobileBuilding for humans, not AI agent parsing
    “We’ll get to it later”“SEO and structured data can wait”
    Too many entrenched stakeholders to pivot fastLegacy JS-heavy SPA architecture that breaks agent crawlers

    The signal is already in the data.

    Perplexity, ChatGPT, Google AI Overviews, Claude, Copilot, and dozens of enterprise AI agents are right now replacing direct human browsing for information retrieval.

    When a user asks an AI agent “compare the top three project management tools,” no human opens five tabs.

    The agent does — or more likely, it never opens tabs at all. It synthesizes from indexed, structured, accessible content.

    The sites that get cited are the ones that were built to be parseable.


    What AI Agents Actually Need From Your Web Content

    This is the technical literacy gap. Most developers instinctively think “accessibility” means screen readers and WCAG compliance. AI agent accessibility is a different surface of concerns entirely, though it overlaps in important ways.

    1. llms.txt — The New robots.txt You Aren’t Implementing Yet

    An emerging standard (championed by Answer.AI’s Jeremy Howard among others) that provides a structured, Markdown-formatted summary of what your site contains and how an LLM should navigate it.

    Think of it as a machine-readable table of contents and intent declaration for your site.

    # YourSite
    > A platform for full-stack developer education
    
    ## Core Content
    - [Course Catalog](/courses): All available courses with descriptions
    - [Documentation](/docs): Technical reference for all tools
    - [Blog](/blog): Weekly articles on web development
    
    ## Key Concepts
    This site covers React, Node.js, PostgreSQL, and DevOps for working developers.
    

    Place it at yourdomain.com/llms.txt. It’s to AI agents what sitemap.xml was to Google crawlers in 2005. Early movers will benefit disproportionately.

    2. JSON-LD Structured Data / Schema.org — The Semantic Layer You’re Probably Underusing

    Search engines have required this for years for rich snippets. AI agents use it to understand entity relationships, not just index keywords. Every page on your site should have appropriate Schema markup:

    • Article / BlogPosting for editorial content
    • Course for educational content
    • FAQPage for knowledge bases (this one is especially powerful for RAG systems)
    • HowTo for procedural content
    • Product / Service for commercial offerings
    • Organization and Person for entity disambiguation

    The developer who implements FAQPage schema today is creating structured training signal that AI agents will preferentially surface when answering user questions in their domain.

    3. Server-Side Rendering (SSR) / Static Site Generation (SSG) — Not Just a Performance Win

    Here’s the dirty secret of the React/Next.js/Vue ecosystem: most AI agents and crawlers cannot execute JavaScript.

    A client-side rendered SPA returns essentially an empty <div id="root"> to a crawler. Your content is invisible.

    The shift to Next.js App Router, Astro, Nuxt, and SvelteKit isn’t just about Core Web Vitals.

    It’s about ensuring your content exists in the initial HTML payload that any agent, crawler, or parser receives.

    Action today: Audit your site with JavaScript disabled.

    What an AI agent sees is roughly what you see.

    4. Semantic HTML — The Foundation That Still Gets Ignored

    AI agents parse document structure. A page where everything is <div> and <span> is informationally flat. A page with proper <article>, <section>, <h1> through <h3> hierarchy, <nav>, <main>, <aside>, and <figure> with <figcaption> gives an agent a navigable knowledge structure.

    This is what your full-stack tech stack diagram should be teaching under “Frontend Basics” — not as an accessibility checkbox, but as agent-legibility infrastructure.

    5. The Model Context Protocol (MCP) — The API Layer for Agentic Integration

    Anthropic’s MCP is rapidly becoming the standard by which AI agents interact with external services and data sources. If your web application exposes functionality — booking, querying, transacting, retrieving — building an MCP server for it means AI agents can use your service, not just read about it.

    This is the mobile-first moment for agentic integration. The platforms that built MCP endpoints in 2025 will be the ones that appear in “use this tool” recommendations by AI assistants in 2026-2027. Shopify, Stripe, Linear, and others are already there.

    6. Content Architecture for Retrieval-Augmented Generation (RAG)

    AI agents that power enterprise tools don’t just crawl the open web — they ingest, chunk, and embed your content into vector databases for retrieval.

    Content that is modular, clearly scoped, and self-contained at the section level embeds well and retrieves accurately.

    Practically this means:

    • Write headings that are complete declarative statements, not clever one-word labels
    • Each section should answer one question fully without requiring adjacent context
    • Avoid pronouns that reference content from a previous section (“As we discussed above…”)
    • Use definition-first writing: state the concept, then elaborate
    • Explicit summaries and conclusions at section and page level

    This is writing for chunking. An AI agent slicing your article into 512-token windows will either surface coherent, useful segments — or it will surface confusing fragments. The architecture of your prose determines which.

    7. Metadata Completeness — OpenGraph, Twitter Cards, and Beyond

    When an AI agent synthesizes a response and needs to attribute or recommend a source, it reads metadata to understand what the page is, who wrote it, when it was published, and whether it’s authoritative. Incomplete metadata = lower confidence = lower citation frequency.

    Every page needs:

    • og:title, og:description, og:image, og:type
    • article:author, article:published_time, article:modified_time
    • Canonical URLs (duplicate content confuses agent indexing)
    • <meta name="description"> that accurately summarizes the page

    8. Explicit robots.txt Governance for AI Crawlers

    The AI crawler landscape is fragmented. GPTBot, ClaudeBot, PerplexityBot, Bytespider, and dozens of others follow robots.txt conventions — to varying degrees. You need a deliberate policy:

    • Decide which AI crawlers you want to allow and which to block
    • Be aware that blocking all AI crawlers means invisibility in AI-powered search
    • Selectively expose high-value content and protect proprietary/paywalled material

    Not having a policy is a policy — one made by default and likely not in your interest.


    Sears-mode thinking has a characteristic internal monologue:

    “Our core users are still human.

    AI agents are a niche.

    We’ll address it in a future sprint. It’s not urgent yet. Let’s not over-engineer.”

    This is precisely the logic that Sears used about mobile commerce in 2010. It wasn’t urgent. Until it was catastrophically late.

    The structural difference this time is the speed of channel adoption.

    Mobile adoption took roughly a decade to become dominant.

    AI agent-mediated information retrieval is moving in an 18-to-36-month window.

    The S-curve is steeper.

    The developers and teams building for agent-accessibility now are not over-engineering — they are future-proofing their distribution channel.


    The Pragmatic Starting Checklist for Today

    For the developer who wants to start this week, not after a full architectural review:

    1. Add llms.txt to your domain root — 30 minutes, zero dependencies
    2. Audit with JS disabled — Chrome DevTools → Settings → Disable JavaScript. Photograph what you see.
    3. Add FAQPage JSON-LD to your highest-traffic content pages — immediate RAG pickup
    4. Verify SSR — if you’re on Create React App with no SSR, plan your Next.js or Astro migration
    5. Review heading hierarchy — use a browser extension like HeadingsMap to visualize your <h> structure
    6. Complete your OpenGraph metadata — use opengraph.xyz to preview what agents see
    7. Set a deliberate robots.txt AI crawler policy — even if that policy is “allow all for now”
    8. Write one page explicitly for chunking — restructure your best-performing article using the RAG writing principles above, then monitor its citation frequency in AI tools

    The web is bifurcating into:

    (1) content that AI agents cite and surface to their users — and

    (2) content that exists on the web but never appears in the answers those users actually see.


  • Google Willow: This Week’s Quantum Computing Breakthrough

    AI With Peter: Business AI Literacy

    Here’s what you need to know about Google’s Willow quantum processor — without the hype, without the science fiction, and without pretending this is going to replace your data center next quarter.

    What Google Actually Built

    Google Quantum AI has built a 105-qubit superconducting quantum processor called Willow. The breakthrough is not that it’s big. The breakthrough is that it works better as it gets bigger.

    That sentence might not sound revolutionary, but it solves one of the fundamental problems that has kept quantum computing in the lab for decades.

    The Real Achievement: Error Correction That Scales

    Classical computers are reliable. You can store a bit, read it back, copy it a million times, and it stays the same.

    That’s why your laptop doesn’t randomly corrupt your files.

    Every previous attempt to scale up quantum systems ran into the same wall: adding more qubits meant adding more noise. The system got worse, not better.

    Google’s published results show below-threshold quantum error correction.

    In plain English: as they increased the size of their error-correcting quantum memory systems, the logical error rate improved rather than deteriorating.

    If errors decrease as you scale up, you have a path to building quantum computers that can actually complete useful calculations before they fall apart.

    How Willow’s Error Correction Works (The Business Version)

    Think of a regular qubit like a single employee trying to remember a complex instruction while sitting in a noisy restaurant. They’re going to make mistakes.

    Error correction is like having a team of people who cross-check each other. But in previous quantum systems, adding more people to the team just meant more confusion — more chances for someone to mishear, more coordination overhead, more chaos.

    Willow’s breakthrough is that the cross-checking team actually reduces errors as the team grows. More qubits, properly configured, means less noise in the final answer.

    That’s counterintuitive. It’s also essential.

    What This Means for Business Today

    Short Answer: Nothing immediate. Willow is not a product you can buy. It’s a research milestone.

    Slightly Longer Answer: This is the foundation for everything that comes next.

    Google has opened a Willow Early Access Program for selected researchers.

    Scientific proposals are due May 15, 2026, with selection notifications planned for July 1, 2026.

    The hardware is being made available to serious researchers who want to run experiments on circuits, quantum simulations, and error-correction protocols.

    The Business Implications That Matter

    If you’re a business leader, manager, or investor trying to understand where quantum computing fits in your strategic horizon, here’s the framework.

    Timeline Reality Check

    TimeframeWhat’s HappeningWhat Business Can Do
    2026-2027Research-grade quantum processors available to select institutionsTrack developments; build quantum literacy in technical teams
    2028-2030Early specialized applications in pharma, materials science, optimizationIdentify high-value use cases; establish partnerships with quantum vendors
    2031-2035Quantum advantage in specific domains; hybrid classical-quantum workflowsPilot programs for applicable problems; infrastructure planning
    Beyond 2035Potentially transformative quantum computing for chemistry, cryptography, AIStrategic integration; competitive positioning

    Where Quantum Computing May Actually Help

    Quantum computers are not faster classical computers. They solve different kinds of problems using different physics.

    The business applications most likely to benefit are:

    1. Drug Discovery and Materials Science
    Simulating molecular interactions and chemical reactions is exponentially hard for classical computers. Quantum systems can model quantum chemistry natively.

    Implication: Pharmaceutical companies, materials manufacturers, and energy companies should track quantum simulation capabilities.

    2. Optimization Problems
    Portfolio optimization, logistics routing, supply chain configuration, network design — problems where you’re searching massive solution spaces for optimal configurations.

    Implication: Financial services, logistics companies, and manufacturing operations may see early quantum advantage here.

    3. Cryptography and Security
    Quantum computers will eventually break current encryption standards. That’s a threat and an opportunity.

    Implication: IT security teams need post-quantum cryptography roadmaps now. The NSA and NIST have already published quantum-resistant standards.

    4. Machine Learning and AI
    Quantum machine learning is speculative, but certain optimization and pattern-recognition tasks may benefit from quantum acceleration.

    Implication: AI-heavy companies should watch this space but not count on it for current roadmaps.

    Let’s clear the air on what quantum computers — even breakthrough ones like Willow — cannot and will not do:

    Replace your cloud infrastructure
    Classical computers will remain dominant for almost everything.

    Run your ERP system faster
    Quantum computers are not general-purpose speed machines.

    Solve NP-complete problems instantly
    Quantum advantage is real but bounded. It’s not magic.

    Work at room temperature in your data center
    Willow operates at millikelvin temperatures in specialized quantum facilities.

    Deliver immediate ROI for typical business software
    This is scientific and engineering infrastructure, not enterprise SaaS.

    If you’re evaluating quantum computing as an investment opportunity, ask these three questions:

    1. Is the company solving a real problem or selling quantum buzzwords?

    Real: “We are developing quantum algorithms for molecular simulation in drug discovery.”
    Buzzword fog: “Our quantum-powered AI will revolutionize all industries with quantum advantage.”

    2. What is the error correction strategy?

    Willow’s milestone matters because error correction is the hard problem. Any quantum computing company that doesn’t have a credible error-correction roadmap is not serious.

    3. What is the classical baseline?

    Quantum advantage only matters if the quantum approach actually beats the best classical algorithm.

    Many “quantum advantage” claims dissolve when compared to optimized classical computing.

    A company that can’t clearly articulate their classical baseline doesn’t understand their own value proposition.

    What You Should Do This Week

    Here’s the practical move for business leaders and technology managers.

    Step 1: Build Quantum Literacy

    You don’t need a physics PhD. You need to understand:

    • What quantum computers are good at (simulation, certain optimization problems, cryptography)
    • What they’re not good at (everything else)
    • Where your business intersects quantum-relevant problems

    Step 2: Audit Your Cryptography

    Even if you never use a quantum computer, quantum computers will affect you through post-quantum cryptography.

    Action item: Ask your security team if your encryption systems are quantum-resistant. If they don’t know, that’s your answer.

    Step 3: Identify Your Quantum-Relevant Problems

    Make a list of hard computational problems in your business:

    • Molecular simulations?
    • Complex optimization?
    • Cryptographic security?
    • High-dimensional pattern matching?

    If you have problems on that list, quantum computing might eventually matter to you. If you don’t, you can watch the field develop without panic.

    Step 4: Track, Don’t Chase

    Quantum computing is advancing. Willow proves that. But it’s advancing from research milestones toward engineering challenges toward eventual commercial applications.

    That journey takes years, sometimes decades.

    The winning strategy is not to throw money at quantum projects because they sound futuristic. The winning strategy is to understand the trajectory, identify where it intersects your domain, and position yourself to adopt when the technology actually delivers advantage.

    The Bottom Line

    Google’s Willow chip is a genuine breakthrough in quantum error correction. It demonstrates that quantum systems can become more reliable as they scale up — which is the opposite of what happened in every previous generation of quantum hardware.

    That’s important.

    It’s also not ready to run your business.

    What business leaders should understand is this:

    Quantum computing is real. It’s advancing. It will eventually matter for specific, valuable problems in chemistry, materials science, optimization, and cryptography. But it’s not replacing classical computing, and it’s not a magic solution to generic business challenges.

    • Real capability and science fiction
    • Useful application and buzzword marketing
    • Strategic positioning and premature commitment

    And frankly, that’s a more interesting story than the hype would suggest.

    Because the real revolution isn’t that quantum computers will be magic.

    The real revolution is that we’re learning how to make them work.


    About AI With Peter: Practical technology analysis for business leaders who want to understand what’s real, what’s hype, and what to do about it. None of the noise. Just the signal.

  • Beyond the Chat Window: Why Grok 4.3’s API Changes the Cognition Economics for Business Users

    In the past 24 hours, xAI released Grok 4.3—a base LLM with December 2025 knowledge cutoff, 1M token context, and aggressive pricing ($1.25/M input, $2.50/M output).

    AI with Peter | May 1, 2026

    The headlines focus on benchmark scores: #1 on CaseLaw v2 (79.3%), 1500 ELO on GDPval-AA agentic tasks, 98% on τ²-Bench Telecom.

    But the real story isn’t what Grok 4.3 can do—it’s what happens when you stop treating AI as a conversation partner and start treating it as programmable infrastructure.

    The chat interface is a demonstration environment.

    The API is the production environment.

    And for business users—project leaders managing organizational memory, data analysts automating insight pipelines, educators scaling personalized learning—this distinction is the difference between experimenting with AI and embedding intelligence into operational workflows.

    The Chat Trap: Why Conversational AI Doesn’t Scale

    Here’s the problem with chat interfaces: they optimize for single-session convenience at the expense of cross-session composability. Every conversation starts from zero. There’s no state persistence, no workflow integration, no programmatic control over temperature, top-k sampling, or system prompts. You can’t A/B test prompt strategies. You can’t batch-process 500 customer support tickets overnight. You can’t version-control your inference logic or deploy it behind authenticated endpoints.

    Chat is fine for prototyping. But it’s the computational equivalent of writing production code in a REPL—useful for exploration, catastrophic for operations.

    The API, by contrast, lets you:

    1. Separate concerns: Decouple prompt engineering from delivery UI
    2. Compose workflows: Chain LLM calls with deterministic logic, external data sources, and validation layers
    3. Control cost: Run batch jobs at 50% lower rates, cache system prompts, dynamically adjust token limits
    4. Monitor quality: Log input/output pairs, track latency/cost per request, build feedback loops
    5. Scale horizontally: Process concurrent requests, integrate with existing CI/CD pipelines, deploy multi-tenant solutions

    This is cognition as infrastructure—not as dialogue, but as a computational primitive you can orchestrate alongside databases, message queues, and business logic.

    Grok 4.3’s pricing model makes this particularly compelling: $1.25/M input tokens is 20% cheaper than previous versions while delivering better performance on agentic benchmarks. For high-volume workflows—legal document review, financial report generation, curriculum personalization—this shifts the ROI calculation from “interesting experiment” to “operational necessity.”


    Three Winning Use Cases: API-First Cognition in Practice

    Use Case 1: Business/Project Leaders — Organizational Memory as Code

    The Problem: Project leaders manage fragmented institutional knowledge—meeting notes, decision logs, technical documentation, Slack threads. This knowledge degrades over time: context gets lost, decisions are re-litigated, new hires can’t find the “why” behind legacy systems.

    The API Solution: Build a queryable organizational memory that ingests artifacts (meeting transcripts, technical specs, product roadmaps), embeds them in vector space, and exposes a REST API for natural language retrieval.

    Grok 4.3’s 1M token context window means you can stuff entire project histories into a single prompt without chunking/retrieval fragility. The agentic performance (1500 ELO on GDPval-AA) means it can reason over multi-document sets to synthesize answers like:

    • “What were the trade-offs we considered when choosing React over Vue in Q2 2025?”
    • “Show me all decisions that assumed our Series B would close by October 2025.”
    • “Generate an onboarding doc for the payments team that explains our fraud detection pipeline.”

    This isn’t just search—it’s institutional reasoning. And because it’s API-driven, you can:

    • Integrate with Slack/Teams to answer questions inline
    • Trigger weekly summary emails of key decisions
    • Version-control prompt templates as your organizational knowledge evolves
    • Enforce access controls (Finance team gets finance docs, Engineering gets technical specs)

    Cost Analysis: Assume 500 queries/month, averaging 50K input tokens (context) + 1K output tokens (response):

    • Input: 500 × 50K × $1.25 = $31.25
    • Output: 500 × 1K × $2.50 = $1.25
    • Total: $32.50/month for a system that replaces 10+ hours of “digging through Notion/Confluence” labor per employee.

    Use Case 2: Data Analysts — Automated Insight Pipelines

    The Problem: Analysts spend 60% of their time on data janitorial work—cleaning CSVs, normalizing column names, writing SQL to join disparate sources—and only 40% on insight generation. The cognition-intensive parts (pattern detection, anomaly explanation, stakeholder reporting) are bottlenecked by manual preprocessing.

    The API Solution: Build a self-serve analytics pipeline where non-technical stakeholders upload raw data, describe what they need, and receive publication-ready reports.

    Grok 4.3’s multimodal capabilities + domain specialization (98% on τ²-Bench Telecom, #1 on CorpFin) mean it can:

    1. Ingest messy data: Parse Excel files with merged cells, footnotes, and color-coded categories
    2. Generate analysis code: Write Python/SQL to clean, transform, and visualize data
    3. Explain findings: Produce executive summaries that connect statistical patterns to business decisions

    Example workflow:

    User uploads: Q1_sales_messy.xlsx
    User prompt: "Which regions underperformed, and why?"
    
    API pipeline:
    1. Grok reads Excel → identifies structure issues (e.g., "Region" column has typos: "North East" vs "Northeast")
    2. Generates pandas code to normalize, aggregate by region, compute variance
    3. Runs code, produces chart + insights:
       - "Northeast underperformed by 18% vs forecast due to delayed product launch in Feb"
       - "Southwest overperformed by 12%, driven by Q1 marketing campaign"
    4. Returns Markdown report + chart PNG
    

    Why API > Chat: Analysts can batch-process 50 datasets overnight, log all transformations for auditability, and integrate this into existing BI dashboards (e.g., Tableau’s Python API). The chat interface forces manual uploads, non-reproducible interactions, and zero version control.

    Cost Analysis: 100 datasets/month, 20K input tokens (data preview + prompt) + 5K output tokens (code + report):

    • Input: 100 × 20K × $1.25 = $2.50
    • Output: 100 × 5K × $2.50 = $1.25
    • Total: $3.75/month to eliminate 20-30 hours of manual data prep.

    Use Case 3: Content Creators (Teachers/Educators) — Personalized Learning at Scale

    The Problem: Educators want to personalize instruction (adaptive problem sets, differentiated reading levels, targeted feedback), but doing this manually is cognitively expensive. Creating 5 difficulty tiers for a single algebra problem set takes hours. Grading 30 essays with individualized feedback is a weekend job.

    The API Solution: Build a learning orchestration platform where the API generates:

    1. Adaptive assessments: Student answers Question 1 incorrectly → API generates a scaffolded follow-up at lower difficulty
    2. Multi-tier content: One lesson plan → API produces versions for grade levels 6-12
    3. Personalized feedback: Batch-grade essays, flagging conceptual gaps and suggesting resources

    Grok 4.3’s domain specialization + fast inference (197 tokens/second) makes this feasible at classroom scale. Example: A teacher uploads a biology unit on cellular respiration. The API:

    • Generates 3 versions: Honors (college-level terminology), Standard (age-appropriate), Remedial (ELL-friendly)
    • Creates 15 formative assessment questions per tier
    • Provides answer keys + worked explanations
    • Flags common misconceptions based on simulated student errors

    Cost Analysis: Generate 50 lessons/semester, 30K input tokens (source material + instructions) + 15K output tokens (3 tiers × 5K tokens each):

    • Input: 50 × 30K × $1.25 = $1.88
    • Output: 50 × 15K × $2.50 = $1.88
    • Total: $3.76/semester to create content that would take 40+ hours manually.

    Code Lab: Building a Production-Ready Grok 4.3 Integration

    This lab walks through building an organizational knowledge API (Use Case 1) with Node.js, covering:

    1. API authentication and basic completion
    2. Document chunking for 1M token context
    3. Streaming responses for UX
    4. Error handling and rate limiting
    5. Prompt versioning and A/B testing

    Prerequisites

    • Node.js 18+
    • xAI API key (get from console.x.ai)
    • Basic familiarity with Express.js

    Step 1: Project Setup

    mkdir grok-knowledge-api
    cd grok-knowledge-api
    npm init -y
    npm install express dotenv axios
    

    Create .env:

    XAI_API_KEY=your_api_key_here
    PORT=3000
    

    Step 2: Basic API Client

    Create lib/grokClient.js:

    const axios = require('axios');
    
    class GrokClient {
      constructor(apiKey) {
        this.apiKey = apiKey;
        this.baseURL = 'https://api.x.ai/v1';
        this.model = 'grok-4.3-2025-12';
      }
    
      async complete(messages, options = {}) {
        const {
          temperature = 0.7,
          max_tokens = 4000,
          stream = false,
        } = options;
    
        try {
          const response = await axios.post(
            `${this.baseURL}/chat/completions`,
            {
              model: this.model,
              messages,
              temperature,
              max_tokens,
              stream,
            },
            {
              headers: {
                'Authorization': `Bearer ${this.apiKey}`,
                'Content-Type': 'application/json',
              },
              responseType: stream ? 'stream' : 'json',
            }
          );
    
          return stream ? response.data : response.data.choices[0].message.content;
        } catch (error) {
          console.error('Grok API Error:', error.response?.data || error.message);
          throw new Error(`API request failed: ${error.response?.status || 'Unknown'}`);
        }
      }
    
      // Calculate cost for a given request
      calculateCost(inputTokens, outputTokens) {
        const inputCost = (inputTokens / 1_000_000) * 1.25;
        const outputCost = (outputTokens / 1_000_000) * 2.50;
        return {
          inputCost: inputCost.toFixed(4),
          outputCost: outputCost.toFixed(4),
          totalCost: (inputCost + outputCost).toFixed(4),
        };
      }
    }
    
    module.exports = GrokClient;
    

    Step 3: Document Chunking Strategy

    Grok 4.3 supports 1M tokens, but you still want smart chunking for:

    • Cost control: Only send relevant context
    • Latency optimization: Smaller prompts = faster responses
    • Logical boundaries: Preserve semantic coherence

    Create lib/documentProcessor.js:

    class DocumentProcessor {
      constructor() {
        // Rough heuristic: 1 token ≈ 4 characters for English
        this.charsPerToken = 4;
      }
    
      // Estimate token count (rough approximation)
      estimateTokens(text) {
        return Math.ceil(text.length / this.charsPerToken);
      }
    
      // Chunk document by semantic boundaries (paragraphs/sections)
      chunkByParagraphs(text, maxTokensPerChunk = 50000) {
        const paragraphs = text.split(/\n\s*\n/);
        const chunks = [];
        let currentChunk = [];
        let currentTokens = 0;
    
        for (const para of paragraphs) {
          const paraTokens = this.estimateTokens(para);
          
          if (currentTokens + paraTokens > maxTokensPerChunk && currentChunk.length > 0) {
            chunks.push(currentChunk.join('\n\n'));
            currentChunk = [para];
            currentTokens = paraTokens;
          } else {
            currentChunk.push(para);
            currentTokens += paraTokens;
          }
        }
    
        if (currentChunk.length > 0) {
          chunks.push(currentChunk.join('\n\n'));
        }
    
        return chunks;
      }
    
      // Prepare context for a query (retrieve top-k relevant chunks)
      prepareContext(allDocuments, query, maxContextTokens = 100000) {
        // Simple keyword-based relevance (in production, use embeddings + vector search)
        const queryTerms = query.toLowerCase().split(/\s+/);
        
        const scoredChunks = allDocuments.map(doc => {
          const docLower = doc.content.toLowerCase();
          const score = queryTerms.reduce((acc, term) => {
            const matches = (docLower.match(new RegExp(term, 'g')) || []).length;
            return acc + matches;
          }, 0);
    
          return { ...doc, relevanceScore: score };
        });
    
        // Sort by relevance, take top chunks within token budget
        scoredChunks.sort((a, b) => b.relevanceScore - a.relevanceScore);
        
        const selectedDocs = [];
        let totalTokens = 0;
    
        for (const doc of scoredChunks) {
          const docTokens = this.estimateTokens(doc.content);
          if (totalTokens + docTokens <= maxContextTokens) {
            selectedDocs.push(doc);
            totalTokens += docTokens;
          } else {
            break;
          }
        }
    
        return {
          documents: selectedDocs,
          totalTokens,
          coverage: `${selectedDocs.length}/${allDocuments.length} documents`,
        };
      }
    }
    
    module.exports = DocumentProcessor;
    

    Step 4: Knowledge Query API

    Create server.js:

    require('dotenv').config();
    const express = require('express');
    const GrokClient = require('./lib/grokClient');
    const DocumentProcessor = require('./lib/documentProcessor');
    
    const app = express();
    app.use(express.json());
    
    const grok = new GrokClient(process.env.XAI_API_KEY);
    const processor = new DocumentProcessor();
    
    // Mock document store (in production: use vector DB like Pinecone, Weaviate)
    const knowledgeBase = [
      {
        id: 'doc_001',
        title: 'Q2 2025 Product Roadmap',
        content: `Our Q2 2025 roadmap focuses on three pillars:\n\n1. Mobile-first redesign: Complete React Native migration by June 15\n2. AI-powered search: Integrate semantic search using vector embeddings\n3. Enterprise SSO: Support Okta, Auth0, and Azure AD\n\nKey trade-offs:\n- Delayed Android tablet support to prioritize iPhone parity\n- Chose React Native over Flutter due to team expertise\n- SSO implementation blocks v2.0 launch by 3 weeks`,
      },
      {
        id: 'doc_002',
        title: 'Tech Stack Decision Log - Feb 2025',
        content: `React vs Vue Debate (Resolved 2025-02-12):\n\nDecision: React\n\nRationale:\n- 4/5 senior engineers have production React experience\n- Better ecosystem for mobile (React Native)\n- Hiring pool 2x larger (per LinkedIn data)\n\nDissent (from @alice): Vue has better DX for junior devs\nCounter: Training cost < hiring risk in current market\n\nDependencies: This assumes Series B closes Q3 2025 (approved headcount: 8 engineers)`,
      },
      {
        id: 'doc_003',
        title: 'Payments Architecture RFC',
        content: `Fraud Detection Pipeline:\n\nWe use a 3-tier approach:\n1. Rule-based filtering (blocks 80% of obvious fraud)\n2. ML model (XGBoost, retrained weekly on labeled data)\n3. Manual review queue (human analysts for edge cases)\n\nPerformance:\n- False positive rate: 2.3% (industry avg: 5%)\n- False negative rate: 0.8% (acceptable per CFO)\n\nKnown limitations:\n- Model doesn't handle cryptocurrency transactions well\n- Manual queue SLA is 4 hours (compliance requires <2 hours)`,
      },
    ];
    
    // Endpoint: Query knowledge base
    app.post('/api/query', async (req, res) => {
      const { question, max_tokens = 2000 } = req.body;
    
      if (!question) {
        return res.status(400).json({ error: 'Missing required field: question' });
      }
    
      try {
        // Step 1: Retrieve relevant documents
        const { documents, totalTokens, coverage } = processor.prepareContext(
          knowledgeBase,
          question,
          100000 // Use up to 100K tokens for context (well under 1M limit)
        );
    
        if (documents.length === 0) {
          return res.json({
            answer: "I couldn't find relevant information in the knowledge base for that question.",
            sources: [],
            cost: null,
          });
        }
    
        // Step 2: Build prompt
        const systemPrompt = `You are an organizational knowledge assistant. Your job is to answer questions based ONLY on the provided internal documents. If the documents don't contain enough information, say so explicitly.
    
    When answering:
    - Cite specific documents by title
    - Highlight trade-offs or caveats mentioned in the source material
    - Flag outdated assumptions (e.g., "This decision assumed Series B by Q3 2025")`;
    
        const contextBlock = documents.map(doc => 
          `[Document: ${doc.title}]\n${doc.content}`
        ).join('\n\n---\n\n');
    
        const userPrompt = `Context:\n${contextBlock}\n\nQuestion: ${question}`;
    
        const messages = [
          { role: 'system', content: systemPrompt },
          { role: 'user', content: userPrompt },
        ];
    
        // Step 3: Call Grok API
        const startTime = Date.now();
        const answer = await grok.complete(messages, { max_tokens });
        const latency = Date.now() - startTime;
    
        // Step 4: Estimate cost
        const inputTokens = processor.estimateTokens(systemPrompt + userPrompt);
        const outputTokens = processor.estimateTokens(answer);
        const cost = grok.calculateCost(inputTokens, outputTokens);
    
        res.json({
          answer,
          sources: documents.map(d => ({ id: d.id, title: d.title })),
          metadata: {
            latency_ms: latency,
            context_coverage: coverage,
            input_tokens: inputTokens,
            output_tokens: outputTokens,
            cost_usd: cost.totalCost,
          },
        });
    
      } catch (error) {
        console.error('Query failed:', error);
        res.status(500).json({ error: error.message });
      }
    });
    
    // Endpoint: Health check
    app.get('/health', (req, res) => {
      res.json({ status: 'ok', model: 'grok-4.3-2025-12' });
    });
    
    const PORT = process.env.PORT || 3000;
    app.listen(PORT, () => {
      console.log(`Knowledge API running on port ${PORT}`);
      console.log(`Test with: curl -X POST http://localhost:${PORT}/api/query -H "Content-Type: application/json" -d '{"question":"Why did we choose React over Vue?"}'`);
    });
    

    Step 5: Test the API

    Start the server:

    node server.js
    

    Test query:

    curl -X POST http://localhost:3000/api/query \
      -H "Content-Type: application/json" \
      -d '{
        "question": "What were the trade-offs when we chose React over Vue?"
      }'
    

    Expected response:

    {
      "answer": "According to the Tech Stack Decision Log from February 2025, the team chose React over Vue with the following rationale:\n\n**Primary factors:**\n- Team expertise: 4 out of 5 senior engineers had production React experience\n- Mobile strategy: Better ecosystem support via React Native\n- Hiring advantage: LinkedIn data showed the React talent pool was 2x larger\n\n**Trade-off acknowledged:**\nAlice dissented, noting that Vue offers better developer experience for junior engineers. However, the team concluded that training costs were lower than hiring risks in the current market.\n\n**Key assumption:**\nThis decision depended on the Series B closing in Q3 2025, which would approve headcount for 8 additional engineers. If funding changes, this calculus might need revisiting.",
      "sources": [
        { "id": "doc_002", "title": "Tech Stack Decision Log - Feb 2025" }
      ],
      "metadata": {
        "latency_ms": 1247,
        "context_coverage": "1/3 documents",
        "input_tokens": 487,
        "output_tokens": 183,
        "cost_usd": "0.0011"
      }
    }
    

    Step 6: Add Prompt Versioning (A/B Testing)

    Create lib/promptTemplates.js:

    const PROMPT_VERSIONS = {
      v1_standard: {
        system: `You are an organizational knowledge assistant. Answer questions based on provided documents.`,
        format: (context, question) => 
          `Context:\n${context}\n\nQuestion: ${question}`,
      },
      
      v2_detailed: {
        system: `You are an organizational knowledge assistant. Your job is to answer questions based ONLY on the provided internal documents. If the documents don't contain enough information, say so explicitly.
    
    When answering:
    - Cite specific documents by title
    - Highlight trade-offs or caveats mentioned in the source material
    - Flag outdated assumptions (e.g., "This decision assumed Series B by Q3 2025")`,
        format: (context, question) => 
          `Context:\n${context}\n\nQuestion: ${question}`,
      },
    
      v3_socratic: {
        system: `You are an organizational knowledge assistant trained to surface decision context. Don't just answer questions—explain the "why" behind decisions, identify assumptions, and flag risks.`,
        format: (context, question) =>
          `INTERNAL DOCUMENTS:\n${context}\n\n---\n\nQUESTION: ${question}\n\nProvide:\n1. Direct answer\n2. Key assumptions in source material\n3. Risks if assumptions changed`,
      },
    };
    
    function getPrompt(version, context, question) {
      const template = PROMPT_VERSIONS[version] || PROMPT_VERSIONS.v2_detailed;
      return {
        system: template.system,
        user: template.format(context, question),
      };
    }
    
    module.exports = { PROMPT_VERSIONS, getPrompt };
    

    Update server.js to support versioning:

    const { getPrompt } = require('./lib/promptTemplates');
    
    app.post('/api/query', async (req, res) => {
      const { 
        question, 
        max_tokens = 2000,
        prompt_version = 'v2_detailed', // Default to v2
      } = req.body;
    
      // ... (document retrieval stays the same)
    
      // Use versioned prompt
      const prompt = getPrompt(prompt_version, contextBlock, question);
      const messages = [
        { role: 'system', content: prompt.system },
        { role: 'user', content: prompt.user },
      ];
    
      // ... (rest of implementation)
    });
    

    Now test different prompts:

    # Test v3_socratic (surfaces assumptions)
    curl -X POST http://localhost:3000/api/query \
      -H "Content-Type: application/json" \
      -d '{
        "question": "Why did we choose React?",
        "prompt_version": "v3_socratic"
      }'
    

    Step 7: Add Rate Limiting and Monitoring

    Install dependencies:

    npm install express-rate-limit winston
    

    Create lib/logger.js:

    const winston = require('winston');
    
    const logger = winston.createLogger({
      level: 'info',
      format: winston.format.json(),
      transports: [
        new winston.transports.File({ filename: 'error.log', level: 'error' }),
        new winston.transports.File({ filename: 'combined.log' }),
      ],
    });
    
    if (process.env.NODE_ENV !== 'production') {
      logger.add(new winston.transports.Console({
        format: winston.format.simple(),
      }));
    }
    
    module.exports = logger;
    

    Update server.js:

    const rateLimit = require('express-rate-limit');
    const logger = require('./lib/logger');
    
    // Rate limiting: 10 requests per minute per IP
    const limiter = rateLimit({
      windowMs: 60 * 1000,
      max: 10,
      message: { error: 'Too many requests, please try again later.' },
    });
    
    app.use('/api/', limiter);
    
    // Update query endpoint to log metrics
    app.post('/api/query', async (req, res) => {
      const requestId = `req_${Date.now()}_${Math.random().toString(36).substr(2, 9)}`;
      
      logger.info('Query received', {
        requestId,
        question: req.body.question,
        prompt_version: req.body.prompt_version || 'v2_detailed',
      });
    
      try {
        // ... (existing logic)
    
        logger.info('Query completed', {
          requestId,
          latency_ms: latency,
          input_tokens: inputTokens,
          output_tokens: outputTokens,
          cost_usd: cost.totalCost,
        });
    
        res.json({ answer, sources, metadata });
    
      } catch (error) {
        logger.error('Query failed', { requestId, error: error.message });
        res.status(500).json({ error: error.message });
      }
    });
    

    Step 8: Deploy Considerations

    Environment Variables (add to .env):

    NODE_ENV=production
    XAI_API_KEY=your_api_key
    PORT=3000
    RATE_LIMIT_WINDOW_MS=60000
    RATE_LIMIT_MAX_REQUESTS=10
    

    Production Checklist:

    1. Authentication: Add JWT/API key middleware const authenticate = (req, res, next) => { const apiKey = req.headers['x-api-key']; if (apiKey !== process.env.INTERNAL_API_KEY) { return res.status(401).json({ error: 'Unauthorized' }); } next(); }; app.use('/api/', authenticate);
    2. Vector Database: Replace mock knowledgeBase with Pinecone/Weaviate for semantic search
    3. Streaming: For long responses, use Grok’s streaming mode: const stream = await grok.complete(messages, { stream: true }); stream.on('data', chunk => { res.write(`data: ${chunk}\n\n`); });
    4. Caching: Use Redis to cache frequent queries
    5. Monitoring: Integrate Datadog/Sentry for error tracking

    Deployment Example (Docker)

    Create Dockerfile:

    FROM node:18-alpine
    
    WORKDIR /app
    
    COPY package*.json ./
    RUN npm ci --only=production
    
    COPY . .
    
    EXPOSE 3000
    
    CMD ["node", "server.js"]
    

    Build and run:

    docker build -t grok-knowledge-api .
    docker run -p 3000:3000 --env-file .env grok-knowledge-api
    

    Key Takeaways

    1. APIs unlock composability: You can’t version-control chat conversations or A/B test prompts in a GUI
    2. Cost scales sub-linearly: Batch processing + caching means marginal cost per query drops as volume increases
    3. Observability is table stakes: Log every request/response pair for debugging, compliance, and model drift detection
    4. Prompt engineering is software engineering: Treat prompts as code—version them, test them, deploy them through CI/CD

    Grok 4.3’s aggressive pricing ($1.25/$2.50 per M tokens) makes API-first architectures economically viable for mid-market teams. The chat interface is where you prototype. The API is where you productionize cognition.


    Next Steps

    • Week 1: Deploy this knowledge API internally, seed with 10-20 key docs
    • Week 2: Instrument with analytics—track query patterns, identify gaps in knowledge base
    • Week 3: Integrate with Slack (answer questions inline) or email (automated weekly summaries)
    • Month 2: Expand to multi-modal (voice queries via TTS, image-based documentation)

    The organizations that win in the cognition economy won’t be the ones with the best models—they’ll be the ones who operationalize intelligence fastest. Stop chatting with AI. Start building with it.


  • LangChain 2026 Lab

    From Prompt Wrapper to Agent Engineering Platform

    This lab demonstrates the core concepts of LangChain as an agent engineering platform, including model-agnostic orchestration, custom tool creation, memory management, and the agent execution loop.

    The research_agent.py code is available here: 
    https://github.com/computationalknowledge/langchain/blob/main/research_agent.py
    
    The full lab with work steps is available here: 
    https://docs.google.com/document/d/1Gves-hIagKFEgtEMjs2tzwMpMSOTUiD-thSkem4EnZ0/edit?usp=sharing

    • How to initialize models with the unified LangChain interface
    • How to integrate external tool libraries
    • How to implement conversational memory
    • How to run the agent loop and inspect tool calls
    • The difference between declarative tool definition and imperative programming

    Prerequisites

    • Python 3.10 or higher
    • OpenAI API key (or Anthropic API key if using Claude)
    • Basic familiarity with Python and async/await

    LangChain 2026: From Prompt Wrapper to Agent Engineering Platform

    A Technical Deep-Dive with Hands-On Lab

    If you’re building with LangChain in 2026, you’re engineering production-grade autonomous systems that reason, recover from failures, manage state across database checkpoints, and execute multi-step workflows with the reliability of traditional software.

    This is not a minor version upgrade. This is a categorical shift.

    The framework has evolved from a convenience layer for LLM calls into a comprehensive agent orchestration platform with first-class support for:

    • Model-agnostic orchestration across 100+ providers
    • First-class tools and toolkits with parallel execution and retry logic
    • Dual-layer memory systems (short-term conversational context + long-term compressed knowledge stores)
    • Agentic RAG where the LLM decides when and how to retrieve, not just fetch-then-prompt
    • LangGraph for durable execution with checkpoint/resume semantics
    • Sandboxed agent deployment with pluggable remote/local/virtual execution backends
    • Production checklists for compliance, resilience, and observability

    If you’re still thinking of LangChain as “a library for calling OpenAI,” you’re operating with a 2023 mental model in a 2026 landscape. Let’s fix that.


    1. The Agent Loop: Call, Select, Execute

    At its heart, every LangChain agent implements a three-stage loop:

    1. CALL MODEL → Model reasons about the current state
    2. SELECT TOOL → Model decides which action to take  
    3. EXECUTE ACTION → Tool runs, result feeds back to step 1

    This loop continues until the objective is complete. The model isn’t just generating text—it’s making decisions about which tools to invoke, when to stop, and how to handle failures.

    1. Model-Agnostic Unification

    The langchain-core library provides a unified interface for:

    • OpenAI (GPT-4, GPT-3.5)
    • Anthropic (Claude Opus, Sonnet, Haiku)
    • Google (Gemini)
    • Open-source models (Llama, Mistral, etc.)

    Why this matters: Your agent logic doesn’t couple to a specific vendor. You define tools once, swap models with a configuration change. Same code, different inference backend.

    # Same interface, different providers
    from langchain_openai import ChatOpenAI
    from langchain_anthropic import ChatAnthropic
    
    model = ChatOpenAI(model="gpt-4")  # or
    model = ChatAnthropic(model="claude-opus-4")

    Dynamic routing and multi-provider failover become trivial when the abstraction is clean.

    1. Tools: The Hands of the Agent

    Tools are Python functions exposed to the LLM via structured schemas. The @tool decorator automatically:

    • Extracts argument metadata from type hints and docstrings
    • Generates JSON Schema for the model
    • Handles parallel tool calls with exponential backoff retries
    • Persists call history for debugging

    External capabilities library: LangChain maintains integrations for Wikipedia, SQL databases, code execution sandboxes, math engines, web search—over 160 pre-built tools.

    Key architectural decision: Tools are declared, not invoked. You hand the model a toolkit. The model decides what to call and when. This is the difference between RPA (you script the workflow) and agentic AI (the model scripts the workflow).

    Short-term memory: The MessagesState object maintains conversational continuity and active planning context. This is the chat history you see in the UI.

    Long-term memory: Postgres/embedding stores provide semantic search over compressed historical interactions. The agent can “remember” facts from 1,000 previous conversations without blowing up the context window.

    Production pattern: Use middleware (PIIMiddleware, SummarizationMiddleware) to cap LLM context costs while preserving coherent agent responses across long interactions.

    1. Agentic RAG: Dynamic Retrieval Under Agent Control

    Traditional RAG (Retrieval-Augmented Generation):

    Query → Retrieve Docs (k=3) → Prompt LLM → Answer

    Agentic RAG:

    Query → LLM decides:
      - Should I retrieve? (or answer from memory?)
      - Which retriever? (vector DB, SQL, web search?)
      - How many docs? (k=1 for precision, k=10 for coverage?)
      - Evaluate retrieved docs → refine query → retrieve again?

    The LLM orchestrates retrieval as a tool. Higher latency, deeper reasoning, better results for complex queries.

    1. LangGraph: Durable Execution with Checkpointing

    LangGraph wraps agent workflows in a state machine where every step is saved to a database. If the agent crashes at Node B:

    Execution: Node A → Node B → [CRASH]
    Recovery:  Skip to Node B (state restored from checkpoint) → Node C

    This is time travel for AI workflows. You can:

    • Resume from failures without re-running expensive LLM calls
    • Implement human-in-the-loop approval gates (execution pauses, waits for input, resumes)
    • Replay and debug agent decisions from production logs

    Production requirement: If you’re running code-executing agents or multi-hour research workflows, you MUST use LangGraph with PostgresSaver checkpoints. Non-negotiable.

    1. Sandboxed Execution: Deep Agents

    Never run code-executing agents on bare metal. The default in 2026 is Deep Agents remote sandboxing:

    • Pluggable backends (local Docker, remote VMs, virtual filesystems)
    • Per-user thread isolation with role-based access control (RBAC)
    • Network egress policies and syscall filtering

    Your agent can execute arbitrary Python, but it runs in a locked-down container with no access to your production database or internal networks.


    The 2026 Production Checklist

    If you’re deploying LangChain agents to production, validate these four pillars:

    1. Control Context
      Implement PIIMiddleware for compliance. Use summarization middleware to cap LLM context costs. Monitor token usage per session.
    2. Optimize Execution
      Use .stream(stream_mode='values') for zero-latency UX. Batch non-interactive queries to reduce compute.
    3. Sandbox Actions
      Never run code-executing agents on bare metal. Default to Deep Agents remote sandboxing with role-based access control.
    4. Ensure Resilience
      Wrap all critical workflows in LangGraph with PostgresSaver checkpoints. State recovery should be automatic, not manual.

    Hands-On Lab: Build Your First Production-Grade Agent

    Objective: Build a research assistant agent that can perform calculations, search Wikipedia, and maintain conversation memory. You’ll see model-agnostic initialization, tool declaration, memory integration, and the agent execution loop in action.

    Time: 30-45 minutes
    Prerequisites: Python 3.10+, basic familiarity with async/await
    What You’ll Learn:

    • How to initialize models with the unified interface
    • How to create custom tools with type-safe schemas
    • How to integrate external tool libraries (Wikipedia)
    • How to implement conversational memory
    • How to run the agent loop and inspect tool calls

    Step 1: Environment Setup

    Create a new directory and virtual environment:

    mkdir langchain_agent_lab
    cd langchain_agent_lab
    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate

    Install dependencies:

    pip install --break-system-packages langchain langchain-core langchain-community langchain-openai wikipedia-api python-dotenv

    Note: We’re using langchain-openai, but you could swap to langchain-anthropic or langchain-google-genai with zero code changes to the agent logic. That’s the model-agnostic unification in action.


    Step 2: Configure API Keys

    Create a .env file in your project directory:

    # .env
    OPENAI_API_KEY=your_openai_key_here

    If you’re using Anthropic Claude instead:

    ANTHROPIC_API_KEY=your_anthropic_key_here

    Important: Never commit API keys to version control. Add .env to your .gitignore.


    Step 3: Build the Agent (Complete Code)

    Create research_agent.py:

    """
    LangChain 2026 Research Agent Lab
    Demonstrates: Model-agnostic init, custom tools, external tools, memory, agent loop
    """
    
    import os
    from typing import Annotated
    from dotenv import load_dotenv
    
    # Core LangChain imports
    from langchain_core.messages import HumanMessage, AIMessage, SystemMessage
    from langchain_core.tools import tool
    from langchain_community.tools import WikipediaQueryRun
    from langchain_community.utilities import WikipediaAPIWrapper
    
    # Model provider (swap to langchain_anthropic for Claude)
    from langchain_openai import ChatOpenAI
    
    # Agent framework
    from langchain.agents import create_tool_calling_agent, AgentExecutor
    from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
    
    # Load environment variables
    load_dotenv()
    
    
    # =============================================================================
    # STEP 1: CUSTOM TOOLS
    # =============================================================================
    
    @tool
    def calculate(expression: str) -> str:
        """
        Evaluates a mathematical expression and returns the result.
    
        Use this for any arithmetic, algebra, or mathematical computations.
    
        Args:
            expression: A valid Python mathematical expression (e.g., "210", "sqrt(144)")
    
        Returns:
            The computed result as a string
    
        Examples:
            - "2 + 2" → "4"
            - "10 * 5" → "50"
            - "216" → "65536"
        """
        try:
            # Safe evaluation for math expressions
            import math
            # Add safe math functions to namespace
            safe_namespace = {
                "__builtins__": {},
                "abs": abs, "round": round, "pow": pow,
                "sqrt": math.sqrt, "sin": math.sin, "cos": math.cos,
                "log": math.log, "exp": math.exp, "pi": math.pi
            }
            result = eval(expression, safe_namespace)
            return str(result)
        except Exception as e:
            return f"Error evaluating expression: {str(e)}"
    
    
    @tool
    def create_research_note(topic: str, key_facts: str) -> str:
        """
        Saves a research note about a topic to the agent's long-term memory.
    
        Use this to persist important information discovered during research.
    
        Args:
            topic: The subject of the research note
            key_facts: The important facts to remember
    
        Returns:
            Confirmation message
        """
        # In production, this would write to a vector database
        # For the lab, we'll just acknowledge the save
        return f"✓ Research note saved for '{topic}'. Key facts stored in long-term memory."
    
    
    # =============================================================================
    # STEP 2: EXTERNAL TOOL INTEGRATION (Wikipedia)
    # =============================================================================
    
    # Initialize Wikipedia tool
    wikipedia_api = WikipediaAPIWrapper(
        top_k_results=2,  # Return top 2 search results
        doc_content_chars_max=500  # Limit content length
    )
    wikipedia_tool = WikipediaQueryRun(api_wrapper=wikipedia_api)
    
    # Assemble toolkit
    tools = [calculate, create_research_note, wikipedia_tool]
    
    
    # =============================================================================
    # STEP 3: MODEL INITIALIZATION (Model-Agnostic)
    # =============================================================================
    
    def initialize_model(provider="openai", model_name=None):
        """
        Initialize a model with the unified LangChain interface.
    
        This demonstrates model-agnostic orchestration. Same agent logic
        works with OpenAI, Anthropic, Google, or open-source models.
        """
        if provider == "openai":
            model_name = model_name or "gpt-4"
            return ChatOpenAI(
                model=model_name,
                temperature=0,  # Deterministic for consistent tool use
                streaming=True
            )
        elif provider == "anthropic":
            # Uncomment if using Anthropic:
            # from langchain_anthropic import ChatAnthropic
            # model_name = model_name or "claude-opus-4"
            # return ChatAnthropic(model=model_name, temperature=0)
            raise NotImplementedError("Install langchain-anthropic to use this provider")
        else:
            raise ValueError(f"Unknown provider: {provider}")
    
    
    # =============================================================================
    # STEP 4: AGENT CONFIGURATION WITH MEMORY
    # =============================================================================
    
    # System prompt defines agent behavior
    SYSTEM_PROMPT = """You are a research assistant with access to calculation tools and Wikipedia.
    
    Your capabilities:
    - Perform mathematical calculations using the 'calculate' tool
    - Search Wikipedia for factual information using the 'Wikipedia' tool  
    - Save important research findings using 'create_research_note'
    
    Guidelines:
    - Always think step-by-step before acting
    - Use tools when appropriate rather than guessing
    - Cite sources when referencing Wikipedia information
    - Be concise but thorough in your responses
    
    Remember: You're helping users learn and research. Be accurate, helpful, and educational."""
    
    # Create prompt template with memory placeholder
    prompt = ChatPromptTemplate.from_messages([
        ("system", SYSTEM_PROMPT),
        MessagesPlaceholder(variable_name="chat_history", optional=True),
        ("human", "{input}"),
        MessagesPlaceholder(variable_name="agent_scratchpad"),
    ])
    
    
    # =============================================================================
    # STEP 5: AGENT CONSTRUCTION AND EXECUTION
    # =============================================================================
    
    def create_research_agent():
        """Build the agent with model, tools, and memory."""
    
        # Initialize model (swap provider here for different models)
        model = initialize_model(provider="openai", model_name="gpt-4")
    
        # Create the agent
        agent = create_tool_calling_agent(
            llm=model,
            tools=tools,
            prompt=prompt
        )
    
        # Wrap in executor for handling the execution loop
        agent_executor = AgentExecutor(
            agent=agent,
            tools=tools,
            verbose=True,  # Shows tool calls and reasoning
            handle_parsing_errors=True,
            max_iterations=10  # Prevent infinite loops
        )
    
        return agent_executor
    
    
    def run_interactive_session():
        """
        Run an interactive session with the agent.
        Demonstrates conversational memory and multi-turn interactions.
        """
        print("=" * 70)
        print("🤖 LangChain 2026 Research Agent Lab")
        print("=" * 70)
        print("\nInitializing agent with tools: calculate, Wikipedia, research notes")
        print("Type 'quit' to exit\n")
    
        agent = create_research_agent()
        chat_history = []  # Stores conversation history (memory)
    
        while True:
            user_input = input("\n👤 You: ").strip()
    
            if user_input.lower() in ['quit', 'exit', 'q']:
                print("\n✨ Session ended. Check the output above to see tool calls!\n")
                break
    
            if not user_input:
                continue
    
            print("\n🔧 Agent thinking...\n")
    
            try:
                # Invoke agent with input and memory
                response = agent.invoke({
                    "input": user_input,
                    "chat_history": chat_history
                })
    
                # Extract response
                output = response.get("output", "No response generated.")
    
                print(f"\n🤖 Agent: {output}\n")
                print("-" * 70)
    
                # Update memory
                chat_history.append(HumanMessage(content=user_input))
                chat_history.append(AIMessage(content=output))
    
            except Exception as e:
                print(f"\n❌ Error: {str(e)}\n")
    
    
    def run_example_queries():
        """
        Run pre-defined queries to demonstrate capabilities.
        Use this to see the agent in action without manual input.
        """
        print("=" * 70)
        print("🧪 Running Example Queries")
        print("=" * 70)
    
        agent = create_research_agent()
        chat_history = []
    
        # Example queries that demonstrate different tools
        queries = [
            "What is 2 to the power of 16?",
            "Who was Alan Turing and what did he contribute to computer science?",
            "Create a research note about Alan Turing: British mathematician, Turing machine, Enigma code-breaker, father of computer science",
            "What's the square root of 144 multiplied by 5?"
        ]
    
        for i, query in enumerate(queries, 1):
            print(f"\n\n{'='*70}")
            print(f"Query {i}: {query}")
            print('='*70)
    
            try:
                response = agent.invoke({
                    "input": query,
                    "chat_history": chat_history
                })
    
                output = response.get("output", "No response")
                print(f"\n🤖 Response: {output}\n")
    
                # Update memory
                chat_history.append(HumanMessage(content=query))
                chat_history.append(AIMessage(content=output))
    
            except Exception as e:
                print(f"\n❌ Error: {str(e)}")
    
    
    # =============================================================================
    # MAIN EXECUTION
    # =============================================================================
    
    if __name__ == "__main__":
        import sys
    
        # Check for API key
        if not os.getenv("OPENAI_API_KEY"):
            print("❌ Error: OPENAI_API_KEY not found in environment")
            print("Create a .env file with: OPENAI_API_KEY=your_key_here")
            sys.exit(1)
    
        # Choose mode
        print("\nSelect mode:")
        print("1. Interactive session (manual input)")
        print("2. Example queries (automated demo)")
    
        choice = input("\nEnter 1 or 2: ").strip()
    
        if choice == "1":
            run_interactive_session()
        elif choice == "2":
            run_example_queries()
        else:
            print("Invalid choice. Running example queries...")
            run_example_queries()

    Step 4: Run the Agent

    Execute the script:

    python research_agent.py

    You’ll be prompted to choose between interactive mode (where you type queries) or example mode (automated demo).


    Step 5: Understanding the Output

    When you run the agent, watch for these key events:

    1. Tool Selection
    > Entering new AgentExecutor chain...
    Invoking: `calculate` with `{'expression': '216'}`

    The model decided to use the calculate tool. You didn’t write an if-statement. The model chose the action.

    1. Tool Execution
    65536

    The tool executed and returned a result.

    1. Agent Reasoning
    The result of 2 to the power of 16 is 65536.

    The model incorporated the tool result into its response.

    1. Multi-Step Workflows
      For complex queries, you’ll see multiple tool calls in sequence:
    Invoking: `Wikipedia` with `{'query': 'Alan Turing'}`
    Invoking: `create_research_note` with `{'topic': 'Alan Turing', ...}`

    This is the agent loop in action: call model → select tool → execute → repeat until done.


    Once the basic agent works, try these modifications:

    Extension 1: Swap Models (Model-Agnostic Orchestration)

    Install Anthropic SDK:

    pip install langchain-anthropic

    Modify research_agent.py:

    # Change this line in run_interactive_session() or run_example_queries():
    agent = initialize_model(provider="anthropic", model_name="claude-sonnet-4")

    Same agent logic. Different model. Zero changes to tools or prompts. This is the power of unified interfaces.

    Extension 2: Add Custom Tools

    Create a new tool for web search:

    @tool
    def web_search(query: str) -> str:
        """
        Searches the web for current information.
        Use this when Wikipedia doesn't have the answer or you need recent data.
        """
        # Implementation: DuckDuckGo, Bing API, etc.
        return "Web search results would appear here"

    Add it to the tools list. The agent automatically learns how to use it from the docstring.

    Extension 3: Implement Long-Term Memory with Vector DB

    from langchain.vectorstores import Chroma
    from langchain.embeddings import OpenAIEmbeddings
    
    # Initialize vector store
    embeddings = OpenAIEmbeddings()
    vectorstore = Chroma(embedding_function=embeddings, persist_directory="./agent_memory")
    
    @tool
    def search_memory(query: str) -> str:
        """Search the agent's long-term memory for relevant past conversations."""
        docs = vectorstore.similarity_search(query, k=3)
        return "\n".join([doc.page_content for doc in docs])

    Now your agent has semantic recall over all previous conversations. This is how production agents handle context that spans weeks or months.

    Extension 4: Add LangGraph for Durable Execution

    from langgraph.graph import StateGraph
    from langgraph.checkpoint.postgres import PostgresSaver
    
    # Define state schema
    class ResearchState(TypedDict):
        input: str
        chat_history: list
        agent_outcome: Any
    
    # Build graph
    workflow = StateGraph(ResearchState)
    workflow.add_node("agent", run_agent_node)
    workflow.add_node("tools", run_tools_node)
    # ... add edges, checkpointing
    
    # Compile with checkpointing
    checkpointer = PostgresSaver.from_conn_string("postgresql://localhost/agent_checkpoints")
    app = workflow.compile(checkpointer=checkpointer)

    Now if your agent crashes during a multi-hour research task, it resumes from the last checkpoint. No wasted LLM calls. No lost progress.


    Key Takeaways: The Agent Engineering Mindset

    After completing this lab, you should understand:

    1. Declarative tool definition – You describe capabilities, the model figures out when to use them
    2. Model-agnostic architecture – Same code, different providers, zero refactoring
    3. Memory is dual-layer – Short-term (chat history) + long-term (vector stores)
    4. The agent loop is autonomous – The framework orchestrates, you don’t write the control flow
    5. Production means resilience – Checkpointing, sandboxing, and middleware aren’t optional

    The shift from 2023 to 2026: You’re not building prompt wrappers anymore. You’re engineering production-grade autonomous systems with reliability patterns borrowed from distributed systems (checkpointing, state machines, retries) and safety patterns borrowed from containerization (sandboxing, RBAC, network policies).

    If you treat LangChain as a library for calling OpenAI, you’re missing the entire architecture. It’s an agent orchestration platform. Master it, and you’re building the infrastructure for reliable AI-driven workflows at scale.


    Next Steps

    • Read the LangGraph documentation – This is where production-grade agentic workflows live
    • Explore LangSmith – Observability and debugging for agent workflows (trace every tool call, inspect model reasoning, replay production failures)
    • Study the agent framework ecosystem – Compare LangChain vs Anthropic Claude SDK vs OpenAI function calling. Each has strengths. Choose based on your constraints (model support, sandboxing requirements, license)
    • Build something real – The best learning is shipping. Take this lab agent and extend it into a research assistant for your domain (legal documents, medical literature, code documentation, etc.)

    About This Series
    AI with Peter explores the architecture of AI systems—not just what they do, but how they’re built, why they’re designed that way, and what it means for developers who need to ship reliable intelligent systems. Subscribe for weekly deep-dives on frameworks, patterns, and production war stories.

    We advocate for rigorous technical education that builds genuine competence, not credentialed compliance. When institutions fail to teach systems thinking, we teach the alternative path: learn by building, validate through shipping, and never mistake a certificate for understanding.


    Ready to go deeper? The full LangChain 2026 documentation and reference architecture diagrams are available at docs.langchain.com.

  • The Real Economics of Agentic AI: What I Learned About Search, Browser Automation, and Cost


    Most people still talk about AI as if the main question is:

    “Can it do the task?”

    That is no longer the most important question.

    A much more useful question is:

    What is the cost structure of asking AI to do the task this way?

    That is where serious AI work begins.

    I’ve been using AI agents to do competitive analysis of how colleges are integrating software engineering principles into building inference-oriented mobile and I of T applications. AI assisted tooling such as Google Notebook LM gives learners their own experiential learning platforms which simulate on the job workflows in ways never before attainable. Learners can experience delivering real-world level solutions with immediate feedback and an elevation of their comprehension customized to their individual and cognitive and learning styles. Confusion and cognitive drag related to the mediation layer in text books has been flatten to zero.

    This has taught me something important:

    AI is not just an intelligence layer. It is an economic layer.

    And if you do not understand the economics, you do not really understand the workflow.

    The shift from “chatbot thinking” to “workflow thinking”

    A lot of people still approach AI tools the way they approached early chat systems:

    • ask a question
    • get an answer
    • move on

    But once you start using agents for real work — market scanning, research, comparison, structured analysis, workflow support — you are no longer just chatting.

    You are designing processes.

    And processes have:

    • cost
    • speed
    • failure modes
    • architecture decisions
    • tradeoffs

    This is especially true when you start using tools that can:

    • search the web
    • browse live sites
    • navigate pages
    • extract structured information
    • synthesize results

    At that point, the prompt is no longer just a prompt.

    It becomes a work instruction.

    And work instructions have economics.

    One of the biggest lessons: not all AI actions cost the same

    A very important distinction emerged in my experimentation:

    There is a major difference between:

    1. Search-index style retrieval

    This is when the system uses its own search infrastructure to discover material broadly.

    And:

    2. Direct browser navigation

    This is when the system goes directly to known websites, clicks, reads pages, and extracts information the way a human operator would.

    From a user perspective, both may feel like “the AI is researching.”

    But operationally, they are not the same thing.

    The lesson is:

    Broad search and direct site navigation are different economic behaviors.

    That means you should not casually design a workflow as if all “AI research” is equivalent.

    It is not.

    The hidden trap: asking one agent to do everything

    One of the easiest mistakes to make is to build a single giant agent workflow that tries to do all of the following at once:

    • find the sources
    • search the market
    • deduplicate
    • rank opportunities
    • research companies
    • discover contacts
    • draft messages
    • explain the results

    At first glance, that sounds efficient.

    In practice, it is often the exact opposite.

    Why?

    Because the broader and fuzzier the task becomes, the more the system is forced into expensive, multi-stage behavior.

    This is the point where many people accidentally turn AI from a useful assistant into an uncontrolled cost center.

    The better pattern is usually:

    Bound the search → shortlist → deepen only on the winners

    That is not just good prompt design.

    That is good economics.

    A curriculum-research example

    Let’s say you are doing a competitive analysis of how colleges are teaching:

    • web development
    • mobile development
    • AI-assisted software workflows
    • modern toolchains shaped by inference-capable systems

    There are at least two ways to ask for that.

    Expensive and naive

    “Find how colleges across Canada and the U.S. are teaching web and mobile development in the age of AI.”

    This sounds impressive.

    But it is vague, broad, and computationally undisciplined.

    It invites:

    • too many schools
    • too many irrelevant pages
    • too much crawling
    • too much synthesis too early

    Economically sane

    “Review a limited set of target institutions. Extract evidence of whether web and mobile curricula explicitly reflect AI-assisted workflows, toolchain modernization, or inference-era development changes. Return only the top findings.”

    Now the system has:

    • bounded sources
    • a narrower mission
    • a more disciplined output structure

    That is the difference between:

    • asking for “AI magic”
    • and designing a usable research workflow

    This is exactly the kind of thinking institutions, businesses, and training organizations need much more of.

    Why this matters for organizations

    A lot of leaders are still being sold a fantasy version of AI:

    • faster
    • cheaper
    • smarter
    • automatic

    But real organizational AI adoption has to deal with:

    • task architecture
    • cost discipline
    • governance
    • hidden dependencies
    • manual vs automated division of labor
    • process design

    That is why I increasingly believe one of the emerging professional skills is:

    By that I mean the ability to decide:

    • what should be searched broadly
    • what should be browsed directly
    • what should be staged
    • what should remain manual
    • where the expensive steps actually are
    • how to reduce waste without destroying value

    That is a real capability.

    And it will matter more and more as organizations move from “trying AI” to operating with AI.

    The important operational distinction

    One of the most practical lessons I’ve locked in is this:

    If an agent can work from known sites directly, that is often far more efficient than asking it to discover everything through broad search.

    In other words:

    Better for cost control

    • go directly to known job boards
    • go directly to known employer pages
    • go directly to known institutional sites
    • read and extract from those pages

    More expensive

    • open-ended “search the web for me” behavior
    • broad discovery across many unknown sources
    • forcing the system to do search, browse, compare, and synthesize all in one pass

    That does not mean search is bad.

    It means:

    search should be used deliberately, not lazily

    The strategic workflow pattern I now recommend

    For serious work, I recommend separating AI tasks into stages.

    Stage A — cheap discovery

    Use bounded prompts to gather:

    • top candidate sources
    • top current opportunities
    • top institutions to inspect
    • top documents worth deeper review

    Keep the output compact.

    Stage B — selective deeper work

    Only after you have a shortlist should you ask the system to:

    • compare
    • interpret
    • research context
    • identify implications
    • support messaging or recommendations

    This dramatically improves the economics of the workflow.

    It also improves quality.

    Why?

    Because most of the world is noise.

    The expensive part of AI should be reserved for the small slice that is actually worth serious attention.

    If you are involved in education, training, or curriculum design, this matters immediately.

    We are entering a period where the real differentiator is not simply:

    • using AI
    • mentioning AI
    • adding AI to a slide deck

    It is being able to design economically coherent AI-supported learning systems.

    That includes:

    • knowing which research tasks should be agent-assisted
    • knowing which content tasks should be staged
    • knowing when NotebookLM, retrieval systems, browser automation, or structured search should be used
    • knowing how to build learning workflows that are not just impressive, but sustainable

    That is part of the reason I have been doing this kind of competitive analysis work in the first place.

    The institutions that understand this early will not just save time.

    They will design better systems.

    The deeper lesson

    The deeper lesson here is that AI is beginning to look less like “software you use” and more like:

    an operational resource that must be budgeted, structured, and governed

    That changes the conversation.

    Now the real questions become:

    • What is the workflow?
    • What is the value of the workflow?
    • What is the cost structure of the workflow?
    • What part should be automated?
    • What part should remain human?
    • What part should be staged?

    That is a much more mature conversation than simple AI enthusiasm.

    The future will not be won by the people who shout “AI” the loudest.

    It will be won by the people who learn how to make AI workflows:

    • useful
    • bounded
    • economically sane
    • operationally durable

    That is where real competency begins.

    And that is also where the next generation of consulting, curriculum design, and institutional strategy work will increasingly live.

    Because the question is no longer:

    “Can AI do this?”

    The better question is:

    “Can we design the workflow so the value justifies the spend?”


  • What Perplexity Agent Credits Taught Me About the Real Economics of AI Work

    At first glance, this sounds like a straightforward research task.

    It is not.

    It is actually a perfect example of a much bigger issue that organizations are only beginning to understand:

    AI value is not just about what a model can do. It is about what the workflow costs.

    That is where the real economics of agentic AI begin.

    The hidden lesson most people miss

    A lot of people still think of AI as a chatbot with a monthly subscription.

    That is already too simple.

    Once you begin using agents to research, browse, compare, summarize, and synthesize across many sources, you move into a different world. You are no longer just asking questions. You are designing workflows, and workflows have architecture costs.

    That means the real skill is no longer just prompt writing.

    It is:

    workflow cost design

    Why a college competitive analysis is a perfect example

    Suppose you want to study how schools across Canada and the United States are approaching:

    • web development curricula
    • mobile development curricula
    • AI-assisted coding
    • software engineering education
    • implications of AI inference hardware
    • changing employer expectations for student skills

    A human being can imagine this as one broad question:

    “Go find out how colleges are doing all of this.”

    But an AI agent experiences that as many different tasks:

    • discovering institutions
    • finding current program pages
    • comparing course outlines
    • looking for curriculum shifts
    • extracting themes
    • comparing countries
    • avoiding duplicates
    • identifying which findings actually matter

    That is not one task.

    That is a stack of expensive subtasks.

    The lesson is immediate:

    Broad, fuzzy, open-ended work is expensive work.

    The real operational insight

    If you ask an agent to do all of the following in one run:

    • scan widely
    • crawl deeply
    • deduplicate
    • compare
    • rank
    • summarize
    • produce executive insights

    you are effectively asking for a premium research pipeline, not a simple search.

    That is the first competency organizations need to build:
    they must learn to distinguish between:

    • a cheap discovery scan
    • and an expensive synthesis workflow

    Those are not the same thing.

    The economics of AI are architectural

    This is the point I think many business leaders, educators, and even technically sophisticated professionals are still underestimating.

    When using agent-based AI systems, cost is driven not just by “how much AI you use,” but by:

    • how many sources are searched
    • how broad the problem is framed
    • how much deep browsing is required
    • how many outputs are requested
    • whether the system is asked to compare, rank, or research recursively
    • whether the task is staged or all-in-one

    In other words:

    That is a very different mindset from casual chatbot usage.

    What the college-research example reveals

    Take the curriculum-analysis scenario again.

    There are at least three different ways to architect that research.

    Version 1: expensive and naive

    “Search colleges across Canada and the United States and tell me how they are teaching web development and mobile development in the age of AI inference chips.”

    This sounds intelligent.

    It is actually a cost bomb.

    Why? Because it invites:

    • open-ended crawling
    • vague inclusion criteria
    • lots of irrelevant institutions
    • lots of duplication
    • deep synthesis before the search has even been bounded

    Version 2: bounded and disciplined

    “Search only a selected list of colleges. Compare only current web development and mobile development programs. Return only whether AI-related tooling, deployment, or hardware-awareness appears in the curriculum.”

    This is much better.

    It introduces:

    • source limits
    • scope control
    • output discipline

    Version 3: staged and economically sane

    This is the best model.

    Stage A

    Find the top 10 institutions worth examining.

    Stage B

    Review only those top 10 for:

    • curriculum structure
    • evidence of AI-related adaptation
    • relevant toolchain shifts

    Stage C

    Do deep interpretation only on the top 3–5 most significant examples.

    That is the pattern I increasingly believe organizations need:

    Bound the search → shortlist → deepen only on winners

    That is not just good research design.

    That is good AI economics.

    In business settings, people often assume that once an agent exists, the smart move is to let it do everything.

    That is usually the wrong move.

    The better question is:

    Which parts of the work deserve premium agent execution, and which parts should remain bounded, staged, or even manual?

    In my own work, I have found this distinction extremely useful.

    For example:

    • broad market scanning should be narrow and cheap
    • deep analysis should happen only on shortlisted targets
    • manual intake channels should remain separate if the agent adds little value there

    This is a much more mature operating model than simply saying:
    “Use AI to research everything.”

    Why this matters for colleges, businesses, and consultants

    If you are:

    • a college administrator
    • a curriculum designer
    • a business leader
    • an L&D strategist
    • an AI consultant

    then this matters because agentic AI is beginning to behave like a real operating expense.

    And as soon as something becomes an operating expense, it requires:

    • budgeting
    • governance
    • architecture
    • policy
    • ROI discipline

    That is where the conversation changes.

    The question is no longer:

    “Can AI help us do this?”

    The better question is:

    “Can we structure the workflow so the value exceeds the cost?”

    That is a much more serious question.

    The new professional skill: agentic workflow cost design

    I think this is becoming a real capability area.

    Not just AI usage.
    Not just prompting.
    Not just tool familiarity.

    A more advanced and valuable skill is:

    Agentic workflow cost design

    That means knowing how to:

    • narrow scope
    • break a problem into stages
    • separate discovery from synthesis
    • avoid unnecessary browsing
    • reduce duplication
    • reserve deeper agent work for the highest-value subset
    • control spend without destroying value

    That is a real consulting and advisory competency.

    The practical framework I now recommend

    When approaching research-heavy AI work, I would suggest five design questions:

    1. What is the true decision goal?

    Are you trying to:

    • discover options
    • compare competitors
    • rank alternatives
    • produce executive recommendations

    Those are different tasks and should not be blended thoughtlessly.

    2. What can be narrowed?

    Can you reduce:

    • sources
    • time window
    • geography
    • institution list
    • title clusters
    • output count

    Every reduction helps.

    3. What can be staged?

    What belongs in:

    • Stage A: cheap scan
    • Stage B: shortlist review
    • Stage C: deep analysis

    This is usually the biggest cost lever.

    4. What should remain manual?

    Not everything needs agent work.

    Sometimes:

    • a known email feed
    • a saved search
    • a manually curated source list
      is cheaper and better.

    5. What is the maximum acceptable spend?

    This is the governance question.

    If you cannot answer this, you are not really managing AI operations.
    You are improvising.

    What I think organizations need to learn now

    The organizations that will use AI best are not the ones that shout the loudest about AI.

    They are the ones that learn to ask:

    • What is the workflow?
    • What is the value of this workflow?
    • What is the cost structure?
    • How do we redesign it so the economics make sense?

    That is the beginning of maturity.

    Final thought

    The most important realization from doing serious AI-assisted research is this:

    AI is not just intelligence. It is economics.

    If you are evaluating how colleges teach web and mobile development in the age of AI inference chips, or trying to redesign any other serious information workflow, that lesson matters immediately.

    Because once you understand the economics of agentic work, you stop being dazzled by what AI can do.

    And you start focusing on the thing that actually matters:

    whether the workflow is designed well enough to justify the spend

    That is where competency begins.

  • Claude Code’s Model Context Protocol (MCP) is the “USB‑C port” that lets Claude reach beyond your local files into the rest of your tech stack, turning it from a smart editor into a true automation hub.


    Claude Code with Model Context Protocol (MCP) turns an AI from a chat window into a process‑analysis partner that can “see” your documentation, connect the dots between assets, and produce clear, defensible insights about how your work actually runs. This guide is written so business analysts, managers, and process engineers can follow it without needing to be developers.


    What Claude Code and Model Context Protocol Are:

    Claude Code in one sentence

    Claude Code is an “agent workspace” where you invite Claude into a project folder so it can read and write files, run safe commands you approve, and help you manage real work—not just answer questions.

    Model Context Protocol (MCP) in one sentence

    Think of model context protocol as the USB-C standard connector between AI models.

    MCP is a standard way to plug Claude into external systems (docs, wikis, ticketing, databases, knowledge bases) through small adapters called MCP servers.

    Think of it this way:

    • Claude is the analyst.
    • Claude Code is the workspace where the analyst sits.
    • MCP servers are adapters that open windows into specific systems (one server per system or data source).
    • Your tools (Confluence, SharePoint, file shares, etc.) are the rooms behind those windows.

    You decide which windows to open. Claude can then look through them, connect information, and write new files that summarize what it finds.


    2. Why MCP Matters for Business Analysts and Managers

    Most organizations already have everything they need to understand their processes—spread across different tools:

    • Process docs and SOPs in wikis, drives, or SharePoint.
    • Tickets and incidents in service or project systems.
    • Metrics and KPIs in BI dashboards and spreadsheets.
    • Policies, controls, and risks in GRC or governance tools.

    The problem is that the connections between all these assets live mostly in people’s heads. <Refer to the book “If only we knew what we know” for a discussion of this problem.>

    The solution? Model Context Protocol.

    Claude Code with MCP helps you:

    1. Discover unseen connections
      • Example: a recurring delay in onboarding that correlates with a single poorly documented step in an old SOP.
    2. Produce transparent, defensible insights
      • Every conclusion can be traced back to the specific documents Claude read and the files it created for you.
    3. Iterate quickly on process change
      • You can ask “what if” questions and get updated explanations without manually copying data between tools.

    You don’t need to write code. You need three things:

    • A curated folder of process docs.
    • A documentation MCP server pointing at that folder.
    • Plain‑English prompts for Claude.

    Warp AI and Notebook LM will help you bridge everything else.


    3. The Enterprise as a Graph of Process Assets

    It helps to imagine your organization as a graph:

    • Nodes (things):
      • SOPs, runbooks, tickets, risks, controls, KPIs, training modules, systems, roles.
    • Edges (relationships):
      • “implements”, “documents”, “depends on”, “is measured by”, “owned by”, “is evidence for”.

    Right now, most of those edges are implicit and undocumented.
    MCP + Claude Code lets the AI:

    • Walk across nodes and edges (from SOP → ticket → metric → control).
    • Notice inconsistencies and missing connections.
    • Explain what it sees in plain language, with references to specific assets.

    That’s exactly what a good business process analyst does—only now you have a tireless assistant.


    4. Three Concrete Use Cases (No Coding Required)

    4.1 Understand how a process really behaves

    Question you might ask:

    “Based on our Incident Management documentation, show me the official flow; then, using recent tickets, show me how it actually runs and where it diverges.”

    Claude can:

    • Use the docs MCP server to read your Incident SOPs.
    • Use other MCP servers (later) to read tickets and metrics.
    • Generate:
      • A step‑by‑step “happy path” flow.
      • A list of common deviations and their causes.
      • A text‑based diagram you can paste into a diagramming tool.

    4.2 Map risks, controls, and evidence

    Question:

    “For risk R‑123, list all related policies, procedures, controls, and evidence we have, and highlight any gaps.”

    Claude can:

    • Search your policy and control docs.
    • Identify where each control is documented and what assets back it up.
    • Produce:
      • A risk → control → evidence table.
      • A short narrative you can show to auditors.

    4.3 Plan the impact of a change

    Question:

    “We plan to retire System X. Based on our documentation, which processes, teams, and controls depend on it?”

    Claude can:

    • Read system descriptions and process docs.
    • Trace references to System X across documents.
    • Produce:
      • An impact analysis grouped by process and team.
      • Draft communication and training plans.

    In all cases, you stay in business language; Claude does the connecting and drafting.


    5. Warp AI: Your Translator Between English and the Terminal

    Before we set up MCP, we’ll use Warp, an AI‑powered terminal, as a translator between natural language and shell commands.

    Why Warp?

    • It lets you describe what you want in English.
    • It proposes the correct shell commands.
    • You can read them and click “Run” only if you’re comfortable.
    • You never need to memorize mkdir, git, or installation commands.

    Think of Warp AI as your “friendly tech partner” that runs commands for you, while you stay in control.

    What you do: describe goals in plain English.


    What Warp does: drafts the terminal commands, explains them, and executes only after you approve.

    We’ll use Warp to:

    • Create folders.
    • Install a simple documentation MCP server.
    • Register it with Claude.

    6. Quickstart Box: 10‑Minute MCP Docs Setup (Using Warp AI)

    You can paste this box into your blog as a standalone quickstart.


    Quickstart: 10‑Minute MCP Docs Setup (Using Warp AI)

    Goal
    Give Claude (within Claude Code or another MCP‑aware client) read‑only access to a folder of process documents, using Warp AI so you can work mostly in plain English.


    Step 1 – Install Warp Once

    1. Open your existing terminal (macOS Terminal, for example).
    2. Install Warp:
      • On macOS with Homebrew:bashbrew install --cask warp
      • Or visit the Warp website, download the app, and drag it into Applications.
    3. Launch Warp from Applications.
    4. Sign in (Google, GitHub, etc.) and enable AI when prompted.

    Success visual:
    You see the Warp window with a command line at the bottom and an AI/Agent button or panel you can open.


    Step 2 – Create a process-docs Folder With Warp AI

    Prompt:“Using the process-docs MCP server, – find all incident‑related documents in that folder, – create a file incident_asset_inventory.md in this project, – and fill it with a table: filename, document type, short summary, and why it matters for the process.

    Show me the file when you’re done.”Claude will call process-docs, read each file, and generate the inventory.

    Success visual:
    incident_asset_inventory.md appears in your project, with a readable table summarizing each document.From here on, you use business questions:

    • “Reconstruct the process from these documents.”“Find contradictions and unclear ownership.”“Create a cheat sheet and training outline for front‑line staff.”
    No more setup required unless you want to add more MCP servers later.

    7. Turning Documentation Into Transparent, Defensible InsightWith the MCP docs setup complete, you can now treat Claude as a process analyst that always shows its work.

    Here are some powerful prompts you can try immediately:

    Reconstruct the documented process“Using the documents in incident_asset_inventory.md via the process-docs server, reconstruct our Incident Management process as it is supposed to work.


    Provide:


    – a numbered list of steps from detection to closure,
    – for each step: purpose, primary role, key inputs and outputs,
    – and a short section listing any contradictions or unclear areas between documents.
    Cite document names for each claim.”7.2 Create a RACI matrix“From the same documentation set, build a RACI matrix for the Incident Management process.


    Use roles exactly as written in the docs.


    Output as a markdown table in a new file incident_raci.md.”7.3 Generate training material“Create a one‑page cheat sheet for front‑line staff describing this process in plain language.

    Then design a 3‑module training plan with learning objectives and quiz questions.”Each result is grounded in your own documents.

    If something looks wrong, you can ask:“Which files did you use to make this statement?”and Claude will tell you.

    8. Growing the Graph Over TimeOnce your first process works, you can:

    • Add other subfolders: change-management, onboarding, etc.Introduce more MCP servers for:
      • Tickets and incidents.BI/analytics data.Risk and control libraries.
      Ask increasingly sophisticated questions, like:
      • “Compare the documented flow with the last 90 days of tickets.”“Show where risk controls are documented but have no clear owner.”“List all processes that depend on System X.”
    You still work in natural language. Warp AI and Claude handle the technical details.9. Use Google Notebook LM as Your Personal TutorThis article gives you the overview and the steps, but learning is easier with a tutor by your side.

    That’s where Google Notebook LM and Gemini come in.

    Here’s what to do next:

    1. Copy this entire article.Open Google Notebook LM and create a new notebook.Paste the full article into a note.Ask Notebook LM:
      • “Act as my personal tutor.

      • Walk me through this article one step at a time, starting from installing Warp, with clear checkpoints after each step.”“Don’t move on until I confirm I’ve completed a step. If I say I’m stuck, explain it again in simpler language.”“Turn the ‘Quickstart: 10‑Minute MCP Docs Setup’ into a checklist with boxes I can tick off.”

    1. As you work through the steps on your own machine, you can ask Notebook LM follow‑up questions or bring in Gemini:

      • “Explain what an MCP server is again, but in under 100 words.”“Help me troubleshoot why Claude isn’t seeing the process-docs server.”“Suggest three example prompts I can try once my setup is working.”
    Whenever something still doesn’t make sense—or you hit an edge case—drop your questions into the comments on the blog.

    I’ll be happy to help you troubleshoot and refine your workflow.Between Claude Code + MCP, Warp AI as your translator, and Notebook LM as your tutor, you now have a full stack of support to go from scattered process documents to clear, transparent, and defensible insight into how your enterprise really operates.

    In Warp, open the AI/Agent input.

    Type this instruction (exact wording is fine):“I want a new folder in my home directory called process-docs/incident-management.Please:create process-docs in my home directory,create the subfolder incident-management inside it,then list that folder so I can see it exists.Explain the commands before you run them.”Warp AI will propose commands like:bashmkdir -p ~/process-docs/incident-management ls ~/process-docsplus an explanation.Review the commands; if they look reasonable, click Run.

    Success visual:
    Warp prints something like:textincident-managementwhen it lists ~/process-docs, and you can see the folder in Finder/Explorer.

    Step 3 – Put Your Process Documents Into That Folder

    1. Open your file manager (Finder/Explorer).Navigate to the documents you already use for your process (Word, PDFs, Markdown, etc.).Move or copy a small, curated set into:
      • macOS/Linux: ~/process-docs/incident-managementWindows (if applicable via WSL): the equivalent directory Warp showed.
      Rename them clearly, for example:
      • INCIDENT-Overview.docxINCIDENT-Roles.pdfINCIDENT-Workflow.mdINCIDENT-Metrics.xlsx
    If you’re unsure how to move files, ask Warp AI:“Explain step by step how to move my incident management documents from my Downloads folder into ~/process-docs/incident-management on macOS using Finder, not commands.”

    Success visual:
    When you open process-docs/incident-management in Finder/Explorer, you see your clearly named process files.

    Step 4 – Use Warp AI to Install a Simple Docs MCP ServerWe’ll use a filesystem‑style MCP server that exposes a folder of files to Claude.

    In Warp AI, type:“I’m using Claude Code and I want to expose the folder ~/process-docs/incident-management as a read‑only documentation MCP server that follows the Model Context Protocol.

    Please:recommend a simple filesystem‑style MCP server,give me the exact installation command,explain what it does, in plain English, before I run it.”

    1. Warp AI will typically respond with:
      • A recommendation (e.g., an official example filesystem MCP server).An installation command (for example, an npx or pip command).A description like “this installs a small tool that lets Claude list and read files in the specified directory.”
      Read the explanation.When you’re comfortable, click Run to install.
    Success visual:
    Warp prints normal installation logs, ending without errors (no big red stack traces). The AI confirms that the server is installed.

    Step 5 – Register the MCP Server With Claude (Still Using Warp AI)Now we tell Claude about this new server.

    In Warp AI, type:“I have installed a local docs MCP server that reads ~/process-docs/incident-management.I’m using Claude Code (or Claude Desktop).Show me, step by step, how to add this server to my Claude MCP configuration so it appears under the name process-docs.Give me the exact config snippet and where to paste it.”Warp AI will:

    • Explain where your Claude MCP configuration lives.

    • Draft a small configuration block, for example:json{ "mcpServers": { "process-docs": { "command": "npx", "args": [ "-y", "@modelcontextprotocol/server-filesystem", "/Users/yourname/process-docs/incident-management" ] } } }

    • Tell you to replace /Users/yourname with your actual username and to save the file.
    Follow its instructions exactly.Restart Claude Code / Claude Desktop so it reloads MCP servers.

    Success visual:

    No error messages when restarting Claude; MCP settings save cleanly.

    Step 6 – Confirm That Claude Sees process-docs

    1. Open Claude (Code or Desktop).Start a new session or project.Ask Claude:“List the MCP servers you can access and briefly describe what process-docs can do.”Claude should say something like:“The process-docs server lets me list and read files from your incident‑management process documentation folder…”
    Success visual:
    process-docs appears in Claude’s answer with a clear description of reading your docs.

    Step 7 – Let Claude Build an Asset InventoryNow the fun part: ask Claude to build a structured inventory of your docs.

    In Claude Code, create a project folder (for example incident-analysis) and add a CLAUDE.md with a short description:

    text# Project: Incident Management Process Analysis You are an AI process analyst.

    Goal:

    - Understand our Incident Management process. - Identify gaps and contradictions.

    - Produce artifacts I can reuse (summary, flow, RACI, training outline).

    Process docs are available via the `process-docs` MCP server.

    Read only.

    Always tell me which documents you used.

    Open this project in Claude Code.