Build Your Own Local AI Agent: A Step By Step Guide

Local AI agents run on your machine. No cloud. No external APIs. Just you, your hardware, and the model. This post walks through the essentials: choosing a model, wiring it up with an agent framework, and running it locally. If you want privacy, speed, or control, this is how you get it.

What Can Local Agents Do?

Local agents can handle a wide range of tasks: summarizing documents, answering questions, automating workflows, scraping websites, or even acting as coding assistants.

In this post, we’ll focus on a simple task: scraping news headlines from a website and summarizing them. It’s fast, useful, and shows the core pieces in action.

Tools We’ll Use

  • Ollama – run language models locally with one command. Gemma or Mistral work fine on a Laptop
  • LangChain – structure reasoning, tools, and memory
  • Python – glue everything together

Basic Structure of a Local Agent

  1. Model – the LLM doing the “thinking”
  2. Tools – code the agent can use (like a scraper or file reader)
  3. Prompt – instructions for what the agent should do
  4. Loop – let the agent think and act step-by-step

That’s it. The rest is just wiring.

Getting Started

  1. Install Ollama
    https://ollama.com
    brew install ollama or grab it for your OS.
  2. Pull a model: ollama run mistral
  3. Set up a LangChain agent
    Load the model via LangChain, define a tool, and pass it to the agent. You’ll see how in the example below.

The Code

pip install langchain beautifulsoup4 requests

ollama run mistral

Now make yourself a python script, such as run.py

from langchain.llms import Ollama

llm = Ollama(model="mistral")

The scraper:

import requests
from bs4 import BeautifulSoup

def get_headlines(url="https://www.bbc.com"):
    res = requests.get(url)
    soup = BeautifulSoup(res.text, "html.parser")
    headlines = [h.get_text() for h in soup.find_all("h3")]
    return "\n".join(headlines[:10])  # Just take top 10

Wrap it as a LangChain tool:

from langchain.agents import tool

@tool
def scrape_headlines() -> str:
    """Scrapes top headlines from BBC."""
    return get_headlines()

Build the agent:

from langchain.agents import initialize_agent, AgentType

tools = [scrape_headlines]

agent = initialize_agent(
    tools,
    llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True
)

Run the agent:

agent.run("Get the top news headlines and summarize them in a few bullet points.")

That’s it, you now have a local agent: scraping, thinking, and summarizing. All on your machine.

std::vector

std::vector is a dynamic container that can be used with offsets. Unlike static arrays, it can grow and shrink in size at runtime, but you can use offsets as well as iterators to navigate on them.

Some key points about std::vector

  • Dynamic resizing: When you add elements to a std::vector, it can automatically resize itself to accommodate the new elements.
  • Random access: You can access any element of a std::vector using its index, just like a regular array.
  • Efficient insertion and deletion: Inserting or deleting elements at the end of a std::vector is very efficient.
  • Memory management: std::vector handles memory management for you, so you don’t have to worry about allocating and deallocating memory manually.
  • Iterators: std::vector provides iterators that allow you to iterate over its elements in a convenient way.

Example of std:vector in C++

#include <iostream>
#include <vector>

int main() {
    std::vector<int> numbers;

    // Add elements to the vector
    numbers.push_back(1);
    numbers.push_back(2);
    numbers.push_back(3);   


    // Access elements using   
 their indices
    std::cout << numbers[0] << std::endl;
    std::cout << numbers[1] << std::endl;
    std::cout << numbers[2] << std::endl;

    // Iterate over the elements using a for loop
    for (int number : numbers) {
        std::cout << number << std::endl;
    }

    return 0;
}

It will output 1,2,3 and 1,2,3.

Differences between std::vector and std::list

Featurestd::vectorstd::list
Underlying data structureDynamic arrayDoubly linked list
Random accessYesNo
Insertion/deletion at the beginning/middleInefficient (O(n))Efficient (O(1))
Insertion/deletion at the endEfficient (O(1))Efficient (O(1))
IteratorsRandom access iteratorsBidirectional iterators
Memory usageGenerally more compactGenerally less compact (due to pointers)
Export to Sheets

std::vector is often preferred due to random access and memory efficiency, while std::list if the choice if you often need to insert at the beginning or in the middle.

C++ map quick example

Here’s a quick example of using a C++ map:

#include <iostream>
#include <map>

int main() {
    std::map<std::string, int> grades;
    grades["John"] = 85;
    grades["Jane"] = 92;
    grades["Jim"] = 78;

    std::cout << "John's grade: " << grades["John"] << std::endl;
    std::cout << "Jane's grade: " << grades["Jane"] << std::endl;
    std::cout << "Jim's grade: " << grades["Jim"] << std::endl;

    return 0;
}

In this example, we use a std::map to store the grades of three students (John, Jane, and Jim). Each student’s name is used as a key to look up their corresponding grade. The code uses the [] operator to insert new grades into the map and to look up existing grades.

When we run this program, it outputs the following:

John’s grade: 85

Jane’s grade: 92

Jim’s grade: 78

This demonstrates how a std::map can be used to store key-value pairs and look up values based on keys. It’s a powerful and flexible container that can be used in many applications.

How to read long compiler outputs

Reading long compiler outputs can be overwhelming and time-consuming, but there are several steps you can take to make it easier:

  1. Scan for error messages: Look for the word “error” in the output, as this indicates a problem that needs to be fixed. Start by fixing the first error, as it may resolve subsequent errors.
  2. Look for error messages that are repeated: If the same error message is repeated multiple times, it may be easier to resolve all instances of the error at once.
  3. Locate the file and line number of the error: The compiler will usually provide the name of the file and line number where the error occurred. This information can be used to quickly locate the problem in your code.
  4. Read the error message carefully: The error message will usually give you a clue as to what the problem is and how to fix it. Pay close attention to the error message and take the time to understand what it is telling you.
  5. Use a text editor with error navigation: Some text editors have plugins that can automatically parse the compiler output and allow you to quickly navigate to the location of the error in your code.
  6. Consult online resources: If you are not sure how to resolve an error, you can consult online resources such as Stack Overflow, the GCC documentation, or other forums.
  7. Try to understand the root cause of the error: Compiler errors often have multiple causes, so try to understand the root cause of the error so you can fix it for good.

By following these steps, you can make reading long compiler outputs easier and more manageable.

std::atomic default value

std::atomic is a template that allows you to declare its contained value to be atomic: access to this value from different threads is well defined and does not lead to data races.

How to set values

std::atomic is not copyable and not movable.

Lets say we need an atomic value as a class member and want it to be set to a default value:

#include <atomic>

class Foo {
    std::atomic<bool> _online = true; // intuitive, but doesn't work
}

That doesn’t compile. Clang will tell us: "copying member subobject of type 'std::atomic_bool' (aka 'atomic<bool>') invokes deleted constructor."

Use the initialization constructor

Default values to atomic members can be assigned using an initializer list {} (braced initializer, or C++11 brace-init):

std::atomic<bool> _online{true};

For the complete API, refer to https://www.cplusplus.com/reference/atomic/atomic/

Aliases

And for those who want to improve readability a bit more, there are aliases we can use:

std::atomic_bool _online{true};

std::atomic_short
std::atomic_ushort
std::atomic_int
std::atomic_uint
std::atomic_long
std::atomic_ulong
std::atomic_llong
std::atomic_ullong
std::atomic_char8_t
std::atomic_char16_t
std::atomic_char32_t
std::atomic_wchar_t
std::atomic_int8_t
std::atomic_uint8_t
std::atomic_int16_t
std::atomic_uint16_t
std::atomic_int32_t
std::atomic_uint32_t
std::atomic_int64_t
std::atomic_uint64_t
std::atomic_int_least8_t
std::atomic_uint_least8_t
std::atomic_int_least16_t
std::atomic_uint_least16_t
std::atomic_int_least32_t
std::atomic_uint_least32_t
std::atomic_int_least64_t
std::atomic_uint_least64_t
std::atomic_int_fast8_t
std::atomic_uint_fast8_t
std::atomic_int_fast16_t
std::atomic_uint_fast16_t
std::atomic_int_fast32_t
std::atomic_uint_fast32_t
std::atomic_int_fast64_t
std::atomic_uint_fast64_t
std::atomic_intptr_t
std::atomic_uintptr_t
std::atomic_size_t
std::atomic_ptrdiff_t
std::atomic_intmax_t
std::atomic_uintmax_t

Tools

To detect data races you can use ThreadSanitizer, for example when you run your automated tests. It will tell you where to optimize your code.

clang++ -fsanitize=thread

And in cmake:

cmake -DCMAKE_C_FLAGS="-fsanitize=thread"

More about tsan here: https://clang.llvm.org/docs/ThreadSanitizer.html

pre-increment and post-increment in C++

Maybe you have wondered why there are 2 types if increment operators in C++ and why it matters in some situations.

Lets have a variable i of type int. Pre-increment would be ++i, and post-increment is i++ as we learned it in school;

The result in both ways is the same, it increments the variable i by 1. The difference is how it behaves when passing i along to a function:

void doSomething(int i) {
  std::cout << i << std::endl;
}

int i = 12;
doSomething(i++); // prints 12
// i is now 13
int i = 12;
doSomething(++i); // prints 13
// i is now 13

The explanation

Pre-increment increments the value of i and returns a reference to the now increased i.

Post-increment makes a copy of i, then increments i, but returns the copy.