Building a Local RAG AI Agent for Airline Reviews with Ollama and LangChain

The landscape of Artificial Intelligence is shifting rapidly from massive, centralized cloud models to efficient, localized deployments. For developers and privacy-conscious enterprises, the ability to run sophisticated AI agents entirely offline is no longer a luxury—it is a strategic advantage. In this tutorial, we will explore the architecture and implementation of a local Retrieval-Augmented Generation (RAG) system designed to analyze airline reviews. By leveraging the power of n1n.ai principles of efficiency and local-first development, we can create a high-performance assistant without incurring API costs or compromising data privacy.

Why Local RAG Matters in 2025

Traditional RAG systems often rely on proprietary APIs like OpenAI or Anthropic. While these are powerful, they come with latency, cost, and data residency concerns. By using Ollama, we can host models like Llama 3.2 locally, ensuring that sensitive customer data—such as airline reviews—never leaves the local environment. This is particularly useful for small curiosity-driven projects or internal enterprise tools where the scale doesn't yet justify a massive cloud spend. However, for those looking to scale beyond local hardware, n1n.ai provides the perfect bridge to high-speed, stable LLM APIs when your local resources reach their limit.

The Technical Stack

To build this agent, we utilize a modern Python-based stack:

Ollama: The runtime that allows us to run LLMs locally on macOS, Linux, and Windows.
Llama 3.2: Our primary reasoning engine, optimized for speed and instruction following.
mxbai-embed-large: A state-of-the-art embedding model designed for high-density semantic search.
ChromaDB: An open-source vector database that stores our review embeddings.
LangChain: The orchestration framework that ties the data retrieval and LLM generation together.
Pandas: For initial data cleaning and CSV manipulation.

Data Preparation: The Foundation of Quality

The dataset for this project consists of airline reviews sourced from Kaggle. Raw data is often noisy; therefore, we must refine it to ensure the RAG system performs optimally. Instead of feeding the entire CSV into the vector store, we focus on specific columns: review_text, airline_name, and overall_rating.

Reducing the dataset size not only speeds up the embedding process but also minimizes the noise during retrieval. In a production environment, you might consider using advanced cleaning techniques, but for this proof of concept, a simple Pandas filter is sufficient.

Implementation Guide

1. Setting Up the Environment

First, ensure you have Ollama installed and the necessary models pulled:

ollama pull llama3.2
ollama pull mxbai-embed-large

Next, install the Python dependencies:

pip install langchain langchain-ollama langchain-chroma pandas

2. Vector Ingestion and Embedding

The core of RAG is converting text into numerical vectors. We use the mxbai-embed-large model because it offers a great balance between dimensionality and semantic accuracy.

from langchain_ollama import OllamaEmbeddings
from langchain_chroma import Chroma
from langchain_community.document_loaders import DataFrameLoader
import pandas as pd

# Load and clean data
df = pd.read_csv('airline_reviews.csv')
df = df[['airline_name', 'content', 'overall_rating']].dropna()

# Initialize loader
loader = DataFrameLoader(df, page_content_column="content")
documents = loader.load()

# Create embeddings and store in Chroma
embeddings = OllamaEmbeddings(model="mxbai-embed-large")
vectorstore = Chroma.from_documents(
    documents=documents,
    embedding=embeddings,
    persist_directory="./chroma_db"
)

3. The Retrieval Chain

Once the data is indexed, we need to define how the agent retrieves information and generates a response. We use a custom prompt template to ensure the model stays grounded in the provided context. If the model cannot find an answer in the retrieved documents, it is instructed to admit it rather than hallucinate.

from langchain.prompts import ChatPromptTemplate
from langchain_ollama import ChatOllama
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain

llm = ChatOllama(model="llama3.2", temperature=0)

template = """
You are an expert in airline reviews.
Use the provided reviews to answer the question.

Context: \{context\}

Question: \{input\}

Answer ONLY based on the context. If unknown, say 'No reviews found.'
"""

prompt = ChatPromptTemplate.from_template(template)
combine_docs_chain = create_stuff_documents_chain(llm, prompt)
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
retrieval_chain = create_retrieval_chain(retriever, combine_docs_chain)

Testing the Agent

A robust RAG system must handle both valid and out-of-distribution queries. When we ask, "How is the service on Emirates?", the system retrieves top-rated reviews mentioning Emirates and summarizes the sentiment. Conversely, if we ask about a non-related topic like "Is the Honda CR-V a good car?", the retriever returns no relevant documents, and the LLM correctly responds with the fallback message.

Scaling Local RAG to Production

While running locally is excellent for development, production workloads often require higher availability and lower latency than a single local machine can provide. This is where n1n.ai becomes essential. As the premier LLM API aggregator, n1n.ai allows you to switch from local Ollama instances to high-performance cloud endpoints (like Claude 3.5 Sonnet or GPT-4o) with minimal code changes. This hybrid approach—developing locally and deploying with a stable API aggregator—is the gold standard for modern AI engineering.

Conclusion

Building a local RAG agent with Ollama is a rewarding way to understand the mechanics of semantic search and LLM orchestration. It proves that you don't need a massive GPU cluster to build something functional and intelligent. Whether you are analyzing airline reviews or building a personal knowledge base, the combination of local tools and professional aggregators like n1n.ai ensures your project is both flexible and future-proof.

Get a free API key at n1n.ai

Source: https://dev.to/najilouis/building-a-local-rag-ai-agent-for-airline-reviews-with-ollama-2ik5