Build Your Local AI: From Zero to a Custom ChatGPT Interface with Ollama & Open WebUI
Imagine having ChatGPT or DeepSeek-like capabilities right on your computer — no subscription fees, no privacy concerns, no waiting for responses, and complete customization options. Sounds too good to be true? It’s not!
Large Language Models (LLMs) have become indispensable tools for many of us. Whether you’re using them to process text, learn new concepts, generate code solutions, automate workflows, or enjoy chatting with your computer buddy — they’re changing how we work and create.
But there’s a catch with popular services like OpenAI, Anthropic, or Perplexity: they often come with limitations:
- Cost: Monthly subscriptions or per-token pricing can add up quickly
- Limited APIs: Restricted customization options
- Missing features: Many don’t offer RAG (Retrieval-Augmented Generation) capabilities
- Privacy concerns: Your data might be used for training or stored on third-party servers
- Response delays: Peak usage times can mean long waits for responses
The solution? A locally running LLM instance that you can fully customize to your specific needs — complete with a user-friendly interface that rivals commercial offerings.
In this guide (about a 1-hour setup), I’ll walk you through setting up a robust local AI environment using free, open-source tools that provide:
- Total privacy (your data stays on your machine)
- No subscription costs
- Minimal response waiting time
- Nearly comparable results to high-end commercial LLMs
- Complete customizability for your specific use cases
Personal note: I've been using this setup for two months, and it has cut my monthly AI subscription costs to zero while increasing my productivity. The initial setup time is worth every minute for the long-term benefits.
Toolstack Overview
Our local AI setup will use two primary tools:
Ollama
Ollama is an open-source framework designed specifically for running LLMs locally. It provides:
- Access to a wide variety of open-source models (DeepSeek, Llama, Phi, Mistral, Gemma, and many more)
- Text generation capabilities
- Multimodal support (for models that can process images)
- Efficient model management
OpenWebUI
OpenWebUI is currently the most prominent open-source project offering a UI interface for your Ollama instance. Think of it as your local version of the ChatGPT or Claude interface, but with even more features:
- User-friendly chat interface
- Model customization
- RAG capabilities
- Web search integration
- Code interpreter
- Complex workflow design
- And many more features are being actively developed
The best part? OpenWebUI is constantly improving as passionate engineers contribute to this open-source project, bringing features from proprietary platforms to this free alternative.
Requirements
Before we start, make sure your system meets these minimum requirements:
- A GPU-powered laptop or desktop (minimum 4GB GPU memory, 8GB+ recommended)
- At least 20GB of free disk space (models can be large)
- A modern web browser
Pro tip: While CPU-only setups technically work, they’ll be significantly slower. Even a modest GPU will greatly improve your experience.
Environment Setup
Setting Up Python 3.11
Ollama and OpenWebUI work best with Python 3.11, so we’ll start by making sure we have the correct version installed.
First, check your current Python version:
python --version
If you don’t have Python 3.11, we’ll install it using pyenv, an excellent tool for managing multiple Python versions:
Note: Visit the pyenv GitHub repository for detailed installation instructions specific to your OS.
For macOS users:
1. Install pyenv
brew update
brew install pyenv
2. Configure your shell
Add these lines to your shell configuration file (.bashrc
, .zshrc
, or equivalent):
export PYENV_ROOT="$HOME/.pyenv"
export PATH="$PYENV_ROOT/bin:$PATH"
eval "$(pyenv init --path)"
eval "$(pyenv init -)"
Then reload your shell:
exec "$SHELL"
3. Install Python 3.11
pyenv install 3.11
4. Set it as your global Python version
This step is optional but is recommended for quickly starting Ollama and Open WebUI from your terminal in future sessions.
pyenv global 3.11
5. Verify the installation
python --version
You should see an output confirming Python 3.11.8 is now installed.
Installing and Configuring Ollama
Now that we have our Python environment ready let’s install Ollama:
1. Download and Install Ollama
Note: Visit the Ollama GitHub page and follow the installation instructions for your OS.
For macOS users, download the .zip package from the Ollama official repository, unzip the file and install it by clicking the ‘Ollama’ application file.
2. Verify Ollama Installation
Open a terminal/command prompt and type:
ollama
You should see Ollama startup with a help message indicating it’s running.
3. Exploring Available Models
Ollama gives you access to many open-source models. You can browse available models at ollama.com/search.
When selecting a model, consider:
- Model size (smaller models run faster but may be less capable)
- Specialization (some models excel at coding, others at creative writing)
- Memory requirements (larger models need more GPU memory)
Check the HuggingFace Open LLM Leaderboard for benchmarks and performance metrics. Use the advanced filters and metrics to find suitable models for your task.
Pro tip: Start with smaller models (7B parameters or less) and move to larger ones only if needed. Many tasks can be handled effectively by smaller models, which run much faster.
4. Running Your First Model Let’s start with phi4-mini, a small but capable model:
ollama run phi4-mini
This will download the model (if it’s not already downloaded) and start an interactive chat session. Try asking it a question to verify everything is working.
To exit the chat, type /bye
or press Ctrl+D.
Speed check: For a smooth experience, a good rule of thumb is that your model should output at least 10–20 words per second. If it’s much slower, you might try a smaller model or check if your GPU is properly utilized.


5. Check Your Installed Models
List all your installed models with:
ollama list
Installing and Configuring Open WebUI
Now that Ollama is running, let’s install Open WebUI to create a user-friendly interface.
1. Install Open WebUI
The easiest way to install Open WebUI is using pip:
pip install open-webui
2. Start OpenWebUI
Make sure Ollama is running, then start Open WebUI with:
open-webui serve
Note: Make sure that no other processes are using ports 8080 or 5173
3. Access the UI
Open your web browser and navigate to: http://localhost:8080/
You should see the OpenWebUI interface welcoming you!
4. Interface Overview
Take a moment to familiarize yourself with the interface:
- Chat Interface: The main area where you’ll interact with your AI
- Models Menu: Select which model(s) to use
- Chat Controls: Configure system prompts and model parameters
- Settings: Access administrative features and customizations
Pro tip: You can even use multiple models in a chat simultaneously to compare responses and aggregate their knowledge.
Basic OpenWebUI Functionality
Let’s explore some of the powerful features of OpenWebUI.
Enabling Web Search
Web search allows your AI to access current information beyond its training data. OpenWebUI supports multiple search engines, giving you flexibility based on your needs. Here’s how to set it up:
- Go to Settings → Admin Panel → Web Search
- Enable web search and select your preferred search engine
I’ll focus on two popular options: Google PSE and Brave Search. Each has its advantages and disadvantages:
Google PSE API Setup
Advantages:
- King of search. Industry-leading search capabilities and relevance
- Generous free tier (10,000 requests per day)
- Comprehensive search results across the entire web
Disadvantages:
- Privacy concerns (Google processes your queries 😢)
- More complex setup process
Setup process:
For detailed instructions on how to set up Google PSE, please refer to the Open WebUI documentation.
Brave Search API
Advantages:
- Privacy-focused (doesn’t track your search queries)
- Independent search index (not relying on Google)
- Simple setup process
- Free tier available
Disadvantages:
- Limited to 2,000 free queries per month
- Search results may sometimes be less comprehensive than Google
Setup Process:
- Go to Brave Search API
- Sign up and verify your email
-
Navigate to the “Subscribe” tab and choose the free subscription (requires card information)
-
Go to the “API Keys” tab and create a new API key
-
Copy the token and paste it into Open WebUI’s web search configuration
Test Search Functionality
We can now test the search functionality by asking our AI agent for the weather forecast for this weekend.
Note: Only enable web search when you need recent information or are researching topics outside the model’s knowledge, as it significantly slows down response time. For most general queries, the built-in knowledge of your model will be faster and sufficient.
Code Interpreter
Open WebUI’s code interpreter transforms your AI assistant into a dynamic programming tool, enabling it to write and execute Python code directly within the chat interface.
Why This Matters: Your AI can now solve problems with code, demonstrate processes, and allow you to modify solutions interactively.
Key Features:
- Interactive Code Blocks: View and edit AI-generated code directly in your chat.
- Multiple Executions: Run code multiple times with different inputs.
- Real-time Feedback: Receive immediate results without switching platforms.
- No Execution Limits: Enjoy unrestricted code execution without arbitrary limits.
When you request a task that benefits from computational assistance, the AI recognizes the need for code execution and generates appropriate Python code that helps it answer your query. Just make sure to enable the ‘Code Interpreter’ option in your chat.
Enabling the Code Interpreter:
- Navigate to Settings → Admin Panel → Code Execution.
- Toggle the “Enable Code Interpreter” switch to On.
Example Use Case:
Imagine you need to find the prime factorization of a large number. This task can be challenging for humans to perform quickly and accurately, and it can also be difficult for AI assistants. That’s why we seek the assistance of a computer to help with this task.
Proompt:
what is the prime factors decomposition of 272894?
Without Code Interpreter:

With Code Interpreter:

The code interpreter fundamentally changes how you interact with your AI, transforming it from a conversational assistant into a computational powerhouse that can directly solve problems and demonstrate solutions. This feature alone can justify the entire local setup process for many users, especially those with data analysis, programming, or mathematics.
Creating a Basic Custom Model
One of the most powerful features of Open WebUI is the ability to customize how your AI behaves. To unlock this feature:
- Go to Workspace → Models → Create New
- Select your model a name, and select a base model (e.g., phi4-mini)
- Customize its behaviour (e.g., by adding a system prompt guiding its behaviour)
For example, you could create a biology professor persona with this system prompt:
You are a university professor specialising in Biology with a passion for frogs-and you have a charming lisp in your speech. When interacting with users:
- Answer biology-related queries with clear, factual, and detailed explanations, mainly focusing on frog topics.
- Explain complex concepts using analogies drawn from everyday scenarios, making them easier to grasp.
- If a user's question is ambiguous or unclear, ask clarifying questions before providing a complete answer.
- Regularly quiz the user on key points to confirm understanding.
- Propose various follow-up questions or alternative learning directions to encourage further discussion.
- Maintain a friendly, engaging, and scholarly tone, ensuring your unique lisp is reflected in your speech.'
To get such wonderful AI assistant interactions:
Pro tip: Create different model configurations for different tasks — one for brainstorming, another for coding, and yet another for detailed explanations.
Conclusion & Next Steps
Congratulations! 🎉 You now have a fully functional local AI environment that gives you:
- Privacy (your data stays on your machine)
- Cost savings (no subscription fees)
- Fast responses
- Customizable AI assistants
In just about an hour, you’ve set up an infrastructure that rivals commercial AI platforms, all while maintaining complete control over your data and experience.
What We’ve Accomplished
- ✅ Set up the required Python environment
- ✅ Installed and configured Ollama
- ✅ Installed and set up OpenWebUI
- ✅ Enabled web search capabilities
- ✅ Activated the code interpreter
- ✅ Created a basic custom model
Next Steps
If you want to advance your project or learn more about Retrieval-Augmented Generation (RAG) and custom knowledge bases, check out the next article: "Open WebUI Tutorial — Supercharge Your Local AI with RAG and Custom Knowledge Bases". This guide walks you through the out-of-the-box RAG features in Open WebUI that require no coding. By the end of the tutorial, you’ll be able to build your own local documentation assistant.