Run your own AI LLM in two commands

Apr 23, 2024·

Derek Armstrong

· 2 min read

Run Your Own AI Chatbot Locally with Meta’s Llama Model

Ever wanted to have your own AI chatbot running locally? With Meta’s Llama model and Docker, you can set it up in just a few steps. Here’s how:

Prerequisites: Ensure Docker is installed on your machine. If you need to install Docker, follow the straightforward guide available at the Docker Docs.

Install Docker Engine | Docker Docs

Step 1: Set Up the Docker Container Open your terminal and execute the following command to create and run the Ollama container:

docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

This command downloads the Ollama image and runs it as a detached container, mapping the necessary ports and volumes.

Step 2: Access the Chatbot Interface Once the container is active, use this command to access the shell, load your preferred Llama model, and initiate the chatbot interface:

docker exec -it ollama ollama run llama2

You can choose between llama2 or llama3 based on the model you wish to deploy.

Congratulations! You now have a locally running AI chatbot.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1713915519360/84ba269a-511c-4401-b7d5-7f07e520a219.png align=“center”)

Further Exploration: Dive into the Ollama documentation to discover how to use the API and experiment with other LLM models for your projects.

Reference Documentation: For more detailed information, refer to the Ollama Docker Image on Docker Hub.

ollama/ollama - Docker Image | Docker Hub

ollama/ollama: Get up and running with Llama 3, Mistral, Gemma, and other large language models. (github.com)

Key Takeaways

Running LLMs locally is now simple: one Docker command to spin up Ollama, a second to run it — no compute cluster needed
Ollama defaults to 4096 tokens for most models; the model name alone runs with 4096 and doesn’t need explicit setting
Swap models by running docker exec -it ollama ollama run {modelname} — the same container serves any model Ollama supports
Access the Ollama API at http://localhost:11434 for programmatic model interaction, or browse the full model library at Ollama

Self Hosted AI: Actually Running Local LLMs for a Multi-User Household — why Ollama became insufficient for multi-user concurrency, and what replaced it
Dev Magic with Docker — why Docker matters for local development and infrastructure

Last updated on Apr 23, 2024

Ai-Tools Ai Docker Llama2 Llama3 Llm Localai Machine-Learning - Ollama

Authors

Derek Armstrong

Software Engineer · AI · Infrastructure

I’m Derek — software engineer, infrastructure nerd, and chronic tinkerer. 10+ years building payment platforms, production systems, and the kind of infrastructure that has to work at 3am whether I’m awake or not. When I’m not at my day job, I’m running local LLMs on dual 3090s, 3D printing things my wife didn’t ask for, and writing about all of it here. Topics range from code to infrastructure, AI, and whatever I broke this week.

← All-in-One Unraid HomeLab Server Jul 26, 2024

Writing Jira Stories That Don't Make Developers Cry Apr 19, 2024 →

Homelab Infrastructure Platform

Run your own AI LLM in two commands

Key Takeaways

Next

Related