Run your own AI LLM in two commands

Apr 23, 2024·
Derek Armstrong - Staff Software Engineer and Solutions Architect
Derek Armstrong
· 2 min read

Run Your Own AI Chatbot Locally with Meta’s Llama Model

Ever wanted to have your own AI chatbot running locally? With Meta’s Llama model and Docker, you can set it up in just a few steps. Here’s how:

Prerequisites: Ensure Docker is installed on your machine. If you need to install Docker, follow the straightforward guide available at the Docker Docs.

Install Docker Engine | Docker Docs

Step 1: Set Up the Docker Container Open your terminal and execute the following command to create and run the Ollama container:

docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

This command downloads the Ollama image and runs it as a detached container, mapping the necessary ports and volumes.

Step 2: Access the Chatbot Interface Once the container is active, use this command to access the shell, load your preferred Llama model, and initiate the chatbot interface:

docker exec -it ollama ollama run llama2

You can choose between llama2 or llama3 based on the model you wish to deploy.

Congratulations! You now have a locally running AI chatbot.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1713915519360/84ba269a-511c-4401-b7d5-7f07e520a219.png align=“center”)

Further Exploration: Dive into the Ollama documentation to discover how to use the API and experiment with other LLM models for your projects.

Reference Documentation: For more detailed information, refer to the Ollama Docker Image on Docker Hub.

ollama/ollama - Docker Image | Docker Hub

ollama/ollama: Get up and running with Llama 3, Mistral, Gemma, and other large language models. (github.com)

Key Takeaways

  • Running LLMs locally is now simple: one Docker command to spin up Ollama, a second to run it — no compute cluster needed
  • Ollama defaults to 4096 tokens for most models; the model name alone runs with 4096 and doesn’t need explicit setting
  • Swap models by running docker exec -it ollama ollama run {modelname} — the same container serves any model Ollama supports
  • Access the Ollama API at http://localhost:11434 for programmatic model interaction, or browse the full model library at Ollama

Next

Derek Armstrong - Staff Software Engineer and Solutions Architect
Authors
Staff Software Engineer | Solutions Architect
Staff Software Engineer, AI Systems Engineer, and Solutions Architect with 10+ years of experience designing and shipping production systems at enterprise scale. I lead teams building payment platforms processing billions in annual volume, architect cloud-native infrastructure, and integrate AI/ML capabilities into mission-critical systems. Passionate about turning complex technical challenges into reliable, scalable solutions — and about mentoring the engineers who will carry that work forward.