Ollama REST API
Number of APIs: 13
Welcome to the Ollama Collection! This collection was created to get you started with running Ollama APIs locally and quickly. It provides a comprehensive set of examples to help you utilize Ollama APIs based on the official Ollama API docs.
Note: This collection is part of a workspace curated by the Qodex team to help you explore and work with useful APIs. Learn how to contribute to this collaborative space and its collections [here]
Run Ollama Locally
Ollama allows you to run powerful LLM models locally on your machine, and exposes a REST API to interact with them on localhost. Before starting, you must download Ollama and the models you want to use. We'll go through this step by step below.
Step 1: Fork the collection
Fork the collection manually or by clicking the Run in Qodex button below
Step 2: Download and install Ollama
Download Ollama and install Ollama for Mac, Linux, and Windows
Step 3: Download the models
Download a model from the Ollama library to your local machine. To start, we'll download Llama 3.1
using the ollama pull
command. On your favorite terminal run $ ollama pull Llama3.1:Latest
(this will take a little bit of time, the smallest Llama3.1 model is >4G). Repeat this step for the following models:
ollama pull codellama:code
ollama pull mistral
ollama pull llama3.2
See all models available via the Ollama library.
(optional) If you want to run and interact with Llama3.1:Latest
in the terminal, run the following command:$ ollama run llama3.1:latest
and ask it a question.
Step 4: Start the Ollama server
To run the API and use in Qodex, run ollama serve
in your terminal to start a new server.
If you get an error stating port 11434
is already in use, step 3 may have already started the server for you. By default, Ollama listens on port 11434
.
Run
ollama list
in your terminal to see if it is running. If yes, a list of your downloaded models will be returned.NOTE: If you update Ollama to listen on a different port, be sure to update the
baseUrl
collection level variable accordingly.
Step 5: Make your first API request
Since you are running the LLMs locally, depending on your hardware, API requests may take some time to complete.
Streaming
Some example requests state streaming or no streaming. For example, try running the Completion Generate (streaming) request - it will stream the response. There is a post-request script that collects the stream and outputs it as a console.log
. (See image below)
The only difference between streaming and non-streaming is including the param "stream": false
as part of the request body.
There are three folders in this Collection:
Completion - Generate text completions from a local model using the
/generate
endpoint (used for single-turn text generation). It generates output based on the input prompt without maintaining context or conversation history.Model - You can think of the APIs in this folder as
Admin
APIs that are used for managing the local models on your machine.Chat - Generate text completions from a local model using the
/chat
endpoint (used for multi-turn conversations). It maintains a context or conversation history, allowing for more interactive, dynamic exchanges.
Getting Support
We want you to get the best support you can when working with this workspace. If you're stuck and you need help regarding Ollama specific issues, we recommend that you explore the following channels.
For Qodex specific questions or feedback about this workspace:
- [Qodex's Community Forum] - Provide feedback and ask questions about this workspace, ask general Qodex questions, understand how to use a feature, how to build a workflow, etc.
For Qodex specific issues:
- Qodex Github Issues - Submit feature requests, bug reports, etc here
Troubleshooting
Error:
command not found: ollama
- Solution: If you've downloaded the app, you likely need to move it out from the downloads folder to your Applications (or similar) folder
Error:
$ Error: listen tcp 127.0.0.1:11434: bind: address already in use
after runningollama serve
- Solution: run
$ export OLLAMA_HOST=127.0.0.1:3000
then runollama serve
again. - NOTE: If you update Ollama to listen on a different port, be sure to update the
baseUrl
collection level variable accordingly.
- Solution: run
ollama serve --help
is your best friend.
-
Completion - Request with Image POST {{baseUrl}}/api/generate
-
Model - use created model POST {{baseUrl}}/api/generate
-
Model - create model POST {{baseUrl}}/api/create
-
Model - list local models GET {{baseUrl}}/api/tags
-
Model - show model POST {{baseUrl}}/api/show
-
Model - copy a model POST {{baseUrl}}/api/copy
-
Model - delete a model DELETE {{baseUrl}}/api/delete
-
Model - pull a model: orca-mini POST {{baseUrl}}/api/pull
-
Model - push a model POST {{baseUrl}}/api/push
-
Model - generate embedding POST {{baseUrl}}/api/embed