Commit graph

187 commits

Author SHA1 Message Date
Patrick Devine 1b272d5bcd
change github.com/jmorganca/ollama to github.com/ollama/ollama (#3347) 2024-03-26 13:04:17 -07:00
Patrick Devine 47cfe58af5
Default Keep Alive environment variable (#3094)
---------

Co-authored-by: Chris-AS1 <8493773+Chris-AS1@users.noreply.github.com>
2024-03-13 13:29:40 -07:00
Jeffrey Morgan 3b4bab3dc5
Fix embeddings load model behavior (#2848) 2024-02-29 17:40:56 -08:00
Ikko Eltociear Ashimine e95b896790
Update types.go (#2744)
specfied -> specified
2024-02-25 13:41:25 -05:00
Michael Yang 897b213468
use http.DefaultClient (#2530)
default client already handles proxy
2024-02-20 18:34:47 -05:00
bnorick caf2b13c10
Fix infinite keep_alive (#2480) 2024-02-13 15:40:32 -08:00
Patrick Devine b5cf31b460
add keep_alive to generate/chat/embedding api endpoints (#2146) 2024-01-26 14:28:02 -08:00
Patrick Devine 7c40a67841
Save and load sessions (#2063) 2024-01-25 12:12:36 -08:00
Michael Yang 745b5934fa add model to ModelResponse 2024-01-18 14:32:55 -08:00
Michael Yang a38d88d828 api: add model for all requests
prefer using req.Model and fallback to req.Name
2024-01-18 14:31:37 -08:00
Michael Yang 5ffbbea1d7 remove client.py 2024-01-11 15:53:10 -08:00
Patrick Devine 22e93efa41 add show info command and fix the modelfile 2024-01-05 12:20:05 -08:00
Brian Murray 0d6e3565ae
Add embeddings to API (#1773) 2024-01-04 15:00:52 -05:00
Jeffrey Morgan 55978c1dc9 clean up cache api option 2023-12-27 14:27:45 -05:00
Jeffrey Morgan d4ebdadbe7 enable cache_prompt by default 2023-12-27 14:23:42 -05:00
K0IN 10da41d677
Add Cache flag to api (#1642) 2023-12-22 17:16:20 -05:00
Bruce MacDonald d99fa6ce0a
send empty messages on last chat response (#1530) 2023-12-18 14:23:38 -05:00
Patrick Devine d9e60f634b
add image support to the chat api (#1490) 2023-12-12 13:28:58 -08:00
Patrick Devine 910e9401d0
Multimodal support (#1216)
---------

Co-authored-by: Matt Apperson <mattapperson@Matts-MacBook-Pro.local>
2023-12-11 13:56:22 -08:00
Jeffrey Morgan 9e1406e4ed Don't expose model information in /api/generate 2023-12-09 02:05:43 -08:00
Michael Yang c3ff36088b
Merge pull request #774 from jmorganca/mxyng/server-version
add version api and show server version in cli
2023-12-06 13:22:55 -08:00
Michael Yang 5d75505ebd return model configuration in generate 2023-12-05 14:39:02 -08:00
Bruce MacDonald 195e3d9dbd
chat api endpoint (#1392) 2023-12-05 14:57:33 -05:00
Michael Yang 0db4706ec2 api: add version api handler 2023-12-05 09:36:01 -08:00
Jeffrey Morgan 00d06619a1 Revert "chat api (#991)" while context variable is fixed
This reverts commit 7a0899d62d.
2023-12-04 21:16:27 -08:00
Bruce MacDonald 7a0899d62d
chat api (#991)
- update chat docs
- add messages chat endpoint
- remove deprecated context and template generate parameters from docs
- context and template are still supported for the time being and will continue to work as expected
- add partial response to chat history
2023-12-04 18:01:06 -05:00
Patrick Devine cde31cb220
Allow setting parameters in the REPL (#1294) 2023-11-29 09:56:42 -08:00
Bruce MacDonald 928950fcc6
update python client create example (#1227)
* add remote create to python example client
2023-11-27 15:36:19 -05:00
Michael Yang bc22d5a38b no blob response 2023-11-15 15:16:23 -08:00
Michael Yang 1901044b07 use checksum reference 2023-11-15 15:16:23 -08:00
Michael Yang 1552cee59f client create modelfile 2023-11-15 15:16:23 -08:00
Michael Yang 3ca56b5ada add create modelfile field 2023-11-15 15:16:23 -08:00
Jeffrey Morgan cdddd3df65 add format to example python client 2023-11-10 10:22:21 -08:00
Jeffrey Morgan 5cba29b9d6
JSON mode: add `"format" as an api parameter (#1051)
* add `"format": "json"` as an API parameter
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2023-11-09 16:44:02 -08:00
Bruce MacDonald a49d6acc1e
add a complete /generate options example (#1035) 2023-11-08 16:44:36 -08:00
Bruce MacDonald ec2a31e9b3
support raw generation requests (#952)
- add the optional `raw` generate request parameter to bypass prompt formatting and response context
-add raw request to docs
2023-11-08 14:05:02 -08:00
Jeffrey Morgan 17678b7225 Restore system prompt on requests and default num_keep to 0 2023-11-03 13:25:25 -07:00
Jeffrey Morgan 06589a3b30
Set NumKeep to 4 by default (#982) 2023-11-02 17:26:11 -07:00
Michael Yang 1fd511e661
Merge pull request #975 from jmorganca/mxyng/downloads
update downloads to use retry wrapper
2023-11-02 16:12:48 -07:00
Michael Yang 6db3691b8f update default NumKeep 2023-11-02 15:47:35 -07:00
Michael Yang 60bb3c03a1 use http.Method 2023-11-02 13:12:45 -07:00
Bruce MacDonald 5c3491f425
allow for a configurable ollama model storage directory (#897)
* allow for a configurable ollama models directory

- set OLLAMA_MODELS in the environment that ollama is running in to change where model files are stored
- update docs

Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>
Co-Authored-By: Jay Nakrani <dhananjaynakrani@gmail.com>
Co-Authored-By: Akhil Acharya <akhilcacharya@gmail.com>
Co-Authored-By: Sasha Devol <sasha.devol@protonmail.com>
2023-10-27 10:19:59 -04:00
Michael Yang 28c3f288e2 client: fix trailing slash 2023-10-26 11:09:38 -07:00
Michael Yang 459f4a7889 fix: ollama host for hostname 2023-10-20 11:32:41 -07:00
Bruce MacDonald fe6f3b48f7
do not reload the running llm when runtime params change (#840)
- only reload the running llm if the model has changed, or the options for loading the running model have changed
- rename loaded llm to runner to differentiate from loaded model image
- remove logic which keeps the first system prompt in the generation context
2023-10-19 10:39:58 -04:00
Michael Yang 92189a5855 fix memory check 2023-10-13 14:47:29 -07:00
Bruce MacDonald 6fe178134d
improve api error handling (#781)
- remove new lines from llama.cpp error messages relayed to client
- check api option types and return error on wrong type
- change num layers from 95% VRAM to 92% VRAM
2023-10-13 16:57:10 -04:00
Bruce MacDonald 7804b8fab9
validate api options fields from map (#711) 2023-10-12 11:18:11 -04:00
Michael Yang b599946b74 add format bytes 2023-10-11 14:08:23 -07:00
Bruce MacDonald 274d5a5fdf
optional parameter to not stream response (#639)
* update streaming request accept header
* add optional stream param to request bodies
2023-10-11 12:54:27 -04:00
Michael Yang 2cfffea02e handle client proxy 2023-10-09 12:33:47 -07:00
Bruce MacDonald 2130c0708b
output type parsed from modelfile (#678) 2023-10-05 14:58:04 -04:00
Bruce MacDonald 9e2de1bd2c
increase streaming buffer size (#692) 2023-10-04 14:09:00 -04:00
Bruce MacDonald 1fbf3585d6
Relay default values to llama runner (#672)
* include seed in params for llama.cpp server and remove empty filter for temp

* relay default predict options to llama.cpp

- reorganize options to match predict request for readability

* omit empty stop

---------

Co-authored-by: hallh <hallh@users.noreply.github.com>
2023-10-02 14:53:16 -04:00
Bruce MacDonald a1b2d95f96
remove unused push/pull params (#650) 2023-09-29 17:27:19 -04:00
Michael Yang f40b3de758 use int64 consistently 2023-09-28 11:07:24 -07:00
Patrick Devine 8efbc5df55
DRAFT: add a simple python client to access ollama (#522) 2023-09-14 16:37:38 -07:00
Bruce MacDonald f221637053
first pass at linux gpu support (#454)
* linux gpu support
* handle multiple gpus
* add cuda docker image (#488)
---------

Co-authored-by: Michael Yang <mxyng@pm.me>
2023-09-12 11:04:35 -04:00
Patrick Devine 790d24eb7b
add show command (#474) 2023-09-06 11:04:17 -07:00
Michael Yang 0f541a0367 s/ListResponseModel/ModelResponse/ 2023-08-31 09:47:10 -04:00
Bruce MacDonald 42998d797d
subprocess llama.cpp server (#401)
* remove c code
* pack llama.cpp
* use request context for llama_cpp
* let llama_cpp decide the number of threads to use
* stop llama runner when app stops
* remove sample count and duration metrics
* use go generate to get libraries
* tmp dir for running llm
2023-08-30 16:35:03 -04:00
Michael Yang 982c535428
Merge pull request #428 from jmorganca/mxyng/upload-chunks
update upload chunks
2023-08-30 07:47:17 -07:00
Patrick Devine 8bbff2df98
add model IDs (#439) 2023-08-28 20:50:24 -07:00
Michael Yang 246dc65417 loosen http status code checks 2023-08-28 18:34:53 -04:00
Jeffrey Morgan 22ab7f5f88 default host to 127.0.0.1, fixes #424 2023-08-26 11:59:28 -07:00
Michael Yang 2c7f956b38 add version 2023-08-22 09:40:58 -07:00
Michael Yang f723bf0879 ignore nil map values 2023-08-17 15:50:46 -07:00
Jeffrey Morgan 54bb49a502 parse protocol for OLLAMA_HOST 2023-08-17 18:20:44 -04:00
Jeffrey Morgan 5ee6116420 set default OLLAMA_HOST to http://localhost:11434 2023-08-16 12:22:59 -04:00
Blake Mizerany 67e593e355
cmd: support OLLAMA_CLIENT_HOST environment variable (#262)
* cmd: support OLLAMA_HOST environment variable

This commit adds support for the OLLAMA_HOST environment
variable. This variable can be used to specify the host to which
the client should connect. This is useful when the client is
running somewhere other than the host where the server is running.

The new api.FromEnv function is used to read configure clients from the
environment. Clients wishing to use the environment variable being
consistent with the Ollama CLI can use this new function.

* Update api/client.go

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update api/client.go

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

---------

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2023-08-16 11:03:48 -04:00
Michael Yang f27bc261cf s/parmeter/parameter/ 2023-08-10 16:26:06 -07:00
Michael Yang 81d8d7b73f fix could not convert int 2023-08-10 16:24:17 -07:00
Patrick Devine be989d89d1
Token auth (#314) 2023-08-10 11:34:25 -07:00
Bruce MacDonald 4b3507f036 embeddings endpoint
Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>
2023-08-10 11:45:57 -04:00
Bruce MacDonald 7a5f3616fd
embed text document in modelfile 2023-08-09 10:26:19 -04:00
Bruce MacDonald 21ddcaa1f1 pr comments
- default to embeddings enabled
- move embedding logic for loaded model to request
- allow embedding full directory
- close llm on reload
2023-08-08 13:49:37 -04:00
Michael Yang f2074ed4c0
Merge pull request #306 from jmorganca/default-keep-system
automatically set num_keep if num_keep < 0
2023-08-08 09:25:34 -07:00
Jeffrey Morgan 8713ac23a8 allow overriding template and system in /api/generate
Fixes #297
Fixes #296
2023-08-08 00:55:34 -04:00
Michael Yang 4dc5b117dd automatically set num_keep if num_keep < 0
num_keep defines how many tokens to keep in the context when truncating
inputs. if left to its default value of -1, the server will calculate
num_keep to be the left of the system instructions
2023-08-07 16:19:12 -07:00
Michael Yang b9f4d67554 configurable rope frequency parameters 2023-08-03 22:11:58 -07:00
Bruce MacDonald 8b1e791820
allow specifying zero values in modelfile 2023-08-02 17:07:53 -04:00
Bruce MacDonald 8f8b6288ac
check server is running before running command 2023-08-02 10:51:23 -04:00
Bruce MacDonald 765994362c use head to check heartbeat 2023-08-01 14:50:38 -04:00
Bruce MacDonald 1c5a8770ee read runner parameter options from map
- read runner options from map to see what was specified explicitly and overwrite zero values
2023-08-01 13:38:19 -04:00
Jeffrey Morgan 528bafa585 cache loaded model 2023-08-01 11:24:18 -04:00
Bruce MacDonald e72fe7945f check server is running before running command 2023-07-31 16:25:57 -04:00
Bruce MacDonald 184ad8f057 allow specifying stop conditions in modelfile 2023-07-28 11:02:04 -04:00
Jeffrey Morgan 822a0e36eb lower batch size to 512 2023-07-28 10:56:21 -04:00
Michael Yang fadf75f99d add stop conditions 2023-07-27 17:00:47 -07:00
Michael Yang ad3a7d0e2c add NumGQA 2023-07-27 14:05:11 -07:00
Jeffrey Morgan 688661ab9b increase default batch size to 1024 2023-07-27 16:51:01 -04:00
Michael Yang cca61181cb sample metrics 2023-07-27 09:31:44 -07:00
Michael Yang c490416189 lock on llm.lock(); decrease batch size 2023-07-27 09:31:44 -07:00
Michael Yang f62a882760 add session expiration 2023-07-27 09:31:44 -07:00
Michael Yang 3003fc03fc update predict code 2023-07-27 09:31:44 -07:00
Michael Yang 32aec66e6a add load duration 2023-07-27 09:31:44 -07:00
Michael Yang 35af37a2cb session id 2023-07-27 09:31:44 -07:00
Bruce MacDonald 4c1caa3733 download models when creating from modelfile 2023-07-25 14:25:13 -04:00
Bruce MacDonald 536028c35a better error message when model not found on pull 2023-07-24 17:48:17 -04:00
Patrick Devine 4cb42ca55e
add copy command (#191) 2023-07-24 11:27:28 -04:00