Commit graph

478 commits

Author SHA1 Message Date
Jeffrey Morgan 453f572f83
Initial OpenAI /v1/chat/completions API compatibility (#2376) 2024-02-07 17:24:29 -05:00
Michael Yang e805ac1d59 fix response on token error 2024-02-07 11:05:49 -08:00
Michael Yang bfbf2f7cf7
Merge pull request #2296 from ollama/mxyng/img-tags
append image tags to user content
2024-02-01 13:16:59 -08:00
Michael Yang 3d6f48507a structured debug prompt 2024-02-01 11:56:28 -08:00
Michael Yang f3761405c8 use image id 2024-02-01 11:52:42 -08:00
Michael Yang e49dc9f3d8 fix tests 2024-02-01 11:48:11 -08:00
Michael Yang d125510b4b remove image tags 2024-02-01 11:32:51 -08:00
Michael Yang fb56988014 account for image projection in token count 2024-02-01 09:50:48 -08:00
Michael Yang d046bee790 use llm.ImageData for chat 2024-01-31 19:18:25 -08:00
Jeffrey Morgan f11bf0740b use llm.ImageData 2024-01-31 19:13:48 -08:00
Michael Yang 8450bf66e6 trim images 2024-01-31 19:13:47 -08:00
Michael Yang b4e11be8ef append image tags to user content 2024-01-31 19:13:10 -08:00
Bruce MacDonald a896079705
preserve last system message from modelfile (#2289) 2024-01-31 21:45:01 -05:00
Michael Yang 8ac08a0eec update slog handler options
- consistent format by using text handler for debug and non-debug
- truncate source file to just the file name
2024-01-31 15:15:00 -08:00
Michael Yang c8b1f2369e remove unnecessary parse raw 2024-01-30 17:00:53 -08:00
Bruce MacDonald 0632dff3f8
trim chat prompt based on llm context size (#1963) 2024-01-30 15:59:29 -05:00
Jeffrey Morgan f2245c7c77
print prompt with OLLAMA_DEBUG=1 (#2245) 2024-01-28 15:22:35 -08:00
Jeffrey Morgan e4b9b72f2a
Do not repeat system prompt for chat templating (#2241) 2024-01-28 14:15:56 -08:00
Patrick Devine b5cf31b460
add keep_alive to generate/chat/embedding api endpoints (#2146) 2024-01-26 14:28:02 -08:00
Michael Yang 9d3dcfd0ec fix logging 2024-01-26 11:04:27 -08:00
Michael Yang 6e0ea5ecc8
Merge pull request #1916 from ollama/mxyng/inactivity-monitor
download: add inactivity monitor
2024-01-26 10:56:00 -08:00
Patrick Devine 7c40a67841
Save and load sessions (#2063) 2024-01-25 12:12:36 -08:00
Michael Yang c08dfaa23d fix: remove overwritten model layers
if create overrides a manifest, first add the older manifest's layers to
the delete map so they can be cleaned up
2024-01-19 14:58:37 -08:00
Michael Yang aac9ab4db7 fix show handler 2024-01-18 15:36:50 -08:00
Michael Yang 745b5934fa add model to ModelResponse 2024-01-18 14:32:55 -08:00
Michael Yang a38d88d828 api: add model for all requests
prefer using req.Model and fallback to req.Name
2024-01-18 14:31:37 -08:00
Daniel Hiltgen fedd705aea Mechanical switch from log to slog
A few obvious levels were adjusted, but generally everything mapped to "info" level.
2024-01-18 14:12:57 -08:00
Michael Yang 96cfb62641 fix: normalize name path before splitting 2024-01-16 16:48:29 -08:00
Patrick Devine eef50accb4
Fix show parameters (#2017) 2024-01-16 10:34:44 -08:00
Michael Yang 27331ae3a8 download: add inactivity monitor
if a download part is inactive for some time, restart it
2024-01-12 15:23:15 -08:00
Michael Yang cf29bd2d72 fix: request retry with error
this fixes a subtle bug with makeRequestWithRetry where an HTTP status
error on a retried request will potentially not return the right err
2024-01-12 13:32:27 -08:00
Michael Yang 2b9892a808 fix(windows): modelpath and list 2024-01-09 09:36:58 -08:00
Michael Yang 2bb2bdd5d4 fix lint 2024-01-09 09:36:58 -08:00
Michael Yang acfc376efd add .golangci.yaml 2024-01-09 09:36:58 -08:00
Bruce MacDonald 7e8f7c8358
remove ggml automatic re-pull (#1856) 2024-01-08 14:41:01 -05:00
Michael Yang 0101e76dbe
Merge pull request #1797 from sublimator/nd-allow-extension-origins-still-needs-explicit-listing-2024-01-05
fix: allow extension origins (still needs explicit listing), fixes #1686
2024-01-05 17:20:09 -08:00
Patrick Devine 22e93efa41 add show info command and fix the modelfile 2024-01-05 12:20:05 -08:00
Nicholas Dudfield 8baaaa39c0 Allow extension origins (still needs explicit listing), fixes #1686 2024-01-05 09:06:47 +07:00
Bruce MacDonald 4ad6c9b11f
fix: pull either original model or from model on create (#1774) 2024-01-04 01:34:38 -05:00
Bruce MacDonald 0b3118e0af
fix: relay request opts to loaded llm prediction (#1761) 2024-01-03 12:01:42 -05:00
Daniel Hiltgen 697bea6939 Guard integration tests with a tag
This should help CI avoid running the integration test logic in a
container where it's not currently possible.
2023-12-22 16:33:27 -08:00
Bruce MacDonald db356c8519
post-response templating (#1427) 2023-12-22 17:07:05 -05:00
Daniel Hiltgen 96fb441abd
Merge pull request #1146 from dhiltgen/ext_server_cgo
Add cgo implementation for llama.cpp
2023-12-22 08:16:31 -08:00
Michael Yang 63aac0edc5 fix(test): use real version string for comparison 2023-12-19 15:03:02 -08:00
Daniel Hiltgen 51082535e1 Add automated test for multimodal
A simple test case that verifies llava:7b can read text in an image
2023-12-19 09:05:46 -08:00
Daniel Hiltgen 35934b2e05 Adapted rocm support to cgo based llama.cpp 2023-12-19 09:05:46 -08:00
Daniel Hiltgen d4cd695759 Add cgo implementation for llama.cpp
Run the server.cpp directly inside the Go runtime via cgo
while retaining the LLM Go abstractions.
2023-12-19 09:05:46 -08:00
Bruce MacDonald 5e7fd6906f Update images.go 2023-12-19 09:05:46 -08:00
Bruce MacDonald 811b1f03c8 deprecate ggml
- remove ggml runner
- automatically pull gguf models when ggml detected
- tell users to update to gguf in the case automatic pull fails

Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>
2023-12-19 09:05:46 -08:00
Bruce MacDonald d99fa6ce0a
send empty messages on last chat response (#1530) 2023-12-18 14:23:38 -05:00
Patrick Devine 3948c6ea06
add magic header for unit tests (#1558) 2023-12-18 10:41:02 -08:00
Patrick Devine 86b0dd4b16
add API create/copy handlers (#1541) 2023-12-15 11:59:18 -08:00
Patrick Devine 0174665d0e
add API tests for list handler (#1535) 2023-12-14 18:18:25 -08:00
Patrick Devine 630518f0d9
Add unit test of API routes (#1528) 2023-12-14 16:47:40 -08:00
Bruce MacDonald 6ee8c80199
restore model load duration on generate response (#1524)
* restore model load duration on generate response

- set model load duration on generate and chat done response
- calculate createAt time when response created

* remove checkpoints predict opts

* Update routes.go
2023-12-14 12:15:50 -05:00
Jeffrey Morgan 4a1abfe4fa fix tests 2023-12-13 14:42:30 -05:00
Patrick Devine d9e60f634b
add image support to the chat api (#1490) 2023-12-12 13:28:58 -08:00
Jeffrey Morgan 0a9d348023
Fix issues with /set template and /set system (#1486) 2023-12-12 14:43:19 -05:00
Patrick Devine 910e9401d0
Multimodal support (#1216)
---------

Co-authored-by: Matt Apperson <mattapperson@Matts-MacBook-Pro.local>
2023-12-11 13:56:22 -08:00
Jeffrey Morgan 7db5bcf73b fix go-staticcheck warning 2023-12-10 11:44:27 -05:00
Jeffrey Morgan fa2f095bd9 fix model name returned by /api/generate being different than the model name provided 2023-12-10 11:42:15 -05:00
Jeffrey Morgan 045b855db9 fix error on accumulating final chat response 2023-12-10 11:24:39 -05:00
Jeffrey Morgan 32064a0646 fix empty response when receiving runner error 2023-12-10 10:53:38 -05:00
Jeffrey Morgan 9e1406e4ed Don't expose model information in /api/generate 2023-12-09 02:05:43 -08:00
Bruce MacDonald 7e9405fd07
fix: encode full previous prompt in context (#1424) 2023-12-08 16:53:51 -05:00
Bruce MacDonald 3b0b8930d4
fix: only flush template in chat when current role encountered (#1426) 2023-12-08 16:44:24 -05:00
Bruce MacDonald e3f925fc1b
fix: restore modelfile system in prompt template (#1425) 2023-12-08 14:20:19 -05:00
Michael Yang 1f05d77110
Merge pull request #1244 from jmorganca/brucemacd/no-fail-template
do not fail on unsupported template variables
2023-12-06 13:23:04 -08:00
Michael Yang c3ff36088b
Merge pull request #774 from jmorganca/mxyng/server-version
add version api and show server version in cli
2023-12-06 13:22:55 -08:00
Bruce MacDonald 47d4e22673 use missingkey in set empty interface when missing 2023-12-05 15:49:05 -08:00
Michael Yang 5d75505ebd return model configuration in generate 2023-12-05 14:39:02 -08:00
Michael Yang b9495ea162 load projectors 2023-12-05 14:36:12 -08:00
Michael Yang 409bb9674e
Merge pull request #1308 from jmorganca/mxyng/split-from
split from into one or more models
2023-12-05 14:33:03 -08:00
Michael Yang d3479c07a1
Merge pull request #1250 from jmorganca/mxyng/create-layer
refactor layer creation
2023-12-05 14:32:52 -08:00
Bruce MacDonald 195e3d9dbd
chat api endpoint (#1392) 2023-12-05 14:57:33 -05:00
Michael Yang 1ebdbd9694 server: add version handler 2023-12-05 09:36:01 -08:00
Jeffrey Morgan 00d06619a1 Revert "chat api (#991)" while context variable is fixed
This reverts commit 7a0899d62d.
2023-12-04 21:16:27 -08:00
Michael Yang a3737cbd33 use NewLayer for CreateBlobHandler 2023-12-04 16:59:23 -08:00
Michael Yang 998f1785b6 add modelfamilies 2023-12-04 16:59:23 -08:00
Michael Yang 70a93057cd refactor layer creation
previous layer creation was not ideal because:

1. it required reading the input file multiple times, once to calculate
   the sha256 checksum, another to write it to disk, and potentially one
   more to decode the underlying gguf
2. used io.ReadSeeker which is prone to user error. if the file isn't
   reset correctly or in the right place, it could end up reading an
   empty file

there are also some brittleness when reading existing layers else
writing the inherited layers will error reading an already closed file

this commit aims to fix these issues by restructuring layer creation.

1. it will now write the layer to a temporary file as well as the hash
   function and move it to the final location on Commit
2. layers are read once once when copied to the destination. exception
   is raw model files which still requires a second read to decode the
   model metadata
2023-12-04 16:59:23 -08:00
Michael Yang 2cb0fa7d40 split from into one or more models 2023-12-04 16:59:23 -08:00
Bruce MacDonald 7a0899d62d
chat api (#991)
- update chat docs
- add messages chat endpoint
- remove deprecated context and template generate parameters from docs
- context and template are still supported for the time being and will continue to work as expected
- add partial response to chat history
2023-12-04 18:01:06 -05:00
Joshua Pham bb80a597db Fix adapter loading from SHA hash 2023-12-01 13:50:55 -05:00
Michael Yang 13efd5f218 upload: fix PUT retry 2023-11-29 16:38:35 -08:00
Michael Yang c4bdfffd96 upload: separate progress tracking 2023-11-29 16:38:33 -08:00
Michael Yang 26c63418e0 new hasher 2023-11-29 14:52:41 -08:00
Michael Yang 2799784ac8 revert checksum calculation to calculate-as-you-go 2023-11-29 13:47:58 -08:00
Bruce MacDonald 96122b7271
validate model tags on copy (#1323) 2023-11-29 15:54:29 -05:00
Timothy Jaeryang Baek c2e3b89176
fix: disable ':' in tag names (#1280)
Co-authored-by: rootedbox
2023-11-29 13:33:45 -05:00
Patrick Devine cde31cb220
Allow setting parameters in the REPL (#1294) 2023-11-29 09:56:42 -08:00
Bruce MacDonald 37d95157df
fix relative path on create (#1222) 2023-11-21 15:43:17 -05:00
Jeffrey Morgan 35c4b5ec16 calculate hash separately from http request 2023-11-20 15:45:11 -05:00
Jeffrey Morgan 9d73d3a6b5 add back part.Reset() 2023-11-19 14:32:19 -05:00
Jeffrey Morgan 72cd336410 dont retry on upload complete context cancel 2023-11-19 14:32:19 -05:00
Jeffrey Morgan 1bd594b2fa revert to using one open file for blob uploads 2023-11-19 14:32:19 -05:00
Jeffrey Morgan 9a8c21ac3d use exponential everywhere 2023-11-19 14:32:19 -05:00
Jeffrey Morgan f6b317e8c9 fix sending too little data in chunk upload body 2023-11-19 14:32:19 -05:00
Jeffrey Morgan ac5076ce1e exponential backoff up to 30s 2023-11-19 14:32:19 -05:00
Michael Yang 42c2e3a624 upload: retry complete upload 2023-11-19 14:32:19 -05:00
Michael Yang cb42589792 adjust download/upload parts 2023-11-19 14:32:19 -05:00