Commit graph

257 commits

Author SHA1 Message Date
Patrick Devine 2c017ca441
Convert Safetensors to an Ollama model (#2824) 2024-03-06 21:01:51 -08:00
Jeffrey Morgan 63861f58cc
Support for bert and nomic-bert embedding models 2024-02-20 21:37:29 -05:00
Michael Yang 897b213468
use http.DefaultClient (#2530)
default client already handles proxy
2024-02-20 18:34:47 -05:00
Michael Yang e43648afe5 rerefactor 2024-02-15 05:56:45 +00:00
Daniel Hiltgen f397e0e988 Move hub auth out to new package 2024-02-15 05:56:45 +00:00
Jeffrey Morgan 48a273f80b
Fix issues with templating prompt in chat mode (#2460) 2024-02-12 15:06:57 -08:00
Jeffrey Morgan a0a199b108
Fix hanging issue when sending empty content (#2399) 2024-02-07 19:30:33 -05:00
Michael Yang f3761405c8 use image id 2024-02-01 11:52:42 -08:00
Michael Yang d125510b4b remove image tags 2024-02-01 11:32:51 -08:00
Michael Yang d046bee790 use llm.ImageData for chat 2024-01-31 19:18:25 -08:00
Michael Yang 8450bf66e6 trim images 2024-01-31 19:13:47 -08:00
Michael Yang b4e11be8ef append image tags to user content 2024-01-31 19:13:10 -08:00
Bruce MacDonald a896079705
preserve last system message from modelfile (#2289) 2024-01-31 21:45:01 -05:00
Bruce MacDonald 0632dff3f8
trim chat prompt based on llm context size (#1963) 2024-01-30 15:59:29 -05:00
Jeffrey Morgan e4b9b72f2a
Do not repeat system prompt for chat templating (#2241) 2024-01-28 14:15:56 -08:00
Patrick Devine 7c40a67841
Save and load sessions (#2063) 2024-01-25 12:12:36 -08:00
Michael Yang c08dfaa23d fix: remove overwritten model layers
if create overrides a manifest, first add the older manifest's layers to
the delete map so they can be cleaned up
2024-01-19 14:58:37 -08:00
Daniel Hiltgen fedd705aea Mechanical switch from log to slog
A few obvious levels were adjusted, but generally everything mapped to "info" level.
2024-01-18 14:12:57 -08:00
Michael Yang cf29bd2d72 fix: request retry with error
this fixes a subtle bug with makeRequestWithRetry where an HTTP status
error on a retried request will potentially not return the right err
2024-01-12 13:32:27 -08:00
Michael Yang 2bb2bdd5d4 fix lint 2024-01-09 09:36:58 -08:00
Bruce MacDonald 7e8f7c8358
remove ggml automatic re-pull (#1856) 2024-01-08 14:41:01 -05:00
Bruce MacDonald 4ad6c9b11f
fix: pull either original model or from model on create (#1774) 2024-01-04 01:34:38 -05:00
Bruce MacDonald db356c8519
post-response templating (#1427) 2023-12-22 17:07:05 -05:00
Bruce MacDonald 5e7fd6906f Update images.go 2023-12-19 09:05:46 -08:00
Bruce MacDonald 811b1f03c8 deprecate ggml
- remove ggml runner
- automatically pull gguf models when ggml detected
- tell users to update to gguf in the case automatic pull fails

Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>
2023-12-19 09:05:46 -08:00
Patrick Devine d9e60f634b
add image support to the chat api (#1490) 2023-12-12 13:28:58 -08:00
Jeffrey Morgan 0a9d348023
Fix issues with /set template and /set system (#1486) 2023-12-12 14:43:19 -05:00
Patrick Devine 910e9401d0
Multimodal support (#1216)
---------

Co-authored-by: Matt Apperson <mattapperson@Matts-MacBook-Pro.local>
2023-12-11 13:56:22 -08:00
Jeffrey Morgan 9e1406e4ed Don't expose model information in /api/generate 2023-12-09 02:05:43 -08:00
Bruce MacDonald 3b0b8930d4
fix: only flush template in chat when current role encountered (#1426) 2023-12-08 16:44:24 -05:00
Bruce MacDonald e3f925fc1b
fix: restore modelfile system in prompt template (#1425) 2023-12-08 14:20:19 -05:00
Bruce MacDonald 47d4e22673 use missingkey in set empty interface when missing 2023-12-05 15:49:05 -08:00
Michael Yang 5d75505ebd return model configuration in generate 2023-12-05 14:39:02 -08:00
Michael Yang b9495ea162 load projectors 2023-12-05 14:36:12 -08:00
Michael Yang 409bb9674e
Merge pull request #1308 from jmorganca/mxyng/split-from
split from into one or more models
2023-12-05 14:33:03 -08:00
Michael Yang d3479c07a1
Merge pull request #1250 from jmorganca/mxyng/create-layer
refactor layer creation
2023-12-05 14:32:52 -08:00
Bruce MacDonald 195e3d9dbd
chat api endpoint (#1392) 2023-12-05 14:57:33 -05:00
Jeffrey Morgan 00d06619a1 Revert "chat api (#991)" while context variable is fixed
This reverts commit 7a0899d62d.
2023-12-04 21:16:27 -08:00
Michael Yang 998f1785b6 add modelfamilies 2023-12-04 16:59:23 -08:00
Michael Yang 70a93057cd refactor layer creation
previous layer creation was not ideal because:

1. it required reading the input file multiple times, once to calculate
   the sha256 checksum, another to write it to disk, and potentially one
   more to decode the underlying gguf
2. used io.ReadSeeker which is prone to user error. if the file isn't
   reset correctly or in the right place, it could end up reading an
   empty file

there are also some brittleness when reading existing layers else
writing the inherited layers will error reading an already closed file

this commit aims to fix these issues by restructuring layer creation.

1. it will now write the layer to a temporary file as well as the hash
   function and move it to the final location on Commit
2. layers are read once once when copied to the destination. exception
   is raw model files which still requires a second read to decode the
   model metadata
2023-12-04 16:59:23 -08:00
Michael Yang 2cb0fa7d40 split from into one or more models 2023-12-04 16:59:23 -08:00
Bruce MacDonald 7a0899d62d
chat api (#991)
- update chat docs
- add messages chat endpoint
- remove deprecated context and template generate parameters from docs
- context and template are still supported for the time being and will continue to work as expected
- add partial response to chat history
2023-12-04 18:01:06 -05:00
Joshua Pham bb80a597db Fix adapter loading from SHA hash 2023-12-01 13:50:55 -05:00
Patrick Devine cde31cb220
Allow setting parameters in the REPL (#1294) 2023-11-29 09:56:42 -08:00
Bruce MacDonald 37d95157df
fix relative path on create (#1222) 2023-11-21 15:43:17 -05:00
Jeffrey Morgan 02524a56ff check retry for authorization error 2023-11-19 00:19:53 -05:00
Jeffrey Morgan 12e046f12a remove unused function 2023-11-18 22:16:51 -05:00
Bruce MacDonald 0b19e24d81
only retry once on auth failure (#1175) 2023-11-17 14:22:35 -05:00
Bruce MacDonald 4b3f4bc7d9
return failure details when unauthorized to push (#1131)
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2023-11-16 16:44:18 -05:00
Michael Yang 652d90e1c7 Update server/images.go
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2023-11-15 15:16:23 -08:00
Michael Yang 1901044b07 use checksum reference 2023-11-15 15:16:23 -08:00
Michael Yang a07c935d34 ignore non blobs 2023-11-15 15:16:23 -08:00
Michael Yang b0d14ed51c refactor create model 2023-11-15 15:16:23 -08:00
Daniel Reis 7c438f2c53 Replaced method 2023-11-10 20:22:03 +00:00
Daniel Reis 6e46338d44 Reverting previous changes 2023-11-10 20:21:35 +00:00
Daniel Reis d17730356a Removed inline parse model path 2023-11-09 22:44:26 +00:00
Daniel Reis 32d79a6eea Using 'GetShortTagname' method instead 2023-11-09 22:40:37 +00:00
Jeffrey Morgan e21579a0f1 Restore system prompt on requests 2023-11-03 17:26:45 -07:00
Jeffrey Morgan c50b01bc21 check request.Context for initial system prompt 2023-11-02 18:17:00 -07:00
Bruce MacDonald b9dc875401
remove modelfile context deprecated in v0.0.7 (#974) 2023-11-02 20:52:56 -04:00
Michael Yang 1fd511e661
Merge pull request #975 from jmorganca/mxyng/downloads
update downloads to use retry wrapper
2023-11-02 16:12:48 -07:00
Jeffrey Morgan 1beb5645a9
only use system prompt if context is not provided (#978) 2023-11-02 15:48:02 -07:00
Michael Yang fe5a872444 fix upload 2023-11-02 13:25:58 -07:00
Michael Yang d39709260f download with retry 2023-11-02 13:16:11 -07:00
Michael Yang 60bb3c03a1 use http.Method 2023-11-02 13:12:45 -07:00
Michael Yang 4e09aab8b9 concurrent uploads 2023-10-27 17:07:33 -07:00
Bruce MacDonald 5c3491f425
allow for a configurable ollama model storage directory (#897)
* allow for a configurable ollama models directory

- set OLLAMA_MODELS in the environment that ollama is running in to change where model files are stored
- update docs

Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>
Co-Authored-By: Jay Nakrani <dhananjaynakrani@gmail.com>
Co-Authored-By: Akhil Acharya <akhilcacharya@gmail.com>
Co-Authored-By: Sasha Devol <sasha.devol@protonmail.com>
2023-10-27 10:19:59 -04:00
Michael Yang 846f593dbf
Merge pull request #828 from jmorganca/mxyng/template-parameters
image: show parameters
2023-10-19 09:31:31 -07:00
Michael Yang a19d47642e models: rm workDir from CreateModel
unused after removing EMBED
2023-10-19 09:21:04 -07:00
Bruce MacDonald fe6f3b48f7
do not reload the running llm when runtime params change (#840)
- only reload the running llm if the model has changed, or the options for loading the running model have changed
- rename loaded llm to runner to differentiate from loaded model image
- remove logic which keeps the first system prompt in the generation context
2023-10-19 10:39:58 -04:00
Michael Yang 4dcceeffb7 let the template do the work 2023-10-18 13:12:00 -07:00
Michael Yang 019e4a4558 image: show parameters 2023-10-18 13:12:00 -07:00
Michael Yang 8299bf76ed model: native gotemplate adapter template 2023-10-17 15:28:38 -07:00
Michael Yang ee4979e510 show: no template system if empty 2023-10-17 15:25:43 -07:00
Bruce MacDonald a0c3e989de
deprecate modelfile embed command (#759) 2023-10-16 11:07:37 -04:00
Michael Yang f6e98334e4 handle upstream proxies 2023-10-09 11:42:36 -07:00
Bruce MacDonald d6786f2945
add feedback for reading model metadata (#722) 2023-10-06 16:05:32 -04:00
Michael Yang 8544edca21 parallel chunked downloads 2023-10-06 12:56:43 -07:00
Bruce MacDonald 2130c0708b
output type parsed from modelfile (#678) 2023-10-05 14:58:04 -04:00
Michael Yang 9333b0cc82
Merge pull request #612 from jmorganca/mxyng/prune-empty-directories
prune empty directories
2023-09-29 11:23:39 -07:00
Michael Yang f40b3de758 use int64 consistently 2023-09-28 11:07:24 -07:00
Michael Yang 8608eb4760 prune empty directories 2023-09-27 10:58:09 -07:00
Bruce MacDonald 4cba75efc5
remove tmp directories created by previous servers (#559)
* remove tmp directories created by previous servers

* clean up on server stop

* Update routes.go

* Update server/routes.go

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* create top-level temp ollama dir

* check file exists before creating

---------

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
Co-authored-by: Michael Yang <mxyng@pm.me>
2023-09-21 20:38:49 +01:00
Michael Yang 499e9007a5 pick chunksize based on location 2023-09-20 11:10:24 -07:00
Michael Yang a5520bfb42 fix build 2023-09-19 10:42:24 -07:00
Michael Yang b58d5d16b0 fix mkdir on windows 2023-09-19 09:41:13 -07:00
Patrick Devine 24580df958
only add a layer if there is actual data (#535) 2023-09-18 13:47:45 -07:00
Michael Yang daa4f096f9 set request.ContentLength
This informs the HTTP client the content length is known and disables
chunked Transfer-Encoding
2023-09-14 13:32:44 -07:00
Michael Yang e6881cabd0 remove unused 2023-09-13 14:48:33 -07:00
Michael Yang 0c5a454361 fix model type for 70b 2023-09-12 15:12:59 -07:00
Michael Yang 7dee25a07f fix falcon decode
get model and file type from bin file
2023-09-12 12:34:53 -07:00
Patrick Devine e7e91cd71c
add autoprune to remove unused layers (#491) 2023-09-11 11:46:35 -07:00
Jeffrey Morgan 3920e15386
add model format to config layer (#497) 2023-09-09 17:53:44 -04:00
Michael Yang de227b620f fix nil pointer dereference 2023-09-07 17:24:31 -07:00
Michael Yang 738fe9c4aa
Merge pull request #486 from jmorganca/mxyng/fix-push
fix: retry push on expired token
2023-09-07 13:58:34 -07:00
Michael Yang f0f4943577 fix get auth token 2023-09-07 12:01:56 -07:00
Bruce MacDonald 09dd2aeff9
GGUF support (#441) 2023-09-07 13:55:37 -04:00
Patrick Devine 790d24eb7b
add show command (#474) 2023-09-06 11:04:17 -07:00
Michael Yang 06ef90c051 fix parameter inheritence
parameters are not inherited because they are processed differently from
other layer. fix this by explicitly merging the inherited params into
the new params. parameter values defined in the new modelfile will
override those defined in the inherited modelfile. array lists are
replaced instead of appended
2023-09-05 11:40:20 -07:00
Michael Yang e9f6df7dca use slices.DeleteFunc 2023-09-05 09:56:59 -07:00