Commit graph

167 commits

Author SHA1 Message Date
Michael Yang 70a93057cd refactor layer creation
previous layer creation was not ideal because:

1. it required reading the input file multiple times, once to calculate
   the sha256 checksum, another to write it to disk, and potentially one
   more to decode the underlying gguf
2. used io.ReadSeeker which is prone to user error. if the file isn't
   reset correctly or in the right place, it could end up reading an
   empty file

there are also some brittleness when reading existing layers else
writing the inherited layers will error reading an already closed file

this commit aims to fix these issues by restructuring layer creation.

1. it will now write the layer to a temporary file as well as the hash
   function and move it to the final location on Commit
2. layers are read once once when copied to the destination. exception
   is raw model files which still requires a second read to decode the
   model metadata
2023-12-04 16:59:23 -08:00
Bruce MacDonald 7a0899d62d
chat api (#991)
- update chat docs
- add messages chat endpoint
- remove deprecated context and template generate parameters from docs
- context and template are still supported for the time being and will continue to work as expected
- add partial response to chat history
2023-12-04 18:01:06 -05:00
Joshua Pham bb80a597db Fix adapter loading from SHA hash 2023-12-01 13:50:55 -05:00
Patrick Devine cde31cb220
Allow setting parameters in the REPL (#1294) 2023-11-29 09:56:42 -08:00
Bruce MacDonald 37d95157df
fix relative path on create (#1222) 2023-11-21 15:43:17 -05:00
Jeffrey Morgan 02524a56ff check retry for authorization error 2023-11-19 00:19:53 -05:00
Jeffrey Morgan 12e046f12a remove unused function 2023-11-18 22:16:51 -05:00
Bruce MacDonald 0b19e24d81
only retry once on auth failure (#1175) 2023-11-17 14:22:35 -05:00
Bruce MacDonald 4b3f4bc7d9
return failure details when unauthorized to push (#1131)
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2023-11-16 16:44:18 -05:00
Michael Yang 652d90e1c7 Update server/images.go
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2023-11-15 15:16:23 -08:00
Michael Yang 1901044b07 use checksum reference 2023-11-15 15:16:23 -08:00
Michael Yang a07c935d34 ignore non blobs 2023-11-15 15:16:23 -08:00
Michael Yang b0d14ed51c refactor create model 2023-11-15 15:16:23 -08:00
Daniel Reis 7c438f2c53 Replaced method 2023-11-10 20:22:03 +00:00
Daniel Reis 6e46338d44 Reverting previous changes 2023-11-10 20:21:35 +00:00
Daniel Reis d17730356a Removed inline parse model path 2023-11-09 22:44:26 +00:00
Daniel Reis 32d79a6eea Using 'GetShortTagname' method instead 2023-11-09 22:40:37 +00:00
Jeffrey Morgan e21579a0f1 Restore system prompt on requests 2023-11-03 17:26:45 -07:00
Jeffrey Morgan c50b01bc21 check request.Context for initial system prompt 2023-11-02 18:17:00 -07:00
Bruce MacDonald b9dc875401
remove modelfile context deprecated in v0.0.7 (#974) 2023-11-02 20:52:56 -04:00
Michael Yang 1fd511e661
Merge pull request #975 from jmorganca/mxyng/downloads
update downloads to use retry wrapper
2023-11-02 16:12:48 -07:00
Jeffrey Morgan 1beb5645a9
only use system prompt if context is not provided (#978) 2023-11-02 15:48:02 -07:00
Michael Yang fe5a872444 fix upload 2023-11-02 13:25:58 -07:00
Michael Yang d39709260f download with retry 2023-11-02 13:16:11 -07:00
Michael Yang 60bb3c03a1 use http.Method 2023-11-02 13:12:45 -07:00
Michael Yang 4e09aab8b9 concurrent uploads 2023-10-27 17:07:33 -07:00
Bruce MacDonald 5c3491f425
allow for a configurable ollama model storage directory (#897)
* allow for a configurable ollama models directory

- set OLLAMA_MODELS in the environment that ollama is running in to change where model files are stored
- update docs

Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>
Co-Authored-By: Jay Nakrani <dhananjaynakrani@gmail.com>
Co-Authored-By: Akhil Acharya <akhilcacharya@gmail.com>
Co-Authored-By: Sasha Devol <sasha.devol@protonmail.com>
2023-10-27 10:19:59 -04:00
Michael Yang 846f593dbf
Merge pull request #828 from jmorganca/mxyng/template-parameters
image: show parameters
2023-10-19 09:31:31 -07:00
Michael Yang a19d47642e models: rm workDir from CreateModel
unused after removing EMBED
2023-10-19 09:21:04 -07:00
Bruce MacDonald fe6f3b48f7
do not reload the running llm when runtime params change (#840)
- only reload the running llm if the model has changed, or the options for loading the running model have changed
- rename loaded llm to runner to differentiate from loaded model image
- remove logic which keeps the first system prompt in the generation context
2023-10-19 10:39:58 -04:00
Michael Yang 4dcceeffb7 let the template do the work 2023-10-18 13:12:00 -07:00
Michael Yang 019e4a4558 image: show parameters 2023-10-18 13:12:00 -07:00
Michael Yang 8299bf76ed model: native gotemplate adapter template 2023-10-17 15:28:38 -07:00
Michael Yang ee4979e510 show: no template system if empty 2023-10-17 15:25:43 -07:00
Bruce MacDonald a0c3e989de
deprecate modelfile embed command (#759) 2023-10-16 11:07:37 -04:00
Michael Yang f6e98334e4 handle upstream proxies 2023-10-09 11:42:36 -07:00
Bruce MacDonald d6786f2945
add feedback for reading model metadata (#722) 2023-10-06 16:05:32 -04:00
Michael Yang 8544edca21 parallel chunked downloads 2023-10-06 12:56:43 -07:00
Bruce MacDonald 2130c0708b
output type parsed from modelfile (#678) 2023-10-05 14:58:04 -04:00
Michael Yang 9333b0cc82
Merge pull request #612 from jmorganca/mxyng/prune-empty-directories
prune empty directories
2023-09-29 11:23:39 -07:00
Michael Yang f40b3de758 use int64 consistently 2023-09-28 11:07:24 -07:00
Michael Yang 8608eb4760 prune empty directories 2023-09-27 10:58:09 -07:00
Bruce MacDonald 4cba75efc5
remove tmp directories created by previous servers (#559)
* remove tmp directories created by previous servers

* clean up on server stop

* Update routes.go

* Update server/routes.go

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* create top-level temp ollama dir

* check file exists before creating

---------

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
Co-authored-by: Michael Yang <mxyng@pm.me>
2023-09-21 20:38:49 +01:00
Michael Yang 499e9007a5 pick chunksize based on location 2023-09-20 11:10:24 -07:00
Michael Yang a5520bfb42 fix build 2023-09-19 10:42:24 -07:00
Michael Yang b58d5d16b0 fix mkdir on windows 2023-09-19 09:41:13 -07:00
Patrick Devine 24580df958
only add a layer if there is actual data (#535) 2023-09-18 13:47:45 -07:00
Michael Yang daa4f096f9 set request.ContentLength
This informs the HTTP client the content length is known and disables
chunked Transfer-Encoding
2023-09-14 13:32:44 -07:00
Michael Yang e6881cabd0 remove unused 2023-09-13 14:48:33 -07:00
Michael Yang 0c5a454361 fix model type for 70b 2023-09-12 15:12:59 -07:00