Commit graph

217 commits

Author SHA1 Message Date
Michael Yang 7dee25a07f fix falcon decode
get model and file type from bin file
2023-09-12 12:34:53 -07:00
Patrick Devine e7e91cd71c
add autoprune to remove unused layers (#491) 2023-09-11 11:46:35 -07:00
Jeffrey Morgan 3920e15386
add model format to config layer (#497) 2023-09-09 17:53:44 -04:00
Michael Yang de227b620f fix nil pointer dereference 2023-09-07 17:24:31 -07:00
Michael Yang 738fe9c4aa
Merge pull request #486 from jmorganca/mxyng/fix-push
fix: retry push on expired token
2023-09-07 13:58:34 -07:00
Michael Yang f0f4943577 fix get auth token 2023-09-07 12:01:56 -07:00
Bruce MacDonald 09dd2aeff9
GGUF support (#441) 2023-09-07 13:55:37 -04:00
Patrick Devine 790d24eb7b
add show command (#474) 2023-09-06 11:04:17 -07:00
Michael Yang 06ef90c051 fix parameter inheritence
parameters are not inherited because they are processed differently from
other layer. fix this by explicitly merging the inherited params into
the new params. parameter values defined in the new modelfile will
override those defined in the inherited modelfile. array lists are
replaced instead of appended
2023-09-05 11:40:20 -07:00
Michael Yang e9f6df7dca use slices.DeleteFunc 2023-09-05 09:56:59 -07:00
Quinn Slack 62d29b2157 do not HTML-escape prompt
The `html/template` package automatically HTML-escapes interpolated strings in templates. This behavior is undesirable because it causes prompts like `<h1>hello` to be escaped to `&lt;h1&gt;hello` before being passed to the LLM.

The included test case passes, but before the code change, it failed:

```
--- FAIL: TestModelPrompt
    images_test.go:21: got "a&lt;h1&gt;b", want "a<h1>b"
```
2023-09-01 17:16:38 -05:00
Michael Yang 1c8fd627ad windows: fix create modelfile 2023-08-31 09:47:10 -04:00
Michael Yang ae950b00f1 windows: fix delete 2023-08-31 09:47:10 -04:00
Bruce MacDonald 42998d797d
subprocess llama.cpp server (#401)
* remove c code
* pack llama.cpp
* use request context for llama_cpp
* let llama_cpp decide the number of threads to use
* stop llama runner when app stops
* remove sample count and duration metrics
* use go generate to get libraries
* tmp dir for running llm
2023-08-30 16:35:03 -04:00
Quinn Slack f4432e1dba
treat stop as stop sequences, not exact tokens (#442)
The `stop` option to the generate API is a list of sequences that should cause generation to stop. Although these are commonly called "stop tokens", they do not necessarily correspond to LLM tokens (per the LLM's tokenizer). For example, if the caller sends a generate request with `"stop":["\n"]`, then generation should stop on any token containing `\n` (and trim `\n` from the output), not just if the token exactly matches `\n`. If `stop` were interpreted strictly as LLM tokens, then it would require callers of the generate API to know the LLM's tokenizer and enumerate many tokens in the `stop` list.

Fixes https://github.com/jmorganca/ollama/issues/295.
2023-08-30 11:53:42 -04:00
Michael Yang 982c535428
Merge pull request #428 from jmorganca/mxyng/upload-chunks
update upload chunks
2023-08-30 07:47:17 -07:00
Patrick Devine 8bbff2df98
add model IDs (#439) 2023-08-28 20:50:24 -07:00
Michael Yang 16b06699fd remove unused parameter 2023-08-28 18:35:18 -04:00
Michael Yang 246dc65417 loosen http status code checks 2023-08-28 18:34:53 -04:00
Michael Yang 59734ca24d set default template 2023-08-26 12:20:48 -07:00
Michael Yang 32d1a00017 remove unused requestContextKey 2023-08-22 10:49:54 -07:00
Michael Yang 04e2128273 move upload funcs to upload.go 2023-08-22 10:49:53 -07:00
Michael Yang 2cc634689b use url.URL 2023-08-22 10:49:07 -07:00
Michael Yang 9ec7e37534
Merge pull request #392 from jmorganca/mxyng/version
add version
2023-08-22 09:50:25 -07:00
Michael Yang 2c7f956b38 add version 2023-08-22 09:40:58 -07:00
Jeffrey Morgan a9f6c56652 fix FROM instruction erroring when referring to a file 2023-08-22 09:39:42 -07:00
Ryan Baker 0a892419ad
Strip protocol from model path (#377) 2023-08-21 21:56:56 -07:00
Michael Yang 3b49315f97 retry on unauthorized chunk push
The token printed for authorized requests has a lifetime of 1h. If an
upload exceeds 1h, a chunk push will fail since the token is created on
a "start upload" request.

This replaces the Pipe with SectionReader which is simpler and
implements Seek, a requirement for makeRequestWithRetry. This is
slightly worse than using a Pipe since the progress update is directly
tied to the chunk size instead of controlled separately.
2023-08-18 11:23:47 -07:00
Michael Yang 7eda70f23b copy metadata from source 2023-08-17 21:55:25 -07:00
Michael Yang 086449b6c7 fmt 2023-08-17 15:32:31 -07:00
Michael Yang 3cbc6a5c01 fix push manifest 2023-08-17 15:28:12 -07:00
Michael Yang a894cc792d model and file type as strings 2023-08-17 12:08:04 -07:00
Michael Yang b963a83559
Merge pull request #364 from jmorganca/chunked-uploads
reimplement chunked uploads
2023-08-17 09:58:51 -07:00
Michael Yang bf6688abe6
Merge pull request #360 from jmorganca/fix-request-copies
Fix request copies
2023-08-17 09:58:42 -07:00
Bruce MacDonald 6005b157c2
retry download on network errors 2023-08-17 10:31:45 -04:00
Michael Yang 5dfe91be8b reimplement chunked uploads 2023-08-16 14:50:24 -07:00
Michael Yang 9f944c00f1 push: retry on unauthorized 2023-08-16 11:35:33 -07:00
Michael Yang 56e87cecb1 images: remove body copies 2023-08-16 10:30:41 -07:00
Michael Yang 5d9a4cd251
Merge pull request #348 from jmorganca/cross-repo-mount
cross repo blob mount
2023-08-16 09:20:36 -07:00
Bruce MacDonald 1deb35ca64
use loaded llm for generating model file embeddings 2023-08-15 16:12:02 -03:00
Bruce MacDonald e2de886831
do not regenerate embeddings 2023-08-15 16:10:22 -03:00
Bruce MacDonald f0d7c2f5ea retry download on network errors 2023-08-15 15:07:19 -03:00
Bruce MacDonald 326de48930 use loaded llm for embeddings 2023-08-15 10:50:54 -03:00
Bruce MacDonald 18f2cb0472 dont log fatal 2023-08-15 10:39:59 -03:00
Michael Yang e26085b921 close open files 2023-08-14 16:08:06 -07:00
Michael Yang f594c8eb91 cross repo mount 2023-08-14 15:07:35 -07:00
Bruce MacDonald 2c8b680b03 use file info for embeddings cache 2023-08-14 12:11:04 -03:00
Bruce MacDonald 99b6b60085 use model bin digest for embed digest 2023-08-14 11:57:12 -03:00
Bruce MacDonald e9a9580bdd do not regenerate embeddings
- re-use previously evaluated embeddings when possible
- change embeddings digest identifier to be based on model name and embedded file path
2023-08-14 10:34:17 -03:00
Patrick Devine d9cf18e28d
add maximum retries when pushing (#334) 2023-08-11 15:41:55 -07:00
Michael Yang 6517bcc53c
Merge pull request #290 from jmorganca/add-adapter-layers
implement loading ggml lora adapters through the modelfile
2023-08-10 17:23:01 -07:00
Michael Yang 6a6828bddf
Merge pull request #167 from jmorganca/decode-ggml
partial decode ggml bin for more info
2023-08-10 17:22:40 -07:00
Patrick Devine be989d89d1
Token auth (#314) 2023-08-10 11:34:25 -07:00
Michael Yang 6de5d032e1 implement loading ggml lora adapters through the modelfile 2023-08-10 09:23:39 -07:00
Michael Yang fccf8d179f partial decode ggml bin for more info 2023-08-10 09:23:10 -07:00
Bruce MacDonald 984c9c628c fix embeddings invalid values 2023-08-09 16:50:53 -04:00
Bruce MacDonald ac971c56d1 Update images.go 2023-08-09 11:31:54 -04:00
Bruce MacDonald 868e3b31c7 allow for concurrent pulls of the same files 2023-08-09 11:31:54 -04:00
Bruce MacDonald 1bee2347be pr feedback
- defer closing llm on embedding
- do not override licenses
- remove debugging print line
- reformat model file docs
2023-08-08 17:01:37 -04:00
Bruce MacDonald 884d78ceb3 allow embedding from model binary 2023-08-08 14:38:57 -04:00
Bruce MacDonald 21ddcaa1f1 pr comments
- default to embeddings enabled
- move embedding logic for loaded model to request
- allow embedding full directory
- close llm on reload
2023-08-08 13:49:37 -04:00
Bruce MacDonald a6f6d18f83 embed text document in modelfile 2023-08-08 11:27:17 -04:00
Jeffrey Morgan 8713ac23a8 allow overriding template and system in /api/generate
Fixes #297
Fixes #296
2023-08-08 00:55:34 -04:00
Michael Yang a71ff3f6a2 use a pipe to push to registry with progress
switch to a monolithic upload instead of a chunk upload through a pipe
to report progress
2023-08-03 10:37:13 -07:00
Bruce MacDonald 1c5a8770ee read runner parameter options from map
- read runner options from map to see what was specified explicitly and overwrite zero values
2023-08-01 13:38:19 -04:00
Bruce MacDonald daa0d1de7a allow specifying zero values in modelfile 2023-08-01 13:37:50 -04:00
Jeffrey Morgan 528bafa585 cache loaded model 2023-08-01 11:24:18 -04:00
Michael Yang 872011630a fix license 2023-07-31 21:46:48 -07:00
Michael Yang 203fdbc4b8 check err 2023-07-31 21:46:48 -07:00
Michael Yang 70e0ab6b3d remove unnecessary fmt.Sprintf 2023-07-31 21:46:47 -07:00
Jeffrey Morgan 9968153729 fix Go warnings 2023-07-31 21:37:40 -04:00
Michael Yang eadee46840
Merge pull request #236 from jmorganca/check-os-walk
check os.Walk err
2023-07-28 14:14:21 -07:00
Michael Yang bd58528fbd check os.Walk err 2023-07-28 12:15:31 -07:00
Michael Yang c5e447a359 remove io/ioutil import
ioutil is deprecated
2023-07-28 12:06:03 -07:00
Bruce MacDonald f5cbcb08e6 specify stop params separately 2023-07-28 11:29:00 -04:00
Bruce MacDonald 184ad8f057 allow specifying stop conditions in modelfile 2023-07-28 11:02:04 -04:00
Bruce MacDonald 1ac38ec89c improve modelfile docs 2023-07-27 15:13:04 -04:00
Bruce MacDonald 4c1caa3733 download models when creating from modelfile 2023-07-25 14:25:13 -04:00
Bruce MacDonald 07ed69bc37 remove reduandant err var 2023-07-25 10:30:14 -04:00
Bruce MacDonald 536028c35a better error message when model not found on pull 2023-07-24 17:48:17 -04:00
Bruce MacDonald abf614804b
remove file on digest mismatch 2023-07-24 21:59:12 +02:00
Bruce MacDonald a0dbbb23c4
truncate file size on resume 2023-07-24 21:58:32 +02:00
Bruce MacDonald 0fd6278446 do not panic server if file cannot be opened 2023-07-24 15:24:34 -04:00
Bruce MacDonald abfc73d31e make response errors unique for error trace 2023-07-24 15:04:21 -04:00
Bruce MacDonald 5a5ca8e7ff remove file on digest mismatch 2023-07-24 14:53:01 -04:00
Bruce MacDonald fdbef6c95e truncate file size on resume 2023-07-24 14:36:19 -04:00
Patrick Devine 4cb42ca55e
add copy command (#191) 2023-07-24 11:27:28 -04:00
Patrick Devine 88c55199f8
change push to chunked uploads from monolithic (#179) 2023-07-22 17:31:26 -07:00
Patrick Devine 6d6b0d3321
change error handler behavior and fix error when a model isn't found (#173) 2023-07-21 23:02:12 -07:00
Michael Yang 20a5d99f77 fix vars.First 2023-07-21 20:45:32 -07:00
Patrick Devine b8421dce3d
get the proper path for blobs to delete (#168) 2023-07-21 17:30:40 -07:00
Patrick Devine 9f6e97865c
allow pushing/pulling to insecure registries (#157) 2023-07-21 15:42:19 -07:00
Patrick Devine e7a393de54
add rm command for models (#151) 2023-07-20 16:09:23 -07:00
Michael Yang 6cea2061ec windows: fix model pulling 2023-07-20 12:35:04 -07:00
Michael Yang 2832801c2a
Merge pull request #91 from jmorganca/fix-stream-errors
fix stream errors
2023-07-20 12:21:59 -07:00
Michael Yang 992892866b
Merge pull request #145 from jmorganca/verify-digest
verify blob digest
2023-07-20 12:14:21 -07:00
Michael Yang 1f27d7f1b8 fix stream errors 2023-07-20 12:12:08 -07:00
Michael Yang bf198c3918 verify blob digest 2023-07-20 11:53:57 -07:00
Bruce MacDonald 3ec4ebc562 remove unused code 2023-07-20 20:18:00 +02:00
Jeffrey Morgan d59b164fa2 add prompt back to parser 2023-07-20 01:13:30 -07:00
Michael Yang 6f046dbf18
Update images.go (#134) 2023-07-19 23:46:01 -07:00
Michael Yang 60b4db6389 add .First 2023-07-19 23:24:32 -07:00
Michael Yang ca210ba480 handle vnd.ollama.image.prompt for compat 2023-07-19 23:24:32 -07:00
Michael Yang df146c41e2 separate prompt into template and system 2023-07-19 23:24:31 -07:00
Jeffrey Morgan 2d305fa99a allow relative paths in FROM instruction 2023-07-19 21:55:15 -07:00
Jeffrey Morgan 4ca7c4be1f dont consume reader when calculating digest 2023-07-19 00:47:55 -07:00
Patrick Devine 572fc9099f
add license layers to the parser (#116) 2023-07-18 22:49:38 -07:00
Michael Yang 68df36ae50 fix pull 0 bytes on completed layer 2023-07-18 19:38:11 -07:00
Michael Yang 553fa39fe8 fix memory leak in create 2023-07-18 17:14:17 -07:00
Patrick Devine 5bea29f610
add new list command (#97) 2023-07-18 09:09:45 -07:00
Patrick Devine 4a28a2f093
add modelpaths (#96) 2023-07-17 22:44:21 -07:00
Michael Yang c7dd52271c remove debugging messages 2023-07-17 14:17:34 -07:00
Michael Yang 53d0052c6c unavoid unnecessary type conversion 2023-07-17 12:35:03 -07:00
Michael Yang 28a136e9a3 modelfile params 2023-07-17 12:35:03 -07:00
Michael Yang 3862a51a6a create directories if they do not exist 2023-07-17 11:18:48 -07:00
Michael Yang bcb612a30a fix file paths for windows 2023-07-17 10:47:47 -07:00
Patrick Devine 2fb52261ad
basic distribution w/ push/pull (#78)
* basic distribution w/ push/pull

* add the parser

* add create, pull, and push

* changes to the parser, FROM line, and fix commands

* mkdirp new manifest directories

* make `blobs` directory if it does not exist

* fix go warnings

* add progressbar for model pulls

* move model struct

---------

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2023-07-16 17:02:22 -07:00