Commit graph

478 commits

Author SHA1 Message Date
Jeffrey Morgan e1d7056496 update progress statuses 2023-11-19 09:21:13 -05:00
Jeffrey Morgan 02524a56ff check retry for authorization error 2023-11-19 00:19:53 -05:00
Jeffrey Morgan 12e046f12a remove unused function 2023-11-18 22:16:51 -05:00
Bruce MacDonald 43a726149d fix potentially inaccurate error message 2023-11-18 21:25:07 -05:00
Jeffrey Morgan bab9494176 add - separator to temp file created on ollama create 2023-11-18 09:39:52 -05:00
Michael Yang c6e6c8ee7e fix cross device rename 2023-11-17 15:22:17 -08:00
Michael Yang c1bbf5ddee
Merge pull request #1134 from jmorganca/mxyng/progress
progress bar
2023-11-17 14:03:35 -08:00
Bruce MacDonald 0b19e24d81
only retry once on auth failure (#1175) 2023-11-17 14:22:35 -05:00
Michael Yang d6ecaa2cbf update progress responses 2023-11-17 10:06:19 -08:00
Bruce MacDonald 4b3f4bc7d9
return failure details when unauthorized to push (#1131)
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2023-11-16 16:44:18 -05:00
Michael Yang a5ccf742c1 fix cross repo mounts 2023-11-16 16:33:30 -05:00
Michael Yang e33ef391cd fix push scope error for inherited model 2023-11-16 16:33:30 -05:00
Michael Yang 54f92f01cb update docs 2023-11-15 15:28:15 -08:00
Michael Yang 652d90e1c7 Update server/images.go
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2023-11-15 15:16:23 -08:00
Michael Yang bc22d5a38b no blob response 2023-11-15 15:16:23 -08:00
Michael Yang 1901044b07 use checksum reference 2023-11-15 15:16:23 -08:00
Michael Yang a07c935d34 ignore non blobs 2023-11-15 15:16:23 -08:00
Michael Yang 1552cee59f client create modelfile 2023-11-15 15:16:23 -08:00
Michael Yang 3ca56b5ada add create modelfile field 2023-11-15 15:16:23 -08:00
Michael Yang b0d14ed51c refactor create model 2023-11-15 15:16:23 -08:00
Michael Yang d91c103e74
Merge pull request #1055 from dansreis/946-fix-incorrect-base-model-name
Fixed incorrect base model name
2023-11-13 08:42:55 -08:00
Daniel Reis 7c438f2c53 Replaced method 2023-11-10 20:22:03 +00:00
Daniel Reis 6e46338d44 Reverting previous changes 2023-11-10 20:21:35 +00:00
Daniel Hiltgen cc54a416c6 Resume chunk download on UnexpectedEOF errors
If the chunk download is interrupted, resume from where we left off
2023-11-10 08:29:42 -08:00
Jeffrey Morgan 5cba29b9d6
JSON mode: add `"format" as an api parameter (#1051)
* add `"format": "json"` as an API parameter
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2023-11-09 16:44:02 -08:00
Daniel Reis d17730356a Removed inline parse model path 2023-11-09 22:44:26 +00:00
Daniel Reis 32d79a6eea Using 'GetShortTagname' method instead 2023-11-09 22:40:37 +00:00
Bruce MacDonald ec2a31e9b3
support raw generation requests (#952)
- add the optional `raw` generate request parameter to bypass prompt formatting and response context
-add raw request to docs
2023-11-08 14:05:02 -08:00
Michael Yang 146072113d
Merge pull request #993 from jmorganca/mxyng/cleanup
cleanup upload and download errors
2023-11-06 11:32:12 -08:00
Jeffrey Morgan e21579a0f1 Restore system prompt on requests 2023-11-03 17:26:45 -07:00
Michael Yang 434a6f9d46 return last error 2023-11-03 16:49:51 -07:00
Michael Yang 84725ec7e3 refactor part reset 2023-11-03 09:20:32 -07:00
Noah Gitsham 8ae8c9fa8c
Remove duplicate "install" in GPU support warning (#984) 2023-11-03 00:45:14 -07:00
Noah Gitsham f39daff461
Add missing "be" to GPU support warning message (#983) 2023-11-02 18:37:12 -07:00
Jeffrey Morgan c50b01bc21 check request.Context for initial system prompt 2023-11-02 18:17:00 -07:00
Bruce MacDonald b9dc875401
remove modelfile context deprecated in v0.0.7 (#974) 2023-11-02 20:52:56 -04:00
Michael Yang 1fd511e661
Merge pull request #975 from jmorganca/mxyng/downloads
update downloads to use retry wrapper
2023-11-02 16:12:48 -07:00
Jeffrey Morgan 1beb5645a9
only use system prompt if context is not provided (#978) 2023-11-02 15:48:02 -07:00
Michael Yang fe5a872444 fix upload 2023-11-02 13:25:58 -07:00
Michael Yang d39709260f download with retry 2023-11-02 13:16:11 -07:00
Michael Yang 60bb3c03a1 use http.Method 2023-11-02 13:12:45 -07:00
Michael Yang c4cc738cbf fix log 2023-11-01 17:18:11 -07:00
Michael Yang 2c6189f4fe
Merge pull request #750 from jmorganca/mxyng/concurrent-uploads
concurrent uploads
2023-11-01 15:00:01 -07:00
Bruce MacDonald f9a4281124
clean up: remove server functions from client (#937) 2023-10-30 11:10:18 -04:00
Michael Yang 115fc56eb7 calculate and verify md5 checksum 2023-10-27 17:07:33 -07:00
Michael Yang 186f685224 retry PUT 2023-10-27 17:07:33 -07:00
Michael Yang 12efcbb057 comments 2023-10-27 17:07:33 -07:00
Michael Yang 4e09aab8b9 concurrent uploads 2023-10-27 17:07:33 -07:00
Bruce MacDonald 5c3491f425
allow for a configurable ollama model storage directory (#897)
* allow for a configurable ollama models directory

- set OLLAMA_MODELS in the environment that ollama is running in to change where model files are stored
- update docs

Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>
Co-Authored-By: Jay Nakrani <dhananjaynakrani@gmail.com>
Co-Authored-By: Akhil Acharya <akhilcacharya@gmail.com>
Co-Authored-By: Sasha Devol <sasha.devol@protonmail.com>
2023-10-27 10:19:59 -04:00
Michael Yang 910816a532 fix(download): no retry when out of space 2023-10-26 11:34:07 -07:00
Michael Yang 386169205c
update runtime options (#864) 2023-10-20 21:17:14 -04:00
Michael Yang 75bee074b6 fix: nil pointer dereference 2023-10-20 16:55:24 -07:00
Jeffrey Morgan 7ed5a39bc7 simpler check for model loading compatibility errors 2023-10-19 14:50:49 -04:00
Michael Yang 846f593dbf
Merge pull request #828 from jmorganca/mxyng/template-parameters
image: show parameters
2023-10-19 09:31:31 -07:00
Michael Yang e1c5be24e7 check json eof 2023-10-19 09:21:51 -07:00
Michael Yang 2ad8a074ac generate: set created_at
move the empty response so it's more visible
2023-10-19 09:21:51 -07:00
Michael Yang 7e547c6833 s/message/error/ 2023-10-19 09:21:04 -07:00
Michael Yang 689842b9ff request: bad request when model missing fields 2023-10-19 09:21:04 -07:00
Michael Yang a19d47642e models: rm workDir from CreateModel
unused after removing EMBED
2023-10-19 09:21:04 -07:00
Bruce MacDonald fe6f3b48f7
do not reload the running llm when runtime params change (#840)
- only reload the running llm if the model has changed, or the options for loading the running model have changed
- rename loaded llm to runner to differentiate from loaded model image
- remove logic which keeps the first system prompt in the generation context
2023-10-19 10:39:58 -04:00
Michael Yang 4dcceeffb7 let the template do the work 2023-10-18 13:12:00 -07:00
Michael Yang 019e4a4558 image: show parameters 2023-10-18 13:12:00 -07:00
Michael Yang 627d04d927
Merge pull request #827 from jmorganca/mxyng/template-adapters
model: native gotemplate adapter template
2023-10-18 13:11:25 -07:00
Michael Yang 940e8ebec3
Merge pull request #826 from jmorganca/mxyng/template-system
show: no template system if empty
2023-10-18 13:11:09 -07:00
Yiorgis Gozadinos 8c6c2cbc8c When the .ollama folder is broken or there are no models return an empty list on /api/tags 2023-10-18 08:23:20 +02:00
Michael Yang 8299bf76ed model: native gotemplate adapter template 2023-10-17 15:28:38 -07:00
Michael Yang ee4979e510 show: no template system if empty 2023-10-17 15:25:43 -07:00
Michael Yang 1af493c5a0 server: print version on start 2023-10-16 09:59:14 -07:00
Bruce MacDonald a0c3e989de
deprecate modelfile embed command (#759) 2023-10-16 11:07:37 -04:00
Michael Yang 7a537cdca9
Merge pull request #770 from jmorganca/mxyng/fix-download
fix download
2023-10-12 12:56:43 -07:00
Michael Yang 257ffeb997 fix download 2023-10-12 12:52:43 -07:00
Bruce MacDonald 7804b8fab9
validate api options fields from map (#711) 2023-10-12 11:18:11 -04:00
Michael Yang c413a55093 download: handle inner errors 2023-10-11 14:15:30 -07:00
Michael Yang 630bb75d2a dynamically size download parts based on file size 2023-10-11 14:10:25 -07:00
Michael Yang a2055a1e93 update download 2023-10-11 14:10:25 -07:00
Bruce MacDonald 274d5a5fdf
optional parameter to not stream response (#639)
* update streaming request accept header
* add optional stream param to request bodies
2023-10-11 12:54:27 -04:00
Jeffrey Morgan 65dcd0ce35
always cleanup blob download (#747) 2023-10-10 13:12:29 -04:00
Michael Yang f6e98334e4 handle upstream proxies 2023-10-09 11:42:36 -07:00
Bruce MacDonald af4cf55884
not found error before pulling model (#718) 2023-10-06 16:06:20 -04:00
Bruce MacDonald d6786f2945
add feedback for reading model metadata (#722) 2023-10-06 16:05:32 -04:00
Michael Yang 0560b28a8d names 2023-10-06 12:56:56 -07:00
Michael Yang 10199c5987 replace done channel with file check 2023-10-06 12:56:56 -07:00
Michael Yang 288814d3e4 fix ref counts 2023-10-06 12:56:43 -07:00
Michael Yang 04733438da check head request response 2023-10-06 12:56:43 -07:00
Michael Yang 711e891f0f fix resumable downloads
glob returns files in lexical order which is not appropriate when
rebuilding the parts list
2023-10-06 12:56:43 -07:00
Michael Yang 090d08422b handle unexpected eofs 2023-10-06 12:56:43 -07:00
Michael Yang 5b84404c64 handle concurrent requests for the same blobs 2023-10-06 12:56:43 -07:00
Michael Yang 8544edca21 parallel chunked downloads 2023-10-06 12:56:43 -07:00
Bruce MacDonald 2130c0708b
output type parsed from modelfile (#678) 2023-10-05 14:58:04 -04:00
Jay Nakrani 1d0ebe67e8
Document response stream chunk delimiter. (#632)
Document response stream chunk delimiter.
2023-09-29 21:45:52 -07:00
Bruce MacDonald a1b2d95f96
remove unused push/pull params (#650) 2023-09-29 17:27:19 -04:00
Michael Yang 9333b0cc82
Merge pull request #612 from jmorganca/mxyng/prune-empty-directories
prune empty directories
2023-09-29 11:23:39 -07:00
Michael Yang f40b3de758 use int64 consistently 2023-09-28 11:07:24 -07:00
Michael Yang 8608eb4760 prune empty directories 2023-09-27 10:58:09 -07:00
Jeffrey Morgan 9b12a511ca check other request fields before load short circuit in /api/generate 2023-09-22 23:50:55 -04:00
Bruce MacDonald 5d71bda478
close llm on interrupt (#577) 2023-09-22 19:41:52 +01:00
Michael Yang 82f5b66c01 register HEAD /api/tags 2023-09-21 16:38:03 -07:00
Michael Yang c986694367 fix HEAD / request
HEAD request should respond like their GET counterparts except without a
response body.
2023-09-21 16:35:58 -07:00
Bruce MacDonald 4cba75efc5
remove tmp directories created by previous servers (#559)
* remove tmp directories created by previous servers

* clean up on server stop

* Update routes.go

* Update server/routes.go

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* create top-level temp ollama dir

* check file exists before creating

---------

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
Co-authored-by: Michael Yang <mxyng@pm.me>
2023-09-21 20:38:49 +01:00
Michael Yang 1fabba474b refactor default allow origins
this should be less error prone
2023-09-21 09:42:25 -07:00
Michael Yang ee4fd16f2c
Merge pull request #556 from jmorganca/pack-cuda
pack in cuda libs
2023-09-20 15:02:36 -07:00
Bruce MacDonald 1255bc9b45 only package 11.8 runner 2023-09-20 20:00:41 +01:00
Michael Yang 499e9007a5 pick chunksize based on location 2023-09-20 11:10:24 -07:00
Michael Yang aa45d7c1df draft: explicitly follow upload redirects 2023-09-19 13:36:58 -07:00
Michael Yang a5520bfb42 fix build 2023-09-19 10:42:24 -07:00
Michael Yang b58d5d16b0 fix mkdir on windows 2023-09-19 09:41:13 -07:00
Patrick Devine 24580df958
only add a layer if there is actual data (#535) 2023-09-18 13:47:45 -07:00
Patrick Devine 80dd44e80a
Cmd changes (#541) 2023-09-18 12:26:56 -07:00
Michael Yang 08d7c2a944 fix error on upload chunk 2023-09-15 15:59:30 -07:00
Michael Yang e53bc57d4d split uploadBlobChunked 2023-09-14 17:22:05 -07:00
Michael Yang f0b398d17f implement ProgressWriter 2023-09-14 17:22:04 -07:00
Michael Yang daa4f096f9 set request.ContentLength
This informs the HTTP client the content length is known and disables
chunked Transfer-Encoding
2023-09-14 13:32:44 -07:00
Michael Yang e6881cabd0 remove unused 2023-09-13 14:48:33 -07:00
Michael Yang 0c5a454361 fix model type for 70b 2023-09-12 15:12:59 -07:00
Michael Yang 7dee25a07f fix falcon decode
get model and file type from bin file
2023-09-12 12:34:53 -07:00
Bruce MacDonald f221637053
first pass at linux gpu support (#454)
* linux gpu support
* handle multiple gpus
* add cuda docker image (#488)
---------

Co-authored-by: Michael Yang <mxyng@pm.me>
2023-09-12 11:04:35 -04:00
Patrick Devine 45ac07cd02
create the blobs directory correctly (#508) 2023-09-11 14:54:52 -07:00
Patrick Devine e7e91cd71c
add autoprune to remove unused layers (#491) 2023-09-11 11:46:35 -07:00
Jeffrey Morgan 3920e15386
add model format to config layer (#497) 2023-09-09 17:53:44 -04:00
Michael Yang de227b620f fix nil pointer dereference 2023-09-07 17:24:31 -07:00
Michael Yang 738fe9c4aa
Merge pull request #486 from jmorganca/mxyng/fix-push
fix: retry push on expired token
2023-09-07 13:58:34 -07:00
Michael Yang bf146fb072 fix retry on unauthorized chunk 2023-09-07 12:02:04 -07:00
Michael Yang f0f4943577 fix get auth token 2023-09-07 12:01:56 -07:00
Bruce MacDonald 09dd2aeff9
GGUF support (#441) 2023-09-07 13:55:37 -04:00
Michael Yang 83c6be1666
fix model manifests (#477) 2023-09-06 17:30:08 -04:00
Patrick Devine 790d24eb7b
add show command (#474) 2023-09-06 11:04:17 -07:00
Michael Yang a1ecdd36d5 create manifests directory 2023-09-05 17:10:40 -07:00
Michael Yang d1c2558f7e
Merge pull request #461 from jmorganca/mxyng/fix-inherit-params
fix inherit params
2023-09-05 12:30:23 -07:00
Michael Yang 06ef90c051 fix parameter inheritence
parameters are not inherited because they are processed differently from
other layer. fix this by explicitly merging the inherited params into
the new params. parameter values defined in the new modelfile will
override those defined in the inherited modelfile. array lists are
replaced instead of appended
2023-09-05 11:40:20 -07:00
Michael Yang e9f6df7dca use slices.DeleteFunc 2023-09-05 09:56:59 -07:00
Michael Yang 681f3c4c42 fix num_keep 2023-09-03 17:47:49 -04:00
Quinn Slack 62d29b2157 do not HTML-escape prompt
The `html/template` package automatically HTML-escapes interpolated strings in templates. This behavior is undesirable because it causes prompts like `<h1>hello` to be escaped to `&lt;h1&gt;hello` before being passed to the LLM.

The included test case passes, but before the code change, it failed:

```
--- FAIL: TestModelPrompt
    images_test.go:21: got "a&lt;h1&gt;b", want "a<h1>b"
```
2023-09-01 17:16:38 -05:00
Michael Yang 1c8fd627ad windows: fix create modelfile 2023-08-31 09:47:10 -04:00
Michael Yang ae950b00f1 windows: fix delete 2023-08-31 09:47:10 -04:00
Michael Yang eeb40a672c fix list models for windows 2023-08-31 09:47:10 -04:00
Michael Yang 0f541a0367 s/ListResponseModel/ModelResponse/ 2023-08-31 09:47:10 -04:00
Bruce MacDonald 42998d797d
subprocess llama.cpp server (#401)
* remove c code
* pack llama.cpp
* use request context for llama_cpp
* let llama_cpp decide the number of threads to use
* stop llama runner when app stops
* remove sample count and duration metrics
* use go generate to get libraries
* tmp dir for running llm
2023-08-30 16:35:03 -04:00
Quinn Slack f4432e1dba
treat stop as stop sequences, not exact tokens (#442)
The `stop` option to the generate API is a list of sequences that should cause generation to stop. Although these are commonly called "stop tokens", they do not necessarily correspond to LLM tokens (per the LLM's tokenizer). For example, if the caller sends a generate request with `"stop":["\n"]`, then generation should stop on any token containing `\n` (and trim `\n` from the output), not just if the token exactly matches `\n`. If `stop` were interpreted strictly as LLM tokens, then it would require callers of the generate API to know the LLM's tokenizer and enumerate many tokens in the `stop` list.

Fixes https://github.com/jmorganca/ollama/issues/295.
2023-08-30 11:53:42 -04:00
Michael Yang 982c535428
Merge pull request #428 from jmorganca/mxyng/upload-chunks
update upload chunks
2023-08-30 07:47:17 -07:00
Patrick Devine 8bbff2df98
add model IDs (#439) 2023-08-28 20:50:24 -07:00
Michael Yang 16b06699fd remove unused parameter 2023-08-28 18:35:18 -04:00
Michael Yang 246dc65417 loosen http status code checks 2023-08-28 18:34:53 -04:00
Michael Yang 865fceb73c chunked pipe 2023-08-28 18:34:53 -04:00
Michael Yang 72266c7684 bump chunk size to 95MB 2023-08-28 18:34:53 -04:00
Michael Yang 59734ca24d set default template 2023-08-26 12:20:48 -07:00
Michael Yang 32d1a00017 remove unused requestContextKey 2023-08-22 10:49:54 -07:00
Michael Yang 04e2128273 move upload funcs to upload.go 2023-08-22 10:49:53 -07:00
Michael Yang 2cc634689b use url.URL 2023-08-22 10:49:07 -07:00
Michael Yang 95187d7e1e build release mode 2023-08-22 09:52:43 -07:00
Michael Yang 9ec7e37534
Merge pull request #392 from jmorganca/mxyng/version
add version
2023-08-22 09:50:25 -07:00
Michael Yang 2c7f956b38 add version 2023-08-22 09:40:58 -07:00
Jeffrey Morgan a9f6c56652 fix FROM instruction erroring when referring to a file 2023-08-22 09:39:42 -07:00
Ryan Baker 0a892419ad
Strip protocol from model path (#377) 2023-08-21 21:56:56 -07:00
Michael Yang 3b49315f97 retry on unauthorized chunk push
The token printed for authorized requests has a lifetime of 1h. If an
upload exceeds 1h, a chunk push will fail since the token is created on
a "start upload" request.

This replaces the Pipe with SectionReader which is simpler and
implements Seek, a requirement for makeRequestWithRetry. This is
slightly worse than using a Pipe since the progress update is directly
tied to the chunk size instead of controlled separately.
2023-08-18 11:23:47 -07:00
Michael Yang 7eda70f23b copy metadata from source 2023-08-17 21:55:25 -07:00
Michael Yang 086449b6c7 fmt 2023-08-17 15:32:31 -07:00
Michael Yang 3cbc6a5c01 fix push manifest 2023-08-17 15:28:12 -07:00
Michael Yang a894cc792d model and file type as strings 2023-08-17 12:08:04 -07:00
Michael Yang b963a83559
Merge pull request #364 from jmorganca/chunked-uploads
reimplement chunked uploads
2023-08-17 09:58:51 -07:00
Michael Yang bf6688abe6
Merge pull request #360 from jmorganca/fix-request-copies
Fix request copies
2023-08-17 09:58:42 -07:00
Bruce MacDonald 6005b157c2
retry download on network errors 2023-08-17 10:31:45 -04:00
Patrick Devine 14220d9833
set the scopes correctly (#368) 2023-08-16 21:42:02 -07:00
Michael Yang 5dfe91be8b reimplement chunked uploads 2023-08-16 14:50:24 -07:00
Michael Yang 9f944c00f1 push: retry on unauthorized 2023-08-16 11:35:33 -07:00
Michael Yang 56e87cecb1 images: remove body copies 2023-08-16 10:30:41 -07:00
Michael Yang 5d9a4cd251
Merge pull request #348 from jmorganca/cross-repo-mount
cross repo blob mount
2023-08-16 09:20:36 -07:00
Bruce MacDonald 1deb35ca64
use loaded llm for generating model file embeddings 2023-08-15 16:12:02 -03:00
Bruce MacDonald e2de886831
do not regenerate embeddings 2023-08-15 16:10:22 -03:00
Bruce MacDonald f0d7c2f5ea retry download on network errors 2023-08-15 15:07:19 -03:00
Bruce MacDonald 12052a7624
always remove from in progress map on download 2023-08-15 13:20:32 -03:00
Bruce MacDonald 326de48930 use loaded llm for embeddings 2023-08-15 10:50:54 -03:00
Bruce MacDonald 18f2cb0472 dont log fatal 2023-08-15 10:39:59 -03:00
Michael Yang e26085b921 close open files 2023-08-14 16:08:06 -07:00
Michael Yang f594c8eb91 cross repo mount 2023-08-14 15:07:35 -07:00
Bruce MacDonald f020e1d519 always remove from in progress map on download 2023-08-14 13:09:20 -03:00
Bruce MacDonald 2c8b680b03 use file info for embeddings cache 2023-08-14 12:11:04 -03:00
Bruce MacDonald 99b6b60085 use model bin digest for embed digest 2023-08-14 11:57:12 -03:00
Bruce MacDonald e9a9580bdd do not regenerate embeddings
- re-use previously evaluated embeddings when possible
- change embeddings digest identifier to be based on model name and embedded file path
2023-08-14 10:34:17 -03:00
Patrick Devine d9cf18e28d
add maximum retries when pushing (#334) 2023-08-11 15:41:55 -07:00
Jeffrey Morgan 1556162c90 create .ollama directory if it doesnt exist 2023-08-11 15:35:55 -07:00
Jeffrey Morgan 148f0225c0 create .ollama directory if it doesnt exist 2023-08-11 15:33:11 -07:00
Michael Yang 6517bcc53c
Merge pull request #290 from jmorganca/add-adapter-layers
implement loading ggml lora adapters through the modelfile
2023-08-10 17:23:01 -07:00
Michael Yang 6a6828bddf
Merge pull request #167 from jmorganca/decode-ggml
partial decode ggml bin for more info
2023-08-10 17:22:40 -07:00
Patrick Devine be989d89d1
Token auth (#314) 2023-08-10 11:34:25 -07:00
Jeffrey Morgan 040a5b9750 clean up cli flags 2023-08-10 09:27:03 -07:00
Michael Yang 6de5d032e1 implement loading ggml lora adapters through the modelfile 2023-08-10 09:23:39 -07:00
Michael Yang fccf8d179f partial decode ggml bin for more info 2023-08-10 09:23:10 -07:00
Bruce MacDonald 4b3507f036 embeddings endpoint
Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>
2023-08-10 11:45:57 -04:00
Bruce MacDonald 984c9c628c fix embeddings invalid values 2023-08-09 16:50:53 -04:00
Bruce MacDonald ac971c56d1 Update images.go 2023-08-09 11:31:54 -04:00
Bruce MacDonald 8228d166ce pr comments 2023-08-09 11:31:54 -04:00
Bruce MacDonald 907e6c56b3 unlock downloadu in case or requestDownload err 2023-08-09 11:31:54 -04:00
Bruce MacDonald 868e3b31c7 allow for concurrent pulls of the same files 2023-08-09 11:31:54 -04:00
Bruce MacDonald 09d8bf6730 fix build errors 2023-08-09 10:45:57 -04:00
Bruce MacDonald 7a5f3616fd
embed text document in modelfile 2023-08-09 10:26:19 -04:00
Jeffrey Morgan cff002b824 use content type application/x-ndjson for streaming responses 2023-08-08 21:38:10 -07:00
Bruce MacDonald 1bee2347be pr feedback
- defer closing llm on embedding
- do not override licenses
- remove debugging print line
- reformat model file docs
2023-08-08 17:01:37 -04:00
Jeffrey Morgan a027a7dd65 add 0.0.0.0 as an allowed origin by default
Fixes #282
2023-08-08 13:39:50 -07:00
Bruce MacDonald 884d78ceb3 allow embedding from model binary 2023-08-08 14:38:57 -04:00
Bruce MacDonald 21ddcaa1f1 pr comments
- default to embeddings enabled
- move embedding logic for loaded model to request
- allow embedding full directory
- close llm on reload
2023-08-08 13:49:37 -04:00