Commit graph

160 commits

Author SHA1 Message Date
Daniel Hiltgen d4cd695759 Add cgo implementation for llama.cpp
Run the server.cpp directly inside the Go runtime via cgo
while retaining the LLM Go abstractions.
2023-12-19 09:05:46 -08:00
Bruce MacDonald 811b1f03c8 deprecate ggml
- remove ggml runner
- automatically pull gguf models when ggml detected
- tell users to update to gguf in the case automatic pull fails

Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>
2023-12-19 09:05:46 -08:00
Bruce MacDonald d99fa6ce0a
send empty messages on last chat response (#1530) 2023-12-18 14:23:38 -05:00
Patrick Devine 630518f0d9
Add unit test of API routes (#1528) 2023-12-14 16:47:40 -08:00
Bruce MacDonald 6ee8c80199
restore model load duration on generate response (#1524)
* restore model load duration on generate response

- set model load duration on generate and chat done response
- calculate createAt time when response created

* remove checkpoints predict opts

* Update routes.go
2023-12-14 12:15:50 -05:00
Patrick Devine d9e60f634b
add image support to the chat api (#1490) 2023-12-12 13:28:58 -08:00
Patrick Devine 910e9401d0
Multimodal support (#1216)
---------

Co-authored-by: Matt Apperson <mattapperson@Matts-MacBook-Pro.local>
2023-12-11 13:56:22 -08:00
Jeffrey Morgan 7db5bcf73b fix go-staticcheck warning 2023-12-10 11:44:27 -05:00
Jeffrey Morgan fa2f095bd9 fix model name returned by /api/generate being different than the model name provided 2023-12-10 11:42:15 -05:00
Jeffrey Morgan 045b855db9 fix error on accumulating final chat response 2023-12-10 11:24:39 -05:00
Jeffrey Morgan 32064a0646 fix empty response when receiving runner error 2023-12-10 10:53:38 -05:00
Jeffrey Morgan 9e1406e4ed Don't expose model information in /api/generate 2023-12-09 02:05:43 -08:00
Bruce MacDonald 7e9405fd07
fix: encode full previous prompt in context (#1424) 2023-12-08 16:53:51 -05:00
Michael Yang c3ff36088b
Merge pull request #774 from jmorganca/mxyng/server-version
add version api and show server version in cli
2023-12-06 13:22:55 -08:00
Michael Yang 5d75505ebd return model configuration in generate 2023-12-05 14:39:02 -08:00
Michael Yang b9495ea162 load projectors 2023-12-05 14:36:12 -08:00
Michael Yang d3479c07a1
Merge pull request #1250 from jmorganca/mxyng/create-layer
refactor layer creation
2023-12-05 14:32:52 -08:00
Bruce MacDonald 195e3d9dbd
chat api endpoint (#1392) 2023-12-05 14:57:33 -05:00
Michael Yang 1ebdbd9694 server: add version handler 2023-12-05 09:36:01 -08:00
Jeffrey Morgan 00d06619a1 Revert "chat api (#991)" while context variable is fixed
This reverts commit 7a0899d62d.
2023-12-04 21:16:27 -08:00
Michael Yang a3737cbd33 use NewLayer for CreateBlobHandler 2023-12-04 16:59:23 -08:00
Bruce MacDonald 7a0899d62d
chat api (#991)
- update chat docs
- add messages chat endpoint
- remove deprecated context and template generate parameters from docs
- context and template are still supported for the time being and will continue to work as expected
- add partial response to chat history
2023-12-04 18:01:06 -05:00
Bruce MacDonald 96122b7271
validate model tags on copy (#1323) 2023-11-29 15:54:29 -05:00
Timothy Jaeryang Baek c2e3b89176
fix: disable ':' in tag names (#1280)
Co-authored-by: rootedbox
2023-11-29 13:33:45 -05:00
Bruce MacDonald 37d95157df
fix relative path on create (#1222) 2023-11-21 15:43:17 -05:00
Bruce MacDonald 43a726149d fix potentially inaccurate error message 2023-11-18 21:25:07 -05:00
Jeffrey Morgan bab9494176 add - separator to temp file created on ollama create 2023-11-18 09:39:52 -05:00
Michael Yang c6e6c8ee7e fix cross device rename 2023-11-17 15:22:17 -08:00
Michael Yang 54f92f01cb update docs 2023-11-15 15:28:15 -08:00
Michael Yang bc22d5a38b no blob response 2023-11-15 15:16:23 -08:00
Michael Yang 1901044b07 use checksum reference 2023-11-15 15:16:23 -08:00
Michael Yang 1552cee59f client create modelfile 2023-11-15 15:16:23 -08:00
Michael Yang 3ca56b5ada add create modelfile field 2023-11-15 15:16:23 -08:00
Michael Yang b0d14ed51c refactor create model 2023-11-15 15:16:23 -08:00
Jeffrey Morgan 5cba29b9d6
JSON mode: add `"format" as an api parameter (#1051)
* add `"format": "json"` as an API parameter
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2023-11-09 16:44:02 -08:00
Bruce MacDonald ec2a31e9b3
support raw generation requests (#952)
- add the optional `raw` generate request parameter to bypass prompt formatting and response context
-add raw request to docs
2023-11-08 14:05:02 -08:00
Noah Gitsham 8ae8c9fa8c
Remove duplicate "install" in GPU support warning (#984) 2023-11-03 00:45:14 -07:00
Noah Gitsham f39daff461
Add missing "be" to GPU support warning message (#983) 2023-11-02 18:37:12 -07:00
Michael Yang 2c6189f4fe
Merge pull request #750 from jmorganca/mxyng/concurrent-uploads
concurrent uploads
2023-11-01 15:00:01 -07:00
Bruce MacDonald f9a4281124
clean up: remove server functions from client (#937) 2023-10-30 11:10:18 -04:00
Michael Yang 4e09aab8b9 concurrent uploads 2023-10-27 17:07:33 -07:00
Michael Yang 386169205c
update runtime options (#864) 2023-10-20 21:17:14 -04:00
Jeffrey Morgan 7ed5a39bc7 simpler check for model loading compatibility errors 2023-10-19 14:50:49 -04:00
Michael Yang e1c5be24e7 check json eof 2023-10-19 09:21:51 -07:00
Michael Yang 2ad8a074ac generate: set created_at
move the empty response so it's more visible
2023-10-19 09:21:51 -07:00
Michael Yang 7e547c6833 s/message/error/ 2023-10-19 09:21:04 -07:00
Michael Yang 689842b9ff request: bad request when model missing fields 2023-10-19 09:21:04 -07:00
Michael Yang a19d47642e models: rm workDir from CreateModel
unused after removing EMBED
2023-10-19 09:21:04 -07:00
Bruce MacDonald fe6f3b48f7
do not reload the running llm when runtime params change (#840)
- only reload the running llm if the model has changed, or the options for loading the running model have changed
- rename loaded llm to runner to differentiate from loaded model image
- remove logic which keeps the first system prompt in the generation context
2023-10-19 10:39:58 -04:00
Yiorgis Gozadinos 8c6c2cbc8c When the .ollama folder is broken or there are no models return an empty list on /api/tags 2023-10-18 08:23:20 +02:00