Commit graph

2925 commits

Author SHA1 Message Date
Michael Yang 620d5c569e fix parsing big endian gguf 2024-06-08 12:35:26 -07:00
Michael Yang b9ce7bf75e update import.md 2024-06-07 16:45:15 -07:00
Daniel Hiltgen cddc63381c
Merge pull request #4909 from dhiltgen/oneapi_disable
Add ability to skip oneapi generate
2024-06-07 14:07:15 -07:00
Michael Yang 385a32ecb5
Merge pull request #4910 from ollama/mxyng/detect-chat-template
fix create model when template detection errors
2024-06-07 11:07:39 -07:00
Michael Yang 030e765e76 fix create model when template detection errors 2024-06-07 10:51:35 -07:00
Daniel Hiltgen ab8c929e20 Add ability to skip oneapi generate
This follows the same pattern for cuda and rocm to allow
disabling the build even when we detect the dependent libraries
2024-06-07 08:32:49 -07:00
Jeffrey Morgan ce0dc33cb8
llm: patch to fix qwen 2 temporarily on nvidia (#4897) 2024-06-06 23:14:33 -07:00
Michael Yang 78f81fc0e5
Merge pull request #4800 from ollama/mxyng/detect-chat-template
detect chat template from KV
2024-06-06 16:17:18 -07:00
Michael Yang 9b6c2e6eb6 detect chat template from KV 2024-06-06 16:03:47 -07:00
royjhan 1a29e9a879
API app/browser access (#4879)
* API app/browser access

* Add tauri (resolves #2291, #4791, #3799, #4388)
2024-06-06 15:19:03 -07:00
royjhan 4bf1da4944
Separate ListResponse and ModelResponse for api/tags vs api/ps (#4842)
* Remove false time fields

* Struct Separation for List and Process

* Remove Marshaler
2024-06-06 10:11:45 -07:00
Blake Mizerany de5beb06b3 server: skip blob verification for already verified blobs 2024-06-05 16:39:11 -07:00
Sam 98e65929dc
docs(tools): add gollama (#4829) 2024-06-05 14:13:39 -07:00
Michael Yang 66ab48772f proper utf16 support 2024-06-05 13:11:50 -07:00
Michael Yang 22fcf8f7de
Merge pull request #3737 from ollama/mxyng/modelname-4
update create handler to use model.Name
2024-06-05 12:05:05 -07:00
royjhan 28c7813ac4
API PS Documentation (#4822)
* API PS Documentation
2024-06-05 11:06:53 -07:00
Kartikeya Mishra 1d8616d30f
docs: update to add LLocal.in to web & desktop integrations (#4719) 2024-06-04 14:43:59 -07:00
Michael Yang d61ef8b954 update create handler to use model.Name 2024-06-04 13:28:25 -07:00
Michael Yang 89d9900152
Merge pull request #4570 from ollama/mxyng/slices
lint some of the things
2024-06-04 13:27:05 -07:00
Michael 4a048715b6
local wording was confusing people
local wording was confusing people -- Ollama runs on cloud providers
2024-06-04 13:25:25 -07:00
Michael Yang 6297f85606 gofmt, goimports 2024-06-04 13:20:24 -07:00
Michael Yang ed56428dd7 warn on intrange, usestdlibvars 2024-06-04 11:52:48 -07:00
Michael Yang ad40b92b6a disable intrange 2024-06-04 11:35:30 -07:00
Michael Yang 8ce4032e72 more lint 2024-06-04 11:13:30 -07:00
Michael Yang 42660466f8 no usestdlibvars 2024-06-04 11:13:30 -07:00
Michael Yang e919f6811f lint windows 2024-06-04 11:13:30 -07:00
Michael Yang bf7edb0d5d lint linux 2024-06-04 11:13:30 -07:00
Michael Yang f38353d6b9 stdin.fd 2024-06-04 11:13:30 -07:00
Michael Yang 201d853fdf nolintlint 2024-06-04 11:13:30 -07:00
Michael Yang e40145a39d lint 2024-06-04 11:13:30 -07:00
Michael Yang c895a7d13f some gocritic 2024-06-04 11:13:30 -07:00
Michael Yang dad7a987ae nosprintfhostport 2024-06-04 11:13:30 -07:00
Michael Yang 8ffb51749f nolintlint 2024-06-04 11:13:30 -07:00
Michael Yang 55f6eba049 gofmt 2024-06-04 11:13:30 -07:00
Michael Yang 04f3c12bb7 replace x/exp/slices with slices 2024-06-04 11:13:30 -07:00
Shubham 60323e0805
add embed model command and fix question invoke (#4766)
* add embed model command and fix question invoke

* Update docs/tutorials/langchainpy.md

Co-authored-by: Kim Hallberg <hallberg.kim@gmail.com>

* Update docs/tutorials/langchainpy.md

---------

Co-authored-by: Kim Hallberg <hallberg.kim@gmail.com>
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-06-03 22:20:48 -07:00
Jeffrey Morgan d4a86102fd
update welcome prompt in windows to llama3 (#4779) 2024-06-01 21:05:51 -07:00
Jeffrey Morgan 476fb8e892
Limit GPU lib search for now (#4777)
* fix oneapi errors on windows 10
2024-06-01 19:24:33 -07:00
Michael Yang 829ff87bd1
revert tokenize ffi (#4761)
* Revert "use `int32_t` for call to tokenize (#4738)"

This reverts commit 763bb65dbb.

* Revert "vocab only"

This reverts commit bf54c845e9.

* Revert "use ffi for tokenizing/detokenizing"

This reverts commit 26a00a0410.
2024-05-31 18:54:21 -07:00
Josh f6b622c4b3
Merge pull request #4733 from ollama/jyan/isvalidname
added IsValidNamespace function
2024-05-31 14:08:45 -07:00
Josh Yan 2e4da8eec2 added tests for IsValidNamespace 2024-05-31 11:48:07 -07:00
Jeffrey Morgan 763bb65dbb
use int32_t for call to tokenize (#4738)
* use `int32_t` for call to tokenize

* variable naming

* cleanup

* fix crash
2024-05-30 21:43:30 -07:00
Jeffrey Morgan 7ca9605f54
speed up tests by only building static lib (#4740) 2024-05-30 21:43:15 -07:00
Michael Yang eb2c443a79
Merge pull request #4736 from ollama/mxyng/vocab-only
vocab only for tokenize
2024-05-30 17:21:00 -07:00
Michael Yang 278e25ea44
Merge pull request #4737 from ollama/mxyng/less-generate
only generate on relevant changes
2024-05-30 17:17:50 -07:00
Jeffrey Morgan a50a87a7b8
partial offloading: allow flash attention and disable mmap (#4734)
* partial offloading: allow flash attention and disable mmap

* allow mmap with num_gpu=0
2024-05-30 16:58:01 -07:00
Michael Yang 98085015d5 only generate on relevant changes 2024-05-30 16:54:11 -07:00
Michael Yang bf54c845e9 vocab only 2024-05-30 16:49:28 -07:00
Josh Yan c365f195a8 directly use isvalidpart 2024-05-30 16:40:04 -07:00
Josh e91d0ef737
Merge pull request #4728 from ollama/jyan/japanese
fixed japanese characters deleted at end of line
2024-05-30 16:25:12 -07:00