Commit graph

3250 commits

Author SHA1 Message Date
Blake Mizerany dc77bbcfa4
server: fix json marshalling of downloadBlobPart (#6108) 2024-07-31 16:01:24 -07:00
Michael Yang c4c84b7a0d
Merge pull request #5196 from ollama/mxyng/messages-2
include modelfile messages
2024-07-31 10:18:17 -07:00
Michael Yang 5c1912769e
Merge pull request #5473 from ollama/mxyng/environ
fix: environ lookup
2024-07-31 10:18:05 -07:00
Daniel Nguyen 71399aa682
Added BoltAI as a desktop UI for Ollama (#6096) 2024-07-31 08:44:58 -07:00
Jeffrey Morgan 463a8aa273
Create SECURITY.md 2024-07-30 21:01:12 -07:00
Michael 3579b4966a
Update README to include Firebase Genkit (#6083)
Firebase Genkit
2024-07-30 18:40:09 -07:00
Jeffrey Morgan 5d66578356
Update README.md
Better example for multi-modal input
2024-07-30 18:08:34 -07:00
jmorganca afa8d6e9d5 patch gemma support 2024-07-30 18:07:29 -07:00
royjhan 1b44d873e7
Add Metrics to api\embed response (#5709)
* add prompt tokens to embed response

* rm slog

* metrics

* types

* prompt n

* clean up

* reset submodule

* update tests

* test name

* list metrics
2024-07-30 13:12:21 -07:00
Daniel Hiltgen cef2c6054d
Merge pull request #5859 from dhiltgen/homogeneous_gpus
Prevent partial loading on mixed GPU brands
2024-07-30 11:06:42 -07:00
Daniel Hiltgen 345420998e Prevent partial loading on mixed GPU brands
In mult-brand GPU setups, if we couldn't fully load the model we
would fall through the scheduler and mistakenly try to load across
a mix of brands.  This makes sure we find the set of GPU(s) that
best fit for the partial load.
2024-07-30 11:00:55 -07:00
Kim Hallberg 0be8baad2b
Update and Fix example models (#6065)
* Update example models

* Remove unused README.md
2024-07-29 23:56:37 -07:00
Daniel Hiltgen 1a83581a8e
Merge pull request #5895 from dhiltgen/sched_faq
Better explain multi-gpu behavior
2024-07-29 14:25:41 -07:00
Daniel Hiltgen 37926eb991
Merge pull request #5927 from dhiltgen/high_cpu_count
Ensure amd gpu nodes are numerically sorted
2024-07-29 14:24:57 -07:00
Daniel Hiltgen 3d4634fdff
Merge pull request #5934 from dhiltgen/missing_cuda_repo
Report better error on cuda unsupported os/arch
2024-07-29 14:24:20 -07:00
royjhan 365431d406
return tool calls finish reason for openai (#5995)
* hot fix

* backend stream support

* clean up

* finish reason

* move to openai
2024-07-29 13:56:57 -07:00
Daniel Hiltgen 161e12cecf
Merge pull request #5932 from dhiltgen/win_font
Explain font problems on windows 10
2024-07-29 13:40:24 -07:00
Jeffrey Morgan 46e6327e0f
api: add stringifier for Tool (#5891) 2024-07-29 13:35:16 -07:00
Jeffrey Morgan 68ee42f995
update llama.cpp submodule to 6eeaeba1 (#6039) 2024-07-29 13:20:26 -07:00
Ikko Eltociear Ashimine f26aef9a8b
docs: update README.md (#6059)
HuggingFace -> Hugging Face
2024-07-29 10:53:30 -07:00
Michael Yang 38d9036b59
Merge pull request #5992 from ollama/mxyng/save
fix: model save
2024-07-29 09:53:19 -07:00
Veit Heller 6f26e9322f
Fix typo in image docs (#6041) 2024-07-29 08:50:53 -07:00
Jeffrey Morgan 0e4d653687
upate to llama3.1 elsewhere in repo (#6032) 2024-07-28 19:56:02 -07:00
Michael 2c01610616
update readme to llama3.1 (#5933) 2024-07-28 14:21:38 -07:00
Tibor Schmidt f3d7a481b7
feat: add support for min_p (resolve #1142) (#1825) 2024-07-27 14:37:40 -07:00
Jeffrey Morgan f2a96c7d77
llm: keep patch for llama 3 rope factors (#5987) 2024-07-26 15:20:52 -07:00
Daniel Hiltgen e8a66680d1
Merge pull request #5705 from dhiltgen/win_errormode
Enable windows error dialog for subprocess
2024-07-26 14:49:34 -07:00
Michael Yang 079b2c3b03
Merge pull request #5999 from ollama/mxyng/fix-push
fix nil deref in auth.go
2024-07-26 14:28:34 -07:00
Blake Mizerany 750c1c55f7
server: fix race conditions during download (#5994)
This fixes various data races scattered throughout the download/pull
client where the client was accessing the download state concurrently.

This commit is mostly a hot-fix and will be replaced by a new client one
day soon.

Also, remove the unnecessary opts argument from downloadChunk.
2024-07-26 14:24:24 -07:00
Michael Yang a622c47bd3 fix nil deref in auth.go 2024-07-26 14:14:48 -07:00
Michael Yang ec4c35fe99
Merge pull request #5512 from ollama/mxyng/detect-stop
autodetect stop parameters from template
2024-07-26 13:48:23 -07:00
Michael Yang a250c2cb13 display messages 2024-07-26 13:39:57 -07:00
Michael Yang 3d9de805b7 fix: model save
stop parameter is saved as a slice which is incompatible with modelfile
parsing
2024-07-26 13:23:06 -07:00
Michael Yang 15af558423 include modelfile messages 2024-07-26 11:40:11 -07:00
Jeffrey Morgan f5e3939220
Update api.md (#5968) 2024-07-25 23:10:18 -04:00
Jeffrey Morgan ae27d9dcfd
Update openai.md 2024-07-25 20:27:33 -04:00
Michael Yang 37096790a7
Merge pull request #5552 from ollama/mxyng/messages-docs
docs
2024-07-25 16:26:19 -07:00
Michael Yang 997c903884
Update docs/template.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-07-25 16:23:40 -07:00
Blake Mizerany c8af3c2d96
server: reuse original download URL for images (#5962)
This changes the registry client to reuse the original download URL
it gets on the first redirect response for all subsequent requests,
preventing thundering herd issues when hot new LLMs are released.
2024-07-25 15:58:30 -07:00
Jeffrey Morgan 455e61170d
Update openai.md 2024-07-25 18:34:47 -04:00
royjhan 4de1370a9d
openai tools doc (#5617) 2024-07-25 18:34:06 -04:00
Jeffrey Morgan bbf8f102ee
Revert "llm(llama): pass rope factors (#5924)" (#5963)
This reverts commit bb46bbcf5e.
2024-07-25 18:24:55 -04:00
Daniel Hiltgen ce3c93b08f Report better error on cuda unsupported os/arch
If we detect an NVIDIA GPU, but nvidia doesn't support the os/arch,
this will report a better error for the user and point them to docs
to self-install the drivers if possible.
2024-07-24 17:09:20 -07:00
Daniel Hiltgen 6c2129d5d0 Explain font problems on windows 10 2024-07-24 15:22:00 -07:00
Daniel Hiltgen 7c2a157ca4 Ensure amd gpu nodes are numerically sorted
For systems that enumerate over 10 CPUs the default lexicographical
sort order interleaves CPUs and GPUs.
2024-07-24 13:43:26 -07:00
Michael Yang bb46bbcf5e
llm(llama): pass rope factors (#5924) 2024-07-24 16:05:59 -04:00
royjhan ac33aa7d37
Fix Embed Test Flakes (#5893)
* float cmp

* increase tolerance
2024-07-24 11:15:46 -07:00
Daniel Hiltgen 830fdd2715 Better explain multi-gpu behavior 2024-07-23 15:16:38 -07:00
Ajay Chintala a6cd8f6169
Update README.md to add LLMStack integration (#5799) 2024-07-23 14:40:23 -04:00
Daniel Hiltgen c78089263a
Merge pull request #5864 from dhiltgen/bump_go
Bump Go patch version
2024-07-22 16:34:18 -07:00