Commit graph

693 commits

Author SHA1 Message Date
Jesse Gross 1829fb61bd manifest: Fix crash on startup when trying to clean up unused files (#5840)
Currently if the config field is missing in the manifest file (or
corrupted), Ollama will crash when it tries to read it. This can
happen at startup or when pulling new models.

This data is mostly just used for showing model information so we
can be tolerant of it not being present - it is not required to
run the models. Besides avoiding crashing, this also gives us the
ability to restructure the config in the future by pulling it
into the main manifest file.
2024-08-07 10:30:44 -07:00
Jesse Gross 685a53534b manifest: Don't prune layers if we can't open a manifest file
If there is an error when opening a manifest file (corrupted, permission denied, etc.)
then the referenced layers will not be included in the list of active
layers. This causes them to be deleted when pruning happens at startup
or a model is pulled.

In such a situation, we should prefer to preserve data in the hopes that
it can be recovered rather than being agressive about deletion.
2024-08-06 23:11:19 -07:00
Daniel Hiltgen fc85f50a2b Ensure sparse files on windows during download
The file.Truncate call on windows will write the whole file
unless you set the sparse flag, leading to heavy I/O at the
beginning of download.  This should improve our
I/O behavior on windows and put less stress on the users disk.
2024-08-06 10:58:08 -07:00
Michael Yang a091fadfda use testing tempdirs 2024-08-02 16:04:06 -07:00
Michael Yang b732beba6a lint 2024-08-01 17:06:06 -07:00
Michael Yang ff7c9060ec
Merge pull request #6115 from slouffka/fix-context
Fix context in /api/generate grows too much (#5980).
2024-08-01 15:13:59 -07:00
Michael Yang 0ff42e84b0
Merge pull request #4756 from ollama/mxyng/convert2
refactor convert
2024-08-01 14:16:30 -07:00
Vyacheslav Moskalev 8a9f946ca7 Refactor and format code. 2024-08-02 03:50:05 +07:00
Vyacheslav Moskalev 3b5210548e Refactor code. Remove extra variable. 2024-08-01 19:56:15 +07:00
Vyacheslav Moskalev b0c216584c Better types and naming closer to style. 2024-08-01 19:43:44 +07:00
Vyacheslav Moskalev 49a5483139 Change the order of context and prompt. 2024-08-01 19:25:56 +07:00
Vyacheslav Moskalev 6bc5c13758 Fix extra context concatenation in generate handler (#5980). 2024-08-01 15:45:58 +07:00
Michael Yang d87b4a488e fix modelfile message quotes 2024-07-31 16:52:09 -07:00
Blake Mizerany dc77bbcfa4
server: fix json marshalling of downloadBlobPart (#6108) 2024-07-31 16:01:24 -07:00
Michael Yang eafc607abb convert: only extract large files 2024-07-31 15:58:55 -07:00
Michael Yang df993fa37b comments 2024-07-31 15:58:55 -07:00
Michael Yang 5e9db9fb0b refactor convert 2024-07-31 15:58:33 -07:00
Michael Yang c4c84b7a0d
Merge pull request #5196 from ollama/mxyng/messages-2
include modelfile messages
2024-07-31 10:18:17 -07:00
Michael Yang 5c1912769e
Merge pull request #5473 from ollama/mxyng/environ
fix: environ lookup
2024-07-31 10:18:05 -07:00
royjhan 1b44d873e7
Add Metrics to api\embed response (#5709)
* add prompt tokens to embed response

* rm slog

* metrics

* types

* prompt n

* clean up

* reset submodule

* update tests

* test name

* list metrics
2024-07-30 13:12:21 -07:00
Daniel Hiltgen 345420998e Prevent partial loading on mixed GPU brands
In mult-brand GPU setups, if we couldn't fully load the model we
would fall through the scheduler and mistakenly try to load across
a mix of brands.  This makes sure we find the set of GPU(s) that
best fit for the partial load.
2024-07-30 11:00:55 -07:00
Michael Yang 079b2c3b03
Merge pull request #5999 from ollama/mxyng/fix-push
fix nil deref in auth.go
2024-07-26 14:28:34 -07:00
Blake Mizerany 750c1c55f7
server: fix race conditions during download (#5994)
This fixes various data races scattered throughout the download/pull
client where the client was accessing the download state concurrently.

This commit is mostly a hot-fix and will be replaced by a new client one
day soon.

Also, remove the unnecessary opts argument from downloadChunk.
2024-07-26 14:24:24 -07:00
Michael Yang a622c47bd3 fix nil deref in auth.go 2024-07-26 14:14:48 -07:00
Michael Yang ec4c35fe99
Merge pull request #5512 from ollama/mxyng/detect-stop
autodetect stop parameters from template
2024-07-26 13:48:23 -07:00
Michael Yang 15af558423 include modelfile messages 2024-07-26 11:40:11 -07:00
Blake Mizerany c8af3c2d96
server: reuse original download URL for images (#5962)
This changes the registry client to reuse the original download URL
it gets on the first redirect response for all subsequent requests,
preventing thundering herd issues when hot new LLMs are released.
2024-07-25 15:58:30 -07:00
Josh db0968f30c
fix dupe err message (#5857) 2024-07-22 15:48:15 -07:00
Michael Yang 85d9d73a72 comments 2024-07-22 11:49:03 -07:00
Michael Yang 1954ec5917 uint64 2024-07-22 11:49:02 -07:00
Michael Yang 0f1910129f int 2024-07-22 11:30:07 -07:00
Michael Yang 8570c1c0ef keepalive 2024-07-22 11:27:22 -07:00
Michael Yang 55cd3ddcca bool 2024-07-22 11:27:21 -07:00
Michael Yang 66fe77f084 models 2024-07-22 11:26:12 -07:00
Michael Yang d1a5227cad origins 2024-07-22 11:25:30 -07:00
Michael Yang 35b89b2eab rfc: dynamic environ lookup 2024-07-22 11:25:30 -07:00
Jeffrey Morgan b3e5491e41
server: collect nested tool call objects when parsing (#5824) 2024-07-22 12:38:03 -04:00
Jeffrey Morgan 80ee9b5e47
Remove out of space test temporarily (#5825) 2024-07-21 00:22:11 -04:00
Daniel Hiltgen 06e5d74e34
Merge pull request #5506 from dhiltgen/sched_tests
Refine scheduler unit tests for reliability
2024-07-20 15:48:39 -07:00
Jeffrey Morgan 69a2d4ccff
Fix generate test flakyness (#5804) 2024-07-19 19:11:25 -07:00
Josh e8b954c646
server: validate template (#5734)
add template validation to modelfile
2024-07-19 15:24:29 -07:00
Michael Yang 43606d6d6a fix parsing tool calls 2024-07-18 12:08:11 -07:00
Jeffrey Morgan 70b1010fa5
server: check for empty tools array too (#5779) 2024-07-18 11:44:57 -07:00
Jeffrey Morgan 319fb1ce03
server: only parse tool calls if tools are provided (#5771)
* server: only parse tool calls if tools are provided

* still set `resp.Message.Content`
2024-07-18 08:50:23 -07:00
Michael Yang b255445557
marshal json automatically for some template values (#5758) 2024-07-17 15:35:11 -07:00
Michael Yang 5fd6988126 parse tool call as individual objects 2024-07-17 11:19:04 -07:00
Michael Yang c279f96371 remove ToolCall from GenerateResponse 2024-07-16 15:22:49 -07:00
Michael Yang 499e87c9ba
Merge pull request #5730 from ollama/mxyng/cleanup
remove unneeded tool calls
2024-07-16 14:42:13 -07:00
Michael Yang d290e87513 add suffix support to generate endpoint
this change is triggered by the presence of "suffix", particularly
useful for code completion tasks
2024-07-16 14:31:35 -07:00
Michael Yang 5a83f79afd remove unneeded tool calls 2024-07-16 13:48:45 -07:00