Commit graph

3296 commits

Author SHA1 Message Date
Daniel Hiltgen 8072e205ff
Merge pull request #5447 from dhiltgen/fix_keepalive
Only set default keep_alive on initial model load
2024-07-03 15:34:38 -07:00
Daniel Hiltgen 955f2a4e03 Only set default keep_alive on initial model load
This change fixes the handling of keep_alive so that if client
request omits the setting, we only set this on initial load.  Once
the model is loaded, if new requests leave this unset, we'll keep
whatever keep_alive was there.
2024-07-03 15:29:56 -07:00
Daniel Hiltgen 3c75113e37 Prevent loading models larger than total memory
Users may not realize the siny new model they're trying to load
fits on their disk, but can't load into system+GPU memory.  Today
we crash, but with this fix, we'll give them a better error message
before even trying to load it.
2024-07-03 14:47:42 -07:00
Daniel Hiltgen ccd7785859
Merge pull request #5243 from dhiltgen/modelfile_use_mmap
Fix use_mmap for modefiles
2024-07-03 13:59:42 -07:00
royjhan 3b5a4a77f3
Return Correct Prompt Eval Count Regardless of Cache Prompt (#5371)
* openai compatibility

* Revert "openai compatibility"

This reverts commit d3f98a811e00fc497d889c8c45b0cfec5b64690c.

* remove erroneous subtraction of prompt cache
2024-07-03 13:46:23 -07:00
Daniel Hiltgen daed0634a9
Merge pull request #5467 from dhiltgen/bogus_cpu_mac_error
Fix corner cases on tmp cleaner on mac
2024-07-03 13:39:36 -07:00
Daniel Hiltgen 0d4dd707bc
Merge pull request #5465 from dhiltgen/better_cuda_logging
Better nvidia GPU discovery logging
2024-07-03 13:12:22 -07:00
Daniel Hiltgen 0e982bc1f4 Fix corner cases on tmp cleaner on mac
When ollama is running a long time, tmp cleaners can remove the
runners.  This tightens up a few corner cases on arm macs where
we failed with "server cpu not listed in available servers map[]"
2024-07-03 13:10:14 -07:00
Daniel Hiltgen 6298f49816 Fix clip model loading with unicode paths
On windows, if the model dir contained unicode characters
clip models would fail to load.  This fixes the file name
handling in clip.cpp to support utf16 on windows.
2024-07-03 12:46:36 -07:00
Daniel Hiltgen ef757da2c9 Better nvidia GPU discovery logging
Refine the way we log GPU discovery to improve the non-debug
output, and report more actionable log messages when possible
to help users troubleshoot on their own.
2024-07-03 10:50:40 -07:00
Michael Yang e5352297d9
Merge pull request #5448 from ollama/mxyng/fix-generate
use model template by default
2024-07-02 16:48:06 -07:00
Michael Yang 65a5040e09 fix generate template 2024-07-02 16:42:17 -07:00
royjhan d626b99b54
OpenAI: v1/completions compatibility (#5209)
* OpenAI v1 models

* Refactor Writers

* Add Test

Co-Authored-By: Attila Kerekes

* Credit Co-Author

Co-Authored-By: Attila Kerekes <439392+keriati@users.noreply.github.com>

* Empty List Testing

* Use Namespace for Ownedby

* Update Test

* Add back envconfig

* v1/models docs

* Use ModelName Parser

* Test Names

* Remove Docs

* Clean Up

* Test name

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Add Middleware for Chat and List

* Completions Endpoint

* Testing Cleanup

* Test with Fatal

* Add functionality to chat test

* Rename function

* float types

* type cleanup

* cleaning

* more cleaning

* Extra test cases

* merge conflicts

* merge conflicts

* merge conflicts

* merge conflicts

* cleaning

* cleaning

---------

Co-authored-by: Attila Kerekes <439392+keriati@users.noreply.github.com>
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-07-02 16:01:45 -07:00
Michael Yang dddb58a38b
Merge pull request #5051 from ollama/mxyng/capabilities
add model capabilities
2024-07-02 14:26:07 -07:00
Michael Yang 400056e154
Merge pull request #5420 from ollama/mxyng/insecure-path
err on insecure path
2024-07-02 14:03:23 -07:00
Daniel Hiltgen d2f19024d0
Merge pull request #5442 from dhiltgen/concurrency_docs
Add windows radeon concurrency note
2024-07-02 12:47:47 -07:00
Daniel Hiltgen 69c04eecc4 Add windows radeon concurreny note 2024-07-02 12:46:14 -07:00
royjhan 996bb1b85e
OpenAI: /v1/models and /v1/models/{model} compatibility (#5007)
* OpenAI v1 models

* Refactor Writers

* Add Test

Co-Authored-By: Attila Kerekes

* Credit Co-Author

Co-Authored-By: Attila Kerekes <439392+keriati@users.noreply.github.com>

* Empty List Testing

* Use Namespace for Ownedby

* Update Test

* Add back envconfig

* v1/models docs

* Use ModelName Parser

* Test Names

* Remove Docs

* Clean Up

* Test name

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Add Middleware for Chat and List

* Testing Cleanup

* Test with Fatal

* Add functionality to chat test

* OpenAI: /v1/models/{model} compatibility (#5028)

* Retrieve Model

* OpenAI Delete Model

* Retrieve Middleware

* Remove Delete from Branch

* Update Test

* Middleware Test File

* Function name

* Cleanup

* Test Update

* Test Update

---------

Co-authored-by: Attila Kerekes <439392+keriati@users.noreply.github.com>
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-07-02 11:50:56 -07:00
Daniel Hiltgen 422dcc3856
Merge pull request #5439 from dhiltgen/fix_centos_7_build
Switch ARM64 container image base to rocky 8
2024-07-02 11:01:15 -07:00
Daniel Hiltgen 020bd60ab2 Switch amd container image base to rocky 8
The centos 7 arm mirrors have disappeared due to the EOL 2 days
ago, and the vault sed workaround which works for x86 doesn't work for arm.
2024-07-02 10:34:47 -07:00
Daniel Hiltgen 8e277b72bb
Merge pull request #5438 from dhiltgen/fix_centos_7_build
Centos 7 EOL broke mirrors
2024-07-02 09:28:00 -07:00
Daniel Hiltgen 4f67b39d26 Centos 7 EOL broke mirrors
As of July 1st 2024: Could not resolve host: mirrorlist.centos.org
This is expected due to EOL dates.
2024-07-02 09:22:17 -07:00
Josh 2425281317
Merge pull request #5336 from ollama/jyan/from-errors
fix: trim spaces for FROM argument, don't trim inside of quotes
2024-07-01 16:32:46 -07:00
Josh 0403e9860e
Merge pull request #5421 from ollama/jyan/ver
fix: add unsupported architecture message for linux/windows
2024-07-01 16:32:14 -07:00
Josh Yan 33a65e3ba3 error 2024-07-01 16:04:13 -07:00
Michael Yang 88bcd79bb9 err on insecure path 2024-07-01 15:55:59 -07:00
Josh Yan 7e571f95f0 trimspace test case 2024-07-01 11:07:48 -07:00
Michael Yang da8e2a0447 use kvs to detect embedding models 2024-07-01 10:47:43 -07:00
Michael Yang a30915bde1 add capabilities 2024-07-01 10:47:43 -07:00
Michael Yang 58e3fff311 rename templates to template 2024-07-01 10:40:54 -07:00
Michael Yang 3f0b309ad4 remove ManifestV2 2024-07-01 10:40:54 -07:00
Daniel Hiltgen e70610ef06
Merge pull request #5410 from dhiltgen/ctx_cleanup
Fix case for NumCtx
2024-07-01 09:54:20 -07:00
Daniel Hiltgen dfded7e075
Merge pull request #5364 from dhiltgen/concurrency_docs
Document concurrent behavior and settings
2024-07-01 09:49:48 -07:00
Daniel Hiltgen 173b550438 Remove default auto from help message
This may confuse users thinking "auto" is an acceptable string - it must be numeric
2024-07-01 09:48:05 -07:00
Daniel Hiltgen cff3f44f4a Fix case for NumCtx 2024-07-01 09:43:59 -07:00
Josh Yan 26e4e66faf updated parsefile test 2024-07-01 09:43:49 -07:00
Daniel Hiltgen 97c9e11768 Switch use_mmap to a pointer type
This uses nil as undefined for a cleaner implementation.
2024-07-01 08:44:59 -07:00
Daniel Hiltgen 3518aaef33
Merge pull request #4218 from dhiltgen/auto_parallel
Enable concurrency by default
2024-07-01 08:32:29 -07:00
RAPID ARCHITECT 1963c00201
Update README.md (#5214)
* Update README.md

Added Mesop example to web & desktop

* Update README.md

---------

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-06-30 22:00:57 -04:00
Eduard 27402cb7a2
Update gpu.md (#5382)
Runs fine on a NVIDIA GeForce GTX 1050 Ti
2024-06-30 21:48:51 -04:00
Jeffrey Morgan c1218199cf
Update api.md 2024-06-29 16:22:49 -07:00
Jeffrey Morgan 717f7229eb
Do not shift context for sliding window models (#5368)
* Do not shift context for sliding window models

* truncate prompt > 2/3 tokens

* only target gemma2
2024-06-28 19:39:31 -07:00
Daniel Hiltgen aae56abb7c Document concurrent behavior and settings 2024-06-28 13:15:57 -07:00
royjhan 5f034f5b63
Include Show Info in Interactive (#5342) 2024-06-28 13:15:52 -07:00
royjhan b910fa9010
Ollama Show: Check for Projector Type (#5307)
* Check exists projtype

* Maintain Ordering
2024-06-28 11:30:16 -07:00
royjhan 6d4219083c
Update docs (#5312) 2024-06-28 09:58:14 -07:00
Michael Yang 1ed4f521c4
Merge pull request #5340 from ollama/mxyng/mem
gemma2 graph
2024-06-27 14:26:49 -07:00
Michael Yang de2163dafd gemma2 graph 2024-06-27 13:34:52 -07:00
Josh Yan 9bd00041fa trim all params 2024-06-27 11:18:38 -07:00
Josh Yan 4e986a823c unquote, trimp space 2024-06-27 10:59:15 -07:00