Commit graph

2340 commits

Author SHA1 Message Date
Eli Bendersky ad90b9ab3d
api: start adding documentation to package api (#2878)
* api: start adding documentation to package api

Updates #2840

* Fix lint typo report
2024-04-10 13:31:55 -04:00
Eli Bendersky 4340f8eba4
examples: start adding Go examples using api/ (#2879)
We can have the same examples as e.g. https://github.com/ollama/ollama-python/tree/main/examples
here. Using consistent naming and renaming the existing example to have -http-
since it uses direct HTTP requests rather than api/

Updates #2840
2024-04-10 13:26:45 -04:00
Daniel Hiltgen 4c7db6b7e9
Merge pull request #3566 from dhiltgen/more_time
Handle very slow model loads
2024-04-09 16:53:49 -07:00
Michael Yang c03f0e3c3d
Merge pull request #3565 from ollama/mxyng/rope
fix: rope
2024-04-09 16:36:55 -07:00
Daniel Hiltgen c5ff443b9f Handle very slow model loads
During testing, we're seeing some models take over 3 minutes.
2024-04-09 16:35:10 -07:00
Michael Yang 01114b4526 fix: rope 2024-04-09 16:15:24 -07:00
Blake Mizerany 1524f323a3
Revert "build.go: introduce a friendlier way to build Ollama (#3548)" (#3564) 2024-04-09 15:57:45 -07:00
Blake Mizerany fccf3eecaa
build.go: introduce a friendlier way to build Ollama (#3548)
This commit introduces a more friendly way to build Ollama dependencies
and the binary without abusing `go generate` and removing the
unnecessary extra steps it brings with it.

This script also provides nicer feedback to the user about what is
happening during the build process.

At the end, it prints a helpful message to the user about what to do
next (e.g. run the new local Ollama).
2024-04-09 14:18:47 -07:00
Michael Yang c77d45d836
Merge pull request #3506 from ollama/mxyng/quantize-redux
cgo quantize
2024-04-09 12:32:53 -07:00
Jeffrey Morgan 5ec12cec6c
update llama.cpp submodule to 1b67731 (#3561) 2024-04-09 15:10:17 -04:00
Michael Yang d9578d2bad
Merge pull request #3559 from ollama/mxyng/ci
ci: use go-version-file
2024-04-09 11:03:18 -07:00
Michael Yang cb8352d6b4 ci: use go-version-file 2024-04-09 09:50:12 -07:00
Alex Mavrogiannis fc6558f47f
Correct directory reference in macapp/README (#3555) 2024-04-09 09:48:46 -04:00
Michael Yang 9502e5661f cgo quantize 2024-04-08 15:31:08 -07:00
Michael Yang e1c9a2a00f no blob create if already exists 2024-04-08 15:09:48 -07:00
writinwaters 1341ee1b56
Update README.md (#3539)
RAGFlow now supports integration with Ollama.
2024-04-08 10:58:14 -04:00
Jeffrey Morgan 63efa075a0
update generate scripts with new LLAMA_CUDA variable, set HIP_PLATFORM to avoid compiler errors (#3528) 2024-04-07 19:29:51 -04:00
Thomas Vitale cb03fc9571
Docs: Remove wrong parameter for Chat Completion (#3515)
Fixes gh-3514

Signed-off-by: Thomas Vitale <ThomasVitale@users.noreply.github.com>
2024-04-06 09:08:35 -07:00
Michael Yang a5ec9cfc0f
Merge pull request #3508 from ollama/mxyng/rope 2024-04-05 18:46:06 -07:00
Michael Yang be517e491c no rope parameters 2024-04-05 18:05:27 -07:00
Michael Yang fc8e108642
Merge pull request #3496 from ollama/mxyng/cmd-r-graph
add command-r graph estimate
2024-04-05 12:26:21 -07:00
Daniel Hiltgen c5d5c4a96c
Merge pull request #3491 from dhiltgen/context_bust_test
Add test case for context exhaustion
2024-04-04 16:20:20 -07:00
Daniel Hiltgen dfe330fa1c
Merge pull request #3488 from mofanke/fix-windows-dll-compress
fix dll compress in windows building
2024-04-04 16:12:13 -07:00
Michael Yang 01f77ae25d add command-r graph estimate 2024-04-04 14:07:24 -07:00
Daniel Hiltgen 483b81a863
Merge pull request #3494 from dhiltgen/ci_release
Fail fast if mingw missing on windows
2024-04-04 10:15:40 -07:00
Daniel Hiltgen 36bd967722 Fail fast if mingw missing on windows 2024-04-04 09:51:26 -07:00
Jeffrey Morgan b0e7d35db8
use an older version of the mac os sdk in release (#3484) 2024-04-04 09:48:54 -07:00
Daniel Hiltgen aeb1fb5192 Add test case for context exhaustion
Confirmed this fails on 0.1.30 with known regression
but passes on main
2024-04-04 07:42:17 -07:00
Daniel Hiltgen a2e60ebcaf
Merge pull request #3490 from dhiltgen/ci_fixes
CI missing archive
2024-04-04 07:24:24 -07:00
Daniel Hiltgen 883ec4d1ef CI missing archive 2024-04-04 07:23:27 -07:00
mofanke 4de0126719 fix dll compress in windows building 2024-04-04 21:27:33 +08:00
Daniel Hiltgen 9768e2dc75
Merge pull request #3481 from dhiltgen/ci_fixes
CI subprocess path fix
2024-04-03 19:29:09 -07:00
Daniel Hiltgen 08600d5bec CI subprocess path fix 2024-04-03 19:12:53 -07:00
Daniel Hiltgen a624e672d2
Merge pull request #3479 from dhiltgen/ci_fixes
Fix CI release glitches
2024-04-03 18:42:27 -07:00
Daniel Hiltgen e4a7e5b2ca Fix CI release glitches
The subprocess change moved the build directory
arm64 builds weren't setting cross-compilation flags when building on x86
2024-04-03 16:41:40 -07:00
Michael Yang a0a15cfd5b
Merge pull request #3463 from ollama/mxyng/graph-estimate
update graph size estimate
2024-04-03 14:27:30 -07:00
Michael Yang 12e923e158 update graph size estimate 2024-04-03 13:34:12 -07:00
Jeffrey Morgan cd135317d2
Fix macOS builds on older SDKs (#3467) 2024-04-03 10:45:54 -07:00
Michael Yang 4f895d633f
Merge pull request #3466 from ollama/mxyng/head-kv
default head_kv to 1
2024-04-03 10:41:00 -07:00
Blake Mizerany 7d05a6ee8f
cmd: provide feedback if OLLAMA_MODELS is set on non-serve command (#3470)
This also moves the checkServerHeartbeat call out of the "RunE" Cobra
stuff (that's the only word I have for that) to on-site where it's after
the check for OLLAMA_MODELS, which allows the helpful error message to
be printed before the server heartbeat check. This also arguably makes
the code more readable without the magic/superfluous "pre" function
caller.
2024-04-02 22:11:13 -07:00
Daniel Hiltgen 464d817824
Merge pull request #3464 from dhiltgen/subprocess
Fix numgpu opt miscomparison
2024-04-02 20:10:17 -07:00
Pier Francesco Contino 531324a9be
feat: add OLLAMA_DEBUG in ollama server help message (#3461)
Co-authored-by: Pier Francesco Contino <pfcontino@gmail.com>
2024-04-02 18:20:03 -07:00
Daniel Hiltgen 6589eb8a8c Revert options as a ref in the server 2024-04-02 16:44:10 -07:00
Michael Yang 90f071c658 default head_kv to 1 2024-04-02 16:37:59 -07:00
Michael Yang a039e383cd
Merge pull request #3465 from ollama/mxyng/fix-metal
fix metal gpu
2024-04-02 16:29:58 -07:00
Michael Yang 80163ebcb5 fix metal gpu 2024-04-02 16:06:45 -07:00
Daniel Hiltgen a57818d93e
Merge pull request #3343 from dhiltgen/bump_more2
Bump llama.cpp to b2581
2024-04-02 15:08:26 -07:00
Daniel Hiltgen 841adda157 Fix windows lint CI flakiness 2024-04-02 12:22:16 -07:00
Daniel Hiltgen 0035e31af8 Bump to b2581 2024-04-02 11:53:07 -07:00
Daniel Hiltgen c863c6a96d
Merge pull request #3218 from dhiltgen/subprocess
Switch back to subprocessing for llama.cpp
2024-04-02 10:49:44 -07:00