Commit graph

2408 commits

Author SHA1 Message Date
Daniel Hiltgen 0d651478e4
Merge pull request #3056 from dhiltgen/rocm_link_clash
Avoid rocm runner and dependency clash
2024-03-11 09:48:48 -07:00
Michael Yang 9ea492f1ce convert: fix shape 2024-03-11 09:41:01 -07:00
Daniel Hiltgen bc13da2bfe Avoid rocm runner and dependency clash
Putting the rocm symlink next to the runners is risky.  This moves
the payloads into a subdir to avoid potential clashes.
2024-03-11 09:33:22 -07:00
Jeffrey Morgan 41b00b9856 fix 03-locale.diff 2024-03-10 16:21:05 -07:00
Daniel Hiltgen c2a8ed48e7
Merge pull request #3048 from dhiltgen/harden_rocm_deps
Harden for deps file being empty (or short)
2024-03-10 15:17:22 -07:00
Daniel Hiltgen 3dc1bb6a35 Harden for deps file being empty (or short) 2024-03-10 14:45:38 -07:00
Daniel Hiltgen 7865a6996a
Merge pull request #3046 from dhiltgen/rocm_search_paths
Add ollama executable peer dir for rocm
2024-03-10 12:30:56 -07:00
Daniel Hiltgen 00ec269321 Add ollama executable peer dir for rocm
This allows people who package up ollama on their own to place
the rocm dependencies in a peer directory to the ollama executable
much like our windows install flow.
2024-03-10 12:16:30 -07:00
Jeffrey Morgan 908005d90b
patch: use default locale in wpm tokenizer (#3034) 2024-03-09 21:12:12 -08:00
Jeffrey Morgan cdf65e793f only copy deps for amd64 in build_linux.sh 2024-03-09 17:55:22 -08:00
Daniel Hiltgen 82ca694d68
Rename ROCm deps file to avoid confusion (#3025) 2024-03-09 17:48:38 -08:00
Jeffrey Morgan 5017a15bcb add macapp to .dockerignore 2024-03-09 16:07:06 -08:00
Jeffrey Morgan e11668aa07 add bundle_metal and cleanup_metal funtions to gen_darwin.sh 2024-03-09 16:04:57 -08:00
Jeffrey Morgan 0bd0f4a29c tidy cleanup logs 2024-03-09 15:56:48 -08:00
Jeffrey Morgan 1ffb1e2874
update llama.cpp submodule to 77d1ac7 (#3030) 2024-03-09 15:55:34 -08:00
Daniel Hiltgen 0a7844413c
Merge pull request #3026 from dhiltgen/win_rocm_docs
Doc how to set up ROCm builds on windows
2024-03-09 14:17:19 -08:00
Jeffrey Morgan f9cd55c70b disable gpu for certain model architectures and fix divide-by-zero on memory estimation 2024-03-09 12:51:38 -08:00
Daniel Hiltgen 0fdebb34a9 Doc how to set up ROCm builds on windows 2024-03-09 11:29:45 -08:00
Daniel Hiltgen ac64cd4ef9
Merge pull request #3008 from dhiltgen/no_more_idempotent
Finish unwinding idempotent payload logic
2024-03-09 09:13:24 -08:00
Daniel Hiltgen 4a5c9b8035 Finish unwinding idempotent payload logic
The recent ROCm change partially removed idempotent
payloads, but the ggml-metal.metal file for mac was still
idempotent.  This finishes switching to always extract
the payloads, and now that idempotentcy is gone, the
version directory is no longer useful.
2024-03-09 08:34:39 -08:00
Jeffrey Morgan efe5617b64
update llama.cpp submodule to c2101a2 (#3020) 2024-03-09 00:44:50 -08:00
Jeffrey Morgan 5b3fad9636 separate out isLocalIP 2024-03-09 00:22:08 -08:00
Jeffrey Morgan bfec2c6e10 simplify host checks 2024-03-08 23:29:53 -08:00
Jeffrey Morgan 5c143af726 add additional allowed hosts 2024-03-08 23:23:59 -08:00
Jeffrey Morgan 6c0af2599e
Update docs README.md and table of contents 2024-03-08 22:45:11 -08:00
Jeffrey Morgan fc8c044584
add allowed host middleware and remove workDir middleware (#3018) 2024-03-08 22:23:47 -08:00
Michael Yang ecc133d843
Merge pull request #3014 from ollama/mxyng/decode-ggla 2024-03-08 16:14:53 -08:00
Michael Yang 76bdebbadf decode ggla 2024-03-08 15:46:25 -08:00
Michael Yang 18979ad4a1 convert: fix default shape 2024-03-08 15:42:48 -08:00
Michael Yang 8e0ef931d8
Merge pull request #2990 from ollama/mxyng/default-term-size
fix: default terminal width, height
2024-03-08 15:20:54 -08:00
Daniel Hiltgen 280da44522
Merge pull request #2988 from dhiltgen/rocm_docs
Refined ROCm troubleshooting docs
2024-03-08 13:33:30 -08:00
Bruce MacDonald 0cebc79cba
fix: allow importing a model from name reference (#3005) 2024-03-08 12:27:47 -05:00
Jeffrey Morgan 0e4669b04f
update llama.cpp submodule to 6cdabe6 (#2999) 2024-03-08 00:26:20 -08:00
Jeffrey Morgan b886bec3f9
Update api.md 2024-03-07 23:27:51 -08:00
Jeffrey Morgan fc06205971
Revert "adjust download and upload concurrency based on available bandwidth" (#2995) 2024-03-07 18:10:16 -08:00
Blake Mizerany 2ada81e068
cmd: tighten up env var usage sections (#2962)
Also, document OLLAMA_HOST client semantics per command that honors it.
This looks nicer than having a general puprose environment variable
section in the root usage which was showing up after the "addition help
topics" section outputed by Cobra's default template.

It was decided this was easier to work with than using a custom template
for Cobra right now.
2024-03-07 13:57:07 -08:00
Michael Yang b1e74d4fda default terminal width, height 2024-03-07 11:35:42 -08:00
Michael Yang f678f5c5c3
Merge pull request #2991 from ollama/mxyng/fix-ci
fix ci
2024-03-07 11:35:06 -08:00
Michael Yang 2cb74e23fb fix ci 2024-03-07 11:33:49 -08:00
Daniel Hiltgen 69f0227813 Refined ROCm troubleshooting docs 2024-03-07 11:22:37 -08:00
Daniel Hiltgen 3c8df3808b
Merge pull request #2885 from dhiltgen/rocm_v6_only
Revamp ROCm support
2024-03-07 10:51:00 -08:00
Michael Yang 7d564835c2
Merge pull request #2985 from ollama/rm-empty-examples
remove empty examples
2024-03-07 10:49:40 -08:00
Michael Yang 72431031d9 no ci test on docs, examples 2024-03-07 10:44:48 -08:00
Michael Yang 6041abb5b2 remove empty examples 2024-03-07 10:40:32 -08:00
Daniel Hiltgen 6c5ccb11f9 Revamp ROCm support
This refines where we extract the LLM libraries to by adding a new
OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already
idempotenent, so this should speed up startups after the first time a
new release is deployed.  It also cleans up after itself.

We now build only a single ROCm version (latest major) on both windows
and linux.  Given the large size of ROCms tensor files, we split the
dependency out.  It's bundled into the installer on windows, and a
separate download on windows.  The linux install script is now smart and
detects the presence of AMD GPUs and looks to see if rocm v6 is already
present, and if not, then downloads our dependency tar file.

For Linux discovery, we now use sysfs and check each GPU against what
ROCm supports so we can degrade to CPU gracefully instead of having
llama.cpp+rocm assert/crash on us.  For Windows, we now use go's windows
dynamic library loading logic to access the amdhip64.dll APIs to query
the GPU information.
2024-03-07 10:36:50 -08:00
Michael Yang 2e20110e50
Merge pull request #2221 from ollama/mxyng/up-down-ccy
adjust download and upload concurrency based on available bandwidth
2024-03-07 09:27:33 -08:00
Daniel Hiltgen 82ddc3e441
Merge pull request #2964 from dhiltgen/mem_limit_var
Allow setting max vram for workarounds
2024-03-07 09:25:44 -08:00
Jeffrey Morgan d481fb3cc8
update go to 1.22 in other places (#2975) 2024-03-07 07:39:49 -08:00
DJ Johnson 23ee633252
docs: Add LLM-X to Web Integration section (#2759) 2024-03-07 10:11:53 -05:00
John 23ebe8fe11
fix some typos (#2973)
Signed-off-by: hishope <csqiye@126.com>
2024-03-06 22:50:11 -08:00