Commit graph

2121 commits

Author SHA1 Message Date
Michael Yang b1e74d4fda default terminal width, height 2024-03-07 11:35:42 -08:00
Michael Yang f678f5c5c3
Merge pull request #2991 from ollama/mxyng/fix-ci
fix ci
2024-03-07 11:35:06 -08:00
Michael Yang 2cb74e23fb fix ci 2024-03-07 11:33:49 -08:00
Daniel Hiltgen 3c8df3808b
Merge pull request #2885 from dhiltgen/rocm_v6_only
Revamp ROCm support
2024-03-07 10:51:00 -08:00
Michael Yang 7d564835c2
Merge pull request #2985 from ollama/rm-empty-examples
remove empty examples
2024-03-07 10:49:40 -08:00
Michael Yang 72431031d9 no ci test on docs, examples 2024-03-07 10:44:48 -08:00
Michael Yang 6041abb5b2 remove empty examples 2024-03-07 10:40:32 -08:00
Daniel Hiltgen 6c5ccb11f9 Revamp ROCm support
This refines where we extract the LLM libraries to by adding a new
OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already
idempotenent, so this should speed up startups after the first time a
new release is deployed.  It also cleans up after itself.

We now build only a single ROCm version (latest major) on both windows
and linux.  Given the large size of ROCms tensor files, we split the
dependency out.  It's bundled into the installer on windows, and a
separate download on windows.  The linux install script is now smart and
detects the presence of AMD GPUs and looks to see if rocm v6 is already
present, and if not, then downloads our dependency tar file.

For Linux discovery, we now use sysfs and check each GPU against what
ROCm supports so we can degrade to CPU gracefully instead of having
llama.cpp+rocm assert/crash on us.  For Windows, we now use go's windows
dynamic library loading logic to access the amdhip64.dll APIs to query
the GPU information.
2024-03-07 10:36:50 -08:00
Michael Yang 2e20110e50
Merge pull request #2221 from ollama/mxyng/up-down-ccy
adjust download and upload concurrency based on available bandwidth
2024-03-07 09:27:33 -08:00
Daniel Hiltgen 82ddc3e441
Merge pull request #2964 from dhiltgen/mem_limit_var
Allow setting max vram for workarounds
2024-03-07 09:25:44 -08:00
Jeffrey Morgan d481fb3cc8
update go to 1.22 in other places (#2975) 2024-03-07 07:39:49 -08:00
DJ Johnson 23ee633252
docs: Add LLM-X to Web Integration section (#2759) 2024-03-07 10:11:53 -05:00
John 23ebe8fe11
fix some typos (#2973)
Signed-off-by: hishope <csqiye@126.com>
2024-03-06 22:50:11 -08:00
Patrick Devine 2c017ca441
Convert Safetensors to an Ollama model (#2824) 2024-03-06 21:01:51 -08:00
Daniel Hiltgen be330174dd Allow setting max vram for workarounds
Until we get all the memory calculations correct, this can provide
and escape valve for users to workaround out of memory crashes.
2024-03-06 17:15:06 -08:00
Blake Mizerany 0ded7fdc4b
cmd: document environment variables for serve command
Updates #2944
2024-03-06 13:48:46 -08:00
Leo 2103a5073c
Add Odin Runes, a Feature-Rich Java UI for Ollama, to README (#2440)
* Add Odin Runes to README

Add Odin Runes to README

This commit adds Odin Runes to the "Community Integrations" section of the README. Odin Runes is a Java-based GPT client designed to provide seamless interaction with GPT models, enhancing productivity in prompt engineering and text generation tasks. This addition highlights the integration between Odin Runes and Ollama, offering users the flexibility to leverage large language models locally within their development workflow.

* Update README.md

this commit applies the comments of the reviewer.
2024-03-06 11:57:49 -08:00
Jeffrey Morgan ce9f7c4674
Update api.md 2024-03-05 13:13:23 -08:00
Anders Rex e5596c1944
Add NotesOllama to Community Integrations (#2909) 2024-03-04 01:18:10 -08:00
Timothy Graupmann 9bc3fee694
Added community link for Ollama Copilot (#2582)
* Added community link for Ollama Copilot

* Update README.md

---------

Co-authored-by: Michael <mchiang0610@users.noreply.github.com>
2024-03-04 00:40:36 -08:00
Jeffrey Morgan 21347e1ed6
update llama.cpp submodule to c29af7e (#2868) 2024-03-01 15:26:04 -08:00
Jeffrey Morgan 3b4bab3dc5
Fix embeddings load model behavior (#2848) 2024-02-29 17:40:56 -08:00
Daniel Hiltgen cbd6e3b38e
Merge pull request #2838 from dhiltgen/opensuse
Add ollama user to video group
2024-02-29 15:47:56 -08:00
Daniel Hiltgen b830afa716
Merge pull request #2837 from dhiltgen/podman_image_support
Add env var so podman will map cuda GPUs
2024-02-29 15:47:37 -08:00
Daniel Hiltgen bd1d8b0d14
Merge pull request #2836 from bmwiedemann/gzip
Omit build date from gzip headers
2024-02-29 15:46:46 -08:00
fred-bf 25c2912120
Add Community Integration: NextChat (#2780) 2024-02-29 12:12:13 -08:00
Michael Yang 0e19476b56
prepend image tags (#2789)
instead of appending image tags, prepend them - this generally produces better results
2024-02-29 11:30:14 -08:00
tylinux fa2f2b3563
fix: print usedMemory size right (#2827) 2024-02-29 11:11:04 -08:00
Jeffrey Morgan cbf4970e0f
bump submodule to 87c91c07663b707e831c59ec373b5e665ff9d64a (#2828) 2024-02-29 09:42:08 -08:00
Daniel Hiltgen 74468513bd Add ollama user to video group
On OpenSUSE, ollama needs to be a member of the video group
to access the GPU
2024-02-29 08:50:10 -08:00
Daniel Hiltgen 794a916a72 Add env var so podman will map cuda GPUs
Without this env var, podman's GPU logic doesn't map the GPU through
2024-02-29 08:43:08 -08:00
Bernhard M. Wiedemann 76e5d9ec88 Omit build date from gzip headers
See https://reproducible-builds.org/ for why this is good.

This patch was done while working on reproducible builds for openSUSE.
2024-02-29 16:48:19 +01:00
Daniel Hiltgen 076237b8ea
Merge pull request #2771 from dhiltgen/toggle_models
Bump llama.cpp to b2276
2024-02-27 11:29:53 -08:00
Daniel Hiltgen 53d694c67f
Merge pull request #2772 from dhiltgen/container_image
Refine container image build script
2024-02-27 11:29:08 -08:00
Daniel Hiltgen 5aa6bfea94
Merge pull request #2785 from dhiltgen/win_download
Log unexpected server errors checking for update
2024-02-27 10:43:14 -08:00
Daniel Hiltgen 1cde63dd64 Log unexpected server errors checking for update
This should unmask some failure modes that likely
show up in app logs as unmarshal errors
2024-02-27 09:17:04 -08:00
Daniel Hiltgen 98e0b7e94f Refine container image build script
Allow overriding the platform, image name, and tag latest for
standard and rocm images.
2024-02-26 17:26:49 -08:00
Daniel Hiltgen 061e8f6abc Bump llama.cpp to b2276 2024-02-26 16:49:24 -08:00
peanut256 a189810df6
Determine max VRAM on macOS using recommendedMaxWorkingSetSize (#2354)
* read iogpu.wired_limit_mb on macOS

Fix for https://github.com/ollama/ollama/issues/1826

* improved determination of available vram on macOS

read the recommended maximal vram on macOS via Metal API

* Removed macOS-specific logging

* Remove logging from gpu_darwin.go

* release Core Foundation object

fixes a possible memory leak
2024-02-25 18:16:45 -05:00
Ikko Eltociear Ashimine e95b896790
Update types.go (#2744)
specfied -> specified
2024-02-25 13:41:25 -05:00
elthommy 1f087c4d26
Update langchain python tutorial (#2737)
Remove unused GPT4all
Use nomic-embed-text as embedded model
Fix a deprecation warning (__call__)
2024-02-25 00:31:36 -05:00
Jeffrey Morgan 5d7ea6616f
no extra disk space for windows installation (#2739) 2024-02-25 00:20:35 -05:00
Michael Yang 2a4b128ae3
Merge pull request #2719 from ollama/mxyng/format-private-key
remove format private key
2024-02-23 17:15:14 -08:00
Michael Yang fc483274ad clean up go.mod 2024-02-23 16:53:36 -08:00
Michael Yang fd10a2ad4b remove format/openssh.go
this is unnecessary now that x/crypto/ssh.MarshalPrivateKey has been
added
2024-02-23 16:52:23 -08:00
Benn Huang b291f63188
Add Community Integration: Chatbox
Co-authored-by: bennhuang <bennhuang@tencent.com>
2024-02-23 07:17:28 -05:00
Jeffrey Morgan f58856bf6f better directory cleanup in ollama.iss 2024-02-23 07:14:59 -05:00
Jeffrey Morgan 275ea01587 restore windows build flags and compression 2024-02-22 18:07:18 -05:00
Jeffrey Morgan 8782dd5628 fix build_windows.ps1 script to run go build with the correct flags 2024-02-22 17:41:43 -05:00
Jeffrey Morgan 11bfff8ee1 update llama.cpp submodule to 96633eeca1265ed03e57230de54032041c58f9cd 2024-02-22 16:44:26 -05:00