Commit graph

1712 commits

Author SHA1 Message Date
Patrick Devine 238ac5e765
Add unit tests for Parser (#1815) 2024-01-05 14:04:31 -08:00
Bruce MacDonald 4f4980b66b
simplify ggml update logic (#1814)
- additional information is now available in show response, use this to pull gguf before running
- make gguf updates cancellable
2024-01-05 15:22:32 -05:00
Patrick Devine 22e93efa41 add show info command and fix the modelfile 2024-01-05 12:20:05 -08:00
Patrick Devine 2909dce894 split up interactive generation 2024-01-05 12:20:05 -08:00
Jeffrey Morgan df32537312
gpu: read memory info from all cuda devices (#1802)
* gpu: read memory info from all cuda devices

* add `LOOKUP_SIZE` constant

* better constant name

* address comments
2024-01-05 11:25:58 -05:00
Bruce MacDonald 3367b5f3df
remove unused generate patches (#1810) 2024-01-05 11:25:45 -05:00
Matt Williams 46edbbc518
Merge pull request #1801 from jmorganca/mattw/correctdockerlink 2024-01-04 19:20:45 -08:00
Michael Yang d2ff18cd6b
Merge pull request #1791 from jmorganca/mxyng/update-build
update Dockerfile.build
2024-01-04 19:13:44 -08:00
Matt Williams df086d3c8c fix docker doc to point to hub
Signed-off-by: Matt Williams <m@technovangelist.com>
2024-01-04 18:42:23 -08:00
Michael Yang f9961c70ae update build 2024-01-04 17:34:38 -08:00
Daniel Hiltgen cd8fad3398
Merge pull request #1790 from dhiltgen/llm_code_shuffle
Cleaup stale submodule
2024-01-04 13:47:25 -08:00
Daniel Hiltgen 9983fa5f4e Cleaup stale submodule
If the tree has a stale submodule, make sure we clean it up first
2024-01-04 13:40:16 -08:00
Daniel Hiltgen dfda91c2ee
Merge pull request #1788 from dhiltgen/llm_code_shuffle
Revamp code layout for the llm directory and llama.cpp submodule
2024-01-04 13:14:28 -08:00
Daniel Hiltgen fac9060da5 Init submodule with new path 2024-01-04 13:00:13 -08:00
Daniel Hiltgen a554616f8e remove old llama.cpp submodule path 2024-01-04 12:12:21 -08:00
Daniel Hiltgen 77d96da94b Code shuffle to clean up the llm dir 2024-01-04 12:12:05 -08:00
Brian Murray 0d6e3565ae
Add embeddings to API (#1773) 2024-01-04 15:00:52 -05:00
Daniel Hiltgen b5939008a1
Merge pull request #1785 from dhiltgen/win_native_cli
Load dynamic cpu lib on windows
2024-01-04 08:55:01 -08:00
Daniel Hiltgen e9ce91e9a6 Load dynamic cpu lib on windows
On linux, we link the CPU library in to the Go app and fall back to it
when no GPU match is found. On windows we do not link in the CPU library
so that we can better control our dependencies for the CLI.  This fixes
the logic so we correctly fallback to the dynamic CPU library
on windows.
2024-01-04 08:41:41 -08:00
Bruce MacDonald 4ad6c9b11f
fix: pull either original model or from model on create (#1774) 2024-01-04 01:34:38 -05:00
Jeffrey Morgan c0285158a9 tweak memory requirements error text 2024-01-03 19:47:18 -05:00
Jeffrey Morgan 77a66df72c add macOS memory check for 47B models 2024-01-03 19:46:16 -05:00
Jeffrey Morgan 5b4837f881 remove unused filetype check 2024-01-03 19:45:39 -05:00
Jeffrey Morgan 29340c2e62
update cmake flags for amd64 macOS (#1780)
* update cmake flags for intel macOS

* remove `LLAMA_K_QUANTS`

* put back `CMAKE_OSX_DEPLOYMENT_TARGET` and disable `LLAMA_F16C`
2024-01-03 19:22:15 -05:00
Daniel Hiltgen d5ec730354
Merge pull request #1779 from dhiltgen/refined_amd_gpu_list
Improve maintainability of Radeon card list
2024-01-03 16:18:57 -08:00
Daniel Hiltgen 8bed487aba
Merge pull request #1778 from dhiltgen/wsl1
Fail fast on WSL1 while allowing on WSL2
2024-01-03 16:18:41 -08:00
Daniel Hiltgen c1a10a6e9b
Merge pull request #1781 from dhiltgen/cpu_only_build
Fix CPU only builds
2024-01-03 16:18:25 -08:00
Daniel Hiltgen ddbfa6fe31 Fix CPU only builds
Go embed doesn't like when there's no matching files, so put
a dummy placeholder in to allow building without any GPU support
If no "server" library is found, it's safely ignored at runtime.
2024-01-03 16:08:34 -08:00
Daniel Hiltgen 2fcd41ef81 Fail fast on WSL1 while allowing on WSL2
This prevents users from accidentally installing on WSL1 with instructions
guiding how to upgrade their WSL instance to version 2.  Once running WSL2
if you have an NVIDIA card, you can follow their instructions to set up
GPU passthrough and run models on the GPU.  This is not possible on WSL1.
2024-01-03 16:02:32 -08:00
Daniel Hiltgen 16f4603b67 Improve maintainability of Radeon card list
This moves the list of AMD GPUs to an easier to maintain list which
should make it easier to update over time.
2024-01-03 15:16:56 -08:00
Daniel Hiltgen 1184686649
Merge pull request #1776 from dhiltgen/render_group
Add ollama user to render group for Radeon support
2024-01-03 13:07:54 -08:00
Daniel Hiltgen 2588cb2daa Add ollama user to render group for Radeon support
For the ROCm libraries to access the driver, we need to add the ollama user
to the render group.
2024-01-03 12:56:31 -08:00
Jeffrey Morgan c7ea8f237e
set num_gpu to 1 only by default on darwin arm64 (#1771) 2024-01-03 14:10:29 -05:00
Bruce MacDonald 0b3118e0af
fix: relay request opts to loaded llm prediction (#1761) 2024-01-03 12:01:42 -05:00
Daniel Hiltgen 05face44ef
Merge pull request #1683 from dhiltgen/fix_windows_test
Fix windows system memory lookup
2024-01-03 09:00:39 -08:00
Daniel Hiltgen a2ad952440 Fix windows system memory lookup
This refines the gpu package error handling and fixes a bug with the
system memory lookup on windows.
2024-01-03 08:50:01 -08:00
Daniel Hiltgen 5fea4410be
Merge pull request #1680 from dhiltgen/better_patching
Refactor how we augment llama.cpp and refine windows native build
2024-01-03 08:10:17 -08:00
Bruce MacDonald b846eb64d0
Fix template api doc description (#1661) 2024-01-03 11:00:59 -05:00
Cole Gillespie 3c5dd9ed1d
Update README.md (#1766) 2024-01-03 10:44:22 -05:00
Jeffrey Morgan b17ccd0542
Update import.md 2024-01-02 22:28:18 -05:00
Patrick Devine d0409f772f
keyboard shortcut help (#1764) 2024-01-02 18:04:12 -08:00
Jeffrey Morgan ec261422af use docker build in build scripts 2024-01-02 19:32:54 -05:00
Daniel Hiltgen 0498f7ce56 Get rid of one-line llama.log
This one log line was triggering a single line llama.log to be generated
in the pwd of the server
2024-01-02 15:36:16 -08:00
Daniel Hiltgen 738a8d12eb Rename the ollama cmakefile 2024-01-02 15:36:16 -08:00
Daniel Hiltgen d966b730ac Switch windows build to fully dynamic
Refactor where we store build outputs, and support a fully dynamic loading
model on windows so the base executable has no special dependencies thus
doesn't require a special PATH.
2024-01-02 15:36:16 -08:00
Daniel Hiltgen 9a70aecccb Refactor how we augment llama.cpp
This changes the model for llama.cpp inclusion so we're not applying a patch,
but instead have the C++ code directly in the ollama tree, which should make it
easier to refine and update over time.
2024-01-02 15:35:55 -08:00
Karim ElGhandour 22cd5eaab6
Added Ollama-SwiftUI to integrations (#1747) 2024-01-02 09:47:50 -05:00
Dane Madsen 304a8799ca
Update README.md (#1757) 2024-01-02 09:47:08 -05:00
Jeffrey Morgan 2a2fa3c329 api.md cleanup & formatting 2023-12-27 14:32:35 -05:00
Jeffrey Morgan 55978c1dc9 clean up cache api option 2023-12-27 14:27:45 -05:00