Commit graph

92 commits

Author SHA1 Message Date
Jeffrey Morgan 8f8e736b13
update llama.cpp submodule to d7fd29f (#5475) 2024-07-05 13:25:58 -04:00
Daniel Hiltgen 96624aa412
Merge pull request #5072 from dhiltgen/windows_path
Move libraries out of users path
2024-06-19 09:13:39 -07:00
Daniel Hiltgen b0930626c5 Add back lower level parallel flags
nvcc supports parallelism (threads) and cmake + make can use -j,
while msbuild requires /p:CL_MPcount=8
2024-06-17 13:44:46 -07:00
Daniel Hiltgen e890be4814 Revert "More parallelism on windows generate"
This reverts commit 0577af98f4.
2024-06-17 13:32:46 -07:00
Daniel Hiltgen b2799f111b Move libraries out of users path
We update the PATH on windows to get the CLI mapped, but this has
an unintended side effect of causing other apps that may use our bundled
DLLs to get terminated when we upgrade.
2024-06-17 13:12:18 -07:00
Jeffrey Morgan 152fc202f5
llm: update llama.cpp commit to 7c26775 (#4896)
* llm: update llama.cpp submodule to `7c26775`

* disable `LLAMA_BLAS` for now

* `-DLLAMA_OPENMP=off`
2024-06-17 15:56:16 -04:00
Daniel Hiltgen 0577af98f4 More parallelism on windows generate
Make the build faster
2024-06-15 07:44:55 -07:00
Daniel Hiltgen ab8c929e20 Add ability to skip oneapi generate
This follows the same pattern for cuda and rocm to allow
disabling the build even when we detect the dependent libraries
2024-06-07 08:32:49 -07:00
Jeffrey Morgan 7ca9605f54
speed up tests by only building static lib (#4740) 2024-05-30 21:43:15 -07:00
Daniel Hiltgen 646371f56d
Merge pull request #3278 from zhewang1-intc/rebase_ollama_main
Enabling ollama to run on Intel GPUs with SYCL backend
2024-05-28 16:30:50 -07:00
Wang,Zhe fd5971be0b support ollama run on Intel GPUs 2024-05-24 11:18:27 +08:00
Daniel Hiltgen c48c1d7c46 Port cuda/rocm skip build vars to linux
Windows already implements these, carry over to linux.
2024-05-15 15:56:43 -07:00
Hernan Martinez 8a65717f55 Do not build AVX runners on ARM64 2024-04-26 23:55:32 -06:00
Hernan Martinez b438d485f1 Use architecture specific folders in the generate script 2024-04-26 23:34:12 -06:00
Daniel Hiltgen e4859c4563 Fine grain control over windows generate steps
This will speed up CI which already tries to only build static for unit tests
2024-04-26 15:49:46 -07:00
Daniel Hiltgen ed5fb088c4 Fix target in gen_windows.ps1 2024-04-26 15:10:42 -07:00
Daniel Hiltgen 421c878a2d Put back non-avx CPU build for windows 2024-04-26 12:44:07 -07:00
Daniel Hiltgen 8671fdeda6 Refactor windows generate for more modular usage 2024-04-26 08:35:50 -07:00
Daniel Hiltgen 8feb97dc0d Move cuda/rocm dependency gathering into generate script
This will make it simpler for CI to accumulate artifacts from prior steps
2024-04-25 22:38:44 -07:00
Roy Yang 5f73c08729
Remove trailing spaces (#3889) 2024-04-25 14:32:26 -04:00
Daniel Hiltgen 058f6cd2cc Move nested payloads to installer and zip file on windows
Now that the llm runner is an executable and not just a dll, more users are facing
problems with security policy configurations on windows that prevent users
writing to directories and then executing binaries from the same location.
This change removes payloads from the main executable on windows and shifts them
over to be packaged in the installer and discovered based on the executables location.
This also adds a new zip file for people who want to "roll their own" installation model.
2024-04-23 16:14:47 -07:00
Daniel Hiltgen cc5a71e0e3
Merge pull request #3709 from remy415/custom-gpu-defs
Adds support for customizing GPU build flags in llama.cpp
2024-04-23 09:28:34 -07:00
Jeremy 9c0db4cc83
Update gen_windows.ps1
Fixed improper env references
2024-04-21 16:13:41 -04:00
Jeremy 6f18297b3a
Update gen_windows.ps1
Forgot a " on the write-host
2024-04-18 19:47:44 -04:00
Jeremy 15016413de
Update gen_windows.ps1
Added OLLAMA_CUSTOM_CUDA_DEFS and OLLAMA_CUSTOM_ROCM_DEFS to customize GPU builds on Windows
2024-04-18 19:27:16 -04:00
Jeremy 440b7190ed
Update gen_linux.sh
Added OLLAMA_CUSTOM_CUDA_DEFS and OLLAMA_CUSTOM_ROCM_DEFS instead of OLLAMA_CUSTOM_GPU_DEFS
2024-04-18 19:18:10 -04:00
Jeremy 52f5370c48 add support for custom gpu build flags for llama.cpp 2024-04-17 16:00:48 -04:00
Jeremy 7c000ec3ed adds support for OLLAMA_CUSTOM_GPU_DEFS to customize GPU build flags 2024-04-17 15:21:05 -04:00
Jeremy 8aec92fa6d rearranged conditional logic for static build, dockerfile updated 2024-04-17 14:43:28 -04:00
Jeremy 70261b9bb6 move static build to its own flag 2024-04-17 13:04:28 -04:00
Blake Mizerany 1524f323a3
Revert "build.go: introduce a friendlier way to build Ollama (#3548)" (#3564) 2024-04-09 15:57:45 -07:00
Blake Mizerany fccf3eecaa
build.go: introduce a friendlier way to build Ollama (#3548)
This commit introduces a more friendly way to build Ollama dependencies
and the binary without abusing `go generate` and removing the
unnecessary extra steps it brings with it.

This script also provides nicer feedback to the user about what is
happening during the build process.

At the end, it prints a helpful message to the user about what to do
next (e.g. run the new local Ollama).
2024-04-09 14:18:47 -07:00
Jeffrey Morgan 63efa075a0
update generate scripts with new LLAMA_CUDA variable, set HIP_PLATFORM to avoid compiler errors (#3528) 2024-04-07 19:29:51 -04:00
Daniel Hiltgen dfe330fa1c
Merge pull request #3488 from mofanke/fix-windows-dll-compress
fix dll compress in windows building
2024-04-04 16:12:13 -07:00
Daniel Hiltgen 36bd967722 Fail fast if mingw missing on windows 2024-04-04 09:51:26 -07:00
mofanke 4de0126719 fix dll compress in windows building 2024-04-04 21:27:33 +08:00
Daniel Hiltgen e4a7e5b2ca Fix CI release glitches
The subprocess change moved the build directory
arm64 builds weren't setting cross-compilation flags when building on x86
2024-04-03 16:41:40 -07:00
Jeffrey Morgan cd135317d2
Fix macOS builds on older SDKs (#3467) 2024-04-03 10:45:54 -07:00
Daniel Hiltgen 58d95cc9bd Switch back to subprocessing for llama.cpp
This should resolve a number of memory leak and stability defects by allowing
us to isolate llama.cpp in a separate process and shutdown when idle, and
gracefully restart if it has problems.  This also serves as a first step to be
able to run multiple copies to support multiple models concurrently.
2024-04-01 16:48:18 -07:00
Jeffrey Morgan 856b8ec131
remove need for $VSINSTALLDIR since build will fail if ninja cannot be found (#3350) 2024-03-26 16:23:16 -04:00
Jeremy dfc6721b20 add support for libcudart.so for CUDA devices (adds Jetson support) 2024-03-25 11:07:44 -04:00
Daniel Hiltgen ab3456207b
Merge pull request #3028 from ollama/ci_release
CI release process
2024-03-15 16:40:54 -07:00
Daniel Hiltgen 6ad414f31e
Merge pull request #3086 from dhiltgen/import_server
Import server.cpp to retain llava support
2024-03-15 16:10:35 -07:00
Daniel Hiltgen d4c10df2b0 Add Radeon gfx940-942 GPU support 2024-03-15 15:34:58 -07:00
Daniel Hiltgen 540f4af45f Wire up more complete CI for releases
Flesh out our github actions CI so we can build official releaes.
2024-03-15 12:37:36 -07:00
Daniel Hiltgen 85129d3a32 Adapt our build for imported server.cpp 2024-03-12 14:57:15 -07:00
Jeffrey Morgan 369eda65f5
update llama.cpp submodule to ceca1ae (#3064) 2024-03-11 12:57:48 -07:00
Daniel Hiltgen bc13da2bfe Avoid rocm runner and dependency clash
Putting the rocm symlink next to the runners is risky.  This moves
the payloads into a subdir to avoid potential clashes.
2024-03-11 09:33:22 -07:00
Daniel Hiltgen 3dc1bb6a35 Harden for deps file being empty (or short) 2024-03-10 14:45:38 -07:00
Jeffrey Morgan e11668aa07 add bundle_metal and cleanup_metal funtions to gen_darwin.sh 2024-03-09 16:04:57 -08:00