Commit graph

28 commits

Author SHA1 Message Date
Daniel Hiltgen d88c527be3 Build multiple CPU variants and pick the best
This reduces the built-in linux version to not use any vector extensions
which enables the resulting builds to run under Rosetta on MacOS in
Docker.  Then at runtime it checks for the actual CPU vector
extensions and loads the best CPU library available
2024-01-11 08:42:47 -08:00
Daniel Hiltgen e201efa14b Add windows native build instructions 2023-12-25 08:31:34 -08:00
Daniel Hiltgen e5202eb687 Quiet down llama.cpp logging by default
By default builds will now produce non-debug and non-verbose binaries.
To enable verbose logs in llama.cpp and debug symbols in the
native code, set `CGO_CFLAGS=-g`
2023-12-22 08:47:18 -08:00
Daniel Hiltgen 1b991d0ba9 Refine build to support CPU only
If someone checks out the ollama repo and doesn't install the CUDA
library, this will ensure they can build a CPU only version
2023-12-19 09:05:46 -08:00
Jiayu Liu 4fc10acce9
add some missing code directives in docs (#664) 2023-10-01 11:51:01 -07:00
Michael Yang 6c6a31a1e8 embed libraries using cmake 2023-09-20 14:41:57 -07:00
Bruce MacDonald fc6ec356fc remove libcuda.so 2023-09-20 20:36:14 +01:00
Bruce MacDonald 1255bc9b45 only package 11.8 runner 2023-09-20 20:00:41 +01:00
Bruce MacDonald 4e8be787c7 pack in cuda libs 2023-09-20 17:40:42 +01:00
Bruce MacDonald 2540c9181c
support for packaging in multiple cuda runners (#509)
* enable packaging multiple cuda versions
* use nvcc cuda version if available

---------

Co-authored-by: Michael Yang <mxyng@pm.me>
2023-09-14 15:08:13 -04:00
Bruce MacDonald f221637053
first pass at linux gpu support (#454)
* linux gpu support
* handle multiple gpus
* add cuda docker image (#488)
---------

Co-authored-by: Michael Yang <mxyng@pm.me>
2023-09-12 11:04:35 -04:00
Bruce MacDonald 42998d797d
subprocess llama.cpp server (#401)
* remove c code
* pack llama.cpp
* use request context for llama_cpp
* let llama_cpp decide the number of threads to use
* stop llama runner when app stops
* remove sample count and duration metrics
* use go generate to get libraries
* tmp dir for running llm
2023-08-30 16:35:03 -04:00
Michael Yang 041f9ad1a1 update README.md 2023-08-25 11:44:25 -07:00
Jeffrey Morgan 1f78e409b4 docs: format with prettier 2023-08-08 15:41:48 -07:00
Michael Yang 24e43e3212 update development.md 2023-07-24 09:43:57 -07:00
Bruce MacDonald 52f04e39f2
Note that CGO must be enabled in dev docs 2023-07-21 22:36:36 +02:00
Matt Williams 3d9498dc95 Some simple modelfile examples
Signed-off-by: Matt Williams <m@technovangelist.com>
2023-07-17 17:16:59 -07:00
Jeffrey Morgan 1358e27b77 add publish script 2023-07-07 12:59:45 -04:00
Michael Yang 9811956938 update development.md 2023-06-28 12:41:30 -07:00
Jeffrey Morgan 9ba58c8a9e move desktop docs to desktop/ 2023-06-28 11:29:29 -04:00
Jeffrey Morgan 9f868d8258 move desktop docs to desktop/ 2023-06-28 11:27:18 -04:00
Bruce MacDonald 4018b3c533 poetry development 2023-06-28 11:17:08 -04:00
Bruce MacDonald ecfb4abafb simplify loading 2023-06-27 14:50:30 -04:00
Michael Chiang 2906cbab11
Update development.md 2023-06-27 14:07:31 -04:00
Michael Chiang 9d14e75185
Update development.md 2023-06-27 14:06:59 -04:00
Michael Chiang a2745f8174
Update development.md 2023-06-27 14:06:49 -04:00
Jeffrey Morgan 20cdd9fee6 update README.md 2023-06-27 13:51:20 -04:00
Bruce MacDonald 11614b6d84 add development doc 2023-06-27 13:46:46 -04:00