ollama

Author	SHA1	Message	Date
Daniel Hiltgen	325d74985b	Fix CPU performance on hyperthreaded systems The default thread count logic was broken and resulted in 2x the number of threads as it should on a hyperthreading CPU resulting in thrashing and poor performance.	2023-12-21 16:23:36 -08:00
Daniel Hiltgen	9adca7f711	Bump llama.cpp to b1662 and set n_parallel=1	2023-12-19 09:05:46 -08:00
Daniel Hiltgen	35934b2e05	Adapted rocm support to cgo based llama.cpp	2023-12-19 09:05:46 -08:00
Daniel Hiltgen	d4cd695759	Add cgo implementation for llama.cpp Run the server.cpp directly inside the Go runtime via cgo while retaining the LLM Go abstractions.	2023-12-19 09:05:46 -08:00
Bruce MacDonald	811b1f03c8	deprecate ggml - remove ggml runner - automatically pull gguf models when ggml detected - tell users to update to gguf in the case automatic pull fails Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>	2023-12-19 09:05:46 -08:00
Michael Yang	a00fac4ec8	update llama.cpp	2023-11-21 09:50:02 -08:00
Jeffrey Morgan	b0c9cd0f3b	fix metal assertion errors	2023-10-24 00:32:36 -07:00
Michael Yang	c9167494cb	update default log target	2023-10-23 10:44:50 -07:00
Bruce MacDonald	f3648fd206	Update llama.cpp gguf to latest (#710 )	2023-10-17 16:55:16 -04:00
Michael Yang	058d0cd04b	silence warm up log	2023-09-21 14:53:33 -07:00
Michael Yang	6c6a31a1e8	embed libraries using cmake	2023-09-20 14:41:57 -07:00