ollama/gpu
Daniel Hiltgen 380378cc80 Use our libraries first
Trying to live off the land for cuda libraries was not the right strategy.  We need to use the version we compiled against to ensure things work properly
2024-05-06 14:23:29 -07:00
..
amd_common.go Request and model concurrency 2024-04-22 19:29:12 -07:00
amd_hip_windows.go Request and model concurrency 2024-04-22 19:29:12 -07:00
amd_linux.go AMD gfx patch rev is hex 2024-04-24 09:43:52 -07:00
amd_windows.go AMD gfx patch rev is hex 2024-04-24 09:43:52 -07:00
assets.go Centralize server config handling 2024-05-05 16:49:50 -07:00
cpu_common.go Mechanical switch from log to slog 2024-01-18 14:12:57 -08:00
cuda_common.go Request and model concurrency 2024-04-22 19:29:12 -07:00
gpu.go Use our libraries first 2024-05-06 14:23:29 -07:00
gpu_darwin.go gpu: add 512MiB to darwin minimum, metal doesn't have partial offloading overhead (#4068) 2024-05-01 11:46:03 -04:00
gpu_info.h Add CUDA Driver API for GPU discovery 2024-04-30 18:00:45 -07:00
gpu_info_cpu.c Request and model concurrency 2024-04-22 19:29:12 -07:00
gpu_info_cudart.c Request and model concurrency 2024-04-22 19:29:12 -07:00
gpu_info_cudart.h Add CUDA Driver API for GPU discovery 2024-04-30 18:00:45 -07:00
gpu_info_darwin.h darwin: no partial offloading if required memory greater than system 2024-04-16 11:22:38 -07:00
gpu_info_darwin.m darwin: no partial offloading if required memory greater than system 2024-04-16 11:22:38 -07:00
gpu_info_nvcuda.c Add CUDA Driver API for GPU discovery 2024-04-30 18:00:45 -07:00
gpu_info_nvcuda.h Add CUDA Driver API for GPU discovery 2024-04-30 18:00:45 -07:00
gpu_test.go Request and model concurrency 2024-04-22 19:29:12 -07:00
types.go Request and model concurrency 2024-04-22 19:29:12 -07:00