Commit graph

  • 27402cb7a2
    Update gpu.md (#5382) Eduard 2024-07-01 03:48:51 +0200
  • c1218199cf
    Update api.md Jeffrey Morgan 2024-06-29 16:22:49 -0700
  • 717f7229eb
    Do not shift context for sliding window models (#5368) Jeffrey Morgan 2024-06-28 19:39:31 -0700
  • aae56abb7c Document concurrent behavior and settings Daniel Hiltgen 2024-06-28 13:15:57 -0700
  • 5f034f5b63
    Include Show Info in Interactive (#5342) royjhan 2024-06-28 13:15:52 -0700
  • b910fa9010
    Ollama Show: Check for Projector Type (#5307) royjhan 2024-06-28 11:30:16 -0700
  • 6d4219083c
    Update docs (#5312) royjhan 2024-06-28 09:58:14 -0700
  • 1ed4f521c4
    Merge pull request #5340 from ollama/mxyng/mem Michael Yang 2024-06-27 14:26:49 -0700
  • de2163dafd gemma2 graph Michael Yang 2024-06-27 10:52:25 -0700
  • 9bd00041fa trim all params Josh Yan 2024-06-27 11:18:38 -0700
  • 4e986a823c unquote, trimp space Josh Yan 2024-06-27 10:59:15 -0700
  • 2cc7d05012
    update readme for gemma 2 (#5333) Michael 2024-06-27 12:45:16 -0400
  • 123a722a6f
    zip: prevent extracting files into parent dirs (#5314) Michael Yang 2024-06-26 21:38:21 -0700
  • 4d311eb731
    llm: architecture patch (#5316) Jeffrey Morgan 2024-06-26 21:38:12 -0700
  • cb42e607c5
    llm: speed up gguf decoding by a lot (#5246) Blake Mizerany 2024-06-24 21:47:52 -0700
  • 2aa91a937b
    cmd: defer stating model info until necessary (#5248) Blake Mizerany 2024-06-24 20:14:03 -0700
  • ccef9431c8
    Merge pull request #5205 from dhiltgen/modelfile_use_mmap Daniel Hiltgen 2024-06-21 16:30:36 -0700
  • 642cee1342 Sort the ps output Daniel Hiltgen 2024-06-21 15:59:41 -0700
  • 9a9e7d83c4
    Docs (#5149) royjhan 2024-06-21 15:52:09 -0700
  • 9929751cc8 Disable concurrency for AMD + Windows Daniel Hiltgen 2024-06-19 13:35:38 -0700
  • 17b7186cd7 Enable concurrency by default Daniel Hiltgen 2024-05-06 17:47:52 -0700
  • 189a43caa2
    Merge pull request #5206 from ollama/mxyng/quantize Michael Yang 2024-06-21 13:44:34 -0700
  • e835ef1836 fix: quantization with template Michael Yang 2024-06-21 13:30:43 -0700
  • 7e7749224c Fix use_mmap parsing for modelfiles Daniel Hiltgen 2024-06-21 12:27:19 -0700
  • c7c2f3bc22
    Merge pull request #5194 from dhiltgen/linux_mmap_auto Daniel Hiltgen 2024-06-20 11:44:08 -0700
  • 54a79d6a8a
    Merge pull request #5125 from dhiltgen/fedora39 Daniel Hiltgen 2024-06-20 11:27:24 -0700
  • 5bf5aeec01 Refine mmap default logic on linux Daniel Hiltgen 2024-06-20 11:07:04 -0700
  • e01e535cbb
    Merge pull request #5192 from ollama/mxyng/kv Michael Yang 2024-06-20 10:46:24 -0700
  • 0195d6a2f8
    Merge pull request #5188 from ollama/jyan/tmpdir2 Josh 2024-06-20 10:40:59 -0700
  • 8e0641a9bf handle asymmetric embedding KVs Michael Yang 2024-06-20 09:40:17 -0700
  • 662568d453 err!=nil check Josh Yan 2024-06-20 09:30:59 -0700
  • 4ebb66c662 reformat error check Josh Yan 2024-06-20 09:23:43 -0700
  • 23e899f32d skip os.removeAll() if PID does not exist Josh Yan 2024-06-20 08:51:35 -0700
  • fedf71635e
    Extend api/show and ollama show to return more model info (#4881) royjhan 2024-06-19 14:19:02 -0700
  • 97c59be653
    Merge pull request #5074 from dhiltgen/app_log_rotation Daniel Hiltgen 2024-06-19 13:02:24 -0700
  • 9d8a4988e8 Implement log rotation for tray app Daniel Hiltgen 2024-06-15 16:30:37 -0700
  • 1ae0750a21
    Merge pull request #5147 from ollama/mxyng/cleanup Michael Yang 2024-06-19 12:50:31 -0700
  • 9d91e5e587 remove confusing log message Michael Yang 2024-06-19 11:14:11 -0700
  • 96624aa412
    Merge pull request #5072 from dhiltgen/windows_path Daniel Hiltgen 2024-06-19 09:13:39 -0700
  • 10f33b8537
    Merge pull request #5146 from dhiltgen/backout Daniel Hiltgen 2024-06-19 09:12:45 -0700
  • 4a633cc295
    Merge pull request #5145 from dhiltgen/bad_loads Daniel Hiltgen 2024-06-19 09:12:33 -0700
  • d34d88e417 Revert "Revert "gpu: add env var for detecting Intel oneapi gpus (#5076)"" Daniel Hiltgen 2024-06-19 08:57:41 -0700
  • 52ce350b7a Fix bad symbol load detection Daniel Hiltgen 2024-06-19 08:39:07 -0700
  • 2abebb2cbe
    Merge pull request #5128 from zhewang1-intc/fix_levelzero_empty_symbol_detect Daniel Hiltgen 2024-06-19 08:33:16 -0700
  • 380e06e5be types/model: remove Digest Blake Mizerany 2024-06-18 13:29:38 -0700
  • badf975e45 get real func ptr. Wang,Zhe 2024-06-19 09:00:51 +0800
  • 755b4e4fc2 Revert "gpu: add env var for detecting Intel oneapi gpus (#5076)" Wang,Zhe 2024-06-19 08:59:58 +0800
  • 1a1c99e334 Bump latest fedora cuda repo to 39 Daniel Hiltgen 2024-06-18 17:13:54 -0700
  • 21adf8b6d2
    Merge pull request #5121 from ollama/mxyng/deepseekv2 Michael Yang 2024-06-18 16:30:58 -0700
  • 784bf88b0d Wire up windows AMD driver reporting Daniel Hiltgen 2024-06-18 16:22:47 -0700
  • e873841cbb deepseek v2 graph Michael Yang 2024-06-18 12:42:37 -0700
  • 26d0bf9236
    Merge pull request #5117 from dhiltgen/fix_prediction Daniel Hiltgen 2024-06-18 11:36:51 -0700
  • 359b15a597 Handle models with divergent layer sizes Daniel Hiltgen 2024-06-18 11:05:34 -0700
  • b55958a587
    Merge pull request #5106 from dhiltgen/clean_logs Daniel Hiltgen 2024-06-18 09:24:38 -0700
  • 7784ca33ce Tighten up memory prediction logging Daniel Hiltgen 2024-06-17 18:39:48 -0700
  • c9c8c98bf6
    Merge pull request #5105 from dhiltgen/cuda_mmap Daniel Hiltgen 2024-06-17 17:07:30 -0700
  • 171796791f Adjust mmap logic for cuda windows for faster model load Daniel Hiltgen 2024-06-17 12:14:42 -0700
  • 176d0f7075
    Update import.md Jeffrey Morgan 2024-06-17 19:44:14 -0400
  • 8ed51cac37
    Merge pull request #5103 from dhiltgen/faster_win_build Daniel Hiltgen 2024-06-17 14:23:18 -0700
  • c9e6f0542d
    Merge pull request #5069 from dhiltgen/ci_release Daniel Hiltgen 2024-06-17 13:59:37 -0700
  • b0930626c5 Add back lower level parallel flags Daniel Hiltgen 2024-06-17 13:44:46 -0700
  • e890be4814 Revert "More parallelism on windows generate" Daniel Hiltgen 2024-06-17 13:32:46 -0700
  • b2799f111b Move libraries out of users path Daniel Hiltgen 2024-06-15 13:17:20 -0700
  • 152fc202f5
    llm: update llama.cpp commit to 7c26775 (#4896) Jeffrey Morgan 2024-06-17 15:56:16 -0400
  • 4ad0d4d6d3
    Fix a build warning (#5096) Lei Jitang 2024-06-18 02:47:48 +0800
  • 163cd3e77c
    gpu: add env var for detecting Intel oneapi gpus (#5076) Jeffrey Morgan 2024-06-16 20:09:05 -0400
  • 4c2c8f93dd
    Merge pull request #5080 from dhiltgen/debug_intel_crash Daniel Hiltgen 2024-06-16 14:42:41 -0700
  • fd1e6e0590 Add some more debugging logs for intel discovery Daniel Hiltgen 2024-06-16 07:42:52 -0700
  • 89c79bec8c
    Add ModifiedAt Field to /api/show (#5033) royjhan 2024-06-15 20:53:56 -0700
  • c7b77004e3
    docs: add missing powershell package to windows development instructions (#5075) Jeffrey Morgan 2024-06-15 23:08:09 -0400
  • 07d143f412
    Merge pull request #5058 from coolljt0725/fix_build_warning Daniel Hiltgen 2024-06-15 11:52:36 -0700
  • a12283e2ff Implement custom github release action Daniel Hiltgen 2024-06-15 08:26:54 -0700
  • 4b0050cf0e
    Merge pull request #5037 from dhiltgen/faster_win_build Daniel Hiltgen 2024-06-15 08:03:05 -0700
  • 0577af98f4 More parallelism on windows generate Daniel Hiltgen 2024-06-13 17:13:01 -0700
  • 17ce203a26
    Merge pull request #4875 from dhiltgen/rocm_gfx900_workaround Daniel Hiltgen 2024-06-15 07:38:58 -0700
  • d76555ffb5
    Merge pull request #4874 from dhiltgen/rocm_v6_bump Daniel Hiltgen 2024-06-15 07:38:32 -0700
  • 2786dff5d3
    Merge pull request #4264 from dhiltgen/show_gpu_visible_settings Daniel Hiltgen 2024-06-15 07:33:52 -0700
  • 225f0d1219 gpu: Fix build warning Lei Jitang 2024-06-15 14:26:23 +0800
  • 532db58311
    Merge pull request #4972 from jayson-cloude/main Daniel Hiltgen 2024-06-14 17:04:40 -0700
  • 6be309e1bd Centralize GPU configuration vars Daniel Hiltgen 2024-05-08 11:11:50 -0700
  • da3bf23354 Workaround gfx900 SDMA bugs Daniel Hiltgen 2024-05-31 16:15:21 -0700
  • 26ab67732b Bump ROCm linux to 6.1.1 Daniel Hiltgen 2024-06-06 10:43:55 -0700
  • 45cacbaf05
    Merge pull request #4517 from dhiltgen/gpu_incremental Daniel Hiltgen 2024-06-14 15:35:00 -0700
  • 17df6520c8 Remove mmap related output calc logic Daniel Hiltgen 2024-06-13 09:59:36 -0700
  • 6f351bf586 review comments and coverage Daniel Hiltgen 2024-06-05 12:07:20 -0700
  • ff4f0cbd1d Prevent multiple concurrent loads on the same gpus Daniel Hiltgen 2024-06-04 14:08:36 -0700
  • fc37c192ae Refine CPU load behavior with system memory visibility Daniel Hiltgen 2024-06-03 19:09:23 -0700
  • 434dfe30c5 Reintroduce nvidia nvml library for windows Daniel Hiltgen 2024-06-03 15:07:50 -0700
  • 4e2b7e181d Refactor intel gpu discovery Daniel Hiltgen 2024-05-29 16:37:34 -0700
  • 48702dd149 Harden unload for empty runners Daniel Hiltgen 2024-05-30 16:43:40 -0700
  • 68dfc6236a refined test timing Daniel Hiltgen 2024-05-31 14:28:02 -0700
  • 5e8ff556cb Support forced spreading for multi GPU Daniel Hiltgen 2024-05-08 14:32:42 -0700
  • 6fd04ca922 Improve multi-gpu handling at the limit Daniel Hiltgen 2024-05-18 12:34:31 -0700
  • 206797bda4 Fix concurrency integration test to work locally Daniel Hiltgen 2024-05-23 13:12:14 -0700
  • 43ed358f9a Refine GPU discovery to bootstrap once Daniel Hiltgen 2024-05-15 15:13:16 -0700
  • b32ebb4f29 Use DRM driver for VRAM info for amd Daniel Hiltgen 2024-05-14 16:18:42 -0700
  • fb9cdfa723 Fix server.cpp for the new cuda build macros Daniel Hiltgen 2024-05-18 16:02:13 -0700
  • efac488675 Revert "Limit GPU lib search for now (#4777)" Daniel Hiltgen 2024-06-03 08:31:48 -0700
  • 6b800aa7b7
    openai: do not set temperature to 0 when setting seed (#5045) Jeffrey Morgan 2024-06-14 13:43:56 -0700
  • dd7c9ebeaf
    server: longer timeout in TestRequests (#5046) Jeffrey Morgan 2024-06-14 09:48:25 -0700