Commit graph

3498 commits

Author SHA1 Message Date
mnc 22a28b7f0a Merge remote-tracking branch 'origin/main' into rx580_gpu 2024-09-20 20:13:05 +01:00
mnc 43d22dc9f1 Merge branch 'main' into rx580_gpu
# Conflicts:
#	Dockerfile
#	scripts/build_docker.sh
#	scripts/build_linux.sh
2024-09-20 20:00:14 +01:00
Patrick Devine 5804cf1723
documentation for stopping a model (#6766) 2024-09-18 16:26:42 -07:00
Ryan Marten bf7ee0f4d4
examples: add python examples for bespoke-minicheck (#6841) 2024-09-18 09:35:25 -07:00
Michael Yang 504a410f02
llm: add solar pro (preview) (#6846) 2024-09-17 18:11:26 -07:00
Jeffrey Morgan d05da29912
server: add tool parsing support for nemotron-mini (#6849) 2024-09-17 18:06:16 -07:00
Michael Yang 72962c6e08
Merge pull request #6833 from ollama/mxyng/git-am
make patches git am-able
2024-09-17 16:33:23 -07:00
Michael Yang 7bd7b02712 make patches git am-able
raw diffs can be applied using `git apply` but not with `git am`. git
patches, e.g. through `git format-patch` are both apply-able and am-able
2024-09-17 15:26:40 -07:00
Daniel Hiltgen 8f9ab5e14d
CI: dist directories no longer present (#6834)
The new buildx based build no longer leaves the dist/linux-* directories
around, so we don't have to clean them up before uploading.
2024-09-16 17:31:37 -07:00
Daniel Hiltgen 7717bb6a84
CI: clean up naming, fix tagging latest (#6832)
The rocm CI step for RCs was incorrectly tagging them as the latest rocm build.
The multiarch manifest was incorrectly tagged twice (with and without the
prefix "v").  Static windows artifacts weren't being carried between build
jobs.  This also fixes the latest tagging script.
2024-09-16 16:18:41 -07:00
Daniel Hiltgen 0ec2915ea7
CI: set platform build build_linux script to keep buildx happy (#6829)
The runners don't have emulation set up so the default multi-platform build
wont work.
2024-09-16 14:07:29 -07:00
Michael Yang c9a7541b9c
readme: add Agents-Flex to community integrations (#6788) 2024-09-16 13:42:52 -07:00
Patrick Devine d81cfd7d6f
fix typo in import docs (#6828) 2024-09-16 11:48:14 -07:00
Pepo b330c830d3
readme: add vim-intelligence-bridge to Terminal section (#6818) 2024-09-15 21:20:36 -04:00
mnccouk c4e4ea6019
Update README.md 2024-09-15 16:26:23 +01:00
mnccouk 8fbc5f571a
Update README.md 2024-09-15 16:07:28 +01:00
Matt 7965511b9e Added to README 2024-09-15 16:03:58 +01:00
Matt 3449201ce4 Changed to bild for rx580 GPU, this uses 5.7.1 rocm libraries 2024-09-15 14:59:52 +01:00
Edward Cui d889c6fd07
readme: add Obsidian Quiz Generator plugin to community integrations (#6789) 2024-09-14 23:52:37 -04:00
Daniel Hiltgen 56b9af336a
Fix incremental builds on linux (#6780)
scripts: fix incremental builds on linux or similar
2024-09-13 08:24:08 -07:00
Daniel Hiltgen fda0d3be52
Use GOARCH for build dirs (#6779)
Corrects x86_64 vs amd64 discrepancy
2024-09-12 16:38:05 -07:00
Daniel Hiltgen cd5c8f6471
Optimize container images for startup (#6547)
* Optimize container images for startup

This change adjusts how to handle runner payloads to support
container builds where we keep them extracted in the filesystem.
This makes it easier to optimize the cpu/cuda vs cpu/rocm images for
size, and should result in faster startup times for container images.

* Refactor payload logic and add buildx support for faster builds

* Move payloads around

* Review comments

* Converge to buildx based helper scripts

* Use docker buildx action for release
2024-09-12 12:10:30 -07:00
dcasota fef257c5c5
examples: updated requirements.txt for privategpt example 2024-09-11 18:56:56 -07:00
Adrian Cole d066d9b8e0
examples: polish loganalyzer example (#6744) 2024-09-11 18:37:37 -07:00
RAPID ARCHITECT 5a00dc9fc9
readme: add ollama_moe to community integrations (#6752) 2024-09-11 18:36:26 -07:00
Jesse Gross c354e87809
Merge pull request #6767 from ollama/jessegross/bug_6707
runner: Flush pending responses before returning
2024-09-11 17:20:22 -07:00
Jesse Gross 93ac3760cb runner: Flush pending responses before returning
If there are any pending reponses (such as from potential stop
tokens) then we should send them back before ending the sequence.
Otherwise, we can be missing tokens at the end of a response.

Fixes #6707
2024-09-11 16:39:32 -07:00
Patrick Devine abed273de3
add "stop" command (#6739) 2024-09-11 16:36:21 -07:00
Michael Yang 034392624c
Merge pull request #6762 from ollama/mxyng/show-output
refactor show ouput
2024-09-11 14:58:40 -07:00
Michael Yang ecab6f1cc5 refactor show ouput
fixes line wrapping on long texts
2024-09-11 14:23:09 -07:00
Petr Mironychev 7d6900827d
readme: add QodeAssist to community integrations (#6754) 2024-09-11 13:19:49 -07:00
Daniel Hiltgen 9246e6dd15
Verify permissions for AMD GPU (#6736)
This adds back a check which was lost many releases back to verify /dev/kfd permissions
which when lacking, can lead to confusing failure modes of:
  "rocBLAS error: Could not initialize Tensile host: No devices found"

This implementation does not hard fail the serve command but instead will fall back to CPU
with an error log.  In the future we can include this in the GPU discovery UX to show
detected but unsupported devices we discovered.
2024-09-11 11:38:25 -07:00
Michael Yang 735a0ca2e4
Merge pull request #6732 from ollama/mxyng/debug-proxy
add *_proxy to env map for debugging
2024-09-10 16:13:25 -07:00
Michael Yang dddb72e084 add *_proxy for debugging 2024-09-10 09:43:35 -07:00
Jeffrey Morgan 83a9b5271a
docs: update examples to use llama3.1 (#6718) 2024-09-09 22:47:16 -07:00
Daniel Hiltgen 4a8069f9c4
Quiet down dockers new lint warnings (#6716)
* Quiet down dockers new lint warnings

Docker has recently added lint warnings to build.  This cleans up those warnings.

* Fix go lint regression
2024-09-09 17:22:20 -07:00
Patrick Devine 84b84ce2db
catch when model vocab size is set correctly (#6714) 2024-09-09 17:18:54 -07:00
Jeffrey Morgan bb6a086d63
readme: add crewAI to community integrations (#6699) 2024-09-08 00:36:24 -07:00
RAPID ARCHITECT 30c8f201cc
readme: add crewAI with mesop to community integrations 2024-09-08 00:35:59 -07:00
frob 06d4fba851
openai: align chat temperature and frequency_penalty options with completion (#6688) 2024-09-07 09:08:08 -07:00
Jeffrey Morgan 108fb6c1d1
docs: improve linux install documentation (#6683)
Includes small improvements to document layout and code blocks
2024-09-06 22:05:37 -07:00
Yaroslav da915345d1
openai: don't scale temperature or frequency_penalty (#6514) 2024-09-06 17:45:45 -07:00
nickthecook 8a027bc401
readme: add Archyve to community integrations (#6680) 2024-09-06 14:06:01 -07:00
imoize 5446903fbd
readme: add Plasmoid Ollama Control to community integrations (#6681) 2024-09-06 14:04:12 -07:00
Daniel Hiltgen 56318fb365
Improve logging on GPU too small (#6666)
When we determine a GPU is too small for any layers, it's not always clear why.
This will help troubleshoot those scenarios.
2024-09-06 08:29:36 -07:00
frob fe91d7fff1
openai: fix "presence_penalty" typo and add test (#6665) 2024-09-06 01:16:28 -07:00
Patrick Devine 608e87bf87
Fix gemma2 2b conversion (#6645) 2024-09-05 17:02:28 -07:00
Daniel Hiltgen 48685c6ed0
Document uninstall on windows (#6663) 2024-09-05 15:57:38 -07:00
Daniel Hiltgen 9565fa64a8
Revert "Detect running in a container (#6495)" (#6662)
This reverts commit a60d9b89ce.
2024-09-05 14:26:00 -07:00
Daniel Hiltgen 6719097649
llm: make load time stall duration configurable via OLLAMA_LOAD_TIMEOUT
With the new very large parameter models, some users are willing to wait for
a very long time for models to load.
2024-09-05 14:00:08 -07:00