Commit graph

334 commits

Author SHA1 Message Date
Patrick Devine f1548ef62d
update the FAQ to be more clear about windows env variables (#4415) 2024-05-13 18:01:13 -07:00
睡觉型学渣 9c76b30d72
Correct typos. (#4387)
* Correct typos.

* Correct typos.
2024-05-12 18:21:11 -07:00
Daniel Hiltgen 8cc0ee2efe Doc container usage and workaround for nvidia errors 2024-05-09 09:26:45 -07:00
Jeffrey Morgan d5eec16d23
use model defaults for num_gqa, rope_frequency_base and rope_frequency_scale (#1983) 2024-05-09 09:06:13 -07:00
Carlos Gamez daa1a032f7
Update langchainjs.md (#2027)
Updated sample code as per warning notification from the package maintainers
2024-05-08 20:21:03 -07:00
boessu 5d3f7fff26
Update langchainpy.md (#4236)
fixing pip code.
2024-05-07 16:36:34 -07:00
CrispStrobe 7c5330413b
note on naming restrictions (#2625)
* note on naming restrictions

else push would fail with cryptic
retrieving manifest 
Error: file does not exist
==> maybe change that in code too

* Update docs/import.md

---------

Co-authored-by: C-4-5-3 <154636388+C-4-5-3@users.noreply.github.com>
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-05-06 16:03:21 -07:00
Jeffrey Chen d091fe3c21
Windows automatically recognizes username (#3214) 2024-05-06 15:03:14 -07:00
Mohamed A. Fouad ee02f548c8
Update linux.md (#3847)
Add -e to viewing logs in order to show end of ollama logs
2024-05-06 15:02:25 -07:00
Darinka 3ecae420ac
Update api.md (#3945)
* Update api.md

Changed the calculation of tps (token/s) in the documentation

* Update docs/api.md

---------

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-05-06 14:39:58 -07:00
Adrien Brault aa93423fbf
docs: pbcopy on mac (#3129) 2024-05-06 13:47:00 -07:00
Hyden Liu fb8ddc564e
chore: delete HEAD (#4194) 2024-05-06 10:32:30 -07:00
Daniel Hiltgen 20f6c06569 Make maximum pending request configurable
This also bumps up the default to be 50 queued requests
instead of 10.
2024-05-04 21:00:52 -07:00
Daniel Hiltgen e006480e49 Explain the 2 different windows download options 2024-05-04 12:50:05 -07:00
Dr Nic Williams e8aaea030e
Update 'llama2' -> 'llama3' in most places (#4116)
* Update 'llama2' -> 'llama3' in most places

---------

Co-authored-by: Patrick Devine <patrick@infrahq.com>
2024-05-03 15:25:04 -04:00
Michael Yang 94c369095f fix line ending
replace CRLF with LF
2024-05-02 14:53:13 -07:00
alwqx 68755f1f5e
chore: fix typo in docs/development.md (#4073) 2024-05-01 15:39:11 -04:00
Christian Frantzen 5950c176ca
Update langchainpy.md (#4037)
Updated the code a bit
2024-04-29 23:19:06 -04:00
Quinten van Buul 2a80f55e2a
Update windows.md (#3855)
Fixed a typo
2024-04-26 16:04:15 -04:00
Patrick Devine 74d2a9ef9a
add OLLAMA_KEEP_ALIVE env variable to FAQ (#3865) 2024-04-23 21:06:51 -07:00
Sri Siddhaarth e6f9bfc0e8
Update api.md (#3705) 2024-04-20 15:17:03 -04:00
Jeremy 85bdf14b56 update jetson tutorial 2024-04-17 16:17:42 -04:00
Carlos Gamez a27e419b47
Update langchainjs.md (#2030)
Changed ollama.call() for ollama.invoke() as per deprecated documentation from langchain
2024-04-15 18:37:30 -04:00
Jeffrey Morgan e54a3c7fcd
Update modelfile.md
Remove Modelfile parameters that are decided at runtime
2024-04-15 15:35:44 -04:00
Blake Mizerany 1524f323a3
Revert "build.go: introduce a friendlier way to build Ollama (#3548)" (#3564) 2024-04-09 15:57:45 -07:00
Blake Mizerany fccf3eecaa
build.go: introduce a friendlier way to build Ollama (#3548)
This commit introduces a more friendly way to build Ollama dependencies
and the binary without abusing `go generate` and removing the
unnecessary extra steps it brings with it.

This script also provides nicer feedback to the user about what is
happening during the build process.

At the end, it prints a helpful message to the user about what to do
next (e.g. run the new local Ollama).
2024-04-09 14:18:47 -07:00
Thomas Vitale cb03fc9571
Docs: Remove wrong parameter for Chat Completion (#3515)
Fixes gh-3514

Signed-off-by: Thomas Vitale <ThomasVitale@users.noreply.github.com>
2024-04-06 09:08:35 -07:00
Daniel Hiltgen 0a74cb31d5 Safeguard for noexec
We may have users that run into problems with our current
payload model, so this gives us an escape valve.
2024-04-01 16:48:33 -07:00
Jeffrey Morgan 856b8ec131
remove need for $VSINSTALLDIR since build will fail if ninja cannot be found (#3350) 2024-03-26 16:23:16 -04:00
Patrick Devine 1b272d5bcd
change github.com/jmorganca/ollama to github.com/ollama/ollama (#3347) 2024-03-26 13:04:17 -07:00
Jeffrey Morgan f38b705dc7
Fix ROCm link in development.md 2024-03-25 16:32:44 -04:00
Blake Mizerany 22921a3969
doc: specify ADAPTER is optional (#3333) 2024-03-25 09:43:19 -07:00
Daniel Hiltgen d8fdbfd8da Add docs for GPU selection and nvidia uvm workaround 2024-03-21 11:52:54 +01:00
Bruce MacDonald a5ba0fcf78
doc: faq gpu compatibility (#3142) 2024-03-21 05:21:34 -04:00
Jeffrey Morgan 3a30bf56dc
Update faq.md 2024-03-20 17:48:39 +01:00
Jeffrey Morgan 7ed3e94105
Update faq.md 2024-03-18 10:24:39 +01:00
jmorganca 2297ad39da update faq.md 2024-03-18 10:17:59 +01:00
Daniel Hiltgen 6459377ae0
Add ROCm support to linux install script (#2966) 2024-03-14 18:00:16 -07:00
Jeffrey Morgan 5ce997a7b9
Update README.md 2024-03-13 21:12:17 -07:00
Patrick Devine ba7cf7fb66
add more docs on for the modelfile message command (#3087) 2024-03-12 16:41:41 -07:00
Daniel Hiltgen b53229a2ed Add docs explaining GPU selection env vars 2024-03-12 11:33:06 -07:00
Jeffrey Morgan 6d3adfbea2
Update troubleshooting.md 2024-03-11 13:22:28 -07:00
Daniel Hiltgen 0fdebb34a9 Doc how to set up ROCm builds on windows 2024-03-09 11:29:45 -08:00
Daniel Hiltgen 4a5c9b8035 Finish unwinding idempotent payload logic
The recent ROCm change partially removed idempotent
payloads, but the ggml-metal.metal file for mac was still
idempotent.  This finishes switching to always extract
the payloads, and now that idempotentcy is gone, the
version directory is no longer useful.
2024-03-09 08:34:39 -08:00
Jeffrey Morgan 6c0af2599e
Update docs README.md and table of contents 2024-03-08 22:45:11 -08:00
Daniel Hiltgen 280da44522
Merge pull request #2988 from dhiltgen/rocm_docs
Refined ROCm troubleshooting docs
2024-03-08 13:33:30 -08:00
Jeffrey Morgan b886bec3f9
Update api.md 2024-03-07 23:27:51 -08:00
Daniel Hiltgen 69f0227813 Refined ROCm troubleshooting docs 2024-03-07 11:22:37 -08:00
Daniel Hiltgen 6c5ccb11f9 Revamp ROCm support
This refines where we extract the LLM libraries to by adding a new
OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already
idempotenent, so this should speed up startups after the first time a
new release is deployed.  It also cleans up after itself.

We now build only a single ROCm version (latest major) on both windows
and linux.  Given the large size of ROCms tensor files, we split the
dependency out.  It's bundled into the installer on windows, and a
separate download on windows.  The linux install script is now smart and
detects the presence of AMD GPUs and looks to see if rocm v6 is already
present, and if not, then downloads our dependency tar file.

For Linux discovery, we now use sysfs and check each GPU against what
ROCm supports so we can degrade to CPU gracefully instead of having
llama.cpp+rocm assert/crash on us.  For Windows, we now use go's windows
dynamic library loading logic to access the amdhip64.dll APIs to query
the GPU information.
2024-03-07 10:36:50 -08:00
Jeffrey Morgan d481fb3cc8
update go to 1.22 in other places (#2975) 2024-03-07 07:39:49 -08:00
John 23ebe8fe11
fix some typos (#2973)
Signed-off-by: hishope <csqiye@126.com>
2024-03-06 22:50:11 -08:00
Jeffrey Morgan ce9f7c4674
Update api.md 2024-03-05 13:13:23 -08:00
Jeffrey Morgan 3b4bab3dc5
Fix embeddings load model behavior (#2848) 2024-02-29 17:40:56 -08:00
elthommy 1f087c4d26
Update langchain python tutorial (#2737)
Remove unused GPT4all
Use nomic-embed-text as embedded model
Fix a deprecation warning (__call__)
2024-02-25 00:31:36 -05:00
Jeffrey Morgan bdc0ea1ba5
Update import.md 2024-02-22 02:08:03 -05:00
Jeffrey Morgan 7fab7918cc
Update import.md 2024-02-22 02:06:24 -05:00
Jeffrey Morgan f0425d3de9
Update faq.md 2024-02-20 20:44:45 -05:00
Jeffrey Morgan 8125ce4cb6
Update import.md
Add instructions to get public key on windows
2024-02-19 22:48:24 -05:00
Jeffrey Morgan df56f1ee5e
Update faq.md 2024-02-19 22:16:42 -05:00
Jeffrey Morgan 41aca5c2d0
Update faq.md 2024-02-19 21:11:01 -05:00
Jeffrey Morgan 753724d867
Update api.md to include examples for reproducible outputs 2024-02-19 20:36:16 -05:00
Patrick Devine 9a7a4b9533
add faqs for memory pre-loading and the keep_alive setting (#2601) 2024-02-19 14:45:25 -08:00
Daniel Hiltgen b338c0635f Document setting server vars for windows 2024-02-19 13:30:46 -08:00
Tristan Rhodes 9774663013
Update faq.md with the location of models on Windows (#2545) 2024-02-16 11:04:19 -08:00
Daniel Hiltgen 1ba734de67 typo 2024-02-15 14:56:55 -08:00
Daniel Hiltgen 29e90cc13b Implement new Go based Desktop app
This focuses on Windows first, but coudl be used for Mac
and possibly linux in the future.
2024-02-15 05:56:45 +00:00
Jeffrey Morgan 48a273f80b
Fix issues with templating prompt in chat mode (#2460) 2024-02-12 15:06:57 -08:00
Jeffrey Morgan 1c8435ffa9
Update domain name references in docs and install script (#2435) 2024-02-09 15:19:30 -08:00
Jeffrey Morgan 42b797ed9c
Update openai.md 2024-02-08 15:03:23 -05:00
Jeffrey Morgan 336aa43f3c
Update openai.md 2024-02-08 12:48:28 -05:00
Jeffrey Morgan ab0d37fde4
Update openai.md 2024-02-07 17:25:33 -05:00
Jeffrey Morgan 14e71350c8
Update openai.md 2024-02-07 17:25:24 -05:00
Jeffrey Morgan 453f572f83
Initial OpenAI /v1/chat/completions API compatibility (#2376) 2024-02-07 17:24:29 -05:00
Bruce MacDonald 128fce5495
docs: keep_alive (#2258) 2024-02-06 11:00:05 -05:00
Jeffrey Morgan b9f91a0b36
Update import instructions to use convert and quantize tooling from llama.cpp submodule (#2247) 2024-02-05 00:50:44 -05:00
Jeffrey Morgan f0e9496c85
Update api.md 2024-02-02 12:17:24 -08:00
Daniel Hiltgen e7dbb00331 Add container hints for troubleshooting
Some users are new to containers and unsure where the server logs go
2024-01-29 08:53:41 -08:00
Daniel Hiltgen e02ecfb6c8
Merge pull request #2116 from dhiltgen/cc_50_80
Add support for CUDA 5.0 cards
2024-01-27 10:28:38 -08:00
Jeffrey Morgan 5be9bdd444
Update modelfile.md 2024-01-25 16:29:48 -08:00
Jeffrey Morgan b706794905
Update modelfile.md to include MESSAGE 2024-01-25 16:29:32 -08:00
Michael Yang 93a756266c faq: update to use launchctl setenv 2024-01-22 13:10:13 -08:00
Daniel Hiltgen df54c723ae Make CPU builds parallel and customizable AMD GPUs
The linux build now support parallel CPU builds to speed things up.
This also exposes AMD GPU targets as an optional setting for advaced
users who want to alter our default set.
2024-01-21 15:12:21 -08:00
Daniel Hiltgen a447a083f2 Add compute capability 5.0, 7.5, and 8.0 2024-01-20 14:24:05 -08:00
Daniel Hiltgen abec7f06e5
Merge pull request #2056 from dhiltgen/slog
Mechanical switch from log to slog
2024-01-18 14:27:24 -08:00
Daniel Hiltgen ecbfc0182f Go bump to v1.21 to pick up slog 2024-01-18 14:12:57 -08:00
Daniel Hiltgen fedd705aea Mechanical switch from log to slog
A few obvious levels were adjusted, but generally everything mapped to "info" level.
2024-01-18 14:12:57 -08:00
Daniel Hiltgen 9cd20b0ec8 Refine the linux cuda/rocm developer docs 2024-01-18 09:44:44 -08:00
Tristram Oaten 40a0a90a88
Add group delete to uninstall instructions (#1924)
After executing the `userdel ollama` command, I saw this message:

```sh
$ sudo userdel ollama
userdel: group ollama not removed because it has other members.
```

Which reminded me that I had to remove the dangling group too. For completeness, the uninstall instructions should do this too.

Thanks!
2024-01-12 00:07:00 -05:00
Daniel Hiltgen d88c527be3 Build multiple CPU variants and pick the best
This reduces the built-in linux version to not use any vector extensions
which enables the resulting builds to run under Rosetta on MacOS in
Docker.  Then at runtime it checks for the actual CPU vector
extensions and loads the best CPU library available
2024-01-11 08:42:47 -08:00
Robin Glauser e868c8a5c7
Update api.md (#1878)
Fixed assistant in the example response.
2024-01-09 16:21:17 -05:00
Bruce MacDonald 3f3eb19a3b
document response in modelfile template variables (#1428) 2024-01-08 14:38:51 -05:00
Daniel Hiltgen 2d9dd14f27
Merge pull request #1697 from dhiltgen/win_docs
Add windows native build instructions
2024-01-05 19:34:20 -08:00
Matt Williams df086d3c8c fix docker doc to point to hub
Signed-off-by: Matt Williams <m@technovangelist.com>
2024-01-04 18:42:23 -08:00
Bruce MacDonald b846eb64d0
Fix template api doc description (#1661) 2024-01-03 11:00:59 -05:00
Cole Gillespie 3c5dd9ed1d
Update README.md (#1766) 2024-01-03 10:44:22 -05:00
Jeffrey Morgan b17ccd0542
Update import.md 2024-01-02 22:28:18 -05:00
Jeffrey Morgan 2a2fa3c329 api.md cleanup & formatting 2023-12-27 14:32:35 -05:00
Daniel Hiltgen e201efa14b Add windows native build instructions 2023-12-25 08:31:34 -08:00
K0IN 10da41d677
Add Cache flag to api (#1642) 2023-12-22 17:16:20 -05:00
Matt Williams 511069a2a5 update where are models stored q
Signed-off-by: Matt Williams <m@technovangelist.com>
2023-12-22 09:48:44 -08:00