ollama

Author	SHA1	Message	Date
Michael Yang	3e21799377	rm unused system prompt	2024-05-29 11:26:47 -07:00
Michael Yang	26a00a0410	use ffi for tokenizing/detokenizing	2024-05-29 11:26:47 -07:00
Daniel Hiltgen	646371f56d	Merge pull request #3278 from zhewang1-intc/rebase_ollama_main Enabling ollama to run on Intel GPUs with SYCL backend	2024-05-28 16:30:50 -07:00
Jeffrey Morgan	1f5008544b	Update install.sh	2024-05-28 15:01:22 -07:00
Jeffrey Morgan	45cbfc5aee	fix wsl2 status check for nvidia cards (#4689 )	2024-05-28 14:49:46 -07:00
Jeffrey Morgan	6d423b383b	Improve install experience on WSL2 and Linux (#4653 )	2024-05-28 14:41:50 -07:00
Josh	ad897080a2	working on integration of multi-byte and multi-width runes (#4549 ) * integrated runewidth for display management - fixed cursor movement for mutli-width char * updated input and deletion of multi-byte chars * fixed line history with some exceptions * improved insert and add * fixed issues with moving across lines * end of line extra space tracking' * saved changes * fixed end of line issues with empty spaces * worked some more * worked on end of line * fixed failed test * fixed minor inserting bug * fixed movement hotkeys * adjusted hotkeys * removed comments * Update readline/buffer.go Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com> * Update readline/buffer.go Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com> * Update readline/buffer.go Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com> * Update readline/buffer.go Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com> * Update readline/buffer.go Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com> * Update readline/buffer.go Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com> * Update readline/buffer.go Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com> * Update readline/buffer.go Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com> * Update readline/buffer.go Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com> * Update readline/buffer.go Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com> * Update readline/buffer.go Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com> * Update readline/buffer.go Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com> * deleted comments and duplicate code * removed duplicate code * added comments, refactored add function to use addChar * added helper to retrieve lineSpacing, renamed lineFlags for clarity * fixed remove() --------- Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>	2024-05-28 12:04:03 -07:00
Jeffrey Morgan	b7d316d98d	fix nvidia detection in install script (#4683 )	2024-05-28 09:59:36 -07:00
Daniel Hiltgen	d7339fad52	Merge pull request #4682 from dhiltgen/more_time Give the final model loading more time	2024-05-28 09:36:02 -07:00
Daniel Hiltgen	92c81e8117	Give the final model loading more time On some systems, 1 minute isn't sufficient to finish the load after it hits 100% This creates 2 distinct timers, although they're both set to the same value for now so we can refine the timeouts further.	2024-05-28 09:08:10 -07:00
Tai	9db0996ed4	Add OllamaSpring Project to Readme (#4672 ) * Add OllamaSpring Project to Readme * Update README.md --------- Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>	2024-05-27 19:58:26 -07:00
Orfeo Ciano	6f43898b17	Adds olpaka flutter client (#4647 ) * Adds olpaka flutter client * Update README.md --------- Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>	2024-05-27 17:22:01 -07:00
Lei Jitang	7487229c34	llm/server.go: Fix 2 minor typos (#4661 ) Signed-off-by: Lei Jitang <leijitang@outlook.com>	2024-05-27 17:21:10 -07:00
Rayan Mostovoi	8a8e7afa96	small fix on examples/python-simplechat/client.py to actually get a streamed response and get tokens printed as we receive it (#4671 )	2024-05-27 17:19:20 -07:00
Jeffrey Morgan	c79f8c9c39	Ensure `nvidia` and `nvidia_uvm` kernel modules are loaded in `install.sh` script and at startup (#4652 ) * ensure kernel modules are loaded in `install.sh` script and at startup * indentation * use `SUDO` variable * restart if nouveau is detected * consistent success message for AMD	2024-05-26 14:57:17 -07:00
Jeffrey Morgan	485016bfbb	Update install.sh	2024-05-26 11:46:00 -07:00
Daniel Hiltgen	0165ba1651	Merge pull request #4638 from dhiltgen/better_error Report better warning on client closed abort of load	2024-05-25 14:32:28 -07:00
Daniel Hiltgen	c4209d6d21	Report better warning on client closed abort of load If the client closes the connection before we finish loading the model we abort, so lets make the log message clearer why to help users understand this failure mode	2024-05-25 09:23:28 -07:00
Michael Yang	6adca97f37	Merge pull request #4619 from noxer/patch-1 Fix download retry issue	2024-05-24 17:21:57 -07:00
Michael Yang	9a3c8003c8	Merge pull request #4624 from ollama/mxyng/fix-5 fix q5_0, q5_1	2024-05-24 16:11:21 -07:00
Michael Yang	d51f15257c	Update llm/ggml.go Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>	2024-05-24 16:10:43 -07:00
Michael Yang	8f440d579a	fix q5_0, q5_1	2024-05-24 16:01:46 -07:00
Patrick Devine	4cc3be3035	Move envconfig and consolidate env vars (#4608 )	2024-05-24 14:57:15 -07:00
Tim Scheuermann	db2ffa79f1	Fix download retry issue	2024-05-24 20:30:42 +02:00
Jeffrey Morgan	afd2b058b4	set codesign timeout to longer (#4605 )	2024-05-23 22:46:23 -07:00
Wang,Zhe	fd5971be0b	support ollama run on Intel GPUs	2024-05-24 11:18:27 +08:00
Daniel Hiltgen	89bf98bcf2	Merge pull request #4598 from dhiltgen/docs Tidy up developer guide a little	2024-05-23 15:14:29 -07:00
Daniel Hiltgen	1b2d156094	Tidy up developer guide a little	2024-05-23 15:14:05 -07:00
Michael Yang	714adb8bd1	bump (#4597 )	2024-05-23 14:16:26 -07:00
Daniel Hiltgen	95b1133d0c	Merge pull request #4547 from dhiltgen/load_progress Wire up load progress	2024-05-23 14:06:02 -07:00
Daniel Hiltgen	b37b496a12	Wire up load progress This doesn't expose a UX yet, but wires the initial server portion of progress reporting during load	2024-05-23 13:36:48 -07:00
Bruce MacDonald	d6f692ad1a	Add support for IQ1_S, IQ3_S, IQ2_S, IQ4_XS. IQ4_NL (#4322 ) Co-authored-by: ManniX-ITA <20623405+mann1x@users.noreply.github.com>	2024-05-23 13:21:49 -07:00
Daniel Hiltgen	f77713bf1f	Add isolated gpu test to troubleshooting	2024-05-23 09:33:25 -07:00
Jeffrey Morgan	38255d2af1	Use flash attention flag for now (#4580 ) * put flash attention behind flag for now * add test * remove print * up timeout for sheduler tests	2024-05-22 21:52:09 -07:00
Michael	73630a7e85	add phi 3 medium (#4578 )	2024-05-22 12:53:45 -04:00
Ikko Eltociear Ashimine	955c317cab	chore: update tokenizer.go (#4571 ) PreTokenziers -> PreTokenizers	2024-05-22 00:25:23 -07:00
Josh	9f18b88a06	Merge pull request #4566 from ollama/jyan/shortcuts add Ctrl + W shortcut	2024-05-21 22:49:36 -07:00
Josh Yan	353f83a9c7	add Ctrl + W shortcut	2024-05-21 16:55:09 -07:00
Patrick Devine	3bade04e10	doc updates for the faq/troubleshooting (#4565 )	2024-05-21 15:30:09 -07:00
Michael Yang	a6d0f443eb	Merge pull request #4543 from ollama/mxyng/simple-safetensors simplify safetensors reading	2024-05-21 14:43:55 -07:00
Michael Yang	96236b7968	Merge pull request #4268 from ollama/pdevine/llama3 Convert directly from llama3	2024-05-21 14:43:37 -07:00
Sang Park	4434d7f447	Correct typo in error message (#4535 ) The spelling of the term "request" has been corrected, which was previously mistakenly written as "requeset" in the error log message.	2024-05-21 13:39:01 -07:00
Michael Yang	171eb040fc	simplify safetensors reading	2024-05-21 11:28:22 -07:00
Michael Yang	3591bbe56f	add test	2024-05-21 11:28:22 -07:00
Michael Yang	34d5ef29b3	fix conversion for f16 or f32 inputs	2024-05-21 11:28:22 -07:00
Michael Yang	bbbd9f20f3	cleanup	2024-05-20 16:13:57 -07:00
Michael Yang	547132e820	bpe pretokenizer	2024-05-20 16:13:57 -07:00
Patrick Devine	2d315ba9a9	add missing file	2024-05-20 16:13:57 -07:00
Patrick Devine	d355d2020f	add fixes for llama	2024-05-20 16:13:57 -07:00
Patrick Devine	c8cf0d94ed	llama3 conversion	2024-05-20 16:13:57 -07:00

1 2 3 4 5 ...

2915 commits