Commit graph

693 commits

Author SHA1 Message Date
Michael Yang 6de5d032e1 implement loading ggml lora adapters through the modelfile 2023-08-10 09:23:39 -07:00
Michael Yang d791df75dd check memory requirements before loading 2023-08-10 09:23:11 -07:00
Michael Yang 020a3b3530 disable gpu for q5_0, q5_1, q8_0 quants 2023-08-10 09:23:11 -07:00
Michael Yang fccf8d179f partial decode ggml bin for more info 2023-08-10 09:23:10 -07:00
Bruce MacDonald 5b5cc9c9f1
embeddings endpoint 2023-08-10 11:49:55 -04:00
Bruce MacDonald 4b3507f036 embeddings endpoint
Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>
2023-08-10 11:45:57 -04:00
Jun Tian 5ebce03c77
Add an example on multiline input (#311) 2023-08-10 08:22:28 -07:00
Bruce MacDonald 5e25f801ed
fix a typo in the tweetwriter example Modelfile 2023-08-10 10:19:53 -04:00
Bruce MacDonald 8e1234b758
fix embeddings invalid values 2023-08-10 10:17:00 -04:00
Soroush Javadi 10885986b8 fix a typo in the tweetwriter example Modelfile 2023-08-10 15:12:48 +03:30
Bruce MacDonald 984c9c628c fix embeddings invalid values 2023-08-09 16:50:53 -04:00
Bruce MacDonald c4861360ec remove embed docs 2023-08-09 16:14:19 -04:00
Bruce MacDonald 9738ef85db
allow for concurrent pulls of the same files 2023-08-09 11:35:24 -04:00
Bruce MacDonald ac971c56d1 Update images.go 2023-08-09 11:31:54 -04:00
Bruce MacDonald 8228d166ce pr comments 2023-08-09 11:31:54 -04:00
Bruce MacDonald 907e6c56b3 unlock downloadu in case or requestDownload err 2023-08-09 11:31:54 -04:00
Bruce MacDonald 868e3b31c7 allow for concurrent pulls of the same files 2023-08-09 11:31:54 -04:00
Bruce MacDonald 09d8bf6730 fix build errors 2023-08-09 10:45:57 -04:00
Bruce MacDonald 7a5f3616fd
embed text document in modelfile 2023-08-09 10:26:19 -04:00
Jeffrey Morgan cff002b824 use content type application/x-ndjson for streaming responses 2023-08-08 21:38:10 -07:00
Jeffrey Morgan 55cf5021f0 update langchain example to include python 2023-08-08 21:03:10 -07:00
Jeffrey Morgan f58caa5ab5 update README.md 2023-08-08 15:50:23 -07:00
Jeffrey Morgan 82df473ec9 use note syntax in README.md 2023-08-08 15:49:50 -07:00
Jeffrey Morgan e184c1d035 Link to api.md in README.md 2023-08-08 15:48:47 -07:00
Jeffrey Morgan 371d4e5df3 docs: fix invalid json in api.md 2023-08-08 15:46:05 -07:00
Jeffrey Morgan 1f78e409b4 docs: format with prettier 2023-08-08 15:41:48 -07:00
Jeffrey Morgan 34a88cd776 docs: update api.md formatting 2023-08-08 15:41:19 -07:00
Bruce MacDonald 1bee2347be pr feedback
- defer closing llm on embedding
- do not override licenses
- remove debugging print line
- reformat model file docs
2023-08-08 17:01:37 -04:00
Jeffrey Morgan a027a7dd65 add 0.0.0.0 as an allowed origin by default
Fixes #282
2023-08-08 13:39:50 -07:00
Jeffrey Morgan 22986ccb38 add llama2:70b to the model library list 2023-08-08 13:08:05 -07:00
Bruce MacDonald 884d78ceb3 allow embedding from model binary 2023-08-08 14:38:57 -04:00
Bruce MacDonald 3ceac05108 Add embedding docs 2023-08-08 14:04:11 -04:00
Bruce MacDonald 21ddcaa1f1 pr comments
- default to embeddings enabled
- move embedding logic for loaded model to request
- allow embedding full directory
- close llm on reload
2023-08-08 13:49:37 -04:00
Michael Yang f2074ed4c0
Merge pull request #306 from jmorganca/default-keep-system
automatically set num_keep if num_keep < 0
2023-08-08 09:25:34 -07:00
Bruce MacDonald a6f6d18f83 embed text document in modelfile 2023-08-08 11:27:17 -04:00
Bruce MacDonald 34a13a9d05
pass flags to serve to allow setting allowed-origins + host and port 2023-08-08 10:41:42 -04:00
Jeffrey Morgan 8713ac23a8 allow overriding template and system in /api/generate
Fixes #297
Fixes #296
2023-08-08 00:55:34 -04:00
Jeffrey Morgan 5eb712f962 trim whitespace before checking stop conditions
Fixes #295
2023-08-08 00:29:19 -04:00
Michael Yang 4dc5b117dd automatically set num_keep if num_keep < 0
num_keep defines how many tokens to keep in the context when truncating
inputs. if left to its default value of -1, the server will calculate
num_keep to be the left of the system instructions
2023-08-07 16:19:12 -07:00
Matt Williams 931a5f3cb9
Merge pull request #304 from jmorganca/matt/docs
missed a backtick
2023-08-07 15:14:06 -07:00
Jeffrey Morgan 639288bf2b make ollama binary executable on build 2023-08-07 18:10:37 -04:00
Jeffrey Morgan d112c15d58 remove old library and web directories 2023-08-07 18:09:24 -04:00
Matt Williams 1267895e44 missed a backtick
Signed-off-by: Matt Williams <m@technovangelist.com>
2023-08-07 13:53:49 -07:00
Matt Williams 089d03bc8d
Merge pull request #289 from jmorganca/docs
First draft of API Docs
2023-08-07 13:46:22 -07:00
Michael Yang ab3ced9d32
Merge pull request #276 from jmorganca/rope-freq
configurable rope frequency parameters
2023-08-07 13:39:38 -07:00
Matt Williams 0c52b4509b get rid of namespace and site
Signed-off-by: Matt Williams <m@technovangelist.com>
2023-08-07 13:27:58 -07:00
Matt Williams 13aace3d34 clarify some more
Signed-off-by: Matt Williams <m@technovangelist.com>
2023-08-07 13:21:54 -07:00
Matt Williams 2b3bb41598 model name format added
Signed-off-by: Matt Williams <m@technovangelist.com>
2023-08-07 13:17:16 -07:00
cmiller01 93492f1e18 correct precedence of serve params (args over env over default) 2023-08-07 19:55:20 +00:00
Michael Chiang 54ba3e2ceb
langchain JS integration (#302)
langchain JS integration
2023-08-07 12:21:36 -04:00