Commit graph

16 commits

Author SHA1 Message Date
Daniel Hiltgen 6f351bf586 review comments and coverage 2024-06-14 14:55:50 -07:00
Daniel Hiltgen 68dfc6236a refined test timing
adjust timing on some tests so they don't timeout on small/slow GPUs
2024-06-14 14:51:40 -07:00
Daniel Hiltgen 6fd04ca922 Improve multi-gpu handling at the limit
Still not complete, needs some refinement to our prediction to understand the
discrete GPUs available space so we can see how many layers fit in each one
since we can't split one layer across multiple GPUs we can't treat free space
as one logical block
2024-06-14 14:51:40 -07:00
Daniel Hiltgen 206797bda4 Fix concurrency integration test to work locally
This worked remotely but wound up trying to spawn multiple servers
locally which doesn't work
2024-06-14 14:51:40 -07:00
Daniel Hiltgen 7f2fbad736 Skip max queue test on remote
This test needs to be able to adjust the queue size down from
our default setting for a reliable test, so it needs to skip on
remote test execution mode.
2024-05-16 16:24:18 -07:00
Daniel Hiltgen 074dc3b9d8 Integration fixes 2024-05-10 14:20:10 -07:00
Michael Yang a7248f6ea8 update tests 2024-05-06 15:24:01 -07:00
Daniel Hiltgen 45d61aaaa3 Add integration test to push max queue limits 2024-05-05 10:46:25 -07:00
Daniel Hiltgen f2ea8470e5 Local unicode test case 2024-04-22 19:29:12 -07:00
Daniel Hiltgen 34b9db5afc Request and model concurrency
This change adds support for multiple concurrent requests, as well as
loading multiple models by spawning multiple runners. The default
settings are currently set at 1 concurrent request per model and only 1
loaded model at a time, but these can be adjusted by setting
OLLAMA_NUM_PARALLEL and OLLAMA_MAX_LOADED_MODELS.
2024-04-22 19:29:12 -07:00
Daniel Hiltgen aeb1fb5192 Add test case for context exhaustion
Confirmed this fails on 0.1.30 with known regression
but passes on main
2024-04-04 07:42:17 -07:00
Jeffrey Morgan cd135317d2
Fix macOS builds on older SDKs (#3467) 2024-04-03 10:45:54 -07:00
Daniel Hiltgen 4fec5816d6 Integration test improvements
Cleaner shutdown logic, a bit of response hardening
2024-04-01 16:48:18 -07:00
Patrick Devine 1b272d5bcd
change github.com/jmorganca/ollama to github.com/ollama/ollama (#3347) 2024-03-26 13:04:17 -07:00
Daniel Hiltgen 7b6cbc10ec Integration tests conditionally pull
If images aren't present, pull them.
Also fixes the expected responses
2024-03-25 08:57:45 -07:00
Daniel Hiltgen 949b6c01e0 Revamp go based integration tests
This uplevels the integration tests to run the server which can allow
testing an existing server, or a remote server.
2024-03-23 14:24:18 +01:00