Commit graph

12 commits

Author SHA1 Message Date
Michael Yang eae3af6807 clean up convert tokenizer 2024-08-27 11:11:43 -07:00
Michael Yang 3eb08377f8 detect chat template from configs that contain lists 2024-08-27 10:49:33 -07:00
Michael Yang 5a28b9cf5f bert 2024-08-20 17:27:34 -07:00
Michael Yang d8e2664c33 convert: fix parse functions 2024-07-31 15:58:55 -07:00
Michael Yang eafc607abb convert: only extract large files 2024-07-31 15:58:55 -07:00
Michael Yang df993fa37b comments 2024-07-31 15:58:55 -07:00
Michael Yang 5e9db9fb0b refactor convert 2024-07-31 15:58:33 -07:00
Michael Yang c895a7d13f some gocritic 2024-06-04 11:13:30 -07:00
Ikko Eltociear Ashimine 955c317cab
chore: update tokenizer.go (#4571)
PreTokenziers -> PreTokenizers
2024-05-22 00:25:23 -07:00
Michael Yang bbbd9f20f3 cleanup 2024-05-20 16:13:57 -07:00
Michael Yang 547132e820 bpe pretokenizer 2024-05-20 16:13:57 -07:00
Patrick Devine 2d315ba9a9 add missing file 2024-05-20 16:13:57 -07:00