1
0
Fork 0
mirror of https://github.com/karpathy/minGPT synced 2024-04-24 10:55:04 +02:00
minGPT/tests
Andrej 9642f40b83 add a refactored BPE encoder from openai. Basically I dont super trust the huggingface tokenizer, the implementation sprawls multiple files and inheritance and has special magic handling around AddedTokens that I don't fully follow. Prefer to roll our own explicit implementation here that exactly mirrors the code of OpenAI and nothing else 2022-07-12 02:01:41 +00:00
..
test_huggingface_import.py add a refactored BPE encoder from openai. Basically I dont super trust the huggingface tokenizer, the implementation sprawls multiple files and inheritance and has special magic handling around AddedTokens that I don't fully follow. Prefer to roll our own explicit implementation here that exactly mirrors the code of OpenAI and nothing else 2022-07-12 02:01:41 +00:00