1
0
Fork 0
mirror of https://github.com/karpathy/minGPT synced 2024-05-18 21:46:03 +02:00
Commit Graph

94 Commits

Author SHA1 Message Date
Andrej Karpathy a330148c22 add ability to override config params from command line args. re-inventing the wheel a bit here, should i just use yacs or something? i just really really really do not like dependencies 2022-05-27 12:16:07 -07:00
Andrej Karpathy 8425759c24 early work, refactoring the adder first 2022-05-27 10:04:52 -07:00
Andrej Karpathy 3ed14b2cec i know it doesn't look like much, but this kwarg was not used lol :D 2022-03-27 17:48:05 +01:00
Andrej Karpathy 107b6d7e31 add comment to clarify #39 . Ty @JonathanSum for inspiration PR 2022-03-26 13:52:51 +00:00
Andrej Karpathy dffb6a14e2 Merge branch 'waynemystir-master' 2022-03-26 13:48:03 +00:00
Andrej Karpathy 031ad36f29 don't use default kwargs, in my experience lead to bugs always 2022-03-26 13:47:52 +00:00
Thomas Viehmann 176be2d9bf initialize position embeddings 2022-03-26 13:36:20 +00:00
Mishig Davaadorj fd2977c4e0 Fix broken hugging face link & add link to huggingface / transformers 2022-03-26 13:36:20 +00:00
Andrej 94d880648e
Merge pull request #62 from t-vi/init
initialize position embeddings
2022-03-26 12:23:31 +00:00
Andrej ea8706d964
Merge pull request #63 from mishig25/patch-1
Fix broken hugging face link & add link to huggingface / transformers
2022-03-26 12:21:39 +00:00
Mishig Davaadorj bac74347ff
Fix broken hugging face link & add link to huggingface / transformers 2021-12-08 11:37:43 +01:00
Thomas Viehmann 744d41003a initialize position embeddings 2021-10-25 14:43:04 +02:00
waynemystir 8fcaafb367 move instantiation of DataLoader 2020-11-20 13:44:49 -05:00
Andrej 4050db6040
Merge pull request #32 from brchristian/patch-1
Fix typo in comment in play_char.ipynb
2020-08-25 22:41:18 -07:00
Andrej c43600576e
Merge pull request #31 from fpgaminer/master
fix CharDataset::__len__ off by one error
2020-08-25 22:40:47 -07:00
brchristian 4b5d96b99c
Fix typo in comment in play_char.ipynb 2020-08-25 20:40:17 -07:00
fpgaminer a7b13e02ff fix CharDataset::__len__ off by one error 2020-08-25 18:16:49 -07:00
Andrej Karpathy 339f4e7ad3 fix dataloader issue pointed out by @fpgaminer in #28 and introduce shuffle=True and pin_memory=True as defaults. That said I'm still not very happy with this demo because we're likely overfitting a massive model to tiny text and nothing is really tuned at all. This needs a real train/test dataset and a tiny bit of hyperparameter search, todo. 2020-08-24 23:23:53 -07:00
Andrej 94187b944c
Merge pull request #25 from michaellavelle/shakespeare
Correcting the Bard's name
2020-08-23 22:38:42 -07:00
Michael Lavelle effa35fd93 Correcting the Bard's name 2020-08-24 06:35:45 +01:00
Andrej Karpathy 63902c8d09 remove passive aggressive comment. control yourself andrej. 2020-08-23 19:36:23 -07:00
Andrej Karpathy 38d7327dfd instead of -1e10 use float -inf, which I think will play nicer with fp16 down the line 2020-08-23 17:47:05 -07:00
“Andrej f683085892 resolve merge conflict, this is not going well at all 2020-08-23 17:30:23 -07:00
“Andrej a8835cfebc bleh resolve merge conflicts 2020-08-23 17:26:19 -07:00
“Andrej 5a67ab913d add early stopping to cifar10 image demo 2020-08-23 17:19:45 -07:00
“Andrej 421caf8b20 mit license file 2020-08-23 17:09:21 -07:00
“Andrej bbbdac74fa properly separate params that should be weight decayed, and make a small incremental step towards Lightning compatibility by creating the optimizer object inside the model's configure_optimizers 2020-08-23 15:48:20 -07:00
“Andrej 23982656df add early stopping logic 2020-08-23 15:09:09 -07:00
Andrej Karpathy d100e2251a Merge branch 'master' of github.com:karpathy/minGPT 2020-08-22 17:32:10 -07:00
Andrej Karpathy eca27f6316 Merge branch 'master' of github.com:karpathy/minGPT 2020-08-22 17:32:10 -07:00
Andrej Karpathy 4e152c7aee add demo of image gpt trained on CIFAR-10 2020-08-22 17:29:53 -07:00
Andrej Karpathy d15a85719e add demo of image gpt trained on CIFAR-10 2020-08-22 17:29:53 -07:00
Andrej 382ac70290
Merge pull request #21 from jkravanja/master
Sort characters to always return same mapping
2020-08-21 16:39:49 -07:00
Andrej ebcc03ec7e Merge pull request #21 from jkravanja/master
Sort characters to always return same mapping
2020-08-21 16:39:49 -07:00
Jaka Kravanja 88bf19a869 Sort characters to always return same mapping 2020-08-22 02:09:25 +03:00
Jaka Kravanja 004b807eb2 Sort characters to always return same mapping 2020-08-22 02:09:25 +03:00
Andrej 3f1d1036d7
Merge pull request #6 from shivamtawari/patch-1
Update README.md
2020-08-19 00:17:36 -07:00
Andrej c97efac9a9 Merge pull request #6 from shivamtawari/patch-1
Update README.md
2020-08-19 00:17:36 -07:00
Andrej Karpathy 8909e1b646 fix a dumb bug, intended to use -1e10 instead of 1e-10. thank you @fpgaminer for spotting and bringing to my attention 2020-08-18 17:05:59 -07:00
Andrej Karpathy d708b1e5e2 fix a dumb bug, intended to use -1e10 instead of 1e-10. thank you @fpgaminer for spotting and bringing to my attention 2020-08-18 17:05:59 -07:00
Shivam Tawari 31cd989610
Update README.md 2020-08-18 15:14:28 +05:30
Shivam Tawari 25c2ad25dd Update README.md 2020-08-18 15:14:28 +05:30
Andrej Karpathy 5433de6158 first commit, able to multigpu train fp32 GPTs on math and character-level data, but have done barely any tuning. 2020-08-17 00:39:02 -07:00
Andrej Karpathy 0d9d098cd2 first commit, able to multigpu train fp32 GPTs on math and character-level data, but have done barely any tuning. 2020-08-17 00:39:02 -07:00