minGPT

mirror of https://github.com/karpathy/minGPT synced 2024-05-18 21:46:03 +02:00

Author	SHA1	Message	Date
Andrej Karpathy	a330148c22	add ability to override config params from command line args. re-inventing the wheel a bit here, should i just use yacs or something? i just really really really do not like dependencies	2022-05-27 12:16:07 -07:00
Andrej Karpathy	8425759c24	early work, refactoring the adder first	2022-05-27 10:04:52 -07:00
Andrej Karpathy	3ed14b2cec	i know it doesn't look like much, but this kwarg was not used lol :D	2022-03-27 17:48:05 +01:00
Andrej Karpathy	107b6d7e31	add comment to clarify #39 . Ty @JonathanSum for inspiration PR	2022-03-26 13:52:51 +00:00
Andrej Karpathy	dffb6a14e2	Merge branch 'waynemystir-master'	2022-03-26 13:48:03 +00:00
Andrej Karpathy	031ad36f29	don't use default kwargs, in my experience lead to bugs always	2022-03-26 13:47:52 +00:00
Thomas Viehmann	176be2d9bf	initialize position embeddings	2022-03-26 13:36:20 +00:00
Mishig Davaadorj	fd2977c4e0	Fix broken hugging face link & add link to huggingface / transformers	2022-03-26 13:36:20 +00:00
Andrej	94d880648e	Merge pull request #62 from t-vi/init initialize position embeddings	2022-03-26 12:23:31 +00:00
Andrej	ea8706d964	Merge pull request #63 from mishig25/patch-1 Fix broken hugging face link & add link to huggingface / transformers	2022-03-26 12:21:39 +00:00
Mishig Davaadorj	bac74347ff	Fix broken hugging face link & add link to huggingface / transformers	2021-12-08 11:37:43 +01:00
Thomas Viehmann	744d41003a	initialize position embeddings	2021-10-25 14:43:04 +02:00
waynemystir	8fcaafb367	move instantiation of DataLoader	2020-11-20 13:44:49 -05:00
Andrej	4050db6040	Merge pull request #32 from brchristian/patch-1 Fix typo in comment in play_char.ipynb	2020-08-25 22:41:18 -07:00
Andrej	c43600576e	Merge pull request #31 from fpgaminer/master fix CharDataset::__len__ off by one error	2020-08-25 22:40:47 -07:00
brchristian	4b5d96b99c	Fix typo in comment in play_char.ipynb	2020-08-25 20:40:17 -07:00
fpgaminer	a7b13e02ff	fix CharDataset::__len__ off by one error	2020-08-25 18:16:49 -07:00
Andrej Karpathy	339f4e7ad3	fix dataloader issue pointed out by @fpgaminer in #28 and introduce shuffle=True and pin_memory=True as defaults. That said I'm still not very happy with this demo because we're likely overfitting a massive model to tiny text and nothing is really tuned at all. This needs a real train/test dataset and a tiny bit of hyperparameter search, todo.	2020-08-24 23:23:53 -07:00
Andrej	94187b944c	Merge pull request #25 from michaellavelle/shakespeare Correcting the Bard's name	2020-08-23 22:38:42 -07:00
Michael Lavelle	effa35fd93	Correcting the Bard's name	2020-08-24 06:35:45 +01:00
Andrej Karpathy	63902c8d09	remove passive aggressive comment. control yourself andrej.	2020-08-23 19:36:23 -07:00
Andrej Karpathy	38d7327dfd	instead of -1e10 use float -inf, which I think will play nicer with fp16 down the line	2020-08-23 17:47:05 -07:00
“Andrej	f683085892	resolve merge conflict, this is not going well at all	2020-08-23 17:30:23 -07:00
“Andrej	a8835cfebc	bleh resolve merge conflicts	2020-08-23 17:26:19 -07:00
“Andrej	5a67ab913d	add early stopping to cifar10 image demo	2020-08-23 17:19:45 -07:00
“Andrej	421caf8b20	mit license file	2020-08-23 17:09:21 -07:00
“Andrej	bbbdac74fa	properly separate params that should be weight decayed, and make a small incremental step towards Lightning compatibility by creating the optimizer object inside the model's configure_optimizers	2020-08-23 15:48:20 -07:00
“Andrej	23982656df	add early stopping logic	2020-08-23 15:09:09 -07:00
Andrej Karpathy	d100e2251a	Merge branch 'master' of github.com:karpathy/minGPT	2020-08-22 17:32:10 -07:00
Andrej Karpathy	eca27f6316	Merge branch 'master' of github.com:karpathy/minGPT	2020-08-22 17:32:10 -07:00
Andrej Karpathy	4e152c7aee	add demo of image gpt trained on CIFAR-10	2020-08-22 17:29:53 -07:00
Andrej Karpathy	d15a85719e	add demo of image gpt trained on CIFAR-10	2020-08-22 17:29:53 -07:00
Andrej	382ac70290	Merge pull request #21 from jkravanja/master Sort characters to always return same mapping	2020-08-21 16:39:49 -07:00
Andrej	ebcc03ec7e	Merge pull request #21 from jkravanja/master Sort characters to always return same mapping	2020-08-21 16:39:49 -07:00
Jaka Kravanja	88bf19a869	Sort characters to always return same mapping	2020-08-22 02:09:25 +03:00
Jaka Kravanja	004b807eb2	Sort characters to always return same mapping	2020-08-22 02:09:25 +03:00
Andrej	3f1d1036d7	Merge pull request #6 from shivamtawari/patch-1 Update README.md	2020-08-19 00:17:36 -07:00
Andrej	c97efac9a9	Merge pull request #6 from shivamtawari/patch-1 Update README.md	2020-08-19 00:17:36 -07:00
Andrej Karpathy	8909e1b646	fix a dumb bug, intended to use -1e10 instead of 1e-10. thank you @fpgaminer for spotting and bringing to my attention	2020-08-18 17:05:59 -07:00
Andrej Karpathy	d708b1e5e2	fix a dumb bug, intended to use -1e10 instead of 1e-10. thank you @fpgaminer for spotting and bringing to my attention	2020-08-18 17:05:59 -07:00
Shivam Tawari	31cd989610	Update README.md	2020-08-18 15:14:28 +05:30
Shivam Tawari	25c2ad25dd	Update README.md	2020-08-18 15:14:28 +05:30
Andrej Karpathy	5433de6158	first commit, able to multigpu train fp32 GPTs on math and character-level data, but have done barely any tuning.	2020-08-17 00:39:02 -07:00
Andrej Karpathy	0d9d098cd2	first commit, able to multigpu train fp32 GPTs on math and character-level data, but have done barely any tuning.	2020-08-17 00:39:02 -07:00

1 2

94 Commits All Branches Search

94 Commits

All Branches