minGPT

mirror/minGPT

Fork 0

mirror of https://github.com/karpathy/minGPT synced 2024-11-15 19:10:39 +01:00

Commit Graph

Select branches

Hide Pull Requests

feature/lightning

master

refactor_wip

#100

#100

#101

#101

#102

#102

#103

#103

#106

#106

#108

#109

#109

#110

#110

#116

#116

#117

#117

#121

#121

#122

#122

#124

#124

#137

#138

#138

#14

#14

#141

#142

#143

#143

#16

#18

#19

#19

#2

#20

#21

#24

#25

#26

#27

#29

#30

#31

#32

#35

#35

#38

#41

#43

#46

#50

#52

#53

#55

#58

#6

#60

#60

#61

#62

#63

#64

#66

#66

#69

#69

#7

#7

#72

#72

#73

#74

#76

#78

#80

#80

#81

#82

#83

#84

#85

#86

#86

#87

#87

#88

#88

#89

#89

#91

#91

#92

#92

#96

#96

#97

#97

eca27f6316 Merge branch 'master' of github.com:karpathy/minGPT Andrej Karpathy 2020-08-22 17:32:10 -0700
d100e2251a Merge branch 'master' of github.com:karpathy/minGPT Andrej Karpathy 2020-08-22 17:32:10 -0700
d15a85719e add demo of image gpt trained on CIFAR-10 Andrej Karpathy 2020-08-22 17:29:53 -0700
4e152c7aee add demo of image gpt trained on CIFAR-10 Andrej Karpathy 2020-08-22 17:29:53 -0700
ebcc03ec7e Merge pull request #21 from jkravanja/master Andrej 2020-08-21 16:39:49 -0700
382ac70290

Merge pull request #21 from jkravanja/master Andrej 2020-08-21 16:39:49 -0700
004b807eb2 Sort characters to always return same mapping Jaka Kravanja 2020-08-22 02:09:25 +0300
88bf19a869 Sort characters to always return same mapping Jaka Kravanja 2020-08-22 02:09:25 +0300
dc0f7d8346

use labml instead of tqdm for training loop Varuna Jayasiri 2020-08-21 17:46:49 +0530
d8e08a2471 codestyle and Catalyst example - WIP Sergey Kolesnikov 2020-08-21 00:05:09 +0300
ad77167036

Update requirements.txt William Falcon 2020-08-20 15:28:36 -0400
b50e5e3e0b

Update .gitignore William Falcon 2020-08-20 15:28:17 -0400
c2d12ec38d Introducing parameter_names_for_module_types function and making layer norm weight decay parameter exclusions generic Michael Lavelle 2020-08-20 12:29:13 +0100
bdc5573f02 Also, excluding ln_f.weight from weight decay Michael Lavelle 2020-08-20 10:00:39 +0100
2c9feb6d0e Fixing HuggingFace LayerNorm.weight reference - replacing with PyTorch equivalents Michael Lavelle 2020-08-20 09:44:24 +0100
daa20eba55

Update README.md William Falcon 2020-08-19 18:02:54 -0400
d12d6b997f

Update README.md William Falcon 2020-08-19 12:32:01 -0400
180fa594df updated ipynb williamfalcon 2020-08-19 16:23:27 +0000
762788d34f updated ipynb williamfalcon 2020-08-19 16:22:19 +0000
b1ab817b13 updated ipynb williamfalcon 2020-08-19 16:19:04 +0000
88a95fe04b updated ipynb williamfalcon 2020-08-19 16:18:22 +0000
c779766350 Merge branch 'master' of https://github.com/williamFalcon/minGPT William Falcon 2020-08-19 12:05:55 -0400
d9a82c0f77 math fix William Falcon 2020-08-19 12:05:50 -0400
a073569d63

Delete char_datamodule.py William Falcon 2020-08-19 11:08:26 -0400
cec807de49 math fix William Falcon 2020-08-19 11:07:13 -0400
ceea9ac1cd notebook fixes williamfalcon 2020-08-19 13:45:56 +0000
f31984d8ae notebook fixes williamfalcon 2020-08-19 13:45:16 +0000
2d64b29d5b updated to use pytorch lightning trainer William Falcon 2020-08-19 09:44:22 -0400
9550bbb312 updated to use pytorch lightning trainer William Falcon 2020-08-19 09:41:47 -0400
65115ff200 updated to use pytorch lightning trainer William Falcon 2020-08-19 09:07:49 -0400
0ee8b8dc0d updated to use pytorch lightning trainer William Falcon 2020-08-19 09:07:29 -0400
c97efac9a9 Merge pull request #6 from shivamtawari/patch-1 Andrej 2020-08-19 00:17:36 -0700
3f1d1036d7

Merge pull request #6 from shivamtawari/patch-1 Andrej 2020-08-19 00:17:36 -0700
f8ce278490 also autocast during sampling Benjamin Wild 2020-08-19 08:08:05 +0200
d708b1e5e2 fix a dumb bug, intended to use -1e10 instead of 1e-10. thank you @fpgaminer for spotting and bringing to my attention Andrej Karpathy 2020-08-18 17:05:59 -0700
8909e1b646 fix a dumb bug, intended to use -1e10 instead of 1e-10. thank you @fpgaminer for spotting and bringing to my attention Andrej Karpathy 2020-08-18 17:05:59 -0700
05a7ae2578 automatic mixed precision training Benjamin Wild 2020-08-18 11:50:24 +0200
25c2ad25dd Update README.md Shivam Tawari 2020-08-18 15:14:28 +0530
31cd989610

Update README.md Shivam Tawari 2020-08-18 15:14:28 +0530
19957d9793 added better generation printout Omry Yadan 2020-08-17 21:47:26 -0700
f6d1e9c665 Porting minGPT to Hydra Omry Yadan 2020-08-17 14:31:25 -0700
0d9d098cd2 first commit, able to multigpu train fp32 GPTs on math and character-level data, but have done barely any tuning. Andrej Karpathy 2020-08-17 00:39:02 -0700
5433de6158 first commit, able to multigpu train fp32 GPTs on math and character-level data, but have done barely any tuning. Andrej Karpathy 2020-08-17 00:39:02 -0700