1
0
mirror of https://github.com/karpathy/minGPT synced 2024-11-15 19:10:39 +01:00

Commit Graph

  • eca27f6316 Merge branch 'master' of github.com:karpathy/minGPT Andrej Karpathy 2020-08-22 17:32:10 -0700
  • d100e2251a Merge branch 'master' of github.com:karpathy/minGPT Andrej Karpathy 2020-08-22 17:32:10 -0700
  • d15a85719e add demo of image gpt trained on CIFAR-10 Andrej Karpathy 2020-08-22 17:29:53 -0700
  • 4e152c7aee add demo of image gpt trained on CIFAR-10 Andrej Karpathy 2020-08-22 17:29:53 -0700
  • ebcc03ec7e Merge pull request #21 from jkravanja/master Andrej 2020-08-21 16:39:49 -0700
  • 382ac70290
    Merge pull request #21 from jkravanja/master Andrej 2020-08-21 16:39:49 -0700
  • 004b807eb2 Sort characters to always return same mapping Jaka Kravanja 2020-08-22 02:09:25 +0300
  • 88bf19a869 Sort characters to always return same mapping Jaka Kravanja 2020-08-22 02:09:25 +0300
  • dc0f7d8346
    use labml instead of tqdm for training loop Varuna Jayasiri 2020-08-21 17:46:49 +0530
  • d8e08a2471 codestyle and Catalyst example - WIP Sergey Kolesnikov 2020-08-21 00:05:09 +0300
  • ad77167036
    Update requirements.txt William Falcon 2020-08-20 15:28:36 -0400
  • b50e5e3e0b
    Update .gitignore William Falcon 2020-08-20 15:28:17 -0400
  • c2d12ec38d Introducing parameter_names_for_module_types function and making layer norm weight decay parameter exclusions generic Michael Lavelle 2020-08-20 12:29:13 +0100
  • bdc5573f02 Also, excluding ln_f.weight from weight decay Michael Lavelle 2020-08-20 10:00:39 +0100
  • 2c9feb6d0e Fixing HuggingFace LayerNorm.weight reference - replacing with PyTorch equivalents Michael Lavelle 2020-08-20 09:44:24 +0100
  • daa20eba55
    Update README.md William Falcon 2020-08-19 18:02:54 -0400
  • d12d6b997f
    Update README.md William Falcon 2020-08-19 12:32:01 -0400
  • 180fa594df updated ipynb williamfalcon 2020-08-19 16:23:27 +0000
  • 762788d34f updated ipynb williamfalcon 2020-08-19 16:22:19 +0000
  • b1ab817b13 updated ipynb williamfalcon 2020-08-19 16:19:04 +0000
  • 88a95fe04b updated ipynb williamfalcon 2020-08-19 16:18:22 +0000
  • c779766350 Merge branch 'master' of https://github.com/williamFalcon/minGPT William Falcon 2020-08-19 12:05:55 -0400
  • d9a82c0f77 math fix William Falcon 2020-08-19 12:05:50 -0400
  • a073569d63
    Delete char_datamodule.py William Falcon 2020-08-19 11:08:26 -0400
  • cec807de49 math fix William Falcon 2020-08-19 11:07:13 -0400
  • ceea9ac1cd notebook fixes williamfalcon 2020-08-19 13:45:56 +0000
  • f31984d8ae notebook fixes williamfalcon 2020-08-19 13:45:16 +0000
  • 2d64b29d5b updated to use pytorch lightning trainer William Falcon 2020-08-19 09:44:22 -0400
  • 9550bbb312 updated to use pytorch lightning trainer William Falcon 2020-08-19 09:41:47 -0400
  • 65115ff200 updated to use pytorch lightning trainer William Falcon 2020-08-19 09:07:49 -0400
  • 0ee8b8dc0d updated to use pytorch lightning trainer William Falcon 2020-08-19 09:07:29 -0400
  • c97efac9a9 Merge pull request #6 from shivamtawari/patch-1 Andrej 2020-08-19 00:17:36 -0700
  • 3f1d1036d7
    Merge pull request #6 from shivamtawari/patch-1 Andrej 2020-08-19 00:17:36 -0700
  • f8ce278490 also autocast during sampling Benjamin Wild 2020-08-19 08:08:05 +0200
  • d708b1e5e2 fix a dumb bug, intended to use -1e10 instead of 1e-10. thank you @fpgaminer for spotting and bringing to my attention Andrej Karpathy 2020-08-18 17:05:59 -0700
  • 8909e1b646 fix a dumb bug, intended to use -1e10 instead of 1e-10. thank you @fpgaminer for spotting and bringing to my attention Andrej Karpathy 2020-08-18 17:05:59 -0700
  • 05a7ae2578 automatic mixed precision training Benjamin Wild 2020-08-18 11:50:24 +0200
  • 25c2ad25dd Update README.md Shivam Tawari 2020-08-18 15:14:28 +0530
  • 31cd989610
    Update README.md Shivam Tawari 2020-08-18 15:14:28 +0530
  • 19957d9793 added better generation printout Omry Yadan 2020-08-17 21:47:26 -0700
  • f6d1e9c665 Porting minGPT to Hydra Omry Yadan 2020-08-17 14:31:25 -0700
  • 0d9d098cd2 first commit, able to multigpu train fp32 GPTs on math and character-level data, but have done barely any tuning. Andrej Karpathy 2020-08-17 00:39:02 -0700
  • 5433de6158 first commit, able to multigpu train fp32 GPTs on math and character-level data, but have done barely any tuning. Andrej Karpathy 2020-08-17 00:39:02 -0700