1
0
mirror of https://github.com/karpathy/minGPT synced 2024-11-15 19:10:39 +01:00

Commit Graph

  • 5b22aacc53
    typo marxav 2020-11-10 22:58:28 +0100
  • 17efd27da0
    replace language by orthography marxav 2020-11-10 22:56:50 +0100
  • a500aea83a
    Create marxav 2020-10-27 15:46:44 +0100
  • 7c2a08d8f3
    Reverting BlingDL changes due to errors Alex 2020-10-02 12:38:04 -0700
  • 754480332e
    Removed another wrong code line Alex 2020-10-02 12:31:29 -0700
  • b59e120227
    Trying BlinkDL improvements separately from PytorchLXLA Alex 2020-10-02 12:26:33 -0700
  • c9ee99858f
    Reverted back to old code due to serious issues with the new one Alex 2020-10-02 12:18:21 -0700
  • 17bfe78350
    Added all improvements by BlinkDL Alex 2020-10-02 04:30:46 -0700
  • 705483ec3a
    Added auto-TQDM Alex 2020-10-02 04:24:48 -0700
  • 20e5d4e654
    An attempt to tune the minGPT code. Please stand-by... Alex 2020-10-02 04:23:31 -0700
  • b139d2fc72
    Updated tqdm Alex 2020-09-22 05:52:28 -0700
  • b0906c2117
    #39 Happy Sugar Life 2020-09-20 21:48:57 -1200
  • 3ea94c6e7b
    typo 'terations' Happy Sugar Life 2020-09-14 04:50:21 -1200
  • 359137b457
    Add files via upload Alex 2020-09-09 20:46:55 -0700
  • 686b41c0f8
    [Racially Neutral Code] Happy Sugar Life 2020-09-05 02:07:22 -1200
  • 0fa482b35d
    Merge a796899f656345ac541aba49eccb368f49b7d730 into 4050db60409b5bbaaa3302cee1e49847fc145c65 Andrej 2020-08-30 11:41:57 -0700
  • a796899f65 reorg the bench code to support multigpu training, have to indent properly under __main__ feature/lightning Andrej Karpathy 2020-08-30 11:40:31 -0700
  • 492b79fb31 get rid of spurious function for the model Andrej Karpathy 2020-08-30 11:39:55 -0700
  • d91bb1c0be make labels non-blocking transfer to overlap them, but i don't really expect this to do too much to latency Andrej Karpathy 2020-08-30 11:11:46 -0700
  • 4817231b23 testing now works with both lightning and minLightning Andrej Karpathy 2020-08-30 11:11:17 -0700
  • 9b1e5a461f delete Result structs in favor of dicts Andrej Karpathy 2020-08-30 10:46:32 -0700
  • 452a5ab9a0 massive refactor yet again. this was all probably a pretty bad idea Andrej Karpathy 2020-08-29 23:58:45 -0700
  • 1aa67ca527 switch to a faster version of zero_grad() Andrej Karpathy 2020-08-29 20:50:48 -0700
  • ebd40f112c support fp16/32 precision in bench Andrej Karpathy 2020-08-29 17:47:06 -0700
  • 0ed3376b3f move instantiation of text dataset into the constructor so we don't have to create it twice Andrej Karpathy 2020-08-29 17:33:31 -0700
  • fa10298a8d use a standard benchmark (text8) and implement train/val/test splits Andrej Karpathy 2020-08-29 17:30:41 -0700
  • fb37e03cd1 refactor into a datamodule, attempt number 1 Andrej Karpathy 2020-08-29 16:38:58 -0700
  • 81650ae4d7 one more refactor, this is better because the equivalence to lightning is now much cleaner and all of lightning functionality is in one file Andrej Karpathy 2020-08-29 15:40:21 -0700
  • 990c0c7d9a final integration pieces, now runs with both, but it ain't super pretty yet... Andrej Karpathy 2020-08-29 15:19:55 -0700
  • a5a6d1a638 add training_step to the model and remove DataParallel functionality from the base Trainer, will go to Lightning Andrej Karpathy 2020-08-29 14:01:51 -0700
  • 923b6fcf17 and finally get rid of Config object for the Trainer Andrej Karpathy 2020-08-29 13:39:00 -0700
  • c0823ec247 model is also passed into fit() instead of __init__ ,sure. Andrej Karpathy 2020-08-29 12:48:24 -0700
  • e88f0767cb data loaders are passed directly to fit() instead of the dataset, version 2 haha Andrej Karpathy 2020-08-29 12:41:12 -0700
  • 3fa57cd175 data loaders are passed directly to fit() instead of the datasets Andrej Karpathy 2020-08-29 12:39:29 -0700
  • 08f5b9ac03 refactor out the learning rate decay class as a callback Andrej Karpathy 2020-08-29 12:28:49 -0700
  • 61102983f5 step 1: free the GPT module of config and flatten out the args. WIP, breaks notebooks Andrej Karpathy 2020-08-29 11:36:15 -0700
  • 4050db6040
    Merge pull request #32 from brchristian/patch-1 Andrej 2020-08-25 22:41:18 -0700
  • c43600576e
    Merge pull request #31 from fpgaminer/master Andrej 2020-08-25 22:40:47 -0700
  • 4b5d96b99c
    Fix typo in comment in play_char.ipynb brchristian 2020-08-25 20:40:17 -0700
  • a7b13e02ff fix CharDataset::__len__ off by one error fpgaminer 2020-08-25 18:16:49 -0700
  • a17028e8bc
    Merge pull request #3 from abiller/karpathy-master Ariel Biller 2020-08-25 22:17:29 +0300
  • f7560c1b0e i hate merging notebooks. there ought to be a law! ariel 2020-08-25 22:14:25 +0300
  • b8703820b7
    Create CODE_OF_CONDUCT.md Gaushik M.R 2020-08-25 07:29:02 -0400
  • 339f4e7ad3 fix dataloader issue pointed out by @fpgaminer in #28 and introduce shuffle=True and pin_memory=True as defaults. That said I'm still not very happy with this demo because we're likely overfitting a massive model to tiny text and nothing is really tuned at all. This needs a real train/test dataset and a tiny bit of hyperparameter search, todo. Andrej Karpathy 2020-08-24 23:23:53 -0700
  • f00dbe408c bugfix, pycharm com. does not refactor jupyter notebooks (!!) ariel 2020-08-24 23:28:59 +0300
  • 448c847781 Convert config to attrs - even less lines and now immutability is enforced. ariel 2020-08-24 23:16:22 +0300
  • ed8b745ea4 Renaming for enhanced readability, addint some type annotations ariel 2020-08-24 23:15:29 +0300
  • 18ce6dff5a
    Merge pull request #1 from karpathy/master Ariel Biller 2020-08-24 22:22:31 +0300
  • 8b6e3a0d83 remove comment j-planet 2020-08-24 00:10:45 -0700
  • 1f16e14924 replace the first 2 digits of the answer with 0s j-planet 2020-08-24 00:07:19 -0700
  • 94187b944c
    Merge pull request #25 from michaellavelle/shakespeare Andrej 2020-08-23 22:38:42 -0700
  • effa35fd93 Correcting the Bard's name Michael Lavelle 2020-08-24 06:35:45 +0100
  • 63902c8d09 remove passive aggressive comment. control yourself andrej. Andrej Karpathy 2020-08-23 19:36:23 -0700
  • 38d7327dfd instead of -1e10 use float -inf, which I think will play nicer with fp16 down the line Andrej Karpathy 2020-08-23 17:47:05 -0700
  • f683085892 resolve merge conflict, this is not going well at all “Andrej 2020-08-23 17:30:23 -0700
  • a8835cfebc bleh resolve merge conflicts “Andrej 2020-08-23 17:26:19 -0700
  • 5a67ab913d add early stopping to cifar10 image demo “Andrej 2020-08-23 17:19:45 -0700
  • 421caf8b20 mit license file “Andrej 2020-08-23 17:09:21 -0700
  • bbbdac74fa properly separate params that should be weight decayed, and make a small incremental step towards Lightning compatibility by creating the optimizer object inside the model's configure_optimizers “Andrej 2020-08-23 15:48:20 -0700
  • 23982656df add early stopping logic “Andrej 2020-08-23 15:09:09 -0700
  • 2a2e8d5388 Excluding LayerNorm, Embedding, freestanding Parameters and parameters with bias in the name from weight decay. Allowing an override list to be configured on TrainerConfig Michael Lavelle 2020-08-23 21:30:44 +0100
  • 08c08d8db0
    Merge f8ce2784903c93d7cec8d91fcb93c7d38a9e4baf into d100e2251a258ea6c72e59eeba83539567e8fc8c Benjamin Wild 2020-08-22 21:21:54 -0400
  • 14d2a75f20
    Merge ad77167036e87b72d6db117678020741005da6c6 into d100e2251a258ea6c72e59eeba83539567e8fc8c William Falcon 2020-08-23 07:55:33 +0700
  • 3065d61408
    Merge d8e08a2471468142838bd4e97d132fe52d89896c into d100e2251a258ea6c72e59eeba83539567e8fc8c Sergey Kolesnikov 2020-08-23 06:22:08 +0530