1
0
mirror of https://github.com/karpathy/minGPT synced 2024-11-15 19:10:39 +01:00

Commit Graph

  • 26d5e00274 fix: add missing dependency in setup.py Benjamin Schulz 2023-01-07 13:29:34 -0600
  • 325000e631
    Fix typo in bpe.py Ikko Eltociear Ashimine 2023-01-06 23:01:47 +0900
  • 06b7c3200e Update README.md Chinhai Hour 2023-01-06 12:05:19 +0700
  • b45b695029
    Update generate.ipynb Daniel Gross 2022-11-05 11:08:55 -0400
  • 69ff7ac7d7
    add dependency reminder, remove semicolon Daniel Gross 2022-11-05 11:08:28 -0400
  • a422563cea
    Remove extraneous semicolon Daniel Gross 2022-11-05 11:00:31 -0400
  • 7c22554ef6 named_parameters does not have to be recursive Equim 2022-11-03 01:31:47 +0800
  • b59643b884
    Update readme.md Dmitry Nikolayev 2022-10-20 14:58:34 +0200
  • 59047a6cab
    Update README.md Dmitry Nikolayev 2022-10-20 13:14:10 +0200
  • a362aa626b clean up commented code Gav Gray 2022-09-10 09:13:55 -0400
  • d29554614d make data_parallel optional Gav Gray 2022-09-10 09:11:27 -0400
  • a8dbbd13fc changes required to use dataparallel Gav Gray 2022-09-10 09:07:10 -0400
  • 92facc2d82 Merge branch 'master' of github.com:ericjang/minGPT Eric 2022-08-05 10:58:34 -0700
  • 7deb6d50c8 refactor encoder/decoder into separate initialization methods that can be overwritten in subclasses Eric 2022-08-05 10:58:19 -0700
  • adf1e57252
    Update mingpt/model.py Younes Belkada 2022-08-05 18:46:15 +0200
  • 92b54e7d1d added new assert younesbelkada 2022-08-05 09:28:51 +0200
  • f345c397cc remove dummy class younesbelkada 2022-08-05 09:24:48 +0200
  • c4bce59533 add dtype support younesbelkada 2022-08-05 09:24:15 +0200
  • deadd49230 add dtype support on config + added tests younesbelkada 2022-08-05 09:23:25 +0200
  • 36b3e763cc
    Merge branch 'karpathy:master' into master Eric Jang 2022-08-04 16:37:28 -0700
  • 43dcee2b7d better name Eric 2022-08-04 16:37:04 -0700
  • cb32149497 move input + pos embedding computation into a separate method. Eric 2022-08-04 16:35:35 -0700
  • 7218bcfa52
    Merge pull request #84 from ericjang/master Andrej 2022-08-03 21:35:21 -0700
  • 48c815bb16 Add setup.py to allow mingpt to be used as a third-party library Eric 2022-08-03 16:55:11 -0700
  • cafce4544b
    Merge pull request #83 from mishig25/patch-1 Andrej 2022-07-28 17:04:56 -0700
  • 90420ee978
    Use XOR operator ^ for checking assertion type_given XOR params_given Mishig Davaadorj 2022-07-28 22:33:51 +0200
  • ace6f596a1
    Merge e87c30e538c3dddd2f7d592766519428d4e667c0 into ca74e9a13c6903c643f2879172118cc0d4a226bc Matt Stancliff 2022-07-27 21:51:18 +0200
  • ca74e9a13c
    Merge pull request #82 from neverix/patch-1 Andrej 2022-07-27 11:46:53 -0700
  • e461bf6f00
    Fix README.md typo neverix 2022-07-27 21:10:29 +0300
  • 31559f7dc5
    Merge pull request #81 from luigidisotto/callbacks-optimizer Andrej 2022-07-26 09:44:31 -0700
  • c4c650e3d5 Add optimizer to Trainer's self for callbacks. Luigi Di Sotto 2022-07-26 10:17:44 +0200
  • e87c30e538 Refactor BPE and add to poetry runner Matt Stancliff 2022-07-24 13:19:34 -0700
  • bf44c172b0 Cleanup chargpt and add it to poetry launcher Matt Stancliff 2022-07-24 13:02:11 -0700
  • eed8054132 Add poetry environment for package Matt Stancliff 2022-07-24 12:51:32 -0700
  • 2f89bbb840 Use proper README.md file name conventions Matt Stancliff 2022-07-24 12:44:23 -0700
  • 4d4ad74956 Improve readability in more places Matt Stancliff 2022-07-24 12:31:49 -0700
  • 8f05460427 Use modern python string formatting in adder Matt Stancliff 2022-07-24 12:30:48 -0700
  • 8e9d737cc8 Use explicit digit counts in adder Matt Stancliff 2022-07-24 12:29:29 -0700
  • 8e111e74da Allow callbacks to store state in trainer Matt Stancliff 2022-07-24 12:27:13 -0700
  • e6e12ec628 Introduce pathlib.Path instead of os manipulation Matt Stancliff 2022-07-24 12:22:18 -0700
  • c67fb29c08 Use python built-in iterator cycling Matt Stancliff 2022-07-24 12:16:07 -0700
  • 85a9ee9326 Add default result directory to .gitignore Matt Stancliff 2022-07-24 12:14:21 -0700
  • d60e6825f1 Reformat everything properly Matt Stancliff 2022-07-24 12:02:58 -0700
  • e2065c59c6 use a bit more extended example that has my last name too because nice to show how it breaks up into more tokens Andrej 2022-07-12 04:31:31 +0000
  • d8dd157f9c add a full example into the script as well Andrej 2022-07-12 04:25:17 +0000
  • 59fea1ba1f
    Merge pull request #78 from nat/patch-1 Andrej 2022-07-11 21:05:32 -0700
  • e9f6e3d448
    Typos Nat Friedman 2022-07-11 20:55:38 -0700
  • 0fc12d703d adjust the readme docs to reflect bpe changes Andrej 2022-07-12 02:14:39 +0000
  • 9642f40b83 add a refactored BPE encoder from openai. Basically I dont super trust the huggingface tokenizer, the implementation sprawls multiple files and inheritance and has special magic handling around AddedTokens that I don't fully follow. Prefer to roll our own explicit implementation here that exactly mirrors the code of OpenAI and nothing else Andrej 2022-07-12 02:01:41 +0000
  • 40635a91f4 few added todo notes to readme Andrej 2022-07-11 18:59:29 +0000
  • acaadacd59 refactor sequence generation into the model and match the huggingface/transformers api. touches everything but this makes a lot more sense to me aesthetically Andrej 2022-07-11 18:50:53 +0000
  • 5af9e5c5d7 small typo fixes in readme Andrej 2022-07-09 00:41:50 +0000
  • 610baf2314 quick docs on some planned todos Andrej 2022-07-09 00:27:51 +0000
  • 12f346a63d make generation script into a notebook, makes much more sense that way i think, and much easier to use refactor_wip Andrej 2022-07-08 23:53:14 +0000
  • b14c99191a rename the script to make sense Andrej 2022-07-08 22:58:49 +0000
  • 803f38800d refactor pretrained weight loading into from_pretrained and add unit tests Andrej 2022-07-08 22:56:15 +0000
  • 4a56b20f80 fix parameter counting Andrej 2022-07-08 21:10:54 +0000
  • 28bcbd0ad9 ocd is killing me Andrej 2022-07-08 20:35:08 +0000
  • 449a980d39 updates to readme Andrej 2022-07-08 20:34:23 +0000
  • b7c9acc46c remove legacy notebooks Andrej 2022-07-08 20:20:35 +0000
  • e7fe54898d refactor readme to match the repo Andrej 2022-07-08 20:19:17 +0000
  • 7569ab9d7f simple notebook demo showing how to use minGPT Andrej Karpathy 2022-07-07 18:15:26 -0700
  • 2e979dde5f ummm eyeroll Andrej 2022-06-27 20:53:51 +0000
  • 2f3400f42a split out register_callback to set/add Andrej Karpathy 2022-07-01 08:32:19 -0700