Commit Graph

9 Commits

Author SHA1 Message Date
autoresearch 59e9dd9aab RoPE base frequency 10K to 200K 2026-03-08 04:13:30 +00:00
autoresearch 7da0b673a1 short window 1/8 context (256 tokens instead of 1024) 2026-03-08 04:07:44 +00:00
autoresearch 8363d52e8d SSSSL window pattern (5:1 short:long ratio) 2026-03-08 04:01:58 +00:00
autoresearch 4e6697f68d warmdown 0.5 to 0.7 (more cooldown) 2026-03-08 03:56:11 +00:00
autoresearch 7f2a65c9a5 depth 9 aspect_ratio 57 (extra layer, dim ~512) 2026-03-08 03:44:34 +00:00
autoresearch bea057bc08 halve batch size 524K to 262K for more steps in 5 min 2026-03-08 03:38:47 +00:00
Andrej Karpathy 8a5c4869bd bunch of small changes to docs and files, and a teaser figure with a blooper :) 2026-03-07 19:00:04 +00:00
Marcin Bogdanski 17b480aa65 add fallback FA3 kernel for non-Hopper GPUs 2026-03-07 01:31:48 +00:00
Andrej Karpathy b11d6f283f initial commit 2026-03-06 21:58:52 +00:00