Scaling Laws for Representation Learning
Scaling Laws, Data-Efficient Learning, Foundation Models, Tokenization
Conducted an empirical analysis on the limits of self-supervised learning in low-resource regimes. Analyzed the trade-off between tokenization density and dataset scale, finding that domain-aligned tokenization serves as a stronger supervision signal than sheer data volume for specialized distributions.
Paper
Slides