AI

Reading Time: < 1 minute

Last Updated: 10/17/2024

I am going to start adding to this.

Language Modeling Data Sets:
Link – Salesforce // The WikiText Long Term Dependency Language Modeling Dataset
Link – Paperswithcode – Language Modeling
Link – Paperswithcode – Penn Treebank Dataset
Link – Kili – Open-Sourced Training Datasets for Large Language Models (LLMs) [ list of datasets [