AIHugging Face Blog2024-03-20Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models
AIHugging Face Blog2021-07-15Deep Learning over the Internet: Training Language Models Collaboratively