Curator: Scalable data pre processing and curation toolkit for LLMs github.com 1 points by tanelpoder 11 hours ago