1
Feature Story
Someone Made a Dataset of One Million Bluesky Posts for 'Machine Learning Research'
Nov 27, 2024 · 404media.co
The dataset is intended for machine learning research and experimentation with social media data. Each post in the dataset contains text content, metadata, and information about media attachments and reply relationships.
Key takeaways
- A machine learning librarian at Hugging Face has released a dataset of one million Bluesky posts for machine learning research.
- The dataset includes when the posts were made and who posted them.
- Each post in the dataset contains text content, metadata, and information about media attachments and reply relationships.
- Daniel van Strien posted about the dataset on Bluesky, providing more details about its content and purpose.