Data includes 274,668 posts scraped from Stormfront and 509,982 comments collected from the Reddit API. The following files are included:
- stormfront_posts.txt: one post per line, no post metadata
- reddit_posts.txt: one comment per line, no comment metadata
- stormfront_post_data_processed.json.gz: preprocessed posts from Stormfront, includes post metadata
- reddit_sample.csv.gz: preprocessed comments from Reddit, includes comment metadata
Twitter data used in the report is not available for public reuse because of Twitter's terms of service and our data use agreement with VOX-Pol.