Create a level 2 section in your notebook called Setup.
Create a level 2 section in your notebook called Dataset.
feather files in orig folder.The dataset __details posted in slack___
consists of five files. For each import, and as feather format to subfolder orig.
df_authors from file authors.featherList of 13,182 reddit users who commented/submitted to QAnon identified subreddits. Columns:
author: unique identifier (hashed for user privacy - [see paper, Section 3]).is_QA_interested: 1 for QAnon-interested, 0 otherwise ([see paper, Section 1]).is_QA_enthusiastic: 1 for QAnon-enthusiastic, 0 otherwise ([see paper, Section 1]).status: stores [Active, DNE, Is_suspended]. (DNE=Does Not Exist)df_comments from file comments.featherList of 10,831,922 comments with full text. Columns:
id: unique identifier.link_id: submission id that this comment is in response to (NB: with prefix "t3_").parent_id: this points to link_id of the previous comment that this comment is replying to.author: hash of author identifier. Link to df_authors.author.subreddit: name of subreddit.body: the text of the comment.date_created.df_submissions from file submissions.featherList of 2,099,875 posts with full text. Columns:
subreddit:id: unique identifier.score:num_replies:author: hash of author identifier. Link to df_authors.authortitle:text:is_self:domain:url:permalink:upvote_ratio:date_created Should be read using parse_dates=["date_created"].df_subreddits from file subreddits.featherList of 12,987 subreddits where at least two QAnon-enthusiastic users have made a submission. Too many columns to list here.
df_paper from file paper.featherList of 19 subreddits, identified in the paper [see paper, Appendix A], where QAnon users were more active.