D3.1 – Knowledge extraction models
In this document, we present the metrics, methods, and results of the analysis of data from existing prosumer communities performed in T3.1, that set the bases for the development of the Prosumer Intelligence Toolkit.
Research questions were defined after conversation with project partners and stakeholders, and interviews and focus groups with publishers, organized in the framework of WP2. For addressing these research questions, we collected and analyzed digital traces of user interactions among thousands of users from three popular prosumer platforms in which users co-create content: a fanfiction community where users create works based on existing fictional universes, and review one another’s work (AO3); a fandom wiki, where users create content collaborating on editing wiki pages to document any element related to a fictional universe (Fandom); and a social reading platform, where users create original stories and read and comment on one another’s work (Wattpad).
The central part of the document includes:
- An analysis of popularity dynamics in AO3, and a model to predict works that will become very popular in the near future, based on previous history. We have further re- adapted the model to apply it on tags, and predict topics that will become popular. Both with work and tag popularity prediction, we obtained a satisfactory accuracy with a simple and interpretable model. This responds to an emerging need of publishers and stakeholders to identify trending content and topics, and may help to understand trends in the interests of readers and writers, and to identify valuable content to be considered for publishing.
- An analysis of social interactions between AO3 users, modelled as graphs, studying structural properties of the social networks resulting from different kinds of interactions, and the centrality of the users in the community. We characterized each user by the combination of their centrality as a producer (feedback received an author) and as an active consumer (feedback given as a reader) in the network, finding consistent clusters across the major communities, that suggest the existence of different emerging profiles, among which the ones we dubbed superproducers, superconsumers, and superprosumers, as the users who have the highest levels of centrality in one of the two dimensions, or in both. We believe this “map” of prosumer roles may be helpful to understand the composition of a community and identify relevant users for specific aims.
- An analysis of collective dynamics in Fandom wikis, studying how activity on different tasks and spaces evolves over time and in different phases of community growth, for different communities, with an investigation on peaks of activity and their nature in the life of a community. The most edited pages result to be related to the main characters of each fictional universe. The amount of activity devoted to parallel spaces beyond editing the main content of the wiki (e.g. coordination, communication, technical aspects) varies substantially across communities; as a general tendency, the proportion of the effort spent for personal communication and interactions increases during the periods of higher activity in a community.
- An analysis of the dynamics, and the language and emotion of users’ feedback on a sample of very popular books from two diverse Wattpad categories: teen literature and classics. We found a tendency to have more activity on the first and last chapters of a book, and we have shown how language in the feedback around each book, and across books, can be characterized and compared through different tools for language and emotional analysis.
Finally, the document presents lines for future research and some preliminary ideas for the development of interactive dashboards in the Prosumer Intelligence Toolkit, to be developed in task T3.2.