z-logo
Premium
Uncovering social semantics from textual traces: A theory‐driven approach and evidence from public statements of U . S . M embers of C ongress
Author(s) -
Lin YuRu,
Margolin Drew,
Lazer David
Publication year - 2016
Publication title -
journal of the association for information science and technology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.903
H-Index - 145
eISSN - 2330-1643
pISSN - 2330-1635
DOI - 10.1002/asi.23540
Subject(s) - computer science , aggregate (composite) , semantics (computer science) , social media , test (biology) , information retrieval , microblogging , world wide web , data science , programming language , paleontology , materials science , composite material , biology
The increasing abundance of digital textual archives provides an opportunity for understanding human social systems. Yet the literature has not adequately considered the disparate social processes by which texts are produced. Drawing on communication theory, we identify three common processes by which documents might be detectably similar in their textual features—authors sharing subject matter , sharing goals , and sharing sources . We hypothesize that these processes produce distinct, detectable relationships between authors in different kinds of textual overlap. We develop a novel n ‐gram extraction technique to capture such signatures based on n ‐grams of different lengths. We test the hypothesis on a corpus where the author attributes are observable: the public statements of the members of the U . S . C ongress. This article presents the first empirical finding that shows different social relationships are detectable through the structure of overlapping textual features. Our study has important implications for designing text modeling techniques to make sense of social phenomena from aggregate digital traces.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here