RAND: Reconstructing Arguments from Newsworthy Debates

Large portions of ongoing political debates are available in machine-readable form nowadays, ranging from the formal public sphere of parliamentary proceedings to the semi-public sphere of social media. This offers new opportunities for gaining a comprehensive overview of the arguments exchanged, using automated techniques to analyse text sources. The goal of the RANT/RAND project series within the priority programme RATIO (Robust Argumentation Machines) is to contribute to the automated extraction of arguments and argument structures from machine-readable texts via an approach that combines logical and corpus-linguistic methods and favours precision over recall, on the assumption that the sheer volume of available data will allow us to pinpoint prevalent arguments even under moderate recall. Specifically, we identify logical patterns corresponding to individual argument schemes taken from standard classifications, such as argument from expert opinion; essentially, these logical patterns are formulae with placeholders in dedicated modal logics. To each logical pattern we associate several linguistic patterns corresponding to different realisations of the formula in natural language; these patterns are developed and refined through corpus-linguistic studies and formalised in terms of corpus queries. Our approach thus integrates the development of automated argument extraction methods with work towards a better understanding of the linguistic aspects of everyday political argumentation.

Research in the ongoing first project phase is focused on designing and evaluating patterns and queries for individual arguments, with a large corpus of English Twitter messages used as a running case study. In the second project phase, we plan to test the robustness of our approach by branching out into additional text types, in particular longer coherent texts such as newspaper articles and parliamentary debates, as well as by moving to German texts, which present additional challenges for the design of linguistic patterns (i.a. due to long-distance dependencies and limited availability of high-quality NLP tools).

Crucially, we will also introduce similarity-based methods to enable complex reasoning on extracted arguments, representing the fillers in extracted formulae by specially tailored neural phrase embeddings. Moreover, we will extend the overall approach to allow for the high-precision extraction of argument structure, including explicit and implicit references to other arguments. We will combine these efforts with more specific investigations into the logical structure of arguments on how to achieve certain goals and into the interconnection between argumentation and interpersonal relationships, e.g. in ad-hominem arguments.