RECONSTRUCTING ARGUMENTATION PATTERNS IN GERMAN NEWSPAPER ARTICLES ON MULTIDRUG-RESISTANT PATHOGENS: A MULTI-MEASURE KEYWORD APPROACH

This study explores the reconstruction of argumentative patterns through keywords in a newspaper corpus on multi-resistant organisms. Starting from manually identified frequent argumentation patterns based on a previous study by Peters (2017), keywords are assigned to their assumed argumentative function. We calculate keywords using three different measures (log likelihood, log ratio, adjusted log ratio), which cover different frequency ranges. This approach allows us to explore argumentation on varying levels of semantic granularity. While an unambiguous category assignment is hardly achievable because frequent keywords appear in a wide range of contexts, keywords assigned to argumentation patterns do mostly occur in argumentative contexts. Most of our pre-determined argumentation patterns were successfully reconstructed using keywords. Moreover, we identify two patterns absent from our original annotation scheme. We also demonstrate that the different measures uncover words of noticeably different frequencies and thus argumentative specificity. Therefore, we deem keywords useful for exploring argumentative discourse.


Introduction
Multidrug-resistant organisms (MDRO) have been extensively covered in the media over recent decades.MDRO are bacteria resistant to numerous antibiotics, leaving limited treatment options.Pathogens like Methicillin-resistant Staphylococcus aureus (MRSA) pose threats especially in hospitals (nosocomial infections), where most infections are contracted (Cassini et al. 2019).Moreover, hospitalised patients are more vulnerable to further infections (Swaminathan et al. 2013).Medical studies have found that media coverage of MDRO influences family caregivers' attitudes, and that laypeople strongly base their beliefs on news reports (Heckel et al. 2017;Peters et al. 2019).Gill et al. (2006) report that press texts are the primary information source on medical topics for the public.According to Germany's central medical institution Robert Koch Institute, approximately 10,000-15,000 annual casualties result from hospital-related infections; often from MDRO.1 Thus, news coverage on this subject is likely relevant to the overall public perception of hospitals.
We propose that the study of argumentation in these articles can improve the understanding of which attitudes they reinforce.
Methodologically, we address challenges in the quantitative exploration of everyday argumentation by corpus linguistics: defeasible argumentation, implicitness, and the gap between linguistic and logical contentthe same propositions may be realised in various linguistic forms.
We hypothesise that keywords are useful to explore argumentation; as "pointers which suggest to the prospector areas which are worth mining" (Scott 2010, p. 51).Furthermore, this study is a qualitative, use-case-dependent evaluation of keyness measures: so far, they have mostly been evaluated regarding their statistical validity, but less so in terms of their usefulness in discourse analysis.
We address the following questions: -Which arguments can be reconstructed from German media articles on MDRO? -Are keywords appropriate to complement smaller-scale qualitative argumentation analysis?-How does the choice of keyword measure influence our results?This paper is structured as follows: section 2 presents related work in argumentation theory and quantitative argumentation analysis.Section 3 introduces our corpus and the keyword measures used in the study.Results are presented in section 4, where we provide an overview of recovered argumentation patterns alongside a methodological evaluation of keywords quality.The results are followed by the discussion (5) and a conclusion with suggestions for future work (6).

Argumentation analysis
In a traditional view, an argument contains one or several premises and a conclusion.Traditional argumentation schemes represent a strict relation between the premise(s) and the conclusion: the truth of an argument only depends on the truth of the premise Consider the classic modus ponens: If X implies Y and X holds true, Y also holds true.For instance, if all puppies are dogs and Polly is a puppy (proposition X), Polly is also a dog (Y).
In everyday language, however, logical implications of this kind are rare.Contrarily, argumentation follows defeasible logic, where premises and conclusions can remain implicit (cf.Walton et al. 2008).Moreover, the premise does not strictly imply the conclusion: The argument from correlation to cause states that, given that A and B are correlated; A is the reason for B (Walton et al. 2008, pp. 328-329).While correlation is often framed in this waywe hope that readers will excuse our own defeasible reasoningthe phrase Correlation does not imply causation is sufficiently conventionalised to be the title of a Wikipedia page. 2his application-oriented rather than normative type of analysis relates to the pragma-dialectic approach (van Eemeren and Grootendorst 2004Grootendorst , 2016)).Besides logical relations, it considers rhetorical patterns and extra-linguistic macro-structural factors increasing persuasiveness.Arguments can be indicated by particular linguistic constructions: some verbs implicitly reference causal relations (X destroys/ increases Y, van Eemeren et al. 2007, pp. 170-171), whereas on top of X, Y indicates coordination between arguments (van Eemeren et al. 2007, p. 214).In this sense, the pragma-dialectic view focuses on persuasive intention rather than normative correctness and shows that subtle linguistic patterns reflect argumentative functions.
Some argumentation frameworks distinguish between context-abstract and contextbound schemes.Context-abstract schemes are topic-agnostic and applicable to all discourses (Wengeler 2015;Kienpointner 1992;Wengeler 2003); e.g.references to highprestige discourse actors (argument from expert opinion: Walton et al. 2008, p. 310).Context-bound schemes derive from context-abstract schemes, where general patterns are enriched with features of the discourse topic (Wengeler 2003(Wengeler , 2015(Wengeler , 2005)).In healthcare discourse, context-bound versions of the argument from expert opinion include scientists, doctors or politicians filling the expert role.

Quantitative argumentation study
As a sub-field of Natural Language processing, Argumentation Mining processes arguments in large corpora.Such projects often tackle well-structured texts, like debatepedia; a database of professional debates (Cabrio and Villata 2013).Arguments in these data tend to correspond to normative schemes involving strict logical relations.A popular task is to identify structural components, including premises, conclusions and inter-relations like support or attack (Stab and Gurevych 2014).
Despite NLP primarily focusing on formal argumentation models, an increasing number of papers has adopted argumentation schemes for addressing everyday discourse (Cabrio et al. 2013;Hansen and Walton 2013;Feng and Hirst 2011;Janier and Saint-Dizier 2018;Walton and Macagno 2015).
By incorporating argumentation schemes, NLP tools are provided with a resource to handle defeasible logic omnipresent in everyday reasoning.

Corpus linguistics and argumentation
In a corpus linguistic contribution, Degano (2007) explores indicators for presuppositions and dissociations, starting from a list of markers from theoretical work (Levinson 1983, pp. 181-184).For instance, stating that someone did not manage to do something indicates the presupposition that they have attempted it and failed (Degano 2007, p. 366).
Although argument schemes have played a marginal role in corpus linguistics, compatible views on argumentation have been explored; often based on keywords.While statistical prominence does not infallibly indicate qualitative importance, keywords are a promising starting-point; provided "that looking at […] words in argumentative texts researchers […] can find, if not an 'objective discovery procedure,' certainly a significant 'test bed'" (Rigotti and Rocci 2005, p. 127).
O'Halloran (2011) combines manual coding with keyness in WMatrix (Rayson 2008).After annotating claims and challenges in a corpus, he compares linguistic patterns of argument components with the rest of the corpus using keywords, key POS tags and key semantic domains.Al-Hejin (2015) works closer to context-bound argumentation schemes studying news coverage of Muslim women.He focuses on macro-propositions: "global" motives of the general discourse topic (van Dijk 2008, p. 16).For example, when articles mention women's choices to wear a hijab, such statements are frequently followed by specific arguments, e.g.suggesting that they refuse integrating into Western culture (Al-Hejin 2015, p. 40).Macro-propositions relate to context-bound schemes, structuring the logical content of individual arguments in discourse.Premises and conclusions can rest implicit or have various linguistic realisations.Macro-propositions were identified by keywords, which were grouped manually and verified with key semantic domains.Baker (2004) compares pro-and anti-reform speeches in a political debate on equalising the age of sexual consent for homosexual men in Britain.Pro-and anti-reform representatives differ not only on the obvious lexical level, but also regarding the underlying logic: opposing arguments tend to form chains of individual arguments building upon each other, while proponents' arguments were less intertwined and "more straightforward" (Baker 2004, p. 104).Partington (2003) studies argumentation in White House briefings.Besides concordances of "pure" keyword categories, he demonstrates using n-grams that keywords are integrated into lexico-grammatical patterns, which further highlight their argumentative function.
Our keyword categories, which will be presented in the results section, capture frequent argumentative patterns; similar to context-bound schemes.We expect the keywords to be content-specific rather than general-language indicators.This contrasts our approach from pragma-dialectic frameworks but enables a focus on the particularities of argumentation within the discourse topic.We understand argumentation patterns as units of meaning which a) link a controversial topic to speakers' stance in argumentative contexts b) support argumentative speech acts through repeatedly presenting content-related features which need not be strictly argumentative in themselves Our analysis aims to reconstruct such patterns through keywords based on a manually created gold standard of context-bound schemes.

Data collection
Our corpus contains 1.3 million tokens from 1,200 German media articles on MDRO.It is part of a corpus of 10,000 texts and 14 million tokens collected with BootCat (Baroni and Bernardini 2004), covering MDRO and related issues (clinical hygiene, medical errors, drug resistance).
Additional seeds included general terms on bacteria (gramnegativ 'gram negative'), antibiotic resistance (antibiotikaresistent) and hospitals causing problems (krankenhausbedingt 'due to a hospital stay').The dataset required manual clean-up because some seeds, like Resistenzentwicklung ('resistance development') were too abstract.This was assisted by a script returning all seeds present in each text.
Each text was manually annotated for the actor group that the author and the intended readership belonged to (laypeople, alternative medicine, jurisdiction, hospitals, medical staff, agriculture, media, politics, economy, science).This allowed us to form sub-corpora.For instance, mass media texts were assigned to the author category media and the reader category laypeople.
The corpus was uploaded to CQPweb (Hardie 2012); enabling searches on various linguistic levels and statistical analyses.

Keyword extraction
We used three keyword measures: (1) Log Likelihood (G²), the most popular significance measure in corpus-based analysis (Dunning 1993) (2) Log Ratio (LR)an effect size measure implemented in CQPweb (Hardie 2012) based on the binary log of the ratio of relative frequencies (3) LRC, an adjusted version of Log Ratio, using the lower bound of a 99% confidence interval to adapt keyness values according to their significance) (cf.Evert et al. 2018).
Significance indicates whether there is enough evidence to confirm that a perceived difference between to datasets is not random.Given enough data, any two corpora will differ significantly, no matter how qualitatively meaningful this difference is (cf.Kilgarriff, 2001).Significance measures in general (Kilgarriff 2009) and G² in particular (Lijffijt et al. 2016) have been criticised regarding the interpretability of significance and the failure to reflect dispersion, respectively.Therefore, G² has been deemed inappropriate on its own for discourse studies (Gabrielatos 2018).
Effect size, on the other hand, measures the difference between association strengths regardless of noise due to data size.In other words, measures like LR are prone to low frequency bias: Association strength tends to be higher for uncommon words because small cooccurrence counts become weightier.
Thus, G² and LR "measure different aspects of a frequency difference" rather than being alternatives from a mathematical point of view (Gabrielatos 2018, p. 230).
LRC takes significance into account by reducing the keyness of any non-significant word to 0. However, significance is not purely a filter: it also influences the ranking of significant keywords by assigning greater weight to words when their (effect-size) keyness is supported by higher corpus frequencies (= more evidence).In this way, LRC acts as a middle ground between effect size and significance measures.
The calculations were originally carried out for an evaluation of the measures themselves relating to the general use-case of corpus-based discourse analysis; including argumentation, but also stance and metaphors (Evert et al. 2018;Peters, Dykes 2018).Keywords were assigned to predefined categories based on Peters' (2017) manual analysis of 343 MDRO newspaper articles, which; compared to the present corpus, underlay more restrictions (high-circulation newspapers only; exclusion of agricultural topics and of articles about countries other than Germany).Keywords were annotated independently by both authors, followed by a discussion to reach consensus on doubtful cases.New categories could also be proposed.
An overview of the top 20 keywords from each measure and their respective categories is shown in Table 1 (see Appendix of Tables below).
Independently from their statistical viability, we expect each measure to access a different frequency range: G² will likely generate words that are common in both the target and the reference corpus.We expect LR to yield words uncommon in the articles and in general language.Incorporating both significance and effect size, LRC should act as a middle ground.We aim to exploit these biases to extract both highly topic-specific and more generally widespread keywords.We annotated the top 200 keywords from measures (1)-( 3) to explore various semantic granularities.

Results
To evaluate argument reconstruction, all keywords were grouped by argumentation patterns.As we categorised arguments based on conceptual rather than lexical content, we expect logically equivalent arguments on various lexical frequency levels, and thus represented by keywords from all measures.We therefore present the results aggregated across association measures.

Argumentation patterns
Argumentation patterns were divided into three broad categories (general reference frames, causes for MDRO, solutions to MDRO).Each category was annotated on the level of smaller sub-parts.Figure 1 illustrates our annotation scheme for argumentation patterns.Figure 2 shows the coverage of pre-determined categories by measure where at least 5 keywords were found.
Below, we present the individual argumentation patterns, providing information on keyword precision based on a sample of 150 concordances per sub-category.While this reflects a small proportion of some of the patternsmore frequent ones like treatment errors had approx.3000 hitsthis number was chosen for manageability and the assumption that the total resulting sample of 1,734 concordances should provide a sufficient overview of argumentation strategies. 3We provide counts on how often the concordances referenced the anticipated scheme, how frequently it occurred in a different argument or was not part of an argument at all.The latter two cases were distinguished because a reference to a different argumentation pattern can still prove fruitful if concordances and expanded contexts are considered.

General reference frames
These schemes offer background information; setting a historical, biological, geographical or quantitative frame for the article content (see Table 1 in Appendix for an overview).

Evolution
Antibiotics strengthen the selective advantage of resistances developed through biological evolution.This must be avoided because antibiotics are common property.This scheme is prevalent in the corpus without necessarily being used argumentatively per se.Its argumentative function becomes dominant when foregrounding the rapidity of MDRO spread, suggesting a lack of control by medical professionals.Additional argumentative strategies include highlighting the arbitrariness in mutations, framing nature as overpowering: unter dem ständigen Einfluss der Antibiotika überleben und vermehren sich genau die Bakterienstämme, die zunächst zufällig durch Mutation eine Resistenz entwickelt hatten.Sie sind zudem in der Lage, die Resistenz durch Gentransfer an andere Bakterienstämme zur übertragen.('under the continuous influence of antibiotics, precisely those bacteria survive which originally developed a resistance by random mutation.They can also transfer resistances through gene transfer'; Zeit Online 1/05/2014) Such statements are often accompanied by military (abwehren 'fend off', steuern 'operate'): 21% of evolution keywords have at least one war metaphor marker within 20 words of context. 4War metaphors, represented directly by only one keyword, cluster around evolution keywords together with machine metaphors.While they are also generally frequent discourse features, their intertwinement is particularly notable within the evolution scheme: antibiotics may be described as building blocks and as fighting pathogens in the same sentence.Such associative rather than semantically coherent features suggest that the evolution scheme primarily appeals to an emotional level, despite its seemingly descriptive character.

Spread
The worldwide MDRO spread and increasing infection rates threaten global and national healthcare systems, especially in developing and emerging countries.
This pattern can be realised by mentioning infection rates or MDRO spread over geographical boundaries.It emphasises their dangerousness by highlighting quantitative data: authoritative status is claimed by accentuating measurability.
However, the numbers do not always actually refer to the number of MDRO: some articles provide instead the total number of clinical infections.This may be somewhat misleading, as this much higher value can be expected to be quite suggestive.
Related to the example of the argument of expert opinion in section 2, numeric infection rates are frequently accompanied by quotes from national and international experts, additionally reinforcing the weight of the argument (approximately 10% of hits within 20 words of context).While we did not annotate for expert opinion schemes, it shows that arguments of different schemes form inter-relations.

Medical history
Unreflected prescription of antibiotics leads to the loss of prior medical achievements, which mark milestones of societal development.Thus, the use of antibiotics must be reduced.This scheme is the only one yielding exclusively true positives from the keywords (cf.Table 2 in the Appendix of Tables).However, it is represented by a small number of hits.The perfect precision is unsurprising: the keywords are the proper names Semmelweis and Lister and Kindbettfieber ('childbed fever'), a disease central to Semmelweis' pioneering work in clinical hygiene in the 19th century.Mentions of these terms will likely always be instances of the argument in question.
This argument portrays medical history in an extremely positive way, sharply contrasting this with criticism towards the present.

Country comparisons
National healthcare systems handle MDRO in different ways.Comparing their strategies reveals more and less favourable approaches to the issue.In the sample, country comparisons fulfil three functions: 1) The situation in a given country is evaluated as better than in Germany.The primary example are the Netherlands, but other European countries may also be described as role models; mainly regarding hygiene policies.This argument can be intertwined with the spread scheme, contrasting infection rates to highlight the importance of rigid measures.
2) The situation is portrayed as equal to Germany.This version is comparatively rare.It occurs primarily in collectivising descriptions, when Europe and the USA are foregrounded as "Western" countries.While they are conceptually close to the reader, they also cover a large geographical area.The combination of perceived proximity and wideness of spread fosters the impression of threat.
3) The situation country is evaluated as worse than in Germany.This variant includes Asian countries, particularly India and Pakistan which are depicted very negatively because they are the origin countries of the enzyme NDM-1 found in some MDRO.The dichotomy of Eastern and Western emotionalises through highlighting the "foreignness" of MDRO.

Causal schemes
Causal patterns name reasons for MDRO transmission or for resistance development.They usually highlight a single reason, in or outside the hospital.
Table 3 in the Appendix of Tables presents the causal schemes that reconstructed through keywords and their precision, as evaluated by a concordance sample.Causal schemes covered by keywords are presented below.

Agricultural causes
Animals are kept under poor circumstances due to farms' profit-orientation that encourages factory farming.The use of antibiotics in farms leads to antibiotic resistances.
This pattern has diverse realisations and is often intertwined with other arguments.For instance, it occurs with comparisons of international resistance rates or antibiotics policies.
Keywords relating to agricultural causes include general pointers like Fleisch ('meat') to more specific items like Hähnchenfleisch ('chicken meat') or Rohwurst ('raw sausage').
For this scheme, precision is very high, with 87% of the sample yielding true positivesagriculture is hardly mentioned in other contexts of the corpus.

Hygiene problems are fostered by understaffing, poorly trained staff or poor personnel structure. Inconsequent hygiene standards threaten patients, cause economic losses and contribute to resistance development.
This scheme is closely related to economic causes.It underlines dangers triggered by hospitals prioritising economic interests over patient safety.For example, understaffing is portrayed as increasing medical mistakes.Another version criticises the personnel structure: hospitals are faulted for not employing clinical hygienists and leaving the evaluation of hygiene measures to general practitioners for financial reasons.

MDRO development is accelerated by staff prescribing too many antibiotics and by patients inadequately requesting medicine.
Treatment errors are attributed to hasty antibiotic prescription in minor or virally transmitted diseases; enforcing the selection bias of MDRO.
The most frequent keywords here are verschreiben/verschrieben ('prescribe'), and Viren ('viruses').Passages relating to viruses mostly place the responsibility with the patients, who influence the treatment outcome by inadequately requesting antibiotics.
A different version centres broad-spectrum antibiotics, which attack many bacteria, thus reinforcing MDRO's selectional advantage.This argument does not advocate the reduction of antibiotics as such, but rather a more thorough assessment of patients' needs, similarly to the negligence scheme.

Staff's carelessness and negligence of hygiene measures can foster the development of MDRO.
The scheme of inadequate treatment is often directed towards general practitioners, who are mostly consulted with less severe conditions.The negligence argument usually refers to staff in hospitals, accused of omitting disinfection and cleaning measures; or poor personal hygiene.Such assumed violations of official duty can contain strongly evaluative language (Hygieneskandal 'hygiene scandal'; faul 'lazy'; verdreckt 'filthy').

Pharmaceutical companies follow the rules of the market. Their profit-orientation impairs the development of new antibiotics.
Pharmaceutical actors are portrayed larely negatively.They are assigned substantial responsibility, and they are criticised for being oriented towards economic viability.The articles suggest this attitude is inadequate in healthcare contexts, as in the scheme of hospitals' economic efficiency.This is reflected in compound nouns like Pharmalobby or Pharmariese ('pharmaceutical giant') or adjectives like gierig ('greedy').

Solution schemes
Solution schemes suggest strategies to reduce pathogen spread and resistance development.We differentiate between their levels of implementation: some solutions can be implemented within an individual hospital.Others apply to more general structural levels like federal or national health policies.
Table 4 (see Appendix of Tables below) presents the solution schemes with keywords and their precision.Below, we elaborate more closely on solution patterns.

Structural changes to the healthcare system reduce infection risks through limiting resistance development and spread by improving hospitals' economic situation or establishing more rigid hygiene measures through new legal frameworks.
Current structural approaches are described as insufficient.Political actors are thus attributed a responsibility for the general public, accompanied by the criticism that they do not execute it adequately.This scheme is combined with expert citations, where authorities demand higher legal control and stricter regulations.

Clinical approaches
Hygiene and quarantine measures in hospitals can improve patients' and staff's hygienic situation, improving treatment and reducing the spread of new infections.
Establishing and following rigid hygiene measures is a central desideratum according to this scheme.It can be embedded into long narrative structures, with detailed descriptions of hygiene specialists' professional routines.Hygiene experts are attributed rather uncommon metaphors: as implementers of hygiene guidelines, staff members are Jäger ('hunters') auf der Pirsch ('deerstalking').While the hunting domain also occurs in other discourse contexts, outside of this particular argument pattern the roles of hunter and the hunted are usually reversed: hunting metaphors are mostly used to portray MDRO as intentional, strategic actors wilfully harming (hunting) patients.

Schemes not covered by keywords
As shown above, keywords can help to recover argumentation patterns.However, not all patterns from our gold standard were equally reflected in the keywords.In this section, we discuss deviations from the manual analysis.
Underrepresented schemes Three causal patterns found in the manual analysis were assigned three or fewer keywords: 1) genetic engineering as a cause MDRO develop through genetic engineering in agriculture.When crops and livestock become resistant against bacteria and micro-organisms, these resistances may be transferred to bacteria.This pattern was extremely infrequent in the manual analysis.Thus, it is not surprising that it was not represented by separate keywords.Its similarity to the pattern of agricultural causes makes it plausible that when they do occur, references to genetic causes are covered by the same keywords.

2) economic causes Hospitals emphasise short-term economic viability in their decisions. This encourages austerity measures like staff cuts, which indirectly leads to the further spread of MDRO.
This pattern is closely related to the working conditions scheme.The concordances show how economic and personnel related causes occur within the same sentence: (1) Es gab enorme Sparprogramme; man hat den Aufwand für Hygiene in vielen Spitälern ‹outgesourct›.(There were massive austerity measures; the effort for hygiene was 'outsourced') (2) " Das alles kostet aber viel Geld, und überall wird gespart", sagte der Mediziner.In den Kliniken gebe es immer weniger Pflegepersonal, im stressigen Klinikalltag bliebe deshalb oft kaum Zeit für die grundsätzlichen Dinge.("But all of that is very expensive and everything is economised", the medic said.Hospitals have less and less staff, there was no time even for fundamental tasks during the stressful clinical routine).
3) unethical actions by staff Staff members act unethically to gain personal advantages.They knowingly accept harmful consequences for patients, including MDRO infections.
As with 2), this causal pattern is very infrequent in the manual analysis.It is only represented by the keyword Risikopatient ('high-risk patient').The low frequency in the gold standard suggests that its lack of indicative keywords is connected to its general infrequency.

New schemes
During the annotation, two schemes were identified outside of the original categorisation scheme.

1) Description of symptoms (general reference frame)
MDRO infections lead to numerous unpleasant and dangerous symptoms.Therefore, they pose a threat not only on a macro-social level, but also to each individual.
The description of symptoms is strongly emotionalising due to its detailed elaboration on painful, disconcerting issues.It can be accompanied by very personal depictions of individual patients, describing their conditions, patient history and the severe damage to their personal and professional lives.

2) Solutions by alternative medicine
Traditional methods of medicine do not sufficiently control MDRO.Therefore, alternative methods and "natural" substances should be used in addition to or instead of antibiotics, as they are better suited to heal diseases.
This scheme seems to be infrequent in the articles themselves, but it often occurs in reader comments.It is mostly realised in conjunction with the description of causes for MDRO spread and development: according to the argument, no satisfactory solution can be achieved with antibiotics.For instance, it points to a lack of reflectivity in doctors prescribing too many antibiotics.Thus, patients are encouraged to intervene on their own with selenium or colloidal silver, which are assumed to have bactericidal effects.

Argumentation explored by keywords
As Tables 1-3 show (see Appendix of Tables below), the false positive rate is below 20% for most concordance samples in the various patterns.The general reference scheme relating to the reflection of mass media as an actor has more false positives than the other schemes; mainly due to the frequent occurrence of the word Artikel 'article'a word appearing in boilerplates encouraging readers to comment or share the text.While we manually removed many of these passages before the analysis, they were still sufficiently frequent to generate a small number of misleading keywords.
Nevertheless, our results suggest that the vast majority of keywords that were assigned to argumentative schemes by the annotators after only brief consideration of a few concordances were indeed embedded in argumentative structures.However, they frequently occur in other patterns than was predicted during the initial category assignment.
We examined category mismatches for systematic overlaps similar to issues that a purely manual analysis would facefor instance, due to cause-effect-relationships in the opposition between causal schemes and their proposed solutions.
For keywords assigned to causal schemes of negligence or economic efficiency, the most frequent alternative scheme was the solution scheme on a hospital level.Similarly, the scheme of country comparisons overlapped with the scheme of geographical spread.This seems to confirm that some category overlaps are systematic and would be expected issues within any analysis.
In other cases, the mismatches were less systematic and sometimes even unpredictable, indicating that relying solely on keywords without considering larger amounts of context is insufficient.This is unsurprising because one of the central principles of argument schemes is that they are largely dissociated from lexical specifications; building on the underlying (defeasible) logic.
One scheme stands out with a markedly low true positive rate: the scheme of biological evolution.This is to be expected, because evolution is integral to the discourse topic, resulting in high lexical frequencies across the corpus.
Every concordance sample yielded a sufficiently large number of the expected argumentation pattern, suggesting that keyword analysis is a meaningful entry point to argumentation.
Parts of the gold-standard were not represented by their own keywords.However, they were found in the concordances of keywords assigned to other schemes.Thus, we assume that they could still be re0constructed.The precise categorisation of arguments will depend on interpretation: it has been shown here and in prior work that authentic arguments overlap and lack clear boundaries (Anthony and Kim 2015).

Keyword measures
As mentioned in section 3.2, the study included three association measuresthe standard measure in corpus-based work; G², alongside two versions of Log Ratio (LR and LRC); the latter of which took significance into account via a confidence interval.
Of 455 total keywords from the top 200 candidates for each calculation, 167 keywords were annotated as indicating argumentation pattern.Only 14% of these argument keywords are found in the top candidates for all measures.Figure 3 and Table 5 (see Appendix of Tables) show the items yielded by the various (combinations of) measures.

Figure 3: Proportion of keywords found by specific subsets of measures
As expected from the measures' biases towards higher respectively lower frequencies in the reference corpora, the keywords cover spectrums of varying specificity, which is also reflected in their discursive functions.
Words exclusive to LR indicate specific aspects of more abstract arguments.For an analyst familiar with the discourse, Rohwurst ('raw sausage') clearly relates to factory farming and food hygieneraw sausage is highlighted as a particular risk in the MDRO context.
The LR keyword Guanbara-Bucht references a specific outbreak which gained widespread media attention.During the Olympic games in Brazil in 2016, Guanbara Bay was found contaminated with MDRO, which was seen as a risk of worldwide spread.While close familiarity with the topic is necessary to interpret such keywords, they provide precise cues for discourse strategies.
Keywords yielded by measures incorporating significance are generally less specific, as a higher frequency in the reference corpora is required.The semantics in the target corpus are usually more specific than in the reference data: jährlich ('yearly') often refers to spread scheme, highlighting continually increasing infection rates.
Importantly, the different keyness measures are not meaningfully comparable in terms of their statistics.Our aim, however, is to exploit their biases to access different levels of general-language prevalence; and to evaluate their qualitative usefulness in the case of argumentation in a topic-bound discourse.
Thus, we combined different measures to discover argumentation keywords of various specificity.Figure 4 visualises our suggestion as to how these measures relate to one another in the present study.Keywords generated by both G² and at least one LR version are represented in green.They might be argued to cover a prototypical middle ground in terms of balance between specificity and generality; being reasonably frequent outside the specialised corpus.This includes items like Antibiotikaeinsatz ('use of antibiotics').Simultaneously, they are more specific than G²-exclusive keywords shown in orange.G² generates keywords frequent in both the target and the reference corpus (high frequency bias (Paquot and Bestgen 2009;Lijffijt et al. 2016).LR highlights words with low frequencies in both corpora, highlighting the differences in relative frequency regardless of significance.LRC tends to moderate this low-frequency preferenceits keywords are frequent enough to be significant, but still rather specific.

Conclusion
From our understanding of argument patterns, defining failsafe criteria for their realisations is challenging, because these schemes are not solely defined by the repetition of lexical patterns.This issue has been encountered in previous applied studies based on context-independent argumentation schemes (Mochales and Ieven 2009;Song et al. 2014).However, while the process of identifying argument patterns is difficult to operationalise, it can be rendered plausible and transparent to a certain degree.
Using keywords as a starting point for context-bound argumentation schemes, major argumentative strategies were successfully recovered without relying on explicit argument markers like modals or particular pre-chosen verbs.Our qualitative gold standard from previous manual analysis helped us to test this approach by knowing what types of arguments to expect.The fact that most of the expected arguments were covered by several keywords is encouraging, making it plausible that major argumentative patterns can be uncovered in previously unexplored thematically bound corpora.At the same time, our results emphasise that not all concordances fit the expected pattern.Some expected patterns were not directly indicated by designated keywords, even though they were present in the corpus.Their realisations were found by examining the contexts of keywords assigned to other categories.
The rather low precision for some of our annotated samples suggests that it would be fruitful to incorporate more of the ideas from within the pragma-dialectic approach to identify lexico-grammatical patterns as increasingly concise pointers to particular arguments.One way to accomplish this is by using corpus queries, which can be done with the CQP query language (Evert and Hardie 2011).Specifically, the Corpus Workbench allows a user to combine words and all levels of annotation within a single search.Custom macros and wordlists can be stored to be reused in several queries.For instance, the following query is expected to find a subset of arguments relating to treatment errors with high precision: Elements marked with $ are wordlists; in this case keywords annotated as medical actors resp.the argument of inadequate treatment.The query matches follow the type Ärzte verschreiben zu viele Antibiotika ('doctors prescribe too many antibiotics').Specifying the occurrence of keywords within particular grammatical environments designated by phrases or POS seems to us a promising step for future research.
While quantitatively prominent lexical items provide a broad argumentative structure, the particularities of realisation and less frequent patterns can be uncovered by concordancing.Thus, keywords function as a valuable entry point to studying argumentation in thematically specified corpora.
Combining different keyword measures enables access to different levels of lexical frequency.Due to the biases in each measure, the keywords markedly differ in specificity.High-frequency keywords found by G² provide an anchor to everyday discourse, uncovering stances associated with overall common words.LR keywords point more directly to concepts unique to the discourse.Thus, they enable the analyst to explore smaller sub-discourses and realisation variants of larger argument patterns.
By highlighting different frequencies, each measure contributes to the analysis of individual arguments, which share logical content, but realised through different linguistic means.The concept of context-bound schemes, on the other hand, has allowed us to combine diverse linguistic patterns according to their use of everyday argumentation.

Figure 1 :Figure 2 :
Figure 1: Annotation scheme for argumentation patterns in MDRO media discourse

Table 1 :
Top 20 keywords by measure and their categories.

Table 2 :
General reference frames.[1]Ignaz Semmelweis (1818-1865) is one of the central historical figures in the discourse on clinical hygiene.He was the first researcher to link childbed fever to issues in clinical hygiene and is therefore framed as a pioneer in the MDRO-related press coverage.His role might be somewhat comparable to Alexander Fleming in a more international context.

Table 5 :
Unique keywords per subset of measures.