Family Practice Vol. 19, No. 5, 466-468
© Oxford University Press 2002
Clinical findings in patients presenting with sore throat A study on inter-observer reliability
Department of Primary Care, Rehabilitation and Preventive Medicine, University of Marburg,
a The Surgery, Oettingen and
b Department of Ear, Nose and Throat, University of Marburg, Germany.
Dr Norbert Donner-Banzhoff, MHSc, Senior Lecturer, Department of Primary Care, Rehabilitation and Preventive Medicine, University of Marburg, Blitzweg 16, D-35033 Marburg, Germany; E-mail: norbert{at}mailer.uni-marburg.de
Donner-Banzhoff N, Beck C, Meyer F, Werner JA and Baum E. Clinical findings in patients presenting with sore throat. A study on inter-observer reliability. Family Practice 2002; 19: 466468.
Received 17 October 2001; Revised 2 April 2002; Accepted 13 May 2002.
| Abstract |
|---|
|
|
|---|
Background. Several clinical prediction scores have been developed to help practitioners assess the probability of streptococcal throat infection. Prior to this study, it was not known how reliably doctors assess the signs that contribute to these decision aids.
Objective. The aim of this study was to measure the inter-observer reliability of clinical findings related to sore throat.
Methods. Consecutive patients presenting with sore throat in five primary care practices in Germany took part (n = 126). Each patient was assessed independently by two doctors with regard to lymph nodes, pharynx, soft palate and tonsils.
Results. Agreement among practitioners was not satisfactory.
Conclusions. Results suggest that the performance of clinical scoring systems can be improved by training on how to elicit relevant clinical signs. Our findings cast some doubt on the effectiveness of under- and post-graduate training in this area.
Keywords. Observer variation, pharyngitis, physical examination, streptococcal infections, tonsillitis.
| Introduction |
|---|
|
|
|---|
Sore throat is one of the most frequent symptoms presented in primary care practice. To help practitioners make accurate predictions with regard to group A ß-haemolysing streptococci, several clinical scores have been developed.16 Previous studies in this area explore the validity of historical information, clinical and laboratory findings. No single criterion has been sufficiently accurate, and therefore the combination of several items is usually suggested.
For a clinical score to be of use, its individual items must be measured and reported by different practitioners (observers) in an identical way.7 We therefore asked to what extent family practitioners would agree with regard to their eliciting and recording of clinical signs that are part of strep throat scores of proven validity.
| Methods |
|---|
|
|
|---|
The study was conducted in five primary care practices, in which at least two qualified doctors routinely were present. One city practice, two small town and two rural practices took part. Consecutive patients presenting with sore throat were included. Treating practitioners recorded their findings on anterior cervical lymph nodes (palpable/tender), the posterior pharygeal wall (red/ granulations), soft palate (red/blisters) and tonsils (absent/enlarged/exudate).
After one practitioner had recorded his/her findings, the other doctor performed an independent examination of the patient and recorded his/her findings on a separate form. The forms, which were anonymous with regard to patients and practitioners, were sealed in separate envelopes and stored in a designated box. Participating doctors were asked not to discuss cases before forms were filled in and envelopes sealed.
To estimate agreement between observers, we present
-coefficients. However, we did not attempt to ascribe recordings to any particular doctor in order to preserve anonymity and thus encourage realistic recording. Furthermore, calculations are based on observation pairs from five practices combined.
-coefficients should therefore be interpreted with caution. We also report the proportion of agreements for each criterion with 95% confidence intervals. To give an indication of the prevalence of positive findings, we also report the average proportion of patients positive for each particular sign.
| Results |
|---|
|
|
|---|
Five practices provided data on 126 patients (14, 16, 26, 20 and 50, respectively). Six of 117 (5%) patients with information regarding age were <6 years old.
There was high agreement (>90%) among doctors with regard to blisters on the soft palate and tonsilar exudates. However, we found low concordance (<70%) for palpable anterior cervical lymph nodes, and reddening of the posterior pharyngeal wall and the soft palate. Tenderness of lymph nodes, lymphoid granulations and enlargement of tonsils ranked intermediate (7080%) (Table 1
).
-coefficients were highest for tenderness of anterior lymph nodes and enlargement of tonsils. Overall agreement did not differ between practices (data not shown).
|
| Discussion |
|---|
|
|
|---|
Agreement between primary care practitioners on clinical signs in patients presenting with sore throat was low.
The inter-observer agreement shown for eight items has to be seen against the backround of prevalence of the respective positive findings. At first glance, concordance seems to be high (>90%) for blistering of the soft palate and for tonsilar exudates. However, these results apparently are inflated by a low prevalence of positive findings, which is reflected by comparatively low
-coefficients. Redness of the posterior pharyngeal wall and the soft palate probably posed difficulties by the very nature of these signs. Although a clear and comprehensive definition was given on each report form, the distinction between normal mucosa, which are red, and abnormal redness was made differently by participating practitioners. Lymphoid granulations are a more recent concept, not universally taught in under- and post-graduate curricula. Accordingly, participating doctors learnt to use this sign only through their participation in the study.
To guide decision making in patients with sore throat, i.e. to distinguish between those who have streptococcal infection and those who have not, palpable lymph nodes and tonsilar exudates presumably are the most useful signs.1 The results shown in Table 1
do not support the interpretation that there was higher agreement for these signs compared with those which must be regarded as less valid.
The low proportion of young children reflects the structure of primary care in Germany where children and their parents have unrestricted access to community paediatricians. The fact that in young children with pharyngitis and tonsillitis, a sore throat is seldom the presenting symptom may also play a role. Also the fact that encounters were part of a study setting probably resulted in an overestimation of agreement (the Hawthorne effect).
We wonder to what degree these findings can be generalized to other countries. Medical education in Germany is still based on lectures and occasional short bedside small group sessions. Students are rarely provided with feedback on their clinical findings. Doctors teaching at universities hold history taking and physical examination in low regard compared with laboratory testing and medical imaging. A similar study conducted in medical cultures with different teaching traditions might have produced higher agreement on physical findings. Apparently, vocational training in general practice does not cover clear definitions of pathological findings sufficiently or reliable assessments of these.
The study also provides an explanation for the decrease in performance of clinical scoring systems in test settings. When a predictive instrument is developed, much effort usually is put into the training of assessors, resulting in high reliability. With subsequent application, be this in research or in routine care, this happens less and less. As a consequence of the results of our study, a training package on the eliciting of symptoms and signs should be part of any dissemination strategy for clinical scoring systems. Clinical signs, where observers disagree frequently and the reliability is low, should not be part of routinely applied decision rules.
| References |
|---|
|
|
|---|
1 Centor RM. The diagnosis of strep throat in adults in the emergency room. Med Decis Making 1981; 1: 239246.
2 Breese BB. A simple scorecard for the tentative diagnosis of streptococcal pharyngitis. Am J Dis Child 1977; 131: 514517.
3 Dobbs F. A scoring system for predicting group A streptococcal throat infection. Br J Gen Pract 1996; 46: 461464.[Web of Science][Medline]
4 McIsaac WJ, Goel V, To T, Low DE. The validity of a sore throat score in family practice. Can Med Assoc J 2000; 163: 811815.
5 Poses RM, Cebul RD, Collins M, Fager SS. The accuracy of experienced physicians probability estimates for patients with sore throats. Implications for decision making. J Am Med Assoc 1985; 254: 925929.
6 Seppala H, Lahtonen R, Ziegler T et al. Clinical scoring system in the evaluation of adult pharyngitis. Arch Otolaryngol Head Neck Surg 1993; 119: 288291.
7 Wyatt JC, Altman DG. Commentary: prognostic models: clinically useful or quickly forgotten? Br Med J 1995; 311: 15391531.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
J. Xu, K. Schwartz, J. Monsur, J. Northrup, and A. V. Neale Patient-clinician agreement on signs and symptoms of 'strep throat': a MetroNet study Fam. Pract., December 1, 2004; 21(6): 599 - 604. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
