OARSI recommendations for the management of hip and knee osteoarthritis, part I: critical appraisal of existing treatment guidelines and systematic review of current research evidence.

Authors: Zhang W , Moskowitz R , Nuki G , Abramson S , Altman R , Alden N , Bierma-Zeinstra S , Brandt K , Croft P , Doherty M , Dougados M , Hochberg M , Hunter DJ , Kwoh K , Lohmander LS , Tugwell P

Affiliations:

Source: Osteoarthritis Cartilage. 2007 Sep;15(9):981-1000

DOI: 10.1016/j.joca.2007.06.014 Publication date: 2007 Sep E-Publication date: Aug. 27, 2007 Availability: full text Copyright: © 2007 Osteoarthritis Research Society International. Published by Elsevier Inc.

Language: English Countries: Not specified Location: Not specified Correspondence address: Prof. George Nuki,
Emeritus Professor of Rheumatology, University of Edinburgh, Osteoarticular Research Group, The Queen's Medical Research Institute, 47 Little France Crescent, Edinburgh EH16 4TJ, United Kingdom.
Email : g.nuki@ed.ac.uk

Keywords

Article abstract

PURPOSE:

As a prelude to developing updated, evidence-based, international consensus recommendations for the management of hip and knee osteoarthritis (OA), the Osteoarthritis Research Society International (OARSI) Treatment Guidelines Committee undertook a critical appraisal of published guidelines and a systematic review (SR) of more recent evidence for relevant therapies.

METHODS:

Sixteen experts from four medical disciplines (primary care two, rheumatology 11, orthopaedics one and evidence-based medicine two), two continents and six countries (USA, UK, France, Netherlands, Sweden and Canada) formed the guidelines development team. Three additional experts were invited to take part in the critical appraisal of existing guidelines in languages other than English. MEDLINE, EMBASE, Science Citation Index, CINAHL, AMED, Cochrane Library, seven Guidelines Websites and Google were searched systematically to identify guidelines for the management of hip and/or knee OA. Guidelines which met the inclusion/exclusion criteria were assigned to four groups of four appraisers. The quality of the guidelines was assessed using the AGREE (Appraisal of Guidelines for Research and Evaluation) instrument and standardised percent scores (0-100%) for scope, stakeholder involvement, rigour, clarity, applicability and editorial independence, as well as overall quality, were calculated. Treatment modalities addressed and recommended by the guidelines were summarised. Agreement (%) was estimated and the best level of evidence to support each recommendation was extracted. Evidence for each treatment modality was updated from the date of the last SR in January 2002 to January 2006. The quality of evidence was evaluated using the Oxman and Guyatt, and Jadad scales for SRs and randomised controlled trials (RCTs), respectively. Where possible, effect size (ES), number needed to treat, relative risk (RR) or odds ratio and cost per quality-adjusted life year gained (QALY) were estimated.

RESULTS:

Twenty-three of 1462 guidelines or consensus statements retrieved from the literature search met the inclusion/exclusion criteria. Six were predominantly based on expert opinion, five were primarily evidence based and 12 were based on both. Overall quality scores were 28%, 41% and 51% for opinion-based, evidence-based and hybrid guidelines, respectively (P=0.001). Scores for aspects of quality varied from 18% for applicability to 67% for scope. Thirteen guidelines had been developed for specific care settings including five for primary care (e.g., Prodigy Guidance), three for rheumatology (e.g., European League against Rheumatism recommendations), three for physiotherapy (e.g., Dutch clinical practice guidelines for physical therapy) and two for orthopaedics (e.g., National Institutes of Health consensus guidelines), whereas 10 did not specify the target users (e.g., Ontario guidelines for optimal therapy). Whilst 14 guidelines did not separate hip and knee, eight were specific for knee but only one for hip. Fifty-one different treatment modalities were addressed by these guidelines, but only 20 were universally recommended. Evidence to support these modalities ranged from Ia (meta-analysis/SR of RCTs) to IV (expert opinion). The efficacy of some modalities of therapy was confirmed by the results of RCTs published between January 2002 and 2006. These included exercise (strengthening ES 0.32, 95% confidence interval (CI) 0.23, 0.42, aerobic ES 0.52, 95% CI 0.34, 0.70 and water-based ES 0.25, 95% CI 0.02, 0.47) and nonsteroidal anti-inflammatory drugs (NSAIDs) (ES 0.32, 95% CI 0.24, 0.39). Examples of other treatment modalities where recent trials failed to confirm efficacy included ultrasound (ES 0.06, 95% CI -0.39, 0.52), massage (ES 0.10, 95% CI -0.23, 0.43) and heat/ice therapy (ES 0.69, 95% CI -0.07, 1.45). The updated evidence on adverse effects also varied from treatment to treatment. For example, while the evidence for gastrointestinal (GI) toxicity of non-selective NSAIDs (RR=5.36, 95% CI 1.79, 16.10) and for increased risk of myocardial infarction associated with rofecoxib (RR=2.24, 95% CI 1.24, 4.02) were reinforced, evidence for other potential drug related adverse events such as GI toxicity with acetaminophen or myocardial infarction with celecoxib remained inconclusive.

CONCLUSION:

Twenty-three guidelines have been developed for the treatment of hip and/or knee OA, based on opinion alone, research evidence or both. Twenty of 51 modalities of therapy are universally recommended by these guidelines. Although this suggests that a core set of recommendations for treatment exists, critical appraisal shows that the overall quality of existing guidelines is sub-optimal, and consensus recommendations are not always supported by the best available evidence. Guidelines of optimal quality are most likely to be achieved by combining research evidence with expert consensus and by paying due attention to issues such as editorial independence, stakeholder involvement and applicability. This review of existing guidelines provides support for the development of new guidelines cognisant of the limitations in existing guidelines. Recommendations should be revised regularly following SR of new research evidence as this becomes available.

Article content

Introduction

Osteoarthritis (OA) is the most common form of arthritis and a major contributor to functional impairment and reduced independence in older adults¹. The hip and knee are the principal large joints affected by OA. Although estimates of the prevalence of hip and knee OA vary considerably depending on whether the disease is defined by both symptoms and radiographic changes, or by radiographic criteria alone, knee OA is more prevalent²^, ³^, ⁴^, ⁵^, ⁶ than hip OA⁷^, ⁸^, ⁹^, ¹⁰^, ¹¹. Overall, as many as 40% of those aged over 65 in the community may have symptomatic OA of the knee or hip¹²^, ¹³. Current treatment strategies with both non-pharmacologic and pharmacologic therapies aim to reduce pain, physical disability and handicap, and some of them attempt to limit structural deterioration in affected joints. Surgical therapies are available for patients who fail to respond to more conservative measures¹⁴^, ¹⁵. In recent years, both the American College of Rheumatology (ACR) and the European League against Rheumatism (EULAR) have developed recommendations to optimise the treatment of hip and/or knee OA based on a variable combination of expert consensus and systematic review (SR) of research evidence¹⁶^, ¹⁷^, ¹⁸. Although these guidelines are used by physicians, funding authorities and government agencies in order to try and improve the quality of care of patients with knee and hip OA, they have been criticised for lack of methodological rigour, stakeholder involvement and applicability¹⁹^, ²⁰^, ²¹; and the recommendations for certain modalities of treatment that they contain may require modification following publication of more recent randomised controlled trials (RCTs) and meta-analyses (MAs). The Osteoarthritis Research Society International (OARSI) therefore appointed an international, multidisciplinary committee of experts in September 2005 with the remit of producing up to date, evidence-based, globally relevant consensus recommendations for the management of hip and/or knee OA in 2007. The committee undertook a critical appraisal of existing evidence-based and consensus guidelines and an SR of the current research evidence; as a prelude to developing consensus recommendations following a Delphi exercise. This paper reports the results of the critical appraisal of existing treatment guidelines and the SR of the more recent research evidence. The purpose of this study was to identify the evidence available, assess its quality and to use this knowledge to develop a new guideline. Part II of this document: “The OARSI evidence-based consensus recommendations for the treatment of OA of the hip and knee” will be published separately in Osteoarthritis and Cartilage.

Methods

Participants

The guideline development committee was composed of 16 experts from four medical disciplines (primary care two, rheumatology 11, orthopaedics one, and evidence-based medicine two) and six countries in Europe and North America (France, Netherlands, Sweden, UK, Canada and the USA). All members of this guideline development team participated in: (1) a critical appraisal of existing treatment guidelines; (2) a Delphi exercise to generate consensus recommendations; and (3) an exercise to grade the strength of recommendation for all modalities of therapy recommended. Three additional experts were invited to undertake critical appraisals of existing guidelines in languages other than English.

Critical appraisal of existing guidelines

Systematic literature search

A systematic literature search for existing guidelines for the management of hip and/or knee OA published in any language between 1945 and October 2005 was undertaken using MEDLINE (1966–), EMBASE (1980–), CINAHL (1980–), AMED (1985–) and the Science Citation Index (1945–). The search strategy consisted of two basic components: guidelines in any term (e.g., guidelines, recommendations, standards, algorism, or expert consensus, etc.) and hip or knee OA in any possible terms in the databases (Appendix 1). In addition, Google (the first 100 hits) and seven Guideline Websites were searched, including the National Guideline Clearinghouse http://www.guidelines.gov/, Primary Care Clinical Practice Guidelines http://medicine.ucsf.edu/resources/guidelines/, the Scottish Intercollegiate Guidelines Network http://www.sign.ac.uk/, the Canadian Medical Association Infobase for Clinical Practice Guidelines http://mdm.ca/cpgsnew/cpgs/index.asp, the Guidelines International Network http://www.g-i-n.net/, Evidence Based Medicine Guidelines http://www.ebm-guidelines.com/, and the National Institute for Clinical Excellence http://www.nice.org.uk/.

Inclusion/exclusion criteria

Guidelines developed for the management of hip and/or knee OA using consensus or evidence-based methods were included. The latest version was included if the guidelines had been updated. Guidelines developed for OA in other joints or for aspects of OA other than treatment were excluded, as were narrative reviews, commentaries and appraisals of implementation.

Quality and content assessment

English language guidelines were randomly assigned to three groups of four committee members for appraisal of quality and content. Three guidelines published in German and Dutch were appraised by three additional experts who were fluent in these languages. The quality of the guidelines was assessed using the AGREE instrument²², in which 23 criteria in seven domains are evaluated. These include the scope and purpose of the guidelines, stakeholder participation, methodological rigour, clarity, applicability, editorial independence and overall quality. The content was extracted using a comprehensive reference list of treatment modalities. Each appraiser scored the guidelines independently and results were collected and analysed by the lead investigator (WZ) and the co-chairs (GN and RM), who did not take part in the assessment.

Data analyses

The appraisers' scores from each group were expressed as standardised domain scores on a percentage scale (0–100%)²². Guidelines were categorised according to the methods (expert opinion based, research evidence based or both), the target users to whom they were directed (primary care, rheumatology, physiotherapy or orthopaedics), the scope of the recommendations (general and specific treatments) and the joints for which the guidelines were applicable (hip, knee, or hip and knee). Quality scores were compared between groups using an analysis of variance (ANOVA). Agreement (%) between guidelines was calculated by

where N_r is the number of guidelines recommending the modality and N_a indicates number of guidelines addressing the modality. Levels of evidence were examined and for each modality, the best available evidence was selected according to the evidence hierarchy (Table I)²³.

Table IEvidence hierarchy²³
Ia	MA of RCTs
Ib	RCT
IIa	Controlled study without randomisation
IIb	Quasi-experimental study
III	Non-experimental descriptive studies, such as comparative, correlation, and case–control studies
IV	Expert committee reports or opinion or clinical experience of respected authorities, or both

SR of recent evidence

Systematic literature search

A systematic search of the literature published between 31 January 2002 and 31 January 2006 was undertaken using MEDLINE, EMBASE, CINHAL, AMED, the Science Citation Index and the Cochrane Library databases. Research evidence prior to January 2002 was not sought systematically as this was available from the systematic literature review conducted by EULAR¹⁷. Separate searches for research evidence for each treatment modality were undertaken. Each search was conducted sequentially according to the evidence hierarchy (SRs/MAs, followed by RCTs/controlled trials (CTs), quasi-experimental and uncontrolled studies) (Table II)²³. An example of how this search strategy was employed to obtain the best available research evidence for the efficacy of acetaminophen (paracetamol) is shown in Appendix 2. The same strategy was used for searching MEDLINE, EMBASE, CINHAL and AMED. For the Science Citation Index, however, a key word search was used and all possible terms and combinations of terms were tied in order to obtain relevant citations. Medical subject heading searches (MeSH) were used for all databases and key word searches were used if a MeSH search was not available. All MeSH search terms were exploded. The reference lists of SRs were examined and any additional studies meeting the inclusion/exclusion criteria were included.

Table II23 existing guidelines for the management of hip and/or knee OA
	N	Guidelines
Type of guidelines
Opinion based	6	Royal College of Physicians, etc.
Evidence based	5	Prodigy Guidance, etc.
Both	12	EULAR, etc.

Topic
General	13	ACR, EULAR, etc.
Specific	10	MOVE, Canadian NSAIDs, etc.

Target joint(s)
Hip	1	EULAR
Knee	8	German, etc.
Both	14	ACR, etc.

Target users
Primary care	5	Prodigy Guidance, etc.
Rheumatology	3	EULAR, etc.
Physiotherapy	3	Dutch physiotherapy, etc.
Orthopaedics	2	NIH consensus, etc.
Not specified	10	Ontario, ICSI, etc.

Language
English	21	ACR, EULAR, etc.
Others	2	German, Malay, etc.

The search in the Cochrane Library included MeSH searches of Cochrane reviews, abstracts of Quality Assessed Systematic Reviews, the Cochrane Controlled Trial Register, the National Health Service (NHS) Economic Evaluation Databases, the Health Technology Assessment Database and the NHS Economic Evaluation Bibliography Details Only. In addition, a comprehensive search for all articles including the term OA regardless of treatment was undertaken.

Inclusion/exclusion criteria

Only studies with clinical outcomes for hip and/or knee OA were included. The main focus was on SRs/MAs, RCTs/CTs, uncontrolled trials, cohort studies, case–control studies, cross-sectional studies and economic evaluations. Studies of OA at other sites such as the hand or spine, and other chronic joint diseases were excluded, apart from studies in which adverse effects of relevant pharmacologic treatments were being investigated as a primary outcome. Case reports, animal studies, non-clinical outcome studies, narrative review articles, commentaries and guidelines were excluded.

The efficacy of any modality of treatment was determined by using the best available evidence. For example, when the efficacy of an intervention could be confirmed by category Ia evidence (MA/SR of RCTs), then studies lower in the evidence hierarchy such as individual RCTs (category Ib) were not reviewed (Table I). If there was more than one study in the same evidence level (e.g., four SRs for NSAIDs), the study with the best quality score was used. Information concerning side effects was obtained from both RCTs and observational studies. While the efficacy of each therapeutic intervention was assessed separately for hip and knee OA, side effects were evaluated for each intervention regardless of the OA therapy and the target joint. For determination of cost effectiveness, only cost-utility analyses were included.

Quality assessment

The quality of SR/MAs was assessed using the Oxman and Guyatt checklist²⁴ and the quality of RCTs was evaluated using the Jadad method²⁵. All quality scores were converted into percentages of the maximum score attainable. Quality assessments were not undertaken for other types of study designs, such as cohort or case–control studies. For cost-utility analysis, study perspective, comparator, time horizon, discounting, modelling and uncertainty were evaluated.

Outcome measures

Efficacy

Effect sizes (ESs) and 95% confidence intervals (CIs) compared with placebo or active control were calculated for continuous outcomes such as reduction of pain from baseline or improvement in function²⁶. ES is the standard mean difference, i.e., the mean difference between a treatment and a control group divided by the standard deviation of the difference. It is expressed as a number without units and can be used for comparisons across all interventions. From the clinical standpoint ESs of 0.2 are considered small and 0.5 moderate, while an ES>0.8 indicates a large clinical effect²⁷. Statistical pooling was undertaken, as appropriate, when SRs were not available²⁸. For dichotomous data, such as the percentage of patients with moderate to excellent (or more than 50%) pain relief or symptomatic improvement, the number needed to treat (NNT) was estimated²⁹. The NNT is the estimated number of patients who need to be treated to achieve the target effect. Thus the smaller the NNT the better the treatment effect. The 95% CI for the NNT was calculated using Altman's method³⁰.

Side effects

The relative risk (RR) of side effects was calculated from RCTs or cohort studies for the incident risk, and from cross-sectional studies for prevalent risk. Odds ratios (ORs) were calculated from case–control studies³¹. Both RR and OR provide information on how many times more likely (or less likely) it is that a subject who is exposed to a treatment modality will have an adverse event, when compared with a subject who is not exposed. An RR/OR=1 indicates no increased risk, whereas an RR/OR>1 or <1 indicates increased or decreased risk, respectively.

Cost effectiveness

Only cost-utility analysis was reviewed, where cost per quality-adjusted life years (QALYs) gained was used. Costs were converted into US dollars and values were discounted by 5% per year from the year in which the study was published until 2006.

Data were extracted by two investigators (WZ and a research assistant, Jane Robertson). A customised form was used for data extraction and quality assessment. Any discrepancies were discussed and agreed between the extractors prior to analysis. The data from the non-English language studies were extracted by assessors with good understanding of the languages concerned.

Results

Quality and contents of existing guidelines

The systematic literature search yielded 1462 citations (MEDLINE 276, EMBASE 413, CINAHL 81, AMED 27 and SCI 553, Google and Guidelines Websites 112). Of these, 23 met the inclusion and exclusion criteria specified¹⁶^, ¹⁷^, ¹⁸^, ³²^, ³³^, ³⁴^, ³⁵^, ³⁶^, ³⁷^, ³⁸^, ³⁹^, ⁴⁰^, ⁴¹^, ⁴²^, ⁴³^, ⁴⁴^, ⁴⁵^, ⁴⁶^, ⁴⁷^, ⁴⁸^, ⁴⁹^, ⁵⁰^, ⁵¹. Six guidelines were predominantly based on opinion, five primarily based on evidence and 12 based on both (Table II). Whilst the majority of the guidelines¹⁴ did not separate hip and knee, eight were specific for knee but only one for hip OA. Thirteen guidelines had been developed for specific care settings (five for primary care, three for rheumatology, three for physiotherapy and two for orthopaedics); but 10 did not specify target users.

Scores for overall quality of guidelines were 28%, 41% and 51% for opinion-based, evidence-based and hybrid guidelines, respectively (P<0.001) (Fig. 1). Scores for different quality criteria varied but apart from applicability, opinion-based guidelines tended to have lower scores (Table III).

http://www.oarsijournal.com/cms/attachment/2001986879/2006948853/gr1.jpg

Fig. 1

Overall quality score of guidelines (mean±s.e.m.).

Table IIIQuality scores (%)
	Mean±s.e.m.
	Opinion based	Evidence based	Hybrid	P
n	6	5	12
Scope	45.90±7.30	79.26±8.00	74.23±5.37	0.007
Stakeholder	17.36±6.12	30.56±8.62	37.27±4.42	0.058
Rigour	14.68±5.29	28.57±11.39	57.80±5.57	<0.001
Clarity	42.66±4.60	68.19±10.35	63.14±10.35	0.026
Applicability	15.48±6.47	12.78±4.78	21.53±2.14	0.313
Editorial	19.25±6.30	24.72±7.84	50.58±7.33	0.013
Overall	26.09±4.48	40.68±4.24	50.76±2.70	<0.001

s.e.m.: standard error of mean.

Fifty-one treatment modalities were addressed in the 23 guidelines. Twenty of these modalities were recommended by all (100%) of the guidelines in which they were addressed (Table IV), but the strength of agreement for any modality appeared to be related to the number of guidelines that addressed that modality. For example, while regular telephone contact and knee fusion were recommended in 100% of the guidelines in which these modalities of therapy were considered, this was actually in only two guidelines for each modality. By contrast, although weight loss was not universally recommended, it was in fact recommended in 13/14 of the guidelines where this modality was considered.

Table IVAgreement and level of evidence for modalities of therapy recommended by existing guidelines^∗
Level of evidence^††	Agreement (number of guidelines recommending the modality/total number of guidelines addressing the modality)
Level of evidence^††	<25%	25%–	50%–	75%–	100%
Ia	Ultrasound (1/5)	Chondroitin sulphate (2/7)	Heat/ice (7/10) Glucosamine sulphate (6/10) NSAID+H2-blockers (5/8)	NSAIDs (15/16) Insole (12/13)^‡‡ Braces (8/9)^‡‡ Topical capsaicin (8/9)^‡‡ IA HA (8/9)^‡‡ IA steroid (11/13)^‡‡ TENS (8/10) Topical NSAIDs (7/9)^‡‡	Aerobic exercise (21/21) Strengthening exercise (21/21) Acetaminophen (16/16) Education (15/15) COX-2 inhibitors (11/11) Opioid (9/9) Self-management (8/8) Water-based exercise (8/8) NSAID+PPI (8/8) NSAID+misoprostol (8/8) Telephone (2/2)
Ib	Laser (1/6)	Nutrients (1/3)	Acupuncture (5/8)	Weight loss (13/14)	Combination therapy (12/12)
	Electrotherapy/EMG (1/8)		Massage (1/2)	Patellar tape (12/13)^‡‡	Joint lavage (3/3)^‡‡
	Electrotherapy/EMG (1/8)		Diacerhein (1/2)	Avocado soybean unsaponifiables (3/4)	Herbs (2/2)
III					TJR (14/14)
III					Osteotomy (10/10)
IV	Oral steroid (0/2)			Arthroscopic debridement (5/6)^‡‡	Cane/stick (11/11)^‡‡ Referral (5/5) Knee fusion (2/2)^‡‡ Knee aspiration (2/2)^‡‡

TENS=Transcutaneous Electrical Nerve Stimulation; EMG=Electromyography; TJR=Total Joint Replacement.

∗Modalities were grouped according to strength of agreement and level of evidence. Modalities addressed by only one guideline were not included, such as radiotherapy, sauna/spa, gait aid, topical rubefacients, oestrogen, patellar resurfacing, and anti-depressants. Modalities not directly related to the treatment such as consideration of risk factors, clinical features, etc. were excluded.

†Level of evidence: Ia=SR of RCTs; Ib=RCT, IIa=CT; IIb=quasi-experiment; III=cohort/case–control study; and IV=expert opinion. Only the highest level of evidence has been selected for each modality.

‡Specific for knee OA.

Evidence to support recommendations ranged from Ia (SR of RCTs) to IV (expert opinion), and did not necessarily reflect the extent of agreement (Table IV). For example, while canes/sticks, total joint replacement and osteotomy were not supported by RCTs, they were still universally recommended in the guidelines which addressed them. In contrast, despite evidence from SRs of RCTs for the efficacy of chondroitin sulphate and ultrasound, they were recommended by <50% of the guidelines in which these modalities were considered (Table IV).

Recent evidence

The results of the SR of research papers published between January 2002 and January 2006 are shown in Table V.

Table VRecent evidence for efficacy of treatment of hip and knee OA
Modality	Joint	QoS (%)	LoE	Recent evidence (2002–)
Modality	Joint	QoS (%)	LoE	ES_pain (95% CI)	ES_function (95% CI)	ES_stiffness (95% CI)	NNT (95% CI)
General
Risk factors
Clinical phase
Combination therapy

Non-pharmacological
Self-management	Both	100	Ia	0.06 (0.02, 0.10)⁸⁹	0.06 (0.02, 0.10)⁸⁹
Telephone	Both	100	Ia	0.12 (0.00, 0.24)⁹⁰	0.07 (0.00, 0.15)⁹⁰
Education	Both	100	Ia	0.06 (0.02, 0.10)⁸⁹	0.06 (0.02, 0.10)⁸⁹
Strengthening	Knee	100	Ia	0.32 (0.23, 0.42)⁹¹	0.32 (0.23, 0.41)⁹¹
Aerobic	Knee	100	Ia	0.52 (0.34, 0.70)⁹¹	0.46 (0.25, 0.67)⁹¹
Water-based exercise	Both	60	Ib	0.25 (0.02, 0.47)⁶⁴^, ⁹²	0.23 (0.00, 0.45)⁶⁴	0.17 (−0.05, 0.39)⁶⁴
Balneotherapy	Knee	75	Ia				NS⁹³
Spa/sauna	Both	75	Ib	0.46 (0.17, 0.75)⁹⁴			NS
Weight reduction	Knee	40	Ib	0.13 (−0.12, 0.38)⁵²^, ⁹⁵	0.69 (0.24, 1.14)⁵²	0.36 (−0.08, 0.80)⁵²	3 (2, 9)⁵²
Nutrients (e.g., SAM-e)	Knee	100	Ia	0.22 (−0.25, 0.69)⁹⁶	0.31 (0.10, 0.52)⁹⁶
TENS	Both	75	Ia				2 (1, 5)⁹⁷
Laser	Both	100	Ia				4 (2, 17)⁹⁸
Ultrasound	Both	50	Ia	0.06 (−0.39, 0.52)⁹⁹
Radiotherapy	Both	50	IIb	Similar effects between OA and RA from an MA of uncontrolled trial¹⁰⁰
Heat/ice	Knee	75	Ia	0.69 (−0.07, 1.45)¹⁰¹	1.03 (0.44, 1.62)¹⁰¹ for quads strength; 1.13 (0.54, 1.73)¹⁰¹ for flexion	0.83(−0.03, 1.69)¹⁰¹ for swelling
Massage	Knee	40	Ib	0.10 (−0.23, 0.43)¹⁰²
Acupuncture	Knee	40	Ib	0.51 (0.23, 0.79)⁶³	0.51 (0.23, 0.79)⁶³	0.41 (0.13, 0.69)⁶³	4 (3, 9)⁶³
Insoles	Knee	100	Ia	No different between type of insoles, no placebo/usual care comparisons¹⁰³
Cane/stick
Joint protection (braces)	Knee	100	Ia	More benefits with a knee brace than a neoprene sleeve¹⁰³
Electrotherapy/EMG	Knee	75		0.77 (0.36, 1.17)¹⁰⁴
Referral

Pharmacological
Acetaminophen	Both	100	Ia	0.21 (0.02, 0.41)¹⁰⁵			2 (1, 2)¹⁰⁶
NSAIDs	Both	100	Ia	0.32 (0.24, 0.39)¹⁰⁷
NSAIDs+PPIs	OA/RA	100	Ia
NSAIDs+H2 blockers	OA/RA	100	Ia
NSAIDs+misoprostol	OA/RA	100	Ia
COX-2 inhibitors	Both	100	Ia	0.44 (0.33, 0.55)¹⁰⁸ (exc Deek's for OA/RA)
Topical NSAIDs	Knee	100	Ia	0.41 (0.22, 0.59)⁵³	0.36 (0.24, 0.48)⁵³	0.49 (0.17, 0.80)⁵³	3 (2, 4)⁵³
Topical capsaicin	Knee	75	Ia				4 (3, 5)¹⁰⁹
Opioids	Both	50	Ia
Other narcotics
Oral steroid
IA Corticosteroid	Knee	100	Ia	0.72 (0.42, 1.02)¹¹⁰	0.06 (−0.17, 0.30)¹¹⁰		4 (2, 11)¹¹⁰
IA Hyaluronic acid	Knee	100	Ia	0.32 (0.17, 0.47)¹¹¹	0.00 (−0.23, 0.23)¹¹²
Glucosamine sulphate	Both	100	Ia	0.61 (0.28, 0.95)¹¹³	0.07 (−0.08, 0.21)¹¹³	0.06 (−0.11, 0.23)¹¹³	5 (4, 7)¹¹⁴
Chondroitin sulphate	Knee	100	Ia	0.52 (0.37, 0.67)¹¹⁴			5 (4, 7)¹¹⁴
Diacerhein	Both	–	Ib	0.22 (0.01, 0.42)⁸¹^, ⁸²^, ⁸³^, ⁸⁴^, ⁸⁵
ASU	Both	75	Ia	More beneficial for hip OA¹¹⁵
Herbal remedy	Both	75	Ia				7 (4, 27)¹¹⁶
Oestrogen
Bisphosphonates
Antidepressants

Surgical
Arthroscopic lavage	Knee	100	Ib	0.09 (−0.27, 0.44)⁵⁵	−0.10 (−0.45, 0.26)⁵⁵
Arthroscopic debridement	Knee	100		−0.01 (−0.37, 0.35)⁵⁵	−0.09 (−0.27, 0.45)⁵⁵
Patellar resurfacing	Knee	100	Ib				9 (5, 25)¹¹⁷
Osteotomy	Knee	50	IIb	60% Pain relief from an SR of uncontrolled trial⁵⁷
Joint distraction
TJR	Both	100	III	TJR is effective to improve QoL, more beneficial for hip OA from an SR of cohort studies⁵⁶
Knee aspiration
Knee fusion

ES=0.2 is considered small, ES=0.5 is moderate, and ES>0.8 is large; NNT for symptom relief, e.g., ≥50% pain relief, unless otherwise specified; SAM-e: S-adenosylmethionine; ASU: avocado soybean unsponifiable.

∗LoE (level of evidence): Ia: MA of RCTs; Ib: RCT; IIa controlled study without randomisation; IIb: quasi-experimental study (e.g., uncontrolled trial, one arm dose–response trial, etc.); III: observational studies (e.g., case–control, cohort, cross-sectional studies); IV: expert opinion.

^†QoS (quality of study) was assessed using validated scales, e.g., the Oxman and Guyatt scale for SR and the Jadad's scale for clinical trials. The percentage score was calculated for each study. The best available evidence was presented, i.e., SR with the highest quality, RCT with the highest quality followed by uncontrolled or quasi experiment, cohort and case–control study.

Efficacy

With the exception of combination therapy, the use of a cane/stick and referral, all the non-pharmacologic and pharmacologic therapies recommended universally by existing guidelines were supported by recent SRs of RCTs (Ia) or RCTs (Ib) published after 2002. By contrast, there were no placebo controlled trials of surgical modalities of treatment such as total joint replacement and osteotomy, and supporting evidence came from uncontrolled or non-experimental observational studies (Table V). Overall quality scores for evidence ranged between 40% and 100% but 24/40 studies (60%) scored 100% (Table V).

The ES for pain relief scores varied from small (e.g., education ES=0.06, 95% CI 0.02, 0.10) to moderate (e.g., aerobic exercise ES=0.52, 95% CI 0.34, 0.70). No modality of therapy had an ES as high as 0.80 – the accepted criterion for a large clinical effect²⁷ (Fig. 2). ESs for pain relief score with oral analgesics such as acetaminophen (ES=0.21 95% CI 0.02, 0.41) and NSAIDs (ES=0.32, 95% CI 0.24, 0.39) were small (Fig. 3 and Table V).

http://www.oarsijournal.com/cms/attachment/2001986879/2006948856/gr2.jpg

Fig. 2

ES for pain relief with non-pharmacological therapies.

http://www.oarsijournal.com/cms/attachment/2001986879/2006948859/gr3.jpg

Fig. 3

ES for pain relief with pharmacological therapies.

ESs for improvement in function were also generally small, and very similar to those for pain relief, for a number of modalities of non-pharmacological therapies (Table V). However, the ES for improvement in function for >10% weight reduction was 0.69 (95% CI 0.24, 1.14) compared with the ES for pain relief (0.13, 95% CI −0.12, 0.38). ESs for reduction in stiffness were also available for a few modalities of treatment (Table V).

Some studies provided data, which allowed calculation of NNTs. For example, weight reduction (>10%) was associated with an NNT of three (95% CI 2, 9), i.e., one in three patients with knee OA who achieved this loss of weight would have more than 50% reduction in the total Western Ontario and McMaster Universities (WOMAC) Osteoarthritis index⁵². The NNT for topical NSAIDs was also three (95% CI 2, 4), indicating that one in three patients with pain associated with knee OA treated with a topical NSAID would be expected to experience moderate to excellent pain relief⁵³.

In general, non-pharmacologic therapies had numerically smaller ES (ES=0.25, 95% CI 0.16, 0.34) than pharmacological therapies (ES=0.39, 95% CI 0.31, 0.47) (Fig. 2, Fig. 3). Among surgical treatments, ES could only be calculated for arthroscopic lavage and debridement. An SR of four RCTs showed that arthroscopic joint lavage and debridement were no more effective than placebo⁵⁴. One placebo controlled RCT (with a quality score of 100%) included in this review demonstrated that the ES for arthroscopic lavage and debridement vs placebo were 0.09 (95% CI −0.27, 0.44) and −0.01 (95% CI −0.37, 0.35), respectively⁵⁵. Similar results were obtained for improvement in function (Table V). Although there are no placebo controlled RCTs of total joint (knee or hip) replacement or osteotomy, two recent SRs of uncontrolled trials and cohort studies confirmed that they were highly effective in relieving pain and improving quality of life⁵⁶^, ⁵⁷.

Side effects

Evidence for side effects of treatments has been mainly investigated in pharmacologic therapies. Oral NSAIDs were associated with 3–5 times the risk of gastrointestinal (GI) side effects when compared with placebo or non-exposure⁵⁸, whereas treatment with topical NSAIDs resulted in no more adverse GI events than placebo (RR=0.81, 95% CI 0.43, 1.56)⁵³ or non-exposure (OR=1.45, 95% CI 0.84, 2.50)⁵⁹ (Table VI). Whether or not long-term treatment with acetaminophen 4 g daily is associated with GI and renal side effects remains inconclusive (Table VI). Treatment with cyclooxygenase-2 (COX-2) selective drugs or conventional non-selective NSAIDs together with proton pump inhibitors (PPIs) or misoprostol has been shown to be associated with a reduction in the risk of NSAID-induced upper GI side effects. However, treatment with rofecoxib has been shown to be associated with an increased risk of cardiovascular (CV) events (RR=2.24, 95% CI 1.24, 4.02)⁶⁰ and treatment with misoprostol with an increased risk of diarrhoea (RR=1.81, 95% CI 1.52, 2.61)⁶¹. Following the withdrawal of rofecoxib, a number of RCTs and SRs of the CV safety of other coxibs and conventional non-selective NSAIDs have been undertaken. While the increased risk of CV side effects with rofecoxib was confirmed, the evidence for similar CV toxicity with celecoxib, valdecoxib and conventional non-selective NSAIDs was inconsistent (Table VI). However, the overall CV risk associated with COX-2 selective inhibitors was not significantly greater than that associated with conventional non-selective NSAIDs (RR=1.19, 95% CI 0.80, 1.75)⁶² (Table VI).

Table VISafety profiles – RR or OR^∗ and 95% CI
Intervention^††	Adverse events	RR/OR (95% CI)	Evidence (references)
Acupuncture	Any	0.76 (0.13, 4.42)	RCT⁶³
Acetaminophen	GI discomfort	0.80 (0.27, 2.37)	RCTs¹⁰⁵
	GI perforation/bleed	3.60 (2.60, 5.10)	CC¹¹⁸
	GI bleeding	1.2 (0.8, 1.7)	CCs¹¹⁹
	Renal failure	0.83 (0.50, 1.39)	CS¹²⁰
	Renal failure	2.5 (1.7, 3.6)	CC¹²¹
NSAIDs	GI perforation/ulcer/bleed	5.36 (1.79, 16.10)	RCTs⁵⁸
	GI perforation/ulcer/bleed	2.70 (2.10, 3.50)	CSs⁵⁸
	GI perforation/ulcer/bleed	3.00 (2.70, 3.70)	CCs⁵⁸
	Myocardial infarction	1.09 (1.02, 1.15)	CSs¹²²
Topical NSAIDs	GI events	0.81 (0.43, 1.56)	RCTs⁵³
Topical NSAIDs	GI bleed/perforation	1.45 (0.84, 2.50)	CC⁵⁹
H2 blocker+NSAID vs NSAID	Serious GI complications	0.33 (0.01, 8.14)	RCTs⁶²
	Symptomatic ulcers	1.46 (0.06, 35.53)	RCTs⁶²
	Serious CV or renal events	0.53 (0.08, 3.46)	RCTs⁶²
PPI+NSAID vs NSAID	Serious GI complications	0.46 (0.07, 2.92)	RCTs⁶²
	Symptomatic ulcers	0.09 (0.02, 0.47)	RCTs⁶²
	Serious CV or renal events	0.78 (0.10, 6.26)	RCTs⁶²
Misoprostol+NSAID vs NSAID	Serous GI complications	0.57 (0.36, 0.91)	RCTs⁶²
	Symptomatic ulcers	0.36 (0.20, 0.67)	RCTs⁶²
	Serious CV or renal events	1.78 (0.26, 12.07)	RCTs⁶²
	Diarrhoea	1.81 (1.52, 2.61)	RCTs⁶¹
COX-2 inhibitors
Coxibs vs NSAID	Serious GI complications	0.55 (0.38, 0.80)	RCTs⁶²
	Symptomatic ulcers	0.49 (0.38, 0.62)	RCTs⁶²
	Serious CV or renal events	1.19 (0.80, 1.75)	RCTs⁶²
Celecoxib	Myocardial infarction	2.26 (1.0, 5.1)	RCTs¹²³
Celecoxib	Myocardial infarction	0.97 (0.86, 1.08)	CSs/CCs¹²²
Rofecoxib	Myocardial infarction	2.24 (1.24, 4.02)	RCTs⁶⁰
Rofecoxib	Myocardial infarction	1.27 (1.12, 1.44)	CSs/CCs¹²²
Valdecoxib	CV events	2.3 (1.1, 4.7)	RCTs¹²⁴
Opioids	Any	1.4 (1.3, 1.6)	RCTs¹²⁵
Opioids	Constipation	3.6 (2.7, 4.7)	RCTs¹²⁵
Glucosamine sulphate	Any	0.97 (0.88, 1.08)	RCTs¹¹³
Diacerhein	Diarrhoea	3.98 (2.90, 5.47)	RCTs⁸¹^, ⁸⁵

H2-blockers: histamine type 2 receptor antagonists.

∗RR: Relative Risk; OR: Odds Ratio; CC: case–control study; CS: cohort study. Pooled RR/OR was provided if more than one study were included.

†Compared with placebo/non-exposure unless otherwise stated.

Cost effectiveness

Four cost-utility analyses have been undertaken since 2002. One in Germany, in which acupuncture was compared with sham acupuncture⁶³; two in the UK, which studied treatment with water-based exercises and GI protective strategies⁶⁴^, ⁶⁵; and one in Canada, which looked at treatment with intra-articular injections of hyaluronic acid⁶⁶. Two previous studies which had compared total hip and knee replacements with conventional pharmacologic and non-pharmacologic therapy were retrieved for comparison purpose⁶⁷^, ⁶⁸. Cost/QALY varied with modalities, countries, comparators, perspectives, time horizons and discounting rates and remained variable, even after adjustment for discounting and conversion of the original cost per QALY to the current value of the US dollar (Table VII).

Table VIICost per QALY
Intervention	Comparator	Perspective^∗	Time horizon	Discounting	Year published	Country	Cost/QUALY
Intervention	Comparator	Perspective^∗	Time horizon	Discounting	Year published	Country	Original	Converted ($)^††
Water-based exercise	Usual care	Societal	1 Year	No	2005	UK	£5738	10483⁶⁴
Acupuncture	Sham acupuncture	Societal	3 Months	No	2005	Germany	17845 €	22297⁶³
NSAID+PPI	NSAIDs	NHS	6 Months	No	2005	UK	£33889	61915⁶⁵
NSAID+misoprostol	NSAIDs	NHS	6 Months	No	2005	UK	£8889	16240⁶⁵
COX-2 specifics	NSAIDs	NHS	6 Months	No	2005	UK	£36923	74298⁶⁵
COX-2 selectives	NSAIDs	NHS	6 Months	No	2005	UK	£30000	60367⁶⁵
Intra-articular hyaluronic acid	Standard care	Societal	1 Year	No	2002	Canada	$10000	10453⁶⁶
Total hip replacement	Conventional therapy	Societal	Life	5%	1996	US	$4754	8131⁶⁷
Total knee replacement	Pre-operation	Institutional	2 Years	No	1997	US	$5856	10325⁶⁸

∗Perspective=perspective for economic evaluation (Societal=costs and benefits to whole society; NHS=costs and benefits to UK National Health Service; Institutional=costs and benefits to other payers, e.g., insurance company).

†The original Cost/QALY was converted into US$ with a discount rate of 5% pa from the date of the publication to the current value on 10 March 2006.

Discussion

Clinical guidelines are frequently defined as ‘systematically developed statements to assist practitioner and patient decisions about appropriate health care for specific clinical circumstances’⁶⁹. OA is the most prevalent form of arthritis throughout the world¹^, ²^, ³^, ⁴^, ⁵^, ⁶^, ⁷ and OA related knee pain is the leading cause of physical disability in older adults¹.The prevalence of both symptomatic and radiographically defined hip OA⁷^, ⁸^, ⁹^, ¹⁰^, ¹¹ is less than that of knee OA²^, ³^, ⁴^, ⁵^, ⁶ and varies from one country to another⁷^, ⁸^, ⁷⁰. The treatment of symptomatic OA of the knee and hip are global problems, which present challenges to the clinical skills and judgement of health professionals everywhere. As there is no single treatment modality which will relieve pain, improve mobility and prevent structural progression of disease, effective management relies on the appropriate use of a number of available therapies, each of which has only limited efficacy. While a number of national and regional guidelines have been developed to assist physicians and other health professionals in their management of hip and/or knee OA¹⁶^, ¹⁷^, ¹⁸^, ³²^, ³³^, ³⁴^, ³⁵^, ³⁶^, ³⁷^, ³⁸^, ³⁹^, ⁴⁰^, ⁴¹^, ⁴²^, ⁴³^, ⁴⁴^, ⁴⁵^, ⁴⁶^, ⁴⁷^, ⁴⁸^, ⁴⁹^, ⁵⁰^, ⁵¹, there are currently no universally agreed recommendations, even for a core group of safe and effective therapies, that can be recommended for the treatment of OA of the knee and hip throughout the world. As a prelude to developing updated, evidence-based, international, expert consensus recommendations for the management of hip and knee OA, the OARSI Treatment Guidelines Committee undertook a critical appraisal of existing published guidelines and an SR of more recent evidence for relevant therapies. The purpose of these preliminary appraisals was (1) to establish the extent to which different modalities of therapy are recommended in existing guidelines, and to explore the possibility that there may be a core set of recommendations common to all the guidelines; (2) to investigate the extent to which these guidelines are based on available research evidence; (3) to assess the quality of the guidelines using the widely accepted AGREE criteria; and (4) to examine the extent to which more recent research evidence confirms, or fails to confirm, recommendations in existing guidelines.

Treatment modalities recommended in existing guidelines, core recommendations and their evidence base

The critical appraisal of the 23 existing guidelines showed that of 51 treatment modalities addressed, 20 were universally recommended in those guidelines in which they were considered (100% agreement in Table IV). These included recommendations for non-pharmacological modalities of therapy such as education, exercise, patient contact by telephone and provision of walking aids and pharmacological treatments such as acetaminophen, non-selective NSAIDs with co-prescription of gastroprotective agents or selective COX-2 inhibitors, opioids and some herbal remedies. Surgical treatments recommended in all the guidelines in which they were considered included knee aspiration and joint lavage as well as osteotomy, knee fusion and total joint replacements. Self-management and the combination of non-pharmacologic and pharmacologic treatments were also uniformly recommended core recommendations. It is apparent that this core set of recommended therapies must reflect the availability of treatments. The less than universal recommendation for some modalities of therapy may have been a consequence of them not being universally available, e.g., topical NSAIDs and avocado soybean unsaponifiables are available in Europe but not in the USA. It is also important to consider the number of guidelines, which considered any particular modality of therapy in ones interpretation of the reliability of the strength of agreement for that treatment. Clearly, the confidence one can have in the universal recommendation for exercise, where this modality of treatment was considered and endorsed in 21/21 guidelines, is likely to be greater than the confidence one has in the recommendation for knee fusion, which was only considered and endorsed in 2/2 guidelines.

It was also apparent that some of the core set of universally recommended therapies were not supported by evidence from RCTs. For example, while exercise of various types was supported by SR of RCTs (level Ia), total joint replacement was only supported by uncontrolled or cohort studies (level III) and the recommendations for knee aspiration and knee fusion were based on expert opinion (level IV). The extent to which RCTs should be the gold standard for the recommendation of all treatments has been the subject of previous discussion and controversy⁷¹^, ⁷². Nevertheless, the level of research evidence and clinical effectiveness have been important considerations in the development of recent guidelines for the treatment of knee and hip OA¹⁷^, ¹⁸ and in the development of the OARSI recommendations. Clearly guidelines based on recommendations for treatments for which there is proven evidence of benefit should at least have the potential for improving clinical outcomes and the quality of health care for patients, although success is certainly not guaranteed and evidence-based guidelines are only one option for improving the quality of health care.

A pilot survey of the perceived usefulness of the treatment modalities addressed by the existing guidelines was conducted among physicians and other health care professionals attending a New York University – OARSI Rheumatology Symposium in 2006. The purpose of the survey was to collect the users' opinions on the usefulness of current treatment guidelines. The usefulness of each recommended treatment modality was assessed by the participants using a 5-point categorical scale (not useful, slightly useful, moderately useful, very useful and absolutely essential). Votes (%) on “very useful or absolutely essential” were calculated. Of 19 participants who completed the questionnaire (four general physicians, eight rheumatologists, one physiotherapist, one orthopaedic surgeon, one pharmacist and four other health professionals), 94% perceived total joint replacement to be very useful or essential therapy for both knee OA and hip OA. Combination therapy was judged to be very useful or essential by 79% for knee OA and 72% for hip OA. Weight reduction was perceived to be more useful for knee than hip OA by 68%, whereas NSAIDs, NSAID plus PPIs, COX-2 inhibitors, self-management, education and exercise were considered useful for both hip and knee OA. Although this survey was far from being truly representative of all potential guideline users and only involved a very small number of participants, most of whom were from the United States, the views expressed about the usefulness of various modalities of treatment were at least consistent with the appraisal of existing guidelines that has led to the definition of a tentative core set of recommended treatment modalities. It also points to a possible way of assessing the potential applicability of any future recommendations for other modalities of therapy being considered as additions to this core set.

Quality of existing guidelines

The methodology involved in the development of treatment guidelines for OA has evolved considerably in the last decade. Between the publication of the first guidelines for the treatment of OA by the Royal College of Physicians in 1993⁴⁹ and the publication of the EULAR recommendations in 2005¹⁸, the paradigm has shifted from purely opinion-based guidelines⁴⁹ to entirely evidence-based guidelines such as the Prodigy Guidance³⁴ and subsequently to hybrid guidelines based on both research evidence and clinical expertise such as the EULAR recommendations¹⁷^, ¹⁸. However, no attempt had been made to try and assess the quality of these guidelines. We have therefore used the AGREE instrument to evaluate the quality of all existing guidelines for scope and purpose, stakeholder participation, methodological rigour, clarity, applicability, editorial independence and overall quality²². Overall quality was better in evidence-based than opinion-based guidelines, and significantly better still in the hybrid guidelines that combined research evidence with expert opinion (Fig. 1). This is mainly attributable to the improved scores for scope and purpose (P=0.007), rigour of development (P<0.001) and editorial independence (P=0.013) in the hybrid guidelines (Table III). There is a tendency for evidence-based guidelines to have lower applicability, although the differences are not statistically significant (Table III). This may, in part, reflect the gap that exists between RCTs which demonstrate that an intervention works (“efficacy”) and how often and well the intervention works in clinical practise (“clinical effectiveness”). Hybrid guidelines can be expected to demonstrate improved applicability as clinical expertise can temper the rigidity of research data and close the gap between research and clinical practise.

In the development of hybrid guidelines by the EULAR OA Task Force, expert consensus on the most important propositions was followed by a systematic search for published supporting research evidence, prior to assigning a strength and confidence of recommendation for each treatment proposition. These were based on combined consideration of the research evidence and clinical expertise after also considering risks and benefits, including potential adverse effects and the cost of each treatment modality¹⁸. This method is clinically driven and evidence supported. The sequence of steps has been modified slightly for the development of the OARSI Treatment Guidelines. An initial SR of research evidence was followed by the development of expert consensus based on a combined consideration of the research evidence and the clinical expertise of the members of the committee. This was then followed by assignment of strength and confidence of recommendation for each proposition as before. This current method is evidence-driven and clinically supported. Another important difference in the methodology used in the development of the OARSI recommendations has been that the committee has not arbitrarily restricted the number of treatment options that it would consider, as was the case in the development of the EULAR guidelines¹⁷^, ¹⁸.

Limitations

There are a number of limitations to this study.

Firstly it was inevitably necessary to set fixed timelines for the literature search, i.e., from January 2002 to January 2006. Evidence before this time was obtained from the EULAR SR. For technical reasons it has not been possible, to date, to pool the data, so that the SRs of the relevant scientific literature before January 2002 and from January 2002 to January 2006 remain as two separate data sets. Evidence that has been published after January 2006 has yet to be systematically reviewed. There have been a number of new studies published after 31 January 2006, examples are those for glucosamine, chondroitin, diacerhein and self-management⁷³^, ⁷⁴^, ⁷⁵^, ⁷⁶^, ⁷⁷. It has not been possible to update the SR following the Delphi exercise, which is described in detail in the second part of this report. The methods used to develop the guideline involved undertaking an SR of the research evidence to inform and assist in the development of the expert consensus. Any new evidence or proposals for changes in the consensus recommendations after completion of the Delphi exercise should properly be considered in the context of the full evidence and propositions. This would have required another systematic literature search for all evidence and a further Delphi exercise, which would not have been feasible within the timeframe. Sensitivity analysis⁷⁸ was therefore undertaken to examine whether these recently published studies would alter any of the evidence-based conclusions (Table VIII). For example, the results of two further RCTs for glucosamine hydrochloride, The National Institutes of Health Glucosamine/Chondroitin Arthritis Intervention (GAIT) Trail and sulphate (GUIDE) Trial have recently been published⁷⁴^, ⁷⁵. The addition of the data from these two studies to the main body of trial outcomes did not alter ESs for glucosamine sulphate or hydrochloride significantly. Treatment with glucosamine sulphate remained superior to placebo while treatment with glucosamine hydrochloride was not. However, following the addition of the new data on chondroitin sulphate from the GAIT study to the results of the earlier RCTs, treatment with chondroitin sulphate was no longer superior to placebo⁷⁴^, ⁷⁶ (Table VIII). However, there are a number of studies that have been reported in 2007 that have not been included, two examples are trials of chondroitin sulphate and of weight reduction which were published after the analyses and discussion for this manuscript were completed⁷⁹^, ⁸⁰. Treatment with diacerhein was the subject of a recent Cochrane SR⁷⁷. The calculations of ES and RR were similar to those found in this study (Table VIII). No attempt has been made to pool the data as the majority of trials included in the Cochrane review are already included in our main analysis⁸¹^, ⁸²^, ⁸³^, ⁸⁴^, ⁸⁵. A new RCT of self-management (class training package plus educational booklets) vs educational booklets alone did not show any difference for the WOMAC pain scores between groups⁷³. Unfortunately, numerical data were not available and a sensitivity test could not be conducted.

Table VIIISensitivity analyses
Modality	Outcome measure(s)	Point estimate (95% CI)
Modality	Outcome measure(s)	Data 2002–2006	Data 2006–	Pooled
Glucosamin sulphate	ES_pain	0.68 (0.32, 1.04)	0.26 (−0.01, 0.54)⁷⁵	0.45 (0.04, 0.86)
Glucosamin hydrocloride	ES_pain	0.13 (−0.27, 0.53)	−0.03 (−0.18, 0.13)⁷⁴	−0.01 (−0.15, 0.14)
Chondroitin sulphate	ES_pain	0.52 (0.37, 0.67)	−0.02 (−0.18, 0.14)⁷⁴	0.30 (−0.10, 0.70)
Chondroitin sulphate	ES_pain	0.52 (0.37, 0.67)	0.42 (0.04, 0.79)⁷⁶	0.30 (−0.10, 0.70)
Diacerhein	ES_pain	0.22 (0.01, 0.42)	0.22 (0.01, 0.42)⁷⁷	NA
Diacerhein	RR_diarrhoea	3.98 (2.90, 5.47)	3.81 (2.54, 5.71)⁷⁷	NA
Self-management	ES_pain	0.06 (0.02, 0.10)	No difference for WOMAC pain⁷³

NA: not applicable as the new study is an updated SR.

As it is of course almost certain that additional studies, which may be relevant to the analyses and conclusions contained in this report, will be published in due course, we plan to review accumulating evidence annually, and to formally update the guidelines as required within 3–5 years.

Secondly, research evidence can be prone to publication bias. Although we have searched Cochrane library, unpublished/unregistered trials cannot be comprehensively assessed. We would therefore encourage investigators to register any trials that are being undertaken or planned.

Thirdly, caution must be taken when looking for cross-treatment comparisons unless the evidence has been obtained from a direct comparison. Most of the evidences summarised in this report are from placebo controlled studies. Placebo effects may vary across trials and indirect comparison can be misleading⁸⁶. In addition, there are numerous differences between trials such as differences in study period, severity of disease, age, gender and co-morbidities, etc. For example it is not appropriate to make a direct comparison of ESs between electrotherapy (ES=0.77, 95% CI 0.36, 1.17) and NSAIDs (ES=0.32, 95% CI 0.24, 0.39) and to draw the conclusion that electrotherapy is more effective than NSAIDs.

Finally, evidence was selected sequentially according to the evidence hierarchy (Table I) and the quality of the studies, and only the best available evidence was considered. Whether this is an adequate approach is open to discussion. An MA is not necessarily superior to a large scale well-conducted RCT⁸⁷, and RCTs are not necessarily better than observational studies⁸⁸. Differences in the underlying populations being examined may also impact the results of a study.

In summary, a critical appraisal of existing treatment guidelines across countries and regions has identified a core set of treatments for the management of hip and knee OA. The quality and applicability of these guidelines increased when research evidence and expert opinions were combined. The study suggests that there is room for improvement in the quality and applicability of guidelines for the management of hip and knee OA in the future. Regular SR of research evidence and update of recommendations are important to ensure that guidelines remain current.

Download the file : PIIS1063458407002348.pdf (335.9 KB) Find it online