Skip to contents

Introduction

Binary diagnostic tests are among the most commonly used tests in medicine and are used to rule in or out a certain condition. Commonly, this condition is disease status, but tests may also detect, for example, the presence of a bacteria or a virus, independent of any clinical manifestations.

Test metrics such as diagnostic accuracies, predictive values and likelihood ratios are useful tools to evaluate the efficacy of such tests in comparison to a gold standard, however, these statistics only provide a description of the quality of a test. Performing statistical inference to evaluate if one test is better than another while simultaneously referencing the gold standard is more complicated. Several authors have invested significant effort into developing statistical methods to perform such inference.

Understanding and implementing methods described in the statistical literature is often far outside the comfort zone of clinicians, particularly those who are not routinely involved in academic research.

Here we demonstrate the implementation of the testCompareR package.

Generating data

The package comes with a data set derived from the Cystic Fibrosis Patient Registry. The data was originally presented in the paper ‘Comparing the predictive values of diagnostic tests: sample size and analysis for paired study designs’ by Moskowitz and Pepe. Two binary prognostic factors are evaluated as predictors of severe infection in cystic fibrosis patients.

dat <- cfpr

Using the compareR function

The testCompareR package is elegant in its simplicity. You can pass your data to the compareR() function as the only argument and the function outputs a list object containing the results of descriptive and inferential statistical tests.

results <- compareR(dat)
results
#> $cont
#> $cont$`True Status: POS`
#>           Test 2
#> Test 1     Positive Negative
#>   Positive      185     3445
#>   Negative        0     1424
#> 
#> $cont$`True Status: NEG`
#>           Test 2
#> Test 1     Positive Negative
#>   Positive      123     1219
#>   Negative        0     5564
#> 
#> 
#> $prev
#>            Estimate  SE Lower CI Upper CI
#> Prevalence     42.3 0.5     41.4     43.1
#> 
#> $acc
#> $acc$accuracies
#> $acc$accuracies$`Test 1`
#>             Estimate  SE Lower CI Upper CI
#> Sensitivity     71.8 0.6     70.6     73.0
#> Specificity     80.6 0.5     79.6     81.5
#> 
#> $acc$accuracies$`Test 2`
#>             Estimate  SE Lower CI Upper CI
#> Sensitivity      3.7 0.3      3.2      4.2
#> Specificity     98.2 0.2     97.9     98.5
#> 
#> 
#> $acc$glob.test.stat
#> [1] 12301.32
#> 
#> $acc$glob.p.value
#> [1] 0
#> 
#> $acc$glob.p.adj
#> [1] 0
#> 
#> $acc$sens.test.stat
#> [1] 10821.03
#> 
#> $acc$sens.p.value
#> [1] 0
#> 
#> $acc$sens.p.adj
#> [1] 0
#> 
#> $acc$spec.test.stat
#> [1] 1480.291
#> 
#> $acc$spec.p.value
#> [1] 0
#> 
#> $acc$spec.p.adj
#> [1] 0
#> 
#> 
#> $pv
#> $pv$predictive.values
#> $pv$predictive.values$`Test 1`
#>     Estimate  SE Lower CI Upper CI
#> PPV     73.0 0.6     71.8     74.2
#> NPV     79.6 0.5     78.7     80.6
#> 
#> $pv$predictive.values$`Test 2`
#>     Estimate  SE Lower CI Upper CI
#> PPV     60.1 2.8     54.5     65.4
#> NPV     58.2 0.5     57.3     59.1
#> 
#> 
#> $pv$glob.test.stat
#> [1] 2830.026
#> 
#> $pv$glob.p.value
#> [1] 0
#> 
#> $pv$glob.p.adj
#> [1] 0
#> 
#> $pv$ppv.test.stat
#> [1] 28.45746
#> 
#> $pv$ppv.p.value
#> [1] 9.578013e-08
#> 
#> $pv$ppv.p.adj
#> [1] 1.915603e-07
#> 
#> $pv$npv.test.stat
#> [1] 2261.186
#> 
#> $pv$npv.p.value
#> [1] 0
#> 
#> $pv$npv.p.adj
#> [1] 0
#> 
#> 
#> $lr
#> $lr$likelihood.ratios
#> $lr$likelihood.ratios$`Test 1`
#>     Estimate  SE Lower CI Upper CI
#> PLR      3.7 0.1      3.5      3.9
#> NLR      0.3 0.0      0.3      0.4
#> 
#> $lr$likelihood.ratios$`Test 2`
#>     Estimate  SE Lower CI Upper CI
#> PLR      2.1 0.2      1.6      2.6
#> NLR      1.0 0.0      1.0      1.0
#> 
#> 
#> $lr$glob.test.stat
#> [1] 2010.219
#> 
#> $lr$glob.p.value
#> [1] 0
#> 
#> $lr$glob.p.adj
#> [1] 0
#> 
#> $lr$plr.test.stat
#> [1] 5.246279
#> 
#> $lr$plr.p.value
#> [1] 1.552014e-07
#> 
#> $lr$plr.p.adj
#> [1] 1.915603e-07
#> 
#> $lr$nlr.test.stat
#> [1] 44.83283
#> 
#> $lr$nlr.p.value
#> [1] 0
#> 
#> $lr$nlr.p.adj
#> [1] 0
#> 
#> 
#> $other
#> $other$alpha
#> [1] 0.05
#> 
#> $other$equal
#> [1] FALSE
#> 
#> $other$zeros
#> [1] 2
#> 
#> $other$Youden1
#> [1] 0.5239192
#> 
#> $other$Youden2
#> [1] 0.01879407
#> 
#> $other$test.names
#> [1] "Test 1" "Test 2"
#> 
#> 
#> attr(,"class")
#> [1] "compareR"

Individual results can be accessed via standard indexing.

results$acc$accuracies # returns summary tables for diagnostic accuracies
#> $`Test 1`
#>             Estimate  SE Lower CI Upper CI
#> Sensitivity     71.8 0.6     70.6     73.0
#> Specificity     80.6 0.5     79.6     81.5
#> 
#> $`Test 2`
#>             Estimate  SE Lower CI Upper CI
#> Sensitivity      3.7 0.3      3.2      4.2
#> Specificity     98.2 0.2     97.9     98.5

The list output of the compareR() function is useful should you need to manipulate any of the individual outputs in subsequent calculations. However, should you wish to see an interpretation of your results, you can pass the compareR output to the interpretR() function. This provides the same results in a more human-readable format.

interpretR(results)
#> 
#> WARNING:
#> Zeros exist in contingency table. Tests may return NA/NaN.
#> 
#> --------------------------------------------------------------------------------
#> CONTINGENCY TABLES
#> --------------------------------------------------------------------------------
#> 
#> True Status - POSITIVE
#>           Test 2
#> Test 1     Positive Negative
#>   Positive      185     3445
#>   Negative        0     1424
#> 
#> True Status - NEGATIVE
#>           Test 2
#> Test 1     Positive Negative
#>   Positive      123     1219
#>   Negative        0     5564
#> 
#> --------------------------------------------------------------------------------
#> PREVALENCE (%)
#> --------------------------------------------------------------------------------
#> 
#>            Estimate  SE Lower CI Upper CI
#> Prevalence     42.3 0.5     41.4     43.1
#> 
#> --------------------------------------------------------------------------------
#> DIAGNOSTIC ACCURACIES
#> --------------------------------------------------------------------------------
#> 
#>  Test 1 (%)
#>             Estimate  SE Lower CI Upper CI
#> Sensitivity     71.8 0.6     70.6     73.0
#> Specificity     80.6 0.5     79.6     81.5
#> 
#>  Test 2 (%)
#>             Estimate  SE Lower CI Upper CI
#> Sensitivity      3.7 0.3      3.2      4.2
#> Specificity     98.2 0.2     97.9     98.5
#> 
#> Global Null Hypothesis: Se1 = Se2 & Sp1 = Sp2
#> Test statistic:  12301.32  Adjusted p value:  0 ***SIGNIFICANT***
#> 
#> Investigating cause(s) of significance
#> 
#> Null Hypothesis 1: Se1 = Se2
#> Test statistic:  10821.03  Adjusted p value:  0 ***SIGNIFICANT***
#> 
#> Null Hypothesis 2: Sp1 = Sp2
#> Test statistic:  1480.291  Adjusted p value:  0 ***SIGNIFICANT***
#> 
#> --------------------------------------------------------------------------------
#> PREDICTIVE VALUES
#> --------------------------------------------------------------------------------
#> 
#>  Test 1 (%)
#>     Estimate  SE Lower CI Upper CI
#> PPV     73.0 0.6     71.8     74.2
#> NPV     79.6 0.5     78.7     80.6
#> 
#>  Test 2 (%)
#>     Estimate  SE Lower CI Upper CI
#> PPV     60.1 2.8     54.5     65.4
#> NPV     58.2 0.5     57.3     59.1
#> 
#> Global Null Hypothesis: PPV1 = PPV2 & NPV1 = NPV2
#> Test statistic:  2830.026  Adjusted p value:  0 ***SIGNIFICANT***
#> 
#> Investigating cause(s) of significance
#> 
#> Null Hypothesis 1: PPV1 = PPV2
#> Test statistic:  28.45746  Adjusted p value:  1.915603e-07 ***SIGNIFICANT***
#> 
#> Null Hypothesis 2: NPV1 = NPV2
#> Test statistic:  2261.186  Adjusted p value:  0 ***SIGNIFICANT***
#> 
#> --------------------------------------------------------------------------------
#> LIKELIHOOD RATIOS
#> --------------------------------------------------------------------------------
#> 
#>  Test 1 (%)
#>     Estimate  SE Lower CI Upper CI
#> PLR      3.7 0.1      3.5      3.9
#> NLR      0.3 0.0      0.3      0.4
#> 
#>  Test 2 (%)
#>     Estimate  SE Lower CI Upper CI
#> PLR      2.1 0.2      1.6      2.6
#> NLR      1.0 0.0      1.0      1.0
#> 
#> Global Null Hypothesis: PLR1 = PLR2 & NLR1 = NLR2
#> Test statistic:  2010.219  Adjusted p value:  0 ***SIGNIFICANT***
#> 
#> Investigating cause(s) of significance
#> 
#> Null Hypothesis 1: PLR1 = PLR2
#> Test statistic:  5.246279  Adjusted p value:  1.915603e-07 ***SIGNIFICANT***
#> 
#> Null Hypothesis 2: NLR1 = NLR2
#> Test statistic:  44.83283  Adjusted p value:  0 ***SIGNIFICANT***

And really, that’s it! That is all you need to know to get your answers with testCompareR.

There is some additional functionality that might be useful to know about, though.

Flexible input

The compareR() function will accept data as a data frame or matrix and there are a range of coding options for positive and negative results, detailed below. If you have been working across multiple sites and find that researchers have used different coding systems, no problem! As long as positive results are coded using something from the positive list and negative results with something in the negative list compareR() will handle that for you. No more manually re-coding your data!

“What about those pesky trailing spaces?” I hear you ask. Of course, compareR() can handle that, too. “Case-sensitivity?” Taken care of.

POSITIVE: positive, pos, p, yes, y, true, t, +, 1

NEGATIVE: negative, neg, no, n, false, f, -, 0, 2

# create data frame with varied coding
df <- data.frame(
  test1 = c(" positive ", "POS ", " n ", "N ", " 1 ", "+"),
  test2 = c(" NEG ", " yes ", " negative", " Y ", "-", " 0 "),
  gold = c(0, 1, 0, 1, 2, 1)
)

# recode the dataframe
recoded <- testCompareR:::recoder(df)
recoded
#>   test1 test2 gold
#> 1     1     0    0
#> 2     1     1    1
#> 3     0     0    0
#> 4     0     1    1
#> 5     1     0    0
#> 6     1     0    1

There are two things that compareR() cannot handle.

Firstly, it is imperative that the data structure provided has three columns and that those columns follow the pattern Test 1, Test 2, gold standard. If you place the gold standard at any index other than your_data[,3] then compareR() may return a result that looks sensible but does not answer the question you wanted to ask.

Finally, compareR() cannot handle missing data. Removing missing data is an option, but consider why the data is missing and don’t omit this from any write-up of the results. Alternatively, if the data are missing at random, you could consider the use of imputation methods to replace missing data. If the data are not missing at random then imputation becomes vastly more complex and you should probably seek expert advice.

Variable alpha

If you’re using testCompareR because you’re not statistically savvy and you want a nice function to do it all for you then you should probably just leave alpha alone. If you have a good reason, or you’re just messing around, feel free to change it to whatever you’d like, though.

# simulate data
test1 <- c(rep(1, 300), rep(0, 100), rep(1, 65), rep(0, 135))
test2 <- c(rep(1, 280), rep(0, 120), rep(1, 55), rep(0, 145))
gold <- c(rep(1, 400), rep(0, 200))

df <- data.frame(test1, test2, gold)

# test with alpha = 0.5
result <- compareR(df, alpha = 0.5)

# all results are significant
interpretR(result)
#> 
#> WARNING:
#> Zeros exist in contingency table. Tests may return NA/NaN.
#> 
#> --------------------------------------------------------------------------------
#> CONTINGENCY TABLES
#> --------------------------------------------------------------------------------
#> 
#> True Status - POSITIVE
#>           Test 2
#> Test 1     Positive Negative
#>   Positive      280       20
#>   Negative        0      100
#> 
#> True Status - NEGATIVE
#>           Test 2
#> Test 1     Positive Negative
#>   Positive       55       10
#>   Negative        0      135
#> 
#> --------------------------------------------------------------------------------
#> PREVALENCE (%)
#> --------------------------------------------------------------------------------
#> 
#>            Estimate  SE Lower CI Upper CI
#> Prevalence     66.7 1.9     62.8     70.3
#> 
#> --------------------------------------------------------------------------------
#> DIAGNOSTIC ACCURACIES
#> --------------------------------------------------------------------------------
#> 
#>  Test 1 (%)
#>             Estimate  SE Lower CI Upper CI
#> Sensitivity     75.0 2.2     73.5     76.4
#> Specificity     67.5 3.3     65.2     69.7
#> 
#>  Test 2 (%)
#>             Estimate  SE Lower CI Upper CI
#> Sensitivity     70.0 2.3     68.4     71.5
#> Specificity     72.5 3.2     70.3     74.6
#> 
#> Global Null Hypothesis: Se1 = Se2 & Sp1 = Sp2
#> Test statistic:  31.57895  Adjusted p value:  4.167158e-07 ***SIGNIFICANT***
#> 
#> Investigating cause(s) of significance
#> 
#> Null Hypothesis 1: Se1 = Se2
#> Test statistic:  18.05  Adjusted p value:  0.0001506251 ***SIGNIFICANT***
#> 
#> Null Hypothesis 2: Sp1 = Sp2
#> Test statistic:  8.1  Adjusted p value:  0.02213263 ***SIGNIFICANT***
#> 
#> --------------------------------------------------------------------------------
#> PREDICTIVE VALUES
#> --------------------------------------------------------------------------------
#> 
#>  Test 1 (%)
#>     Estimate  SE Lower CI Upper CI
#> PPV     82.2 2.0     80.8     83.5
#> NPV     57.4 3.2     55.3     59.6
#> 
#>  Test 2 (%)
#>     Estimate  SE Lower CI Upper CI
#> PPV     83.6 2.0     82.2     84.9
#> NPV     54.7 3.1     52.6     56.8
#> 
#> Global Null Hypothesis: PPV1 = PPV2 & NPV1 = NPV2
#> Test statistic:  26.92232  Adjusted p value:  2.850504e-06 ***SIGNIFICANT***
#> 
#> Investigating cause(s) of significance
#> 
#> Null Hypothesis 1: PPV1 = PPV2
#> Test statistic:  3.171214  Adjusted p value:  0.1498935 ***SIGNIFICANT***
#> 
#> Null Hypothesis 2: NPV1 = NPV2
#> Test statistic:  5.653882  Adjusted p value:  0.06966709 ***SIGNIFICANT***
#> 
#> --------------------------------------------------------------------------------
#> LIKELIHOOD RATIOS
#> --------------------------------------------------------------------------------
#> 
#>  Test 1 (%)
#>     Estimate  SE Lower CI Upper CI
#> PLR      2.3 0.2      2.1      2.5
#> NLR      0.4 0.0      0.3      0.4
#> 
#>  Test 2 (%)
#>     Estimate  SE Lower CI Upper CI
#> PLR      2.5 0.3      2.3      2.7
#> NLR      0.4 0.0      0.4      0.4
#> 
#> Global Null Hypothesis: PLR1 = PLR2 & NLR1 = NLR2
#> Test statistic:  23.37068  Adjusted p value:  8.416292e-06 ***SIGNIFICANT***
#> 
#> Investigating cause(s) of significance
#> 
#> Null Hypothesis 1: PLR1 = PLR2
#> Test statistic:  1.779904  Adjusted p value:  0.1498935 ***SIGNIFICANT***
#> 
#> Null Hypothesis 2: NLR1 = NLR2
#> Test statistic:  2.375766  Adjusted p value:  0.06966709 ***SIGNIFICANT***

Margins

Contingency tables are included in the readout from compareR() or interpretR. Some people like to see the summed totals for columns and rows. If you’re one of those people, then set margins = TRUE.

# simulate data
test1 <- c(rep(1, 300), rep(0, 100), rep(1, 65), rep(0, 135))
test2 <- c(rep(1, 280), rep(0, 120), rep(1, 55), rep(0, 145))
gold <- c(rep(1, 400), rep(0, 200))

df <- data.frame(test1, test2, gold)

# test with alpha = 0.5
result <- compareR(df, margins = TRUE)

# contingency tables have margins
result$cont
#> $`True Status: POS`
#>           Test 2
#> Test 1     Positive Negative Sum
#>   Positive      280       20 300
#>   Negative        0      100 100
#>   Sum           280      120 400
#> 
#> $`True Status: NEG`
#>           Test 2
#> Test 1     Positive Negative Sum
#>   Positive       55       10  65
#>   Negative        0      135 135
#>   Sum            55      145 200

Multiple testing

By default compareR() runs a minimum of three hypothesis tests and it can perform up to nine. This is accounted for using adjusted p-values according to the Holm method. If you’d prefer to use a different method, that’s no problem. Just set the multi_corr parameter to any of the methods which are handled by the base R function p.adjust().

# display p.adjust.methods
p.adjust.methods
#> [1] "holm"       "hochberg"   "hommel"     "bonferroni" "BH"        
#> [6] "BY"         "fdr"        "none"

# simulate data
test1 <- c(rep(1, 300), rep(0, 100), rep(1, 65), rep(0, 135))
test2 <- c(rep(1, 280), rep(0, 120), rep(1, 55), rep(0, 145))
gold <- c(rep(1, 400), rep(0, 200))

df <- data.frame(test1, test2, gold)

# test with different multiple comparison methods
result1 <- compareR(df, multi_corr = "holm")
result2 <- compareR(df, multi_corr = "bonf")

# the more restrictive Bonferroni method returns higher adjusted p values
result1$pv$glob.p.adj < result2$pv$glob.p.adj
#> [1] TRUE

Continuity correction

In certain circumstances compareR() uses McNemar’s test for testing differences in diagnostic accuracies. This test is routinely performed with continuity correction. If you wish to perform it without continuity correction then set cc = FALSE. If you aren’t sure whether to run the test with or without continuity correction, then stick to the default parameters.

# simulate data
test1 <- c(rep(1, 6), rep(0, 2), rep(1, 14), rep(0, 76))
test2 <- c(rep(1, 1), rep(0, 7), rep(1, 2), rep(0, 88))
gold <- c(rep(1, 8), rep(0, 90))

df <- data.frame(test1, test2, gold)

# run compareR without continuity correction
result <- compareR(df, cc = FALSE)
result$acc
#> $accuracies
#> $accuracies$`Test 1`
#>             Estimate   SE Lower CI Upper CI
#> Sensitivity     75.0 15.3     41.5     93.4
#> Specificity     84.4  3.8     75.7     90.6
#> 
#> $accuracies$`Test 2`
#>             Estimate   SE Lower CI Upper CI
#> Sensitivity     12.5 11.7      1.4     46.2
#> Specificity     97.8  1.6     92.4     99.5
#> 
#> 
#> $glob.test.stat
#> [1] "n < 100 and prevalence <= 10% - global test not used"
#> 
#> $glob.p.value
#> [1] NA
#> 
#> $glob.p.adj
#> [1] NA
#> 
#> $sens.test.stat
#> [1] 13.33333
#> 
#> $sens.p.value
#> [1] 0.0002607296
#> 
#> $sens.p.adj
#> [1] 0.0003968048
#> 
#> $spec.test.stat
#> [1] 13.84615
#> 
#> $spec.p.value
#> [1] 0.0001984024
#> 
#> $spec.p.adj
#> [1] 0.0003968048

Decimal places

You can change the number of decimal places displayed in the summary tables which are output by both the compareR() and interpretR() functions with the dp parameter. This parameter does not affect the number of decimal places displayed for p values or test statistics.

# simulate data
test1 <- c(rep(1, 317), rep(0, 83), rep(1, 68), rep(0, 132))
test2 <- c(rep(1, 281), rep(0, 119), rep(1, 51), rep(0, 149))
gold <- c(rep(1, 390), rep(0, 210))

df <- data.frame(test1, test2, gold)

# test with different multiple comparison methods
result <- compareR(df, dp = 3)

# the values in the summary tables are displayed to 3 decimal places
result$acc$accuracies
#> $`Test 1`
#>             Estimate    SE Lower CI Upper CI
#> Sensitivity   81.282 1.975   77.135   84.863
#> Specificity   67.619 3.229   61.046   73.605
#> 
#> $`Test 2`
#>             Estimate    SE Lower CI Upper CI
#> Sensitivity   72.051 2.272   67.415   76.289
#> Specificity   75.714 2.959   69.520   81.052

Choosing your tests

Another important aspect of the testCompareR package is the ability to control your study design. For example, if you have made the a priori decision that you are only interested in the predictive values, is it really necessary to control for multiple tests for diagnostic accuracies and likelihood ratios? Of course not!

You can ask compareR() not to display the results by setting the parameters for any pairs of tests which aren’t of interest to you to FALSE. The parameters are as follows: sesp for diagnostic accuracies; ppvnpv for predictive values; plrnlr for likelihood ratios.

# simulate data
test1 <- c(rep(1, 317), rep(0, 83), rep(1, 68), rep(0, 132))
test2 <- c(rep(1, 281), rep(0, 119), rep(1, 51), rep(0, 149))
gold <- c(rep(1, 390), rep(0, 210))

df <- data.frame(test1, test2, gold)

# only display results for predictive values
result <- compareR(df, sesp = FALSE, plrnlr = FALSE)
result
#> $cont
#> $cont$`True Status: POS`
#>           Test 2
#> Test 1     Positive Negative
#>   Positive      281       36
#>   Negative        0       73
#> 
#> $cont$`True Status: NEG`
#>           Test 2
#> Test 1     Positive Negative
#>   Positive       51       17
#>   Negative        0      142
#> 
#> 
#> $prev
#>            Estimate  SE Lower CI Upper CI
#> Prevalence       65 1.9     61.1     68.7
#> 
#> $pv
#> $pv$predictive.values
#> $pv$predictive.values$`Test 1`
#>     Estimate  SE Lower CI Upper CI
#> PPV     82.3 1.9     78.2     85.8
#> NPV     66.0 3.2     59.5     72.1
#> 
#> $pv$predictive.values$`Test 2`
#>     Estimate SE Lower CI Upper CI
#> PPV     84.6  2     80.4     88.1
#> NPV     59.3  3     53.4     65.0
#> 
#> 
#> $pv$glob.test.stat
#> [1] 50.21468
#> 
#> $pv$glob.p.value
#> [1] 1.247447e-11
#> 
#> $pv$glob.p.adj
#> [1] 1.247447e-11
#> 
#> $pv$ppv.test.stat
#> [1] 5.279658
#> 
#> $pv$ppv.p.value
#> [1] 0.02157598
#> 
#> $pv$ppv.p.adj
#> [1] 0.02157598
#> 
#> $pv$npv.test.stat
#> [1] 15.86226
#> 
#> $pv$npv.p.value
#> [1] 6.812387e-05
#> 
#> $pv$npv.p.adj
#> [1] 0.0001362477
#> 
#> 
#> $other
#> $other$alpha
#> [1] 0.05
#> 
#> $other$equal
#> [1] FALSE
#> 
#> $other$zeros
#> [1] 2
#> 
#> $other$Youden1
#> [1] 0.489011
#> 
#> $other$Youden2
#> [1] 0.4776557
#> 
#> $other$test.names
#> [1] "Test 1" "Test 2"
#> 
#> 
#> attr(,"class")
#> [1] "compareR"

Test names

If you want specific test names to be included in the output for compareR() then you can set the test.names parameter. This parameter accepts a character vector of length 2.

# simulate data
test1 <- c(rep(1, 317), rep(0, 83), rep(1, 68), rep(0, 132))
test2 <- c(rep(1, 281), rep(0, 119), rep(1, 51), rep(0, 149))
gold <- c(rep(1, 390), rep(0, 210))

df <- data.frame(test1, test2, gold)

# only display results for predictive values
result <- compareR(df, test.names = c("POCT", "Lab Blood"))
result$acc$accuracies
#> $POCT
#>             Estimate  SE Lower CI Upper CI
#> Sensitivity     81.3 2.0     77.1     84.9
#> Specificity     67.6 3.2     61.0     73.6
#> 
#> $`Lab Blood`
#>             Estimate  SE Lower CI Upper CI
#> Sensitivity     72.1 2.3     67.4     76.3
#> Specificity     75.7 3.0     69.5     81.1

Conclusion

You made it to the end! That pretty well summarises everything you need to know about the package. Hopefully it will save you a lot of time when comparing two binary diagnostic tests.

Please get in touch with any refinements, comments or bugs. The source code is available on Github if you think you can improve it yourself!