A function to test random forest classifiers for QC data

MSstatsQC.ML.deployR(Test.set, guide.set, rf_model)

Arguments

Test.set

comma-separated (.csv), metric file. It should contain a "Precursor" column and the metrics columns. It should also include "Annotations" for each run.

guide.set

comma-separated (.csv), metric file. It should contain a "Precursor" column and the metrics columns. It should also include "Annotations" for each run.

rf_model

the model that was trained previosly by MSstatsQC-ML training process

Value

Probability of failure predictions based on a trained model

Examples

S9Site54.dataML <- DataProcess(MSstatsQC::S9Site54[, ])
#> Your data is ready to go!
colnames(S9Site54.dataML)[1] <- c("idfile")
colnames(S9Site54.dataML)[2] <- c("peptide")
S9Site54.dataML$peptide <- as.factor(S9Site54.dataML$peptide)
S9Site54.dataML$idfile <- as.numeric(S9Site54.dataML$idfile)
S9Site54.dataML <- within(S9Site54.dataML, rm(Annotations, missing))
guide.set <- dplyr::filter(S9Site54.dataML, idfile <= 20)
# \donttest{
rf_model <- MSstatsQC.ML.trainR(guide.set, sim.size = 10)
#> creating full factorial with 32 runs ...
#> creating full factorial with 32 runs ...
#> Error in QcClassifier_data_var(guide.set, nmetric, factor.names, sim.size *     1, peptide.colname, L = a, U = b): object 'guide.set' not found
# }
Test.set <- dplyr::filter(S9Site54.dataML, idfile > 20)
# \donttest{
MSstatsQC.ML.deployR(Test.set, guide.set, rf_model = rf_model)
#> Error in h2o.getConnection(): No active connection to an H2O cluster. Did you run `h2o.init()` ?
# }
Test.set <- S9Site54.dataML
# \donttest{
MSstatsQC.ML.deployR(Test.set, guide.set, rf_model = rf_model)
#> Error in h2o.getConnection(): No active connection to an H2O cluster. Did you run `h2o.init()` ?
# }