A function to train random forest classifiers for QC data

MSstatsQC.ML.trainR(
  guide.set,
  sim.size,
  guide.set.annotations = NULL,
  nfolds = NULL,
  a = 1.5,
  b = 2
)

Arguments

guide.set

comma-separated (.csv), metric file. It should contain a "Precursor" column and the metrics columns. It should also include "Annotations" for each run.

sim.size

enter simulation size.

guide.set.annotations

comma-separated (.csv), metric file with annotations such as pass and fail.

nfolds

fold for cross validation

a

lower threshold to define shift size

b

upper threshold to define shift size

Value

A trained model and performance indicators from train/validation/test splits

Examples

S9Site54.dataML <- DataProcess(MSstatsQC::S9Site54[, ])
#> Your data is ready to go!
colnames(S9Site54.dataML)[1] <- c("idfile")
colnames(S9Site54.dataML)[2] <- c("peptide")
S9Site54.dataML$peptide <- as.factor(S9Site54.dataML$peptide)
S9Site54.dataML$idfile <- as.numeric(S9Site54.dataML$idfile)
S9Site54.dataML <- within(S9Site54.dataML, rm(Annotations, missing))
guide.set <- dplyr::filter(S9Site54.dataML, idfile <= 20)
# \donttest{
MSstatsQC.ML.trainR(guide.set, sim.size = 10)
#> creating full factorial with 32 runs ...
#> creating full factorial with 32 runs ...
#> Error in QcClassifier_data_var(guide.set, nmetric, factor.names, sim.size *     1, peptide.colname, L = a, U = b): object 'guide.set' not found
# }