A function to train random forest classifiers for QC data

MSstatsQC.ML.sim.size.detectR(guide.set, sim.start, sim.end)

Arguments

guide.set: comma-separated (.csv), metric file. It should contain a "Precursor" column and the metrics columns. It should also include "Annotations" for each run.
sim.start: enter min simulation size.
sim.end: enter max simulation size.

Value

a plot for sim.size vs performance

Examples

# First process the data to make sure it's ready to use
S9Site54.dataML <- DataProcess(MSstatsQC::S9Site54[, ])
#> Your data is ready to go!
colnames(S9Site54.dataML)[1] <- c("idfile")
colnames(S9Site54.dataML)[2] <- c("peptide")
S9Site54.dataML$peptide <- as.factor(S9Site54.dataML$peptide)
S9Site54.dataML$idfile <- as.numeric(S9Site54.dataML$idfile)
S9Site54.dataML <- within(S9Site54.dataML, rm(Annotations, missing))
guide.set <- dplyr::filter(S9Site54.dataML, idfile <= 20)
# \donttest{
MSstatsQC.ML.sim.size.detectR(guide.set, sim.start = 10, sim.end = 2500)
#> creating full factorial with 32 runs ...
#> creating full factorial with 32 runs ...
#> Error in QcClassifier_data_var(guide.set, nmetric, factor.names, sim.size *     1, peptide.colname, L = a, U = b): object 'guide.set' not found
# }