MSstatsQC.ML.trainR.RdA function to train random forest classifiers for QC data
MSstatsQC.ML.trainR(
guide.set,
sim.size,
guide.set.annotations = NULL,
nfolds = NULL,
a = 1.5,
b = 2
)comma-separated (.csv), metric file. It should contain a "Precursor" column and the metrics columns. It should also include "Annotations" for each run.
enter simulation size.
comma-separated (.csv), metric file with annotations such as pass and fail.
fold for cross validation
lower threshold to define shift size
upper threshold to define shift size
A trained model and performance indicators from train/validation/test splits
S9Site54.dataML <- DataProcess(MSstatsQC::S9Site54[, ])
#> Your data is ready to go!
colnames(S9Site54.dataML)[1] <- c("idfile")
colnames(S9Site54.dataML)[2] <- c("peptide")
S9Site54.dataML$peptide <- as.factor(S9Site54.dataML$peptide)
S9Site54.dataML$idfile <- as.numeric(S9Site54.dataML$idfile)
S9Site54.dataML <- within(S9Site54.dataML, rm(Annotations, missing))
guide.set <- dplyr::filter(S9Site54.dataML, idfile <= 20)
# \donttest{
MSstatsQC.ML.trainR(guide.set, sim.size = 10)
#> creating full factorial with 32 runs ...
#> creating full factorial with 32 runs ...
#> Error in QcClassifier_data_var(guide.set, nmetric, factor.names, sim.size * 1, peptide.colname, L = a, U = b): object 'guide.set' not found
# }