This document list all the changes that are made to the source code of rpart to tranfsorm the package to distRforest.
rpart to distRforest
useDynLib(distRforest, .registration = TRUE, .fixes = "C_")
R_init_distRforest(DllInfo * dll)
ns <- asNamespace("distRforest")
library.dynam.unload("distRforest", libpath)
SEXP ncand2 and SEXP seed2 as input arguments to SEXP rpart(...)
void sample(int *vector, int x, int size);
int ncand; and int seed; to EXTERN struct {...} rp;
SEXP ncand2 and SEXP seed2 as input arguments to rpart(...)
rp struct by rp.ncand = asInteger(ncand2); and rp.seed = asInteger(seed2);
srand(rp.seed);
ncand, seed and redmem = FALSE as input arguments to rpart <- function(...)
if (missing(parms)) parms <- NULL
missing(parms) to is.null(parms)
ncand equal to number of available variables when input argument is missingseed equalt to 1 when input argument is missingas.integer(ncand) and as.integer(seed) as input arguments to rpfit <- .Call(C_rpart,...)
if (redmem) for(x in c(...)) ans[[x]] <- NULL to reduce memory of fitted rpart object{"rpart", (DL_FUNC) &rpart, 13}
int candidates[rp.ncand]; to allocate a vector to store the split candidates insample(candidates, rp.nvar, rp.ncand); to sample the split candidatesrp.nvar to rp.ncand in for (i = 0; i < rp.ncand; i++) to iterate over subseti to candidates[i] in for body to select correct split candidate from the samplerforest, predict.rforest and importance_rforest to export(...)
method optionsanovapred.c file in anova.c as the anovapred(...) functionextern int lognormalinit(...) and extern int gammainit(...)
extern int lognormaldev(...) and extern int gammadev(...)
extern int lognormal(...) and extern int gammasplit(...)
extern int lognormalpred(...) and extern int gammapred(...)
"lognormal" and "gamma" to vector of possible methodsmethod %in% c('lognormal','gamma') && xval > 0L
rpart vignettes in the distRforest packagedistRforest with examples for classification, Poisson and gamma regressionrpart tests in the distRforest packagemethod = 'gamma' and method = 'lognormal' options added to rpart(...)
rforest() is able to replicate identical trees as obtained with rpart()
rforest() happens as expectedpredict.rforest() are being generated as expectedimportance_rforest() calculates the variable importance scores as expected