Loading required package: Matrix
Parallel processing is a powerful technique to speed up computations by utilizing multiple CPU cores simultaneously. In R, several functions and packages enable parallel processing, making it easier to handle large datasets and complex calculations efficiently. This blog post will introduce you to some of these key functions, such as mclapply
, parLapply
, and parSapply
, and demonstrate how to use them in your R scripts.
### Check the number of cores
detectCores()
[1] 8
mclapply
### mclapply works on unix system, it will call lapply in windows
f <- function(i) {
lmer(Petal.Width ~ . - Species + (1 | Species), data = iris)
}
system.time(save1 <- lapply(1:100, f))
user system elapsed
0.618 0.005 0.628
system.time(save2 <- mclapply(1:100, f))
user system elapsed
0.033 0.031 0.457
parlapply
### Works on windows, but slower than mclapply
numCores <- detectCores()
### Starting a cluster
cl <- makeCluster(numCores)
parSapply(cl, Orange, mean, na.rm = TRUE)
Tree age circumference
NA 922.1429 115.8571
### Close the cluster, best practise
stopCluster(cl)
### lapply
system.time({save1 <- lapply(1:100, f)})
user system elapsed
0.645 0.011 0.691
### mclapply
system.time({save2 <- mclapply(1:100, f)})
user system elapsed
0.031 0.033 0.459
###
system.time(
{
cl <- makeCluster(detectCores())
clusterEvalQ(cl, library(lme4))
save3 <- parLapply(cl, 1:100, f)
stopCluster(cl)
}
)
user system elapsed
0.115 0.017 1.215