# R versus Python: Loops, required Time

My illustration of a well-known “R versus Python” competition in terms of the time they require to loop and generate pseudo-random numbers.

To accomplish the task, the following steps were performed in Python and R:

• loop 100k times (i is the loop index)
• generate a random integer number out of the array of integers from 1 to the current loop index i (i+1 for Python)
• output elapsed time at the probe loop steps: i (i+1 for Python) in [1,10,100,1000,5000,10000,25000,50000,75000,100000]

The result is presented on the plot below.

The following conclusions can be drawn:

• Python is indeed faster than R, when the number of iterations is less than 1000. Below 100 steps, python is up to 8 times faster than R, while if the number of steps is higher than 1000, R beats Python when using lapply function!
• Try to avoid using “for” loop in R, especially when the number of looping steps is higher than 1000. Use the functions lapply/sapply/vapply instead.
• Timing runaway of the R “for” loop starts at 10k looping steps.

The snippets of the code used for the task are below.

Python code:

``````from numpy import random as rand
import datetime as dt

#number of the loop iterations
n_elements = int(1e5)
#probe points
x = [1,10,100,1000,5000,10000,25000,50000,75000,100000]

#for loop
t = dt.datetime.now()
vec = []
elapsed = []

for i in range(n_elements):
vec.append(rand.choice(i+1, size=1, replace=True))
if i+1 in x:
elapsed.append((dt.datetime.now() - t).total_seconds())``````

R code:

``````library(magrittr)
#number of the loop iterations
n_elements <- 1e5
#probe points
x <- c(1,10,100,1000,5000,10000,25000,50000,75000,100000)

#for loop
t <- Sys.time()
vec <- NULL
elapsed <- NULL

for (i in seq_len(n_elements))
{
vec <- c(vec, sample(i, size = 1, replace = T))
if(i %in% x)
elapsed <- c(elapsed, as.numeric(difftime(Sys.time(), t, 'secs')))
}

#lapply function
t <- Sys.time()
vec <- NULL
elapsed_sapply <- lapply(seq_len(n_elements), function(i) {
vec <- c(vec, sample(i, size = 1, replace = T))
if(i %in% x)
return(as.numeric(difftime(Sys.time(), t, 'secs')))
}) %>% Filter(Negate(is.null), .) %>% unlist()``````

Plot code:

``````var xDat = [1,10,100,1000,5000,10000,25000,50000,75000,100000];
var yDat = [[0.000655,0.000827,0.00184,0.012137,0.052789,0.101116,0.248447,0.435075,0.622154,0.808063],
[0.008551,0.008749,0.009103,0.015438,0.057638,0.151765,0.717195,2.663847,6.110421,10.980278],
[0.002781,0.005598,0.005958,0.00921,0.05569,0.075909,0.134087,0.231207,0.327564,0.426327]
];
var labels = ['Python: "for" loop', 'R: "for" loop', 'R: lapply'];
var data = [];

for(var i = 0; i < 3; i++) {
var trace = {
x: xDat,
y: yDat[i],
mode: 'lines+markers',
line: {
width: 2
},
name: labels[i]
};

data.push(trace);
};

var layout = {
showlegend: true,
height: 600,
width: 800,
title: 'R v. Python: Loops, required Time Comparison',
xaxis: {
title: 'Number of steps in the loop',
type: 'log',
autorange: true
},
yaxis: {
title: 'Elapsed time [s]',
autorange: false,
range: [0, 10]
}
};

Plotly.newPlot('r_vs_py_loops', data, layout);
``````