Loops

Over and Over again…

We could explicitly call the function as many times need be to clean the tables we’re interested in.

table9 <- clean_table(file_01)

table10 <- clean_table(file_02)

table11 <- clean_table(file_03)

But that would be redundant! Imagine if we were reading in 15 of those csvs! There would be extra lines of code and as many extra variable names in your global environment to keep track of. Let’s see how loops paired with lists can help us!

Anatomy

A for loop has three parts:

  1. The Output: stuff can be stored or perform an action
  2. The Sequence: code within () that shows what to loop over
  3. The Body: code within {} that does the work

Sequence

The sequence lies within the (). It will follow this structure: ([variable name of your choice] in [list or vector])

The sequence tells the machine what to loop over. The variable name of your choice will represent a single element within the list or vector in a loop.

In the example below, with every iteration, df will be a counter and represent a different data frame in l

l <- list(file_01, file_02, file_03)

# for every data frame (df) in list (l)...
for(df in l) {
  
}

Try printing the head() of each data frame in our list

l <- list(file_01, file_02, file_03) 

for(df in l) {
  print(head(df))
}

Try printing a version of each data frame with clean_table()

for(df in l) {
  print(clean_table(df))
}

Output

Now instead of printing stuff, let’s store stuff in a list!

A way to use both lists and loops is to read data. Let’s create a loop to read-in csvs 1 through 11 and store them into a list.

dfs <- list()

csv <- paste0('Table', 1:11, '.csv')

for(c in csv) {
  t <- read_csv(here('data', '050', c))
  dfs[[c]] <- t
}

Rename the elements in the list

names(dfs) <- paste0('Table', 1:11)

Put it all together

Edit the loop so that we clean the tables as we’re reading in the csvs.

for(c in csv) {
  t <- read_csv(here('data', '050', c))
  ct <- clean_table(t)
  dfs[[c]] <- ct
}

With loops and a list to store the output, we’ve removed code redundancy. Instead of calling clean_table() for every table in our list, it just required editing a couple lines within the loop to make that adjustment.

Alter the Flow

Not everything has to be uniformly applied within a loop. Exceptions are allowed and part of control flow includes skipping to the next element or breaking out of the loop altogether.

Break

To exit the loop entirely, use break with an if statement.

dfs <- list()

csv <- paste0('Table', 1:11, '.csv')

for(c in csv) {
  if(c == 'Table6.csv') break
  
  t <- read_csv(here('data', '050', c))
  dfs[[c]] <- t
}

Next

To exit the current iteration, use next with an if statement. The code below only reads in odd numbered tables.

dfs <- list()

csv <- paste0('Table', 1:11, '.csv')
even_num <- seq(from = 2, to = length(csv), by = 2)

for(c in csv) {
  if(c %in% paste0('Table', even_num, '.csv')) next
  
  t <- read_csv(here('data', '050', c))
  dfs[[c]] <- t
}