How To Get A Peek at Dataframe in R

Get a peek at Dataframe
Get a peek at Dataframe with glimpse, head and view in R

Get a peek at Dataframe with glimpse, head and view in R
Getting a quick look at the dataframe to understand the variables we have or data types is an important aspect of data analysis. If you are used to working with excel, your first impulse is to open the data in excel. However, getting a look at the data programmatically in R has many advantages including the safety of not changing data file by mistake.

In this post, we will see three ways to get a peek at the data in a dataframe in R. We will first use tidyverse’s glimpse() function to get a glimpse of a dataframe, then see how to get look at the top or bottom few rows of the data frame and finally see how to get a look at the data with view() function in R.

Let us first load tidyverse suite of R packages.

library("tidyverse")

We will use the fantastic Penguins dataset to illustrate the three ways to see data in a dataframe. Let us load the data from cmdlinetips.com’ github page.

path2data <- "https://raw.githubusercontent.com/cmdlinetips/data/master/palmer_penguins.csv"
penguins<- readr::read_csv(path2data)
## Parsed with column specification:
## cols(
##   species = col_character(),
##   island = col_character(),
##   bill_length_mm = col_double(),
##   bill_depth_mm = col_double(),
##   flipper_length_mm = col_double(),
##   body_mass_g = col_double(),
##   sex = col_character()
## )

We will three different ways to get a quick look at a data frame in R.

1. glimpse(): Get a glimpse of the data and datatype

glimpse() function in tidyverse is from tibble package and is great to view the columns/variables in a dataframe, It also shows data type and some of the data in the dataframe in each row.

glimpse(penguins)

Here is the output of glimpse() function. It starts off with the number of rows and columns and each column in separate rows.

## Rows: 344
## Columns: 7
## $ species           <chr> "Adelie", "Adelie", "Adelie", "Adelie", "Adelie", "…
## $ island            <chr> "Torgersen", "Torgersen", "Torgersen", "Torgersen",…
## $ bill_length_mm    <dbl> 39.1, 39.5, 40.3, NA, 36.7, 39.3, 38.9, 39.2, 34.1,…
## $ bill_depth_mm     <dbl> 18.7, 17.4, 18.0, NA, 19.3, 20.6, 17.8, 19.6, 18.1,…
## $ flipper_length_mm <dbl> 181, 186, 195, NA, 193, 190, 181, 195, 193, 190, 18…
## $ body_mass_g       <dbl> 3750, 3800, 3250, NA, 3450, 3650, 3625, 4675, 3475,…
## $ sex               <chr> "male", "female", "female", NA, "female", "male", "…

2. head(): to see the first n elements of data frame

head() function lets you get a look at top n rows of a dataframe. By default it shows the first 6 rows in a dataframe.

head(penguins)

## # A tibble: 6 x 7
##   species island bill_length_mm bill_depth_mm flipper_length_… body_mass_g sex  
##   <chr>   <chr>           <dbl>         <dbl>            <dbl>       <dbl> <chr>
## 1 Adelie  Torge…           39.1          18.7              181        3750 male 
## 2 Adelie  Torge…           39.5          17.4              186        3800 fema…
## 3 Adelie  Torge…           40.3          18                195        3250 fema…
## 4 Adelie  Torge…           NA            NA                 NA          NA <NA> 
## 5 Adelie  Torge…           36.7          19.3              193        3450 fema…
## 6 Adelie  Torge…           39.3          20.6              190        3650 male

We can specify the number of rows we want to see in a dataframe with the argument “n”. In the example below, we use n=3 to look at the first three rows of a data frame.


head(penguins, n=3)
## # A tibble: 3 x 7
##   species island bill_length_mm bill_depth_mm flipper_length_… body_mass_g sex  
##   <chr>   <chr>           <dbl>         <dbl>            <dbl>       <dbl> <chr>
## 1 Adelie  Torge…           39.1          18.7              181        3750 male 
## 2 Adelie  Torge…           39.5          17.4              186        3800 fema…
## 3 Adelie  Torge…           40.3          18                195        3250 fema…

2. tail(): See the last n elements of data frame

The function tail() is counterpart to head(). tail() lets you to take a look at the bottom n rows of a dataframe. We can adjust the number of rows with the argument “n” as with head() function.

tail(penguins)
## # A tibble: 6 x 7
##   species island bill_length_mm bill_depth_mm flipper_length_… body_mass_g sex  
##   <chr>   <chr>           <dbl>         <dbl>            <dbl>       <dbl> <chr>
## 1 Chinst… Dream            45.7          17                195        3650 fema…
## 2 Chinst… Dream            55.8          19.8              207        4000 male 
## 3 Chinst… Dream            43.5          18.1              202        3400 fema…
## 4 Chinst… Dream            49.6          18.2              193        3775 male 
## 5 Chinst… Dream            50.8          19                210        4100 male 
## 6 Chinst… Dream            50.2          18.7              198        3775 fema…

3. view(): View the data as table in RStudio

The third way to get a look at the data in dataframe is to use view() function. In RStudio, view() function opens the dataframe in a separate window in the source panel.

view(penguins)

It displays the data in a nice tabular form with ability to sort columns. It is kind of looking at the data in a excel file but with read only mode.

View(): View Data in RStudio

This post is part of the series of posts covering tidyverse tips, tricks, and tutorials to learn data analysis, data munging skills in R with tidyverse suite of R packages. Check here for more tidyverse 101 posts.