18  Списки [EN]

Автор

Юрій Клебан

18.1 What is lists in R?

Lists are the R objects which contain elements of different types like − numbers, strings, vectors and another list inside it. A list can also contain a matrix or a function as its elements. List is created using list() function.

Before start lest see one more package for working with date lubridate. It has a lot of functions for date parsing, manipulating and other. Check it with:

#install.packages("lubridate")
#??lubridate

For our sample we need function ymd() that parse charater date from format like “2012-10-25”.

library(lubridate)
date1 <- ymd("2021-05-25")
date2 <- ymd("2021-05-27")

date1
date2

You can also use ymdhms() to parse date and time correctly.

datetime <- ymd_hms("2021-05-25 11:05:12", tz = "UTC") # wee need this for client transactions fix
datetime
[1] "2021-05-25 11:05:12 UTC"

18.2 Creating a List

Following is an example to create a list containing vectors, strings, numbers and a logical values. Our list will describe a model of banks client:

# initial values
set.seed(1) # for fixing pseudo-random
library(lubridate)

client_name <- "John Doe"
services <- c("credit", "deposite", "online-app")
is_active <- TRUE
transactions <- data.frame(contract_id = sample(10000:99999, size = 2, replace = T),# random numbers
                          datetime = c(ymd_hms("2021-05-25 11:05:12"),
                                      ymd_hms("2021-05-25 11:07:14"),
                                      ymd_hms("2021-05-25 11:08:02"),
                                      ymd_hms("2021-05-25 11:12:45"),
                                      ymd_hms("2021-05-25 11:47:00"),
                                      ymd_hms("2021-05-25 11:48:08")),
                         oper_type = sample(0:1, size=6, replace = T), # 1 for debet, 0 for credet
                         amount = round(sample(1:1000, size = 6) + runif(6),2))   

#change AMOUNT to minus for debet (opertype == 1 
transactions$amount <- ifelse(transactions$oper_type == 1, (-1)*transactions$amount, transactions$amount) 
transactions
A data.frame: 6 × 4
contract_iddatetimeoper_typeamount
<int><dttm><int><dbl>
343872021-05-25 11:05:121-187.72
695202021-05-25 11:07:140 307.99
343872021-05-25 11:08:020 993.38
695202021-05-25 11:12:450 597.78
343872021-05-25 11:47:001-277.93
695202021-05-25 11:48:081-874.21
# creating list of signle objects, vector and dataframe
list_data <- list(client_name, is_active, services, transactions)
list_data
  1. 'John Doe'
  2. TRUE
    1. 'credit'
    2. 'deposite'
    3. 'online-app'
  3. A data.frame: 6 × 4
    contract_iddatetimeoper_typeamount
    <int><dttm><int><dbl>
    343872021-05-25 11:05:121-187.72
    695202021-05-25 11:07:140 307.99
    343872021-05-25 11:08:020 993.38
    695202021-05-25 11:12:450 597.78
    343872021-05-25 11:47:001-277.93
    695202021-05-25 11:48:081-874.21

18.3 Naming List Elements

Its better to name elements in list:

names(list_data) <- c("ClientName", "IsActive", "Services", "Transactions")
list_data
$ClientName
'John Doe'
$IsActive
TRUE
$Services
  1. 'credit'
  2. 'deposite'
  3. 'online-app'
$Transactions
A data.frame: 6 × 4
contract_iddatetimeoper_typeamount
<int><dttm><int><dbl>
343872021-05-25 11:05:121-187.72
695202021-05-25 11:07:140 307.99
343872021-05-25 11:08:020 993.38
695202021-05-25 11:12:450 597.78
343872021-05-25 11:47:001-277.93
695202021-05-25 11:48:081-874.21

You can extend list “on fly” with $:

list_data$ClientName
list_data$ClientId <- 11125489656
list_data
'John Doe'
$ClientName
'John Doe'
$IsActive
TRUE
$Services
  1. 'credit'
  2. 'deposite'
  3. 'online-app'
$Transactions
A data.frame: 6 × 4
contract_iddatetimeoper_typeamount
<int><dttm><int><dbl>
343872021-05-25 11:05:121-187.72
695202021-05-25 11:07:140 307.99
343872021-05-25 11:08:020 993.38
695202021-05-25 11:12:450 597.78
343872021-05-25 11:47:001-277.93
695202021-05-25 11:48:081-874.21
$ClientId
11125489656

18.4 Accessing List Elements

For now every element can be viewed with index in [[]] or []:

# access to list element
list_data[1]
typeof(list_data[1])
$ClientName = 'John Doe'
'list'
# access to object
list_data[[1]]
typeof(list_data[[1]])
'John Doe'
'character'

Access by $ also anbled:

list_data$Transactions
A data.frame: 6 × 4
contract_iddatetimeoper_typeamount
<int><dttm><int><dbl>
343872021-05-25 11:05:121-187.72
695202021-05-25 11:07:140 307.99
343872021-05-25 11:08:020 993.38
695202021-05-25 11:12:450 597.78
343872021-05-25 11:47:001-277.93
695202021-05-25 11:48:081-874.21

18.5 Manipulating List Elements

Lets continue using out list_data list.

list_data
$ClientName
'John Doe'
$IsActive
TRUE
$Services
  1. 'credit'
  2. 'deposite'
  3. 'online-app'
$Transactions
A data.frame: 6 × 4
contract_iddatetimeoper_typeamount
<int><dttm><int><dbl>
343872021-05-25 11:05:121-187.72
695202021-05-25 11:07:140 307.99
343872021-05-25 11:08:020 993.38
695202021-05-25 11:12:450 597.78
343872021-05-25 11:47:001-277.93
695202021-05-25 11:48:081-874.21
$ClientId
11125489656

We can change data with [] and access with $ symbol.

# changing clint name with index
list_data[1] <- "New Name"
list_data
$ClientName
'New Name'
$IsActive
TRUE
$Services
  1. 'credit'
  2. 'deposite'
  3. 'online-app'
$Transactions
A data.frame: 6 × 4
contract_iddatetimeoper_typeamount
<int><dttm><int><dbl>
343872021-05-25 11:05:121-187.72
695202021-05-25 11:07:140 307.99
343872021-05-25 11:08:020 993.38
695202021-05-25 11:12:450 597.78
343872021-05-25 11:47:001-277.93
695202021-05-25 11:48:081-874.21
$ClientId
11125489656
# changing data with $
list_data$ClientName = "John Doe"
list_data
$ClientName
'John Doe'
$IsActive
TRUE
$Services
  1. 'credit'
  2. 'deposite'
  3. 'online-app'
$Transactions
A data.frame: 6 × 4
contract_iddatetimeoper_typeamount
<int><dttm><int><dbl>
343872021-05-25 11:05:121-187.72
695202021-05-25 11:07:140 307.99
343872021-05-25 11:08:020 993.38
695202021-05-25 11:12:450 597.78
343872021-05-25 11:47:001-277.93
695202021-05-25 11:48:081-874.21
$ClientId
11125489656

Yo can merge lists with c() function. Let’s create new list and attach it to the list_data:

list_2 <- list(Consultant = list(Name = "David Cameron", PhoneNum = "+9562311855"))
list_2
$Consultant =
$Name
'David Cameron'
$PhoneNum
'+9562311855'
list_data <- c(list_data, list_2)
list_data
$ClientName
'John Doe'
$IsActive
TRUE
$Services
  1. 'credit'
  2. 'deposite'
  3. 'online-app'
$Transactions
A data.frame: 6 × 4
contract_iddatetimeoper_typeamount
<int><dttm><int><dbl>
343872021-05-25 11:05:121-187.72
695202021-05-25 11:07:140 307.99
343872021-05-25 11:08:020 993.38
695202021-05-25 11:12:450 597.78
343872021-05-25 11:47:001-277.93
695202021-05-25 11:48:081-874.21
$ClientId
11125489656
$Consultant
$Name
'David Cameron'
$PhoneNum
'+9562311855'

With unlist() you can convert a list to a vector.

list_demo <- list(1:10)
list_demo
class(list_demo)
typeof(list_demo)
    1. 1
    2. 2
    3. 3
    4. 4
    5. 5
    6. 6
    7. 7
    8. 8
    9. 9
    10. 10
'list'
'list'

list_demo * 5 # error, you cannot use * for list

Error in list_demo * 5: non-numeric argument to binary operator Traceback:
lapply(list_demo, function(c) c*5)
    1. 5
    2. 10
    3. 15
    4. 20
    5. 25
    6. 30
    7. 35
    8. 40
    9. 45
    10. 50
vector_demo <- unlist(list_demo)
vector_demo
class(vector_demo)
typeof(vector_demo)
  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
'integer'
'integer'
vector_demo * 5 # now it works
  1. 5
  2. 10
  3. 15
  4. 20
  5. 25
  6. 30
  7. 35
  8. 40
  9. 45
  10. 50

18.6 Tasks

18.6.1 Task 1

Wrie a function that calculates sum, average, median, min, max of taken vector. Generate sample vector of 10 elements in \([1;100]\).

Solution

x <- sample(10:100, size = 10)
print(x)
 [1] 94 46 98 99 43 53 88 42 44 79
vector_info <- function(vector) {
  x <- list()
  x$Sum <- sum(vector)
  x$Mean <- mean(vector)
  x$Median <- median(vector)
  x$Min <- min(vector)
  x$Max <- max(vector)
  return(x)
}

vector_info(x)
names(vector_info(x))
$Sum
686
$Mean
68.6
$Median
66
$Min
42
$Max
99
  1. 'Sum'
  2. 'Mean'
  3. 'Median'
  4. 'Min'
  5. 'Max'

18.7 References

  1. The Comprehensive R Archive NetworkRcran: Url: https://cran.r-project.org/
  2. RStudio official website. Url: https://rstudio.com/
  3. Anaconda official website. Url: https://www.anaconda.com/
  4. Introduction to R. Datacamp interactive course. Url: https://www.datacamp.com/courses/free-introduction-to-r
  5. Quanargo. Introduction to R. Url: https://www.quantargo.com/courses/course-r-introduction
  6. R Coder Project. Begin your data science career with R language! Url: https://r-coder.com/
  7. R Core Team (2019). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.URL https://www.R-project.org/.
  8. A.B. Shipunov, EM Baldin, P.A. Volkova, VG Sufiyanov. Visual statistics. We use R! - M .: DMK Press, 2012. - 298 p .: ill.
  9. An Introduction to R. URL: https://cran.r-project.org/doc/manuals/r-release/R-intro.html
  10. R programming. https://www.datamentor.io/r-programming
  11. Learn R. R Functions. https://www.w3schools.com/r/r_functions.asp
  12. UC Business Analytics R Programming Guide. Managing Data Frames. http://uc-r.github.io/dataframes
  13. Learn R programming. R - Lists. https://www.tutorialspoint.com/r/r_lists.htm
  14. Tutorial on the R Apply Family by Carlo Fanara. https://www.datacamp.com/community/tutorials/r-tutorial-apply-family