21  Статистичні функції

Автор

Юрій Клебан

21.1 Завантаження даних

library(tidyverse)
?msleep
msleep {ggplot2}R Documentation

An updated and expanded version of the mammals sleep dataset

Description

This is an updated and expanded version of the mammals sleep dataset. Updated sleep times and weights were taken from V. M. Savage and G. B. West. A quantitative, theoretical framework for understanding mammalian sleep. Proceedings of the National Academy of Sciences, 104 (3):1051-1056, 2007.

Usage

msleep

Format

A data frame with 83 rows and 11 variables:

name

common name

genus
vore

carnivore, omnivore or herbivore?

order
conservation

the conservation status of the animal

sleep_total

total amount of sleep, in hours

sleep_rem

rem sleep, in hours

sleep_cycle

length of sleep cycle, in hours

awake

amount of time spent awake, in hours

brainwt

brain weight in kilograms

bodywt

body weight in kilograms

Details

Additional variables order, conservation status and vore were added from wikipedia.


[Package ggplot2 version 3.3.6 ]
head(msleep)
A tibble: 6 × 11
namegenusvoreorderconservationsleep_totalsleep_remsleep_cycleawakebrainwtbodywt
<chr><chr><chr><chr><chr><dbl><dbl><dbl><dbl><dbl><dbl>
Cheetah Acinonyx carniCarnivora lc 12.1 NA NA11.9 NA 50.000
Owl monkey Aotus omni Primates NA 17.01.8 NA 7.00.01550 0.480
Mountain beaver AplodontiaherbiRodentia nt 14.42.4 NA 9.6 NA 1.350
Greater short-tailed shrewBlarina omni Soricomorphalc 14.92.30.1333333 9.10.00029 0.019
Cow Bos herbiArtiodactyladomesticated 4.00.70.666666720.00.42300600.000
Three-toed sloth Bradypus herbiPilosa NA 14.42.20.7666667 9.6 NA 3.850

21.2 Перелік функцій

Функція Опис
range() Range (minimum and maximum) of vector
min(), max() Minimum or maximum of vector
mean(), median() Mean or median of vector
sd() Standard deviation of vector
table() Number of observations per level for a factor vector
cor() Determine correlation(s) between two or more vectors
summary() Summary statistics, depends on class
summary(msleep)
     name              genus               vore              order          
 Length:83          Length:83          Length:83          Length:83         
 Class :character   Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character   Mode  :character  
                                                                            
                                                                            
                                                                            
                                                                            
 conservation        sleep_total      sleep_rem      sleep_cycle    
 Length:83          Min.   : 1.90   Min.   :0.100   Min.   :0.1167  
 Class :character   1st Qu.: 7.85   1st Qu.:0.900   1st Qu.:0.1833  
 Mode  :character   Median :10.10   Median :1.500   Median :0.3333  
                    Mean   :10.43   Mean   :1.875   Mean   :0.4396  
                    3rd Qu.:13.75   3rd Qu.:2.400   3rd Qu.:0.5792  
                    Max.   :19.90   Max.   :6.600   Max.   :1.5000  
                                    NA's   :22      NA's   :51      
     awake          brainwt            bodywt        
 Min.   : 4.10   Min.   :0.00014   Min.   :   0.005  
 1st Qu.:10.25   1st Qu.:0.00290   1st Qu.:   0.174  
 Median :13.90   Median :0.01240   Median :   1.670  
 Mean   :13.57   Mean   :0.28158   Mean   : 166.136  
 3rd Qu.:16.15   3rd Qu.:0.12550   3rd Qu.:  41.750  
 Max.   :22.10   Max.   :5.71200   Max.   :6654.000  
                 NA's   :27                          
msleep |> arrange(desc(bodywt)) |> tail()
A tibble: 6 × 11
namegenusvoreorderconservationsleep_totalsleep_remsleep_cycleawakebrainwtbodywt
<chr><chr><chr><chr><chr><dbl><dbl><dbl><dbl><dbl><dbl>
Big brown bat Eptesicus insectiChiroptera lc19.73.90.1166667 4.30.000300.023
House mouse Mus herbi Rodentia nt12.51.40.183333311.50.000400.022
Deer mouse PeromyscusNA Rodentia NA11.5 NA NA12.5 NA0.021
Greater short-tailed shrewBlarina omni Soricomorphalc14.92.30.1333333 9.10.000290.019
Little brown bat Myotis insectiChiroptera NA19.92.00.2000000 4.10.000250.010
Lesser short-tailed shrew Cryptotis omni Soricomorphalc 9.11.40.150000014.90.000140.005
mean(msleep$sleep_total)      # Mean
median(msleep$sleep_total)    # Median
max(msleep$sleep_total)       # Max
min(msleep$sleep_total)       # Min
sd(msleep$sleep_total)        # Standard deviation
var(msleep$sleep_total)       # Variance
quantile(msleep$sleep_total)  # Various quantiles
10.433734939759
10.1
19.9
1.9
4.45035699057058
19.8056773435204
0%
1.9
25%
7.85
50%
10.1
75%
13.75
100%
19.9
msleep[msleep$sleep_total > 8,]
A tibble: 61 × 11
namegenusvoreorderconservationsleep_totalsleep_remsleep_cycleawakebrainwtbodywt
<chr><chr><chr><chr><chr><dbl><dbl><dbl><dbl><dbl><dbl>
Cheetah Acinonyx carni Carnivora lc 12.1 NA NA11.9 NA50.000
Owl monkey Aotus omni Primates NA 17.01.8 NA 7.00.01550 0.480
Mountain beaver Aplodontia herbi Rodentia nt 14.42.4 NA 9.6 NA 1.350
Greater short-tailed shrewBlarina omni Soricomorpha lc 14.92.30.1333333 9.10.00029 0.019
Three-toed sloth Bradypus herbi Pilosa NA 14.42.20.7666667 9.6 NA 3.850
Northern fur seal Callorhinus carni Carnivora vu 8.71.40.383333315.3 NA20.490
Dog Canis carni Carnivora domesticated10.12.90.333333313.90.0700014.000
Guinea pig Cavis herbi Rodentia domesticated 9.40.80.216666714.60.00550 0.728
Grivet Cercopithecusomni Primates lc 10.00.7 NA14.0 NA 4.750
Chinchilla Chinchilla herbi Rodentia domesticated12.51.50.116666711.50.00640 0.420
Star-nosed mole Condylura omni Soricomorpha lc 10.32.2 NA13.70.00100 0.060
African giant pouched rat Cricetomys omni Rodentia NA 8.32.0 NA15.70.00660 1.000
Lesser short-tailed shrew Cryptotis omni Soricomorpha lc 9.11.40.150000014.90.00014 0.005
Long-nosed armadillo Dasypus carni Cingulata lc 17.43.10.3833333 6.60.01080 3.500
North American Opossum Didelphis omni Didelphimorphialc 18.04.90.3333333 6.00.00630 1.700
Big brown bat Eptesicus insectiChiroptera lc 19.73.90.1166667 4.30.00030 0.023
European hedgehog Erinaceus omni Erinaceomorpha lc 10.13.50.283333313.90.00350 0.770
Patas monkey Erythrocebus omni Primates lc 10.91.1 NA13.10.1150010.000
Western american chipmunk Eutamias herbi Rodentia NA 14.9 NA NA 9.1 NA 0.071
Domestic cat Felis carni Carnivora domesticated12.53.20.416666711.50.02560 3.300
Galago Galago omni Primates NA 9.81.10.550000014.20.00500 0.200
Mongoose lemur Lemur herbi Primates vu 9.50.9 NA14.5 NA 1.670
Thick-tailed opposum Lutreolina carni Didelphimorphialc 19.46.6 NA 4.6 NA 0.370
Macaque Macaca omni Primates NA 10.11.20.750000013.90.17900 6.800
Mongolian gerbil Meriones herbi Rodentia lc 14.21.9 NA 9.8 NA 0.053
Golden hamster Mesocricetus herbi Rodentia en 14.33.10.2000000 9.70.00100 0.120
Vole Microtus herbi Rodentia NA 12.8 NA NA11.2 NA 0.035
House mouse Mus herbi Rodentia nt 12.51.40.183333311.50.00040 0.022
Little brown bat Myotis insectiChiroptera NA 19.92.00.2000000 4.10.00025 0.010
Round-tailed muskrat Neofiber herbi Rodentia nt 14.6 NA NA 9.4 NA 0.266
Northern grasshopper mouse Onychomys carni Rodentia lc 14.5 NA NA 9.5 NA 0.028
Rabbit Oryctolagus herbi Lagomorpha domesticated 8.40.90.416666715.60.01210 2.500
Chimpanzee Pan omni Primates NA 9.71.41.416666714.30.44000 52.200
Tiger Panthera carni Carnivora en 15.8 NA NA 8.2 NA162.564
Jaguar Panthera carni Carnivora nt 10.4 NA NA13.60.15700100.000
Lion Panthera carni Carnivora vu 13.5 NA NA10.5 NA161.499
Baboon Papio omni Primates NA 9.41.00.666666714.60.18000 25.235
Desert hedgehog Paraechinus NA Erinaceomorphalc 10.32.7 NA13.70.00240 0.550
Potto Perodicticusomni Primates lc 11.0 NA NA13.0 NA 1.100
Deer mouse Peromyscus NA Rodentia NA 11.5 NA NA12.5 NA 0.021
Phalanger Phalanger NA Diprotodontia NA 13.71.8 NA10.30.01140 1.620
Potoroo Potorous herbi Diprotodontia NA 11.11.5 NA12.9 NA 1.100
Giant armadillo Priodontes insectiCingulata en 18.16.1 NA 5.90.08100 60.000
Laboratory rat Rattus herbi Rodentia lc 13.02.40.183333311.00.00190 0.320
African striped mouse Rhabdomys omni Rodentia NA 8.7 NA NA15.3 NA 0.044
Squirrel monkey Saimiri omni Primates NA 9.61.4 NA14.40.02000 0.743
Eastern american mole Scalopus insectiSoricomorpha lc 8.42.10.166666715.60.00120 0.075
Cotton rat Sigmodon herbi Rodentia NA 11.31.10.150000012.70.00118 0.148
Mole rat Spalax NA Rodentia NA 10.62.4 NA13.40.00300 0.122
Arctic ground squirrel Spermophilusherbi Rodentia lc 16.6 NA NA 7.40.00570 0.920
Thirteen-lined ground squirrelSpermophilusherbi Rodentia lc 13.83.40.216666710.20.00400 0.101
Golden-mantled ground squirrelSpermophilusherbi Rodentia lc 15.93.0 NA 8.1 NA 0.205
Musk shrew Suncus NA Soricomorpha NA 12.82.00.183333311.20.00033 0.048
Pig Sus omni Artiodactyla domesticated 9.12.40.500000014.90.18000 86.250
Short-nosed echidna TachyglossusinsectiMonotremata NA 8.6 NA NA15.40.02500 4.500
Eastern american chipmunk Tamias herbi Rodentia NA 15.8 NA NA 8.2 NA 0.112
Tenrec Tenrec omni Afrosoricida NA 15.62.3 NA 8.40.00260 0.900
Tree shrew Tupaia omni Scandentia NA 8.92.60.233333315.10.00250 0.104
Arctic fox Vulpes carni Carnivora NA 12.5 NA NA11.50.04450 3.380
Red fox Vulpes carni Carnivora NA 9.82.40.350000014.20.05040 4.230
sum(msleep$sleep_total > 8)   # Frequency (count)
mean(msleep$sleep_total > 8)  # Relative frequency (proportion)
61
0.734939759036145
mean(msleep$sleep_rem)
<NA>
mean(msleep$sleep_rem, na.rm = TRUE)
1.87540983606557
cor(msleep$sleep_total, msleep$sleep_rem)
<NA>
cor(msleep$sleep_total, msleep$sleep_rem, use = "complete.obs")
0.751754999228714
table(msleep$vore)

  carni   herbi insecti    omni 
     19      32       5      20 
proportions(table(msleep$vore))

     carni      herbi    insecti       omni 
0.25000000 0.42105263 0.06578947 0.26315789 
# Counts:
table(msleep$vore, msleep$conservation)

# Proportions, per row:
proportions(table(msleep$vore, msleep$conservation),
            margin = 1)
         
          cd domesticated en lc nt vu
  carni    1            2  1  5  1  4
  herbi    1            7  2 10  3  3
  insecti  0            0  1  2  0  0
  omni     0            1  0  8  0  0
         
                  cd domesticated         en         lc         nt         vu
  carni   0.07142857   0.14285714 0.07142857 0.35714286 0.07142857 0.28571429
  herbi   0.03846154   0.26923077 0.07692308 0.38461538 0.11538462 0.11538462
  insecti 0.00000000   0.00000000 0.33333333 0.66666667 0.00000000 0.00000000
  omni    0.00000000   0.11111111 0.00000000 0.88888889 0.00000000 0.00000000

TASK

Load ggplot2 using library(ggplot2) if you have not already done so. Then do the following:

View the documentation for the diamonds data and read about different the variables.

Check the data structures: how many observations and variables are there and what type of variables (numeric, categorical, etc.) are there?

Compute summary statistics (means, median, min, max, counts for categorical variables). Are there any missing values?

plot(msleep$sleep_total, msleep$sleep_rem)

library(ggplot2)
ggplot(msleep, aes(x = sleep_total, y = sleep_rem)) + geom_point()
Warning message:
"Removed 22 rows containing missing values (geom_point)."

plot(msleep$sleep_total, msleep$sleep_rem, pch = 16)
grid()

ggplot(msleep, aes(sleep_total, sleep_rem, colour = vore)) +
      geom_point() +
      xlab("Total sleep time (h)")
Warning message:
"Removed 22 rows containing missing values (geom_point)."


21.3 Інші методи

library(psych)
describe(msleep)
A psych: 11 × 13
varsnmeansdmediantrimmedmadminmaxrangeskewkurtosisse
<int><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl>
name* 183 42.0000000 24.103941642.000000042.0000000031.134600001.0000000 83.000 82.000000 0.00000000-1.2434523 2.64575131
genus* 283 40.2530120 22.517658941.000000040.4477611928.169400001.0000000 77.000 76.000000-0.05639775-1.2520857 2.47163416
vore* 376 2.3421053 1.1260862 2.0000000 2.30645161 1.482600001.0000000 4.000 3.000000 0.41786835-1.2479239 0.12917095
order* 483 11.2771084 6.151171315.000000011.53731343 4.447800001.0000000 19.000 18.000000-0.37717341-1.5460760 0.67517877
conservation* 554 3.7777778 1.3127340 4.0000000 3.77272727 0.741300001.0000000 6.000 5.000000-0.13522194-0.5048689 0.17864046
sleep_total 683 10.4337349 4.450357010.100000010.38358209 5.040840001.9000000 19.900 18.000000 0.05230964-0.7074466 0.48849014
sleep_rem 761 1.8754098 1.2982881 1.5000000 1.70816327 1.186080000.1000000 6.600 6.500000 1.46161590 2.7342493 0.16622875
sleep_cycle 832 0.4395833 0.3586801 0.3333333 0.37628205 0.234745000.1166667 1.500 1.383333 1.49498905 1.5749153 0.06340629
awake 983 13.5674699 4.452085213.900000013.61716418 5.040840004.1000000 22.100 18.000000-0.05133450-0.7073810 0.48867984
brainwt1056 0.2815814 0.9764137 0.0124000 0.06602717 0.017843090.0001400 5.712 5.711860 4.6275002520.9636487 0.13047877
bodywt1183166.1363494786.8397316 1.670000020.48743284 2.434429200.00500006654.0006653.995000 7.1001626153.718035786.36688086
describe(
          msleep, 
          fast = TRUE, 
          quant = c(0.25, 0.50, 0.75), 
          ranges = FALSE, 
)
A psych: 11 × 8
varsnmeansdseQ0.25Q0.5Q0.75
<int><dbl><dbl><dbl><dbl><dbl><dbl><dbl>
name 183 NaN NA NA NA NA NA
genus 283 NaN NA NA NA NA NA
vore 376 NaN NA NA NA NA NA
order 483 NaN NA NA NA NA NA
conservation 554 NaN NA NA NA NA NA
sleep_total 683 10.4337349 4.4503570 0.48849014 7.850000010.100000013.7500000
sleep_rem 761 1.8754098 1.2982881 0.16622875 0.9000000 1.5000000 2.4000000
sleep_cycle 832 0.4395833 0.3586801 0.06340629 0.1833333 0.3333333 0.5791667
awake 983 13.5674699 4.4520852 0.4886798410.250000013.900000016.1500000
brainwt1056 0.2815814 0.9764137 0.13047877 0.0029000 0.0124000 0.1255000
bodywt1183166.1363494786.839731686.36688086 0.1740000 1.670000041.7500000

21.3.1 Описова статистика для групи

describe(
          msleep ~ vore,
          fast = TRUE, 
          quant = c(0.25, 0.50, 0.75), 
          ranges = FALSE, omit = TRUE
)

 Descriptive statistics by group 
vore: carni
             vars  n  mean     sd    se Q0.25  Q0.5 Q0.75
name            1 19   NaN     NA    NA    NA    NA    NA
genus           2 19   NaN     NA    NA    NA    NA    NA
vore            3 19   NaN     NA    NA    NA    NA    NA
order           4 19   NaN     NA    NA    NA    NA    NA
conservation    5 14   NaN     NA    NA    NA    NA    NA
sleep_total     6 19 10.38   4.67  1.07  6.25 10.40 13.00
sleep_rem       7 10  2.29   1.86  0.59  1.33  1.95  3.05
sleep_cycle     8  5  0.37   0.03  0.01  0.35  0.38  0.38
awake           9 19 13.63   4.68  1.07 11.00 13.60 17.75
brainwt        10  9  0.08   0.10  0.03  0.02  0.04  0.07
bodywt         11 19 90.75 182.07 41.77  3.34 20.49 93.00
------------------------------------------------------------ 
vore: herbi
             vars  n   mean      sd     se Q0.25  Q0.5 Q0.75
name            1 32    NaN      NA     NA    NA    NA    NA
genus           2 32    NaN      NA     NA    NA    NA    NA
vore            3 32    NaN      NA     NA    NA    NA    NA
order           4 32    NaN      NA     NA    NA    NA    NA
conservation    5 26    NaN      NA     NA    NA    NA    NA
sleep_total     6 32   9.51    4.88   0.86  4.30 10.30 14.22
sleep_rem       7 24   1.37    0.92   0.19  0.60  0.95  1.97
sleep_cycle     8 12   0.42    0.32   0.09  0.18  0.22  0.69
awake           9 32  14.49    4.88   0.86  9.78 13.70 19.70
brainwt        10 20   0.62    1.57   0.35  0.01  0.01  0.24
bodywt         11 32 366.88 1244.08 219.92  0.19  1.23 39.00
------------------------------------------------------------ 
vore: insecti
             vars n  mean    sd    se Q0.25  Q0.5 Q0.75
name            1 5   NaN    NA    NA    NA    NA    NA
genus           2 5   NaN    NA    NA    NA    NA    NA
vore            3 5   NaN    NA    NA    NA    NA    NA
order           4 5   NaN    NA    NA    NA    NA    NA
conservation    5 3   NaN    NA    NA    NA    NA    NA
sleep_total     6 5 14.94  5.92  2.65  8.60 18.10 19.70
sleep_rem       7 4  3.52  1.93  0.96  2.08  3.00  4.45
sleep_cycle     8 3  0.16  0.04  0.02  0.14  0.17  0.18
awake           9 5  9.06  5.92  2.65  4.30  5.90 15.40
brainwt        10 5  0.02  0.03  0.02  0.00  0.00  0.03
bodywt         11 5 12.92 26.39 11.80  0.02  0.07  4.50
------------------------------------------------------------ 
vore: omni
             vars  n  mean    sd   se Q0.25  Q0.5 Q0.75
name            1 20   NaN    NA   NA    NA    NA    NA
genus           2 20   NaN    NA   NA    NA    NA    NA
vore            3 20   NaN    NA   NA    NA    NA    NA
order           4 20   NaN    NA   NA    NA    NA    NA
conservation    5  9   NaN    NA   NA    NA    NA    NA
sleep_total     6 20 10.93  2.95 0.66  9.10  9.90 10.93
sleep_rem       7 18  1.96  1.01 0.24  1.25  1.85  2.30
sleep_cycle     8 11  0.59  0.47 0.14  0.26  0.50  0.71
awake           9 20 13.07  2.95 0.66 13.07 14.10 14.90
brainwt        10 17  0.15  0.32 0.08  0.00  0.01  0.18
bodywt         11 20 12.72 24.69 5.52  0.18  0.95  7.60

** Refs **

https://www.agroninfo.com/computing-summary-statistics-in-r/