1. R 初步

R 语言的入门,文档下载

链接:https://pan.baidu.com/s/1Hidv00Yp-_iatDf-HXDEJQ

提取码:1dzb

1.1. R 绘图 graphics

1.2. 计算功能

  • 赋值:

    • 简单算术运算功能,输入运算式得到结果:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    > 2+2
    [1] 4

    > exp(-2)
    [1] 0.1353353

    > rnorm(15)
    [1] -1.475349131 -0.420342363 1.650538466 -0.350305530 -1.514609697 0.894449245
    [7] -0.052745967 -1.353221501 -1.305113978 -1.574893823 -0.476416373 -0.600568454
    [13] 0.001433752 -0.285181931 1.364441405
    • R语言中的赋值:
    1
    2
    3
    4
    5
    6
    7
    > x<- -2

    > x
    [1] -2

    > x+x
    [1] -4
    • 不要以空格和点号作为名称的开头。
    • 名称是区分大小写的。
    • 最好不要以单个字母来命名。
    • FT 是FALSE和TRUE的标准缩写。
  • 向量计算 Vectorized arithmetic

【例题:判断身体质量指数是否满足标准】

1
2
3
4
5
6
7
> weight <- c(60, 72, 57, 90, 95, 72)
> weight
[1] 60 72 57 90 95 72
> height <- c(1.75, 1.80, 1.65, 1.90, 1.74, 1.91)
> bmi <- weight/height^2
> bmi
[1] 19.59184 22.22222 20.93664 24.93075 31.37799 19.73630
  • 计算均值和标准差:$SD=\sqrt{\sum(X_i-\overline{X})^2/(n-1)}$
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
> sum(weight)
[1] 446
> sum(weight)/length(weight)
[1] 74.33333
> xbar <- sum(weight)/length(weight)
> weight - xbar
[1] -14.333333 -2.333333 -17.333333 15.666667 20.666667
[6] -2.333333
> (weight - xbar)^2
[1] 205.444444 5.444444 300.444444 245.444444 427.111111
[6] 5.444444
> sum((weight - xbar)^2)
[1] 1189.333
> sqrt(sum((weight - xbar)^2)/(length(weight)-1))
[1] 15.42293
> mean(weight)
[1] 74.33333
> sd(weight)
[1] 15.42293

检查bmi身体重量指数是否瞒住标准(标准bmi指数为20-25,均值为22.5)

1
2
3
4
5
6
7
8
9
10
> t.test(bmi, mu = 22.5)
One Sample t-test
data: bmi
t = 0.34488, df = 5, p-value = 0.7442
alternative hypothesis: true mean is not equal to 22.5
95 percent confidence interval:
18.41734 27.84791
sample estimates:
mean of x
23.13262

2. R语言基础

2.1 函数及参数 functions and arguments

1
2
3
4
5
6
7
8
9
10
11
x <- 10086
# 对数
log(x)
# 画图
plot(height, weight)
plot(height, weight, pch=2)
plot(x=height, y=weight)
# 列出当前已保存并可使用的所有参数
ls()
# 打印目标信息
args(plot.default)

2.2 向量 vectors

  • 创建简单向量
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# 创建简单向量
c("Huey", "Dewey", "Louie")
c(T, T, F, T)
bmi > 25
## c()函数创建向量
x <- c(1, 2, 3)
y <- c(10, 20)
## 向量拼接
c(x, y, 5)
## 创建二维向量
x <- c(red="Huey", blue="Dewey", green="Louie")
## 打印向量x的每一维的name
names(x)
## 自动匹配并转化数据类型
c(FALSE, 3)
c(pi, "abc")
c(FALSE, "abc")

运行结果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
> c("Huey", "Dewey", "Louie")
[1] "Huey" "Dewey" "Louie"
> c(T, T, F, T)
[1] TRUE TRUE FALSE TRUE
> bmi > 25
[1] FALSE FALSE FALSE FALSE TRUE FALSE
> # c()函数创建向量
> x <- c(1, 2, 3)
> y <- c(10, 20)
> c(x, y, 5)
[1] 1 2 3 10 20 5
> #
> x <- c(red="Huey", blue="Dewey", green="Louie")
> x
red blue green
"Huey" "Dewey" "Louie"
> names(x)
[1] "red" "blue" "green"
> c(FALSE, 3)
[1] 0 3
> c(pi, "abc")
[1] "3.14159265358979" "abc"
> c(FALSE, "abc")
[1] "FALSE" "abc"
  • 使用函数创建向量
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
> # 使用seq(),rep()函数创建向量
> ## 创建连续数值向量
> seq(4, 9)
[1] 4 5 6 7 8 9
> 4:9
[1] 4 5 6 7 8 9
> ## 创建等距向量
> seq(4, 10, 2)
[1] 4 6 8 10
> ## 向量复制
> oops <- c(7, 9, 13)
> rep(oops, 3)
[1] 7 9 13 7 9 13 7 9 13
> ## 按倍次复制
> rep(oops, 1:3)
[1] 7 9 9 13 13 13
> ## 批量复制
> rep(1:2, c(10, 15))
[1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

2.3. 矩阵与数组 matrics and arrays

  • 创建矩阵
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
> # 创建矩阵
> ## 简单创建
> x <- 1:12
> dim(x) <- c(3, 4)
> x
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 6 9 12
> ## 使用函数创建
> matrix(1:12, nrow = 3, byrow = T)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 5 6 7 8
[3,] 9 10 11 12
  • 矩阵行列名及转置
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
> # 矩阵行列名及转置
> ## 矩阵行列名
> y <- matrix(1:12, nrow = 3, byrow = T)
> rownames(y) <- LETTERS[1:3]
> y
[,1] [,2] [,3] [,4]
A 1 2 3 4
B 5 6 7 8
C 9 10 11 12
> ## 使用函数转置
> t(y)
A B C
[1,] 1 5 9
[2,] 2 6 10
[3,] 3 7 11
[4,] 4 8 12
  • 行组合与列组合
1
2
3
4
5
6
7
8
9
10
11
12
13
14
> # ⾏组合与列组合
> ## ⾏组合
> cbind(A=1:4, B=5:8, C=9:12)
A B C
[1,] 1 5 9
[2,] 2 6 10
[3,] 3 7 11
[4,] 4 8 12
> ## 列组合
> rbind(A=1:4, B=5:8, C=9:12)
[,1] [,2] [,3] [,4]
A 1 2 3 4
B 5 6 7 8
C 9 10 11 12

2.4. 因子 factors

1
2
3
4
5
6
7
8
9
10
> pain <- c(0,3,2,2,1)
> fpain <- factor(pain, levels = 0:3)
> levels(fpain) <- c("none", "mild", "medium", "severe")
> fpain
[1] none severe medium medium mild
Levels: none mild medium severe
> as.numeric(fpain)
[1] 1 4 3 3 2
> levels(fpain)
[1] "none" "mild" "medium" "severe"

2.5. 列表 lists

1
2
3
4
5
6
7
8
9
10
> # 创建数据
> intake.pre <- c(5260,5470,5640,6180,6390,6515,6805,7515,7515,8230,8770)
> intake.post <- c(3910,4220,3885,5160,5645,4680,5260,5975,6790,6900,7335)
> # 对数据排序
> mylist <- list(before = intake.pre, after = intake.post)
> mylist
$before
[1] 5260 5470 5640 6180 6390 6515 6805 7515 7515 8230 8770
$after
[1] 3910 4220 3885 5160 5645 4680 5260 5975 6790 6900 7335

2.6. 数据框 data frames

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
> d <- data.frame(intake.pre, intake.post)
> d
intake.pre intake.post
1 5260 3910
2 5470 4220
3 5640 3885
4 6180 5160
5 6390 5645
6 6515 4680
7 6805 5260
8 7515 5975
9 7515 6790
10 8230 6900
11 8770 7335
> d$intake.pre
[1] 5260 5470 5640 6180 6390 6515 6805 7515 7515 8230 8770

2.7. 索引 indexing

1
2
3
4
5
6
7
8
9
10
> # 索引
> intake.pre[5]
[1] 6390
> intake.pre[c(3, 5, 7)]
[1] 5640 6390 6805
> v <- c(3,5,7)
> intake.pre[v]
[1] 5640 6390 6805
> intake.pre[1:5]
[1] 5260 5470 5640 6180 6390

2.8. 条件筛选 Conditional selection

1
2
3
4
5
6
7
> # 条件筛选
> intake.post[intake.pre > 7000]
[1] 5975 6790 6900 7335
> intake.post[intake.pre > 7000 & intake.pre <= 8000]
[1] 5975 6790
> intake.pre > 7000 & intake.pre <= 8000
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE FALSE FALSE

2.9. 数据框索引

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
> # 数据框索引
> d <- data.frame(intake.pre, intake.post)
> d[5, 1]
[1] 6390
> d[5, ]
intake.pre intake.post
5 6390 5645
> d[d$intake.pre > 7000, ]
intake.pre intake.post
8 7515 5975
9 7515 6790
10 8230 6900
11 8770 7335
> d[1:2, ]
intake.pre intake.post
1 5260 3910
2 5470 4220
> sel <- d$intake.pre > 7000
> sel
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE
> d[sel, ]
intake.pre intake.post
8 7515 5975
9 7515 6790
10 8230 6900
11 8770 7335
> head(d)
intake.pre intake.post
1 5260 3910
2 5470 4220
3 5640 3885
4 6180 5160
5 6390 5645
6 6515 4680