PH525x学习笔记Ⅰ
文章目录
数据导入
安装软件包
|
|
下载课程地址中的femaleMiceWeights.csv
,并导入数据到R中。
方法一
直接下载文件到工作目录下,然后使用read.csv导入数据
|
|
方法二
|
|
数据清理dplyr
读入femaleMiceWeights.csv
的数据后,筛选我们想要的数据
|
|
在dplyr
中可以使用管道符号%>%
来构建连续任务,上述步骤可以在一步完成
|
|
上述命令不会改变数据的性质,若想要将列表转变为向量,则应使用unlist
函数
|
|
summarize函数
该函数的作用是将多个数值降维至单一数值,用法如下
summarise(.data, …)
Useful functions
Center: mean(), median()
Spread: sd(), IQR(), mad()
Range: min(), max(), quantile()
Position: first(), last(), nth(),
Count: n(), n_distinct()
Logical: any(), all()
|
|
数学符号
编号,求和,希腊字母
希腊字母
μ , the Greek letter for m (m is for mean)
σ , the Greek letter for s(s is for standard deviation )
ε , the Greek letter for e(e is for measurement error )
β , Effect sizes
∞
In the text we often talk about asymptotic results. Typically, this refers to an approximation that gets better and better as the number of data points we consider gets larger and larger, with perfect approximations occurring when the number of data points is ∞. In practice, there is no such thing as ∞, but it is a convenient concept to understand. One way to think about asymptotic results is as results that become better and better as some number increases and we can pick a number so that a computer can’t tell the difference between the approximation and the real number. Here is a very simple example that approximates 1/3 with decimals:
|
|
此时,16即是无穷大
积分
∫42f(x)dx
|
|
参考来源
http://genomicsclass.github.io/book/pages/dplyr_intro.html
http://genomicsclass.github.io/book/pages/dplyr_intro_exercises.html
http://genomicsclass.github.io/book/pages/math_notation.html