R语言基础编程技巧汇编 - 27

时间：2015-05-08 09:36:45 阅读：363 评论：0 收藏：0 [点我收藏+]

标签：r语言数据分析数据挖掘机器学习

1. 向量循环移位

library("magic")

x <- 1:10

magic::shift(x,1)

# [1] 10 1 2 3 4 5 6 7 8 9

magic::shift(x,1)

# [1] 10 1 2 3 4 5 6 7 8 9

magic::shift(x,2)

# [1] 9 10 1 2 3 4 5 6 7 8

magic::shift(x,3)

# [1] 8 9 10 1 2 3 4 5 6 7

magic::shift(x,4)

# [1] 7 8 9 10 1 2 3 4 5 6

2. 绘制平行坐标图

使用lattice包：

library("lattice")

parallel(~iris[1:4] | Species, iris)

技术分享

使用ggplot2包：

D <- data.frame(Gain = rnorm(20), Trader= factor(LETTERS[1:4]), Day = factor(rep(1:5, each = 4)))

ggplot(D) + geom_line(aes(x = Trader, y =Gain, group = Day, color = Day))

技术分享

3. 绘制表格

library("gridExtra")

cmSmall = 5

cmMed = 7

cmBig = 15

df =matrix(ncol=4,nrow=2)

df[1,] =c(‘Size‘,"small","medium","large")

df[2,] =c(‘cm‘,cmSmall, cmMed, cmBig)

df =as.data.frame(df)

grid.table(`rownames<-`(df[-1],df[[1]]),gpar.rowtext = gpar(fontface = "bold"),

gpar.rowfill = gpar(fill ="grey95", col = "white"), show.colnames=FALSE)

技术分享

4. 开发工具包devtools介绍

devtools包是为开发人员提供各种开发工具的包，可以使开发过程变得更轻松。

例如;

get_path()

set_path(path)

以上函数，可以用来在Windows系统中，获取/设置PATH环境变量的值。

install_github函数可以用来直接从github上安装开源包。

clean_source函数可以在一个干净的环境中执行R语言脚本，不受当前环境的影响，帮助进行调试。

更多工具请查看该包的帮助文档。

5. 在函数内部加载包

以下列方式在函数内部调用require函数加载包，无法成功：

p.f <-function(x) (require(x))

p.f(‘base‘)

#Loading requiredpackage: x

#[1] FALSE

#Warning message:

#Inlibrary(package, lib.loc = lib.loc, character.only = TRUE, logical.return =TRUE, :

# there is no package called ‘x’

这是因为在全局域中，’base’会被eval成base对象，然后传给require函数，而在函数内部则不会自动执行这个过程，以下方式可以绕过这个限制，告诉require函数传入的是字符串，让其在内部进行处理。

p.f <-function(x) (require(x,character.only = TRUE))

p.f(‘base‘)

6. 在进度条上绘制图标

将img <- readPNG("C:/data/Rlogo.png")这一句中的png图片改成你的机器上的一张png图片的地址即可运行：

library(ggplot2)

library(png)

fill_images <-function()

{

l <- list()

for (i in 1:nrow(df3))

{

for (j in 1:floor(df3$units[i]))

{

#seems redundant, but does not workif moved outside of the loop (why?)

img <-readPNG("C:/data/Rlogo.png")

g <- rasterGrob(img,interpolate=TRUE)

l <- c(l, annotation_custom(g,xmin = i-1/2, xmax = i+1/2, ymin = j-1, ymax = j))

}

p <- ggplot(df3,aes(what, units)) +

geom_bar(fill="white",colour="darkgreen", alpha=0.5, stat="identity") +

coord_flip() +

scale_y_continuous(breaks=seq(0, 20, 2)) +

scale_x_discrete() +

theme_bw() +

theme(axis.title.x = element_blank(), axis.title.y = element_blank()) +

fill_images()

技术分享

7. 用transform函数改造数据框

transform函数可以用于对数据框内容进行改造，比如增加新列，对某一列做某种计算等等，返回的是一个新的数据框。

比如，对Ozone这一列数据取反：

transform(airquality,Ozone = -Ozone)

# Ozone Solar.R Wind Temp Month Day

#1 -41 190 7.4 67 5 1

#2 -36 118 8.0 72 5 2

#3 -12 149 12.6 74 5 3

#4 -18 313 11.5 62 5 4

#5 NA NA 14.3 56 5 5

增加新的一列new，对Temp这一列计算Temp – 32 / 1.8，并覆盖Temp这一列的值

transform(airquality,new = -Ozone, Temp = (Temp-32)/1.8)

# Ozone Solar.R Wind Temp Month Day new

#1 41 190 7.4 19.44444 5 1 -41

#2 36 118 8.0 22.22222 5 2 -36

#3 12 149 12.6 23.33333 5 3 -12

#4 18 313 11.5 16.66667 5 4 -18

#5 NA NA 14.3 13.33333 5 5 NA

#6 28 NA 14.9 18.88889 5 6 -28

8. 展开完整的文件路径

文件路径经常可以用一些符号来代替，比如~代替当前工作目录，不在源代码里面使用绝对路径，可以使程序更加有通用性。但是，怎样通过~得到具体的路径呢？示例如下：

#展开当前工作路径

path.expand("~")

#[1]"C:/Users/liuning/Documents"

path.expand(‘~/data‘)

#[1]"C:/Users/liuning/Documents/data"

9. 用packageDescription函数查看包的信息

使用R包的时候，经常需要了解包的一些信息，比如包的作用、包依赖的R语言版本和其他包等等。

>packageDescription(‘zoo‘)

Package: zoo

Version: 1.7-11

Date: 2014-02-27

Title: S3Infrastructure for Regular and Irregular Time Series

(Z‘s ordered observations)

Authors@R:c(person(given = "Achim", family = "Zeileis", role

= c("aut", "cre"),email =

"Achim.Zeileis@R-project.org"), person(given ="Gabor",

family = "Grothendieck", role ="aut", email =

"ggrothendieck@gmail.com"),person(given = c("Jeffrey",

"A."), family ="Ryan", role = "aut", email =

"jeff.a.ryan@gmail.com"),person(given = "Felix",

family = "Andrews", role ="ctb", email =

"felix@nfrac.org"))

Description: An S3class with methods for totally ordered

indexed observations. It is particularlyaimed at

irregular time series of numericvectors/matrices and

factors. zoo‘s key design goals areindependence of a

particular index/date/time class andconsistency with

ts and base R by providing methods toextend standard

generics.

Depends: R (>=2.10.0), stats

Suggests: coda,chron, DAAG, fts, its, ggplot2, mondate,

scales, strucchange, timeDate, timeSeries,tis,

tseries, xts

Imports: utils,graphics, grDevices, lattice (>= 0.20-27)

License: GPL-2 |GPL-3

URL:http://zoo.R-Forge.R-project.org/

Packaged:2014-02-27 08:32:43 UTC; zeileis

Author: AchimZeileis [aut, cre], Gabor Grothendieck [aut],

Jeffrey A. Ryan [aut], Felix Andrews[ctb]

Maintainer: AchimZeileis <Achim.Zeileis@R-project.org>

NeedsCompilation:yes

Repository: CRAN

Date/Publication:2014-02-27 11:46:02

Built: R 3.1.3;x86_64-w64-mingw32; 2015-03-12 00:01:07 UTC;

windows

-- File:C:/Program Files/R/R-3.1.3/library/zoo/Meta/package.rds

10. 读写二进制文件示例

saveMatrixList<- function(baseName, mtxList) {

idxName <- paste(baseName,".idx", sep="")

idxCon <- file(idxName, ‘wb‘)

on.exit(close(idxCon))

dataName <- paste(baseName,".bin", sep="")

con <- file(dataName, ‘wb‘)

on.exit(close(con))

writeBin(0L, idxCon)

for (m in mtxList) {

writeBin(dim(m), con)

writeBin(typeof(m), con)

writeBin(c(m), con)

flush(con)

offset <- as.integer(seek(con))

cat(‘offset‘, offset)

writeBin(offset, idxCon)

}

flush(idxCon)

}

loadMatrix <-function(baseName = "data", index) {

idxName <- paste(baseName,".idx", sep="")

idxCon <- file(idxName, ‘rb‘)

on.exit(close(idxCon))

dataName <- paste(baseName,".bin", sep="")

con <- file(dataName, ‘rb‘)

on.exit(close(con))

seek(idxCon, (index-1)*4)

offset <- readBin(idxCon, ‘integer‘)

seek(con, offset)

d <- readBin(con, ‘integer‘, 2)

type <- readBin(con, ‘character‘, 1)

structure(readBin(con, type, prod(d)),dim=d)

}

mtx <-list(matrix(1:12,4), matrix(sin(1:12),4))

saveMatrixList("c:/foo",mtx)

loadMatrix("c:/foo",1)

loadMatrix("c:/foo",2)

11. 利用rbind.fill函数合并列数不同的数据框

rbind.fill函数会合并列名相同的列，缺失列会用NA补足。

library("plyr")

df1 <-data.frame(A = c(‘A1‘,‘A2‘,‘A3‘),B = c(‘B1‘,‘B2‘,‘B3‘),C = c(‘C1‘,‘C2‘,‘C3‘))

df2 <-data.frame(A = c(‘A4‘,‘A5‘,‘A6‘),C = c(‘C4‘,‘C5‘,‘C6‘))

rbind.fill(df1,df2)

# A B C

#1 A1 B1 C1

#2 A2 B2 C2

#3 A3 B3 C3

#4 A4 <NA>C4

#5 A5 <NA>C5

#6 A6 <NA>C6

rbind.fill(df2,df1)

# A C B

#1 A4 C4<NA>

#2 A5 C5<NA>

#3 A6 C6<NA>

#4 A1 C1 B1

#5 A2 C2 B2

#6 A3 C3 B3

12. 匹配中文字符示例

library("stringr")

x <-c(‘abcdefg‘,‘I am Chinese!我是中文字符!‘)

#判断是否含有中文字符,[\u4e00-\u9fa5]是中文字符编码的范围

grepl(‘[\u4e00-\u9fa5]‘,x)

#[1] FALSE TRUE

#查找所有的中文字符

str_extract_all(x,‘[\u4e00-\u9fa5]‘)

#[[1]]

#character(0)

#[[2]]

#[1] "我" "是" "中" "文" "字" "符"

#查找连续的中文字符

str_extract_all(x,‘[\u4e00-\u9fa5]+‘)

#[[1]]

#character(0)

#[[2]]

#[1] "我是中文字符"

13. 时间分段划分示例

#将时间按照5分钟划分

time <- seq(as.POSIXct("2014-06-28"), as.POSIXct("2014-07-10"), by = "5 mins")

head(time)

#[1]"2014-06-28 00:00:00 CST" "2014-06-28 00:05:00 CST"

#[3]"2014-06-28 00:10:00 CST" "2014-06-28 00:15:00 CST"

#[5] "2014-06-2800:20:00 CST" "2014-06-28 00:25:00 CST"

#划分成5小时一段

time <-seq(as.POSIXct("2014-06-28"), as.POSIXct("2014-07-10"), by= "5 hours")

head(time)

#[1]"2014-06-28 00:00:00 CST" "2014-06-28 05:00:00 CST"

#[3]"2014-06-28 10:00:00 CST" "2014-06-28 15:00:00 CST"

#[5]"2014-06-28 20:00:00 CST" "2014-06-29 01:00:00 CST"

#划分成2天一段

time <-seq(as.POSIXct("2014-06-28"), as.POSIXct("2014-07-10"), by= "2 days")

time

#[1]"2014-06-28 CST" "2014-06-30 CST" "2014-07-02CST"

#[4]"2014-07-04 CST" "2014-07-06 CST" "2014-07-08CST"

#[7]"2014-07-10 CST"

14. ` `（重音符）和 ‘ ‘（单引号）的区别

这两个符号乍一看都是单引号，其实是有区别的：

`mtcars`

mpg cyl disp hp drat wt qsec vs am gear carb

Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4

Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4

Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1

Hornet 4Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1

HornetSportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2

Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1

Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4

‘mtcars‘

[1]"mtcars"

可以看出，重音符会把其中的字符转化是实际的对象，而单引号中仅仅只是一个普通字符串而已。

R语言基础编程技巧汇编 - 27

标签：r语言数据分析数据挖掘机器学习

原文地址：http://blog.csdn.net/liu7788414/article/details/45567809

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行

R语言基础编程技巧汇编 - 27

1. 向量循环移位

2. 绘制平行坐标图

3. 绘制表格

4. 开发工具包devtools介绍

5. 在函数内部加载包

6. 在进度条上绘制图标

7. 用transform函数改造数据框

8. 展开完整的文件路径

9. 用packageDescription函数查看包的信息

10. 读写二进制文件示例

11. 利用rbind.fill函数合并列数不同的数据框

12. 匹配中文字符示例

13. 时间分段划分示例

14. ` `（重音符）和‘ ‘（单引号）的区别

14. ` `（重音符）和 ‘ ‘（单引号）的区别