标签:launch 版本 getenv classes pac amp mac os warning folder
1. 需要先在Mac OS中安装好R,Rstudio中,这个比较简单,掠过
2. 下载编译好的spark(spark-2.0.0-bin-hadoop2.6.tgz)可以在Spark官网下载到你所需要的版本
解压spark到指定目录
$ tar -zxvf spark-2.0.0-bin-hadoop2.6.tgz -C ~/
我这里解压后spark的目录为(/Users/hduser/spark-2.0.0-bin-hadoop2.6)
3. 打开Rstudio,安装相关包
> install.packages("rJava")
> Sys.setenv(SPARK_HOME="/Users/hduser/spark-2.0.0-bin-hadoop2.6")
> .libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R","lib"), .libPaths()))
> library(SparkR)
载入程辑包:‘SparkR’
The following objects are masked from ‘package:stats’:
cov, filter, lag, na.omit, predict, sd, var, window
The following objects are masked from ‘package:base’:
as.data.frame, colnames, colnames<-, drop, endsWith,
intersect, rank, rbind, sample, startsWith, subset,
summary, transform, union
> sc <- sparkR.init(master="local")
Launching java with spark-submit command /Users/hduser/spark-2.0.0-bin-hadoop2.6/bin/spark-submit sparkr-shell /var/folders/gc/vp7dhzpx6573t0fy46ysmpwr0000gp/T//RtmpyADaoX/backend_port4ee21b15c06c
Using Spark‘s default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
16/12/11 19:52:32 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Warning message:
‘sparkR.init‘ is deprecated.
Use ‘sparkR.session‘ instead.
See help("Deprecated")
> sqlContext <- sparkRSQL.init(sc)
Warning message:
‘sparkRSQL.init‘ is deprecated.
Use ‘sparkR.session‘ instead.
See help("Deprecated")
// 用sqlContext 读取R内置数据集faithful
> df <- createDataFrame(sqlContext, faithful)
Warning message:
‘createDataFrame(sqlContext...)‘ is deprecated.
Use ‘createDataFrame(data, schema = NULL, samplingRatio = 1.0)‘ instead.
See help("Deprecated")
> head(df)
eruptions waiting
1 3.600 79
2 1.800 54
3 3.333 74
4 2.283 62
5 4.533 85
6 2.883 55
> print(df)
SparkDataFrame[eruptions:double, waiting:double]
// 测试读json数据
> people <- read.df(sqlContext, "/Users/hduser/people.json","json")
Warning message:
‘read.df(sqlContext...)‘ is deprecated.
Use ‘read.df(path = NULL, source = NULL, schema = NULL, ...)‘ instead.
See help("Deprecated")
> head(people)
age name
1 NA Michael
2 30 Andy
3 19 Justin
> print(people)
SparkDataFrame[age:bigint, name:string]
下一篇测试sparkR在web界面(shiny)的展示
shiny server SparkR web展示界面(二)
标签:launch 版本 getenv classes pac amp mac os warning folder
原文地址:http://www.cnblogs.com/bonnienote/p/6160936.html