码迷,mamicode.com
首页 > 数据库 > 详细

spark sql 怎样处理日期类型

时间:2015-07-17 12:17:59      阅读:520      评论:0      收藏:0      [点我收藏+]

标签:

spark sql 怎样处理日期类型、时间类型
json 每个对象 不能换行
##问题描述
json File 日期类型 怎样处理?怎样从字符型,转换为Date或DateTime类型?
json文件如下,有字符格式的日期类型
```
{ "name" : "Andy", "age" : 30, "time" :"2015-03-03T08:25:55.769Z"}
{ "name" : "Justin", "age" : 19, "time" : "2015-04-04T08:25:55.769Z" }
{ "name" : "pan", "age" : 49, "time" : "2015-05-05T08:25:55.769Z" }
{ "name" : "penny", "age" : 29, "time" : "2015-05-05T08:25:55.769Z" }
```
默认推测的Schema:
```
root
 |-- _corrupt_record: string (nullable = true)
 |-- age: long (nullable = true)
 |-- name: string (nullable = true)
 |-- time200: string (nullable = true)
```

测试代码
```
    val fileName = "person.json"
    val sc = SparkUtils.getScLocal("json file 测试")
    val sqlContext = new org.apache.spark.sql.SQLContext(sc)
    val jsonFile = sqlContext.read.json(fileName)
    jsonFile.printSchema()
```
##解决方案
### 方案一、json数据 时间为 long 秒或毫秒

### 方案二、自定义schema
```
    val fileName = "person.json" 
    val sc = SparkUtils.getScLocal("json file 测试")
    val sqlContext = new org.apache.spark.sql.SQLContext(sc)
    val schema: StructType = StructType(mutable.ArraySeq(
      StructField("name", StringType, true),
      StructField("age", StringType, true),
      StructField("time", TimestampType, true)));
    val jsonFile = sqlContext.read.schema(schema).json(fileName)
    jsonFile.printSchema()
    jsonFile.registerTempTable("person")
    val now: Timestamp = new Timestamp(System.currentTimeMillis())
 
    val teenagers = sqlContext.sql("SELECT * FROM person WHERE age >= 20 AND age <= 30 AND time <=‘" +now+"‘")
    teenagers.foreach(println)
    val dataFrame = sqlContext.sql("SELECT * FROM person WHERE age >= 20 AND age <= 30 AND time <=‘2015-03-03 16:25:55.769‘")
    dataFrame.foreach(println)
```
###方案三、sql建表 
创建表sql
```
CREATE TEMPORARY TABLE person IF NOT EXISTS 
[(age: long ,name:string ,time:Timestamp)] 
USING org.apache.spark.sql.json
OPTIONS (  path ‘person.json‘)
语法
CREATE [TEMPORARY] TABLE [IF NOT EXISTS] 
   [(col-name data-type [, …])] 
  USING [OPTIONS ...] 
  [AS ]
```
### 方案四、用textfile convert




spark sql 怎样处理日期类型

标签:

原文地址:http://my.oschina.net/itnms/blog/479641

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!