标签:
1、spark sql可以直接加载avro文件,之后再进行一系列的操作,示例:
1 SparkConf sparkConf = new SparkConf().setAppName("Spark job"); 2 JavaSparkContext javaSparkContext = new JavaSparkContext(sparkConf); 3 4 SQLContext sqlContext = new SQLContext(javaSparkContext); 5 6 String FORMAT_CLASS = "com.databricks.spark.avro"; 7 8 // avro 在hdfs上的路径 9 String path = "/sqoopdb/pcdas/*.avro"; 10 DataFrame tblarticleautoDf = sqlContext.read().format(FORMAT_CLASS) 11 .load(path); 12 tblarticleautoDf.registerTempTable("tableName"); 13 14 String sql = "select * from tableName"; 15 DataFrame queryDf = sqlContext.sql(sql); 16 System.out.println(queryDf.count()); 17 System.out.println(queryDf.first());
标签:
原文地址:http://www.cnblogs.com/fillPv/p/5015689.html