码迷,mamicode.com
首页 > 数据库 > 详细

连接Oracle与Hadoop(4) Oracle使用OSCH访问Hive表

时间:2015-12-22 06:26:45      阅读:442      评论:0      收藏:0      [点我收藏+]

标签:

OSCHOracle SQL Connector for Hadoop的缩写Oracle出品的大数据连接器的一个组件

技术分享

本文介绍的就是如何使用OSCHOracle数据库直接访问Hive

  • 前提1:在Oracle数据库端,部署好HDFS客户端与OSCH软件,设置好环境变量
  1. #JAVA
  2. export JAVA_HOME=/home/oracle/jdk1.8.0_65
  3.  
  4. #Hadoop
  5. export HADOOP_USER_NAME=hadoop
  6. export HADOOP_HOME=/home/oracle/hadoop-2.6.2
  7. export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
  8. export HADOOP_LOG_DIR=${HADOOP_HOME}/logs
  9. export HADOOP_LIBEXEC_DIR=${HADOOP_HOME}/libexec
  10. export HADOOP_COMMON_HOME=${HADOOP_HOME}
  11. export HADOOP_HDFS_HOME=${HADOOP_HOME}
  12. export HADOOP_MAPRED_HOME=${HADOOP_HOME}
  13. export HADOOP_YARN_HOME=${HADOOP_HOME}
  14. export HDFS_CONF_DIR=${HADOOP_HOME}/etc/hadoop
  15. export YARN_CONF_DIR=${HADOOP_HOME}/etc/hadoop
  16. export HADOOP_CLASSPATH=${JAVA_HOME}/lib/tools.jar
  17.  
  18. #OSCH_HOME
  19. export OSCH_HOME=/home/oracle/orahdfs-3.3.0
  20.  
  21. #PATH
  22. export HADOOP_CLASSPATH=${JAVA_HOME}/lib/tools.jar:$OSCH_HOME/jlib/*
  23. export PATH=$ORACLE_HOME/bin:$JAVA_HOME/bin:$HADOOP_HOME/bin:$PATH

 

  • 前提2:在Hadoop集群,部署OSCH软件,设置好环境变量
  1. export JAVA_HOME=/home/hadoop/jdk1.8.0_65
  2.  
  3. export HADOOP_USER_NAME=hadoop
  4. export HADOOP_YARN_USER=hadoop
  5. export HADOOP_HOME=/home/hadoop/hadoop-2.6.2
  6. export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
  7. export HADOOP_LOG_DIR=${HADOOP_HOME}/logs
  8. export HADOOP_LIBEXEC_DIR=${HADOOP_HOME}/libexec
  9. export HADOOP_COMMON_HOME=${HADOOP_HOME}
  10. export HADOOP_HDFS_HOME=${HADOOP_HOME}
  11. export HADOOP_MAPRED_HOME=${HADOOP_HOME}
  12. export HADOOP_YARN_HOME=${HADOOP_HOME}
  13. export HDFS_CONF_DIR=${HADOOP_HOME}/etc/hadoop
  14. export YARN_CONF_DIR=${HADOOP_HOME}/etc/hadoop
  15.  
  16. export HIVE_HOME=/home/hadoop/hive-1.1.1
  17. export HIVE_CONF_DIR=${HIVE_HOME}/conf
  18.  
  19. export OSCH_HOME=/home/hadoop/orahdfs-3.3.0
  20.  
  21. export HADOOP_CLASSPATH=${JAVA_HOME}/lib/tools.jar:/usr/share/java/mysql-connector-java.jar
  22. export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HIVE_HOME/lib/*
  23. export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$OSCH_HOME/jlib/*
  24.  
  25. export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin:$HBASE_HOME/bin:$PATH

 

  • 在Oracle数据库端创建必要的目录
  1. mkdir -p /home/oracle/osch_output
  2.  
  3. CONNECT / AS sysdba;
  4. drop DIRECTORY osch_bin_path;
  5. CREATE OR REPLACE DIRECTORY osch_bin_path AS/home/oracle/orahdfs-3.3.0/bin‘;
  6. GRANT READ, EXECUTE ON DIRECTORY OSCH_BIN_PATH TO baron;
  7.  
  8. drop DIRECTORY osch_hive_dir;
  9. CREATE OR REPLACE DIRECTORY osch_hive_dir AS/home/oracle/osch_output‘;
  10. GRANT READ, WRITE ON DIRECTORY osch_hive_dir TO baron;

 

  • 在Hive中,创建测试需要的Hive表
  1.   (
  2.     catalogid INTEGER PRIMARY KEY,
  3.     journal VARCHAR2(25),
  4.     publisher VARCHAR2(25),
  5.     edition VARCHAR2(25),
  6.     title VARCHAR2(45),
  7.     author VARCHAR2(25)
  8.   );
  9.  
  10.  
  11. #
  12. echo ‘1,Oracle Magazine,Oracle Publishing,Nov-Dec 2004,Database Resource Manager,Kimberly Floss
  13. 2,Oracle Magazine,Oracle Publishing,Nov-Dec 2004,From ADF UIX to JSF,Jonas Jacobi
  14. 3,Oracle Magazine,Oracle Publishing,March-April 2005,Starting with Oracle ADF,Steve Muench‘ > catalog.txt
  15.  
  16. Hive> load data local inpath ‘/home/hadoop/catalog.txtinto table catalog;

 

  • 在Hadoop集群,运行OSCH包直接创建外部表
  1. hadoop jar $OSCH_HOME/jlib/orahdfs.jar \
  2. oracle.hadoop.exttab.ExternalTable \
  3. -D oracle.hadoop.exttab.tableName=orders_ext \
  4. -D oracle.hadoop.exttab.sourceType=hive \
  5. -D oracle.hadoop.exttab.locationFileCount=2 \
  6. -D oracle.hadoop.exttab.hive.tableName=orders_raw \
  7. -D oracle.hadoop.exttab.hive.databaseName=default \
  8. -D oracle.hadoop.exttab.defaultDirectory=osch_hive_dir \
  9. -D oracle.hadoop.connection.url=jdbc:oracle:thin:@//server1:1521/orcl \
  10. -D oracle.hadoop.connection.user=baron \
  11. -D oracle.hadoop.exttab.printStackTrace=true \
  12. -createTable

输出结果如下:

  1. Oracle SQL Connector for HDFS Release 3.3.0 - Production
  2.  
  3. Copyright (c) 2011, 2015, Oracle and/or its affiliates. All rights reserved.
  4.  
  5. [Enter Database Password:]
  6. 15/12/15 04:45:30 INFO metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
  7. 15/12/15 04:45:31 INFO metastore.ObjectStore: ObjectStore, initialize called
  8. 15/12/15 04:45:31 INFO DataNucleus.Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
  9. 15/12/15 04:45:31 INFO DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored
  10. SLF4J: Class path contains multiple SLF4J bindings.
  11. SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-2.6.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  12. SLF4J: Found binding in [jar:file:/home/hadoop/hbase-1.1.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  13. SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
  14. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
  15. 15/12/15 04:45:33 INFO metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
  16. 15/12/15 04:45:37 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
  17. 15/12/15 04:45:37 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
  18. 15/12/15 04:45:37 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
  19. 15/12/15 04:45:37 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
  20. 15/12/15 04:45:37 INFO DataNucleus.Query: Reading in results for query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection used is closing
  21. 15/12/15 04:45:37 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is MYSQL
  22. 15/12/15 04:45:37 INFO metastore.ObjectStore: Initialized ObjectStore
  23. 15/12/15 04:45:38 INFO metastore.HiveMetaStore: Added admin role in metastore
  24. 15/12/15 04:45:38 INFO metastore.HiveMetaStore: Added public role in metastore
  25. 15/12/15 04:45:38 INFO metastore.HiveMetaStore: No user is added in admin role, since config is empty
  26. 15/12/15 04:45:39 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=catalog
  27. 15/12/15 04:45:39 INFO HiveMetaStore.audit: ugi=hadoop ip=unknown-ip-addr cmd=get_table : db=default tbl=catalog
  28. The create table command succeeded.
  29.  
  30. User: "BARON" performed the following actions in schema: BARON
  31.  
  32. CREATE TABLE "BARON"."CATALOG_EXT"
  33. (
  34.  "CATALOGID" INTEGER,
  35.  "JOURNAL" VARCHAR2(4000),
  36.  "PUBLISHER" VARCHAR2(4000),
  37.  "EDITION" VARCHAR2(4000),
  38.  "TITLE" VARCHAR2(4000),
  39.  "AUTHOR" VARCHAR2(4000)
  40. )
  41. ORGANIZATION EXTERNAL
  42. (
  43.    TYPE ORACLE_LOADER
  44.    DEFAULT DIRECTORY "CATALOG_HIVE_DIR"
  45.    ACCESS PARAMETERS
  46.    (
  47.      RECORDS DELIMITED BY 0X‘0A
  48.      CHARACTERSET AL32UTF8
  49.      PREPROCESSOR "OSCH_BIN_PATH":‘hdfs_stream
  50.      FIELDS TERMINATED BY 0X‘2C
  51.      MISSING FIELD VALUES ARE NULL
  52.      (
  53.        "CATALOGID" CHAR NULLIF "CATALOGID"=0X‘5C4E‘,
  54.        "JOURNAL" CHAR(4000) NULLIF "JOURNAL"=0X‘5C4E‘,
  55.        "PUBLISHER" CHAR(4000) NULLIF "PUBLISHER"=0X‘5C4E‘,
  56.        "EDITION" CHAR(4000) NULLIF "EDITION"=0X‘5C4E‘,
  57.        "TITLE" CHAR(4000) NULLIF "TITLE"=0X‘5C4E‘,
  58.        "AUTHOR" CHAR(4000) NULLIF "AUTHOR"=0X‘5C4E
  59.      )
  60.    )
  61.    LOCATION
  62.    (
  63.      ‘osch-20151215044541-2290-1
  64.    )
  65. ) PARALLEL REJECT LIMIT UNLIMITED;
  66.  
  67. The following location files were created.
  68.  
  69. osch-20151215044541-2290-1 contains 1 URI, 263 bytes
  70.  
  71.          263 hdfs://server1:8020/user/hive/warehouse/catalog/catalog.txt

运行完毕后Oracle中直接外部表就已创建完毕

 

  • 在Oracle中执行查询Hive外部表
  1. SQL> select count(*) from catalog_ext;
  2.  
  3.   COUNT(*)
  4. ----------
  5.          3

连接Oracle与Hadoop(4) Oracle使用OSCH访问Hive表

标签:

原文地址:http://www.cnblogs.com/panwenyu/p/5065414.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!