标签:
HDFS是存储数据的分布式文件系统,对HDFS的操作,就是对文件系统的操作,除了用HDFS的shell命令对文件系统进行操作,我们也可以利用Java API对文件系统进行操作,比如文件的创建、删除、修改权限等等,还有文件夹的创建、删除、重命名等等。
使用Java API对文件系统进行操作主要涉及以下几个类:
1.Configuration类:该类的对象封装了客户端或者服务端的配置。
2.FileSystem类:该类的对象是一个文件系统对象,可以利用该对象的一些方法来对文件进行操作,FileSystem是一个抽象,不能通过new来获取对象,应该通过Filesystem的静态方法get()来获取:
-
- URI uri = new URI(PATH);
-
- FileSystem fileSystem = FileSystem.get(uri, new Configuration());
3.FSDataInputStream和FSDataOutputStream:这两个类是HDFS中的输入输出流。分别通过FileSystem的open()方法和create()方法获得。
在使用Java API 操作HDFS之前,首先要保证hadoop是正常启动的,可以通过jps命令来查看hadoop的进程是否全部启动,如下图:
注:如上图所示,使用jps命令可以看到Hadoop的五个进程即NameNode、DataNode、SecondaryNameNode、TaskTrackers、JobTracker启动了,就表示hadoop启动成功了。
下面是关于Java API操作HDFS文件系统的常用工具类:
- public class FileSystemUtil {
-
-
- private static final String PATH = "hdfs://liaozhongmin:9000/";
-
-
- public static FileSystem getFileSystem(){
-
- try {
-
-
- URI uri = new URI(PATH);
-
- FileSystem fileSystem = FileSystem.get(uri, new Configuration());
- return fileSystem;
- } catch (Exception e) {
- e.printStackTrace();
- }
-
- return null;
- }
-
-
- public static void mkdir(FileSystem fileSystem,String path){
-
- try {
-
- Path srcPath = new Path(path);
-
- fileSystem.mkdirs(srcPath);
- } catch (IOException e) {
- e.printStackTrace();
- } finally{
- try {
- fileSystem.close();
- } catch (IOException e) {
- e.printStackTrace();
- }
- }
- }
-
-
- public static void createFile(String dst,byte[] contents){
-
-
- FileSystem fileSystem = null;
-
- FSDataOutputStream outputStream = null;
- try {
-
- fileSystem = FileSystemUtil.getFileSystem();
-
- Path path = new Path(dst);
-
- outputStream = fileSystem.create(path);
-
- outputStream.write(contents);
- System.out.println("文件创建成功!");
- } catch (IOException e) {
- e.printStackTrace();
- } finally{
-
- try {
- fileSystem.close();
- outputStream.close();
- } catch (IOException e) {
- e.printStackTrace();
- }
-
- }
-
- }
-
- public static void putData(FileSystem fileSystem,String src,String dst){
-
- try {
-
- Path srcPath = new Path(src);
-
- Path dstPath = new Path(dst);
-
- fileSystem.copyFromLocalFile(false, srcPath, dstPath);
-
- } catch (IOException e) {
- e.printStackTrace();
- } finally{
-
- try {
- fileSystem.close();
- } catch (IOException e) {
- e.printStackTrace();
- }
- }
- }
-
-
- public static void rename(FileSystem fileSystem,String oldName,String newName){
-
- try {
-
- Path oldPath = new Path(oldName);
- Path newPath = new Path(newName);
-
- fileSystem.rename(oldPath, newPath);
- } catch (Exception e) {
- e.printStackTrace();
- } finally{
- try {
- fileSystem.close();
- } catch (IOException e) {
- e.printStackTrace();
- }
- }
- }
-
- public static void getData(FileSystem fileSystem,String src,String dst){
-
-
- try {
-
- Path path = new Path(src);
-
- FSDataInputStream in = fileSystem.open(path);
-
- File file = new File(dst);
-
- if (!file.exists()){
- file.createNewFile();
- }
-
- FileOutputStream fileOutputStream = new FileOutputStream(file);
-
- IOUtils.copyBytes(in, fileOutputStream, 1024, true);
- } catch (IOException e) {
-
- e.printStackTrace();
- }
- }
-
-
- public static void listFile(FileSystem fileSystem,String path){
-
- try {
-
- FileStatus[] listStatus = fileSystem.listStatus(new Path(path));
-
- for (FileStatus fileStatus : listStatus){
-
- String isDir = fileStatus.isDir()?"是文件夹":"是文件";
-
- String permission = fileStatus.getPermission().toString();
-
- short replication = fileStatus.getReplication();
-
- long len = fileStatus.getLen();
-
- String filePath = fileStatus.getPath().toString();
-
- System.out.println("isDir:" +isDir + "\npermission:" + permission + "\nreplication:" + replication+ "\nlen:" + len + "\nfilepath:" + filePath);
- }
- } catch (IOException e) {
-
- e.printStackTrace();
- }
- }
-
-
- public static void remove(FileSystem fileSystem,String filePath){
-
- try {
-
- Path path = new Path(filePath);
-
- fileSystem.delete(path, true);
- } catch (IOException e) {
- e.printStackTrace();
- }
- }
-
-
- public static void main(String[] args) {
-
-
- FileSystem fileSystem = getFileSystem();
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- FileSystemUtil.remove(fileSystem, "/dir/liaozhongmin.txt");
- }
- }
注:远程文件系统的路径,请自行在hadoop/conf/core-site.xml文件中配置:
- <?xml version="1.0"?>
- <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
-
-
- <configuration>
- <property>
- <name>fs.default.name</name>
-
- <value>hdfs://liaozhongmin:9000</value>
- </property>
-
- <property>
- <name>hadoop.tmp.dir</name>
-
- <value>/usr/local/hadoop/tmp</value>
- </property>
-
- </configuration>
Java API操作HDFS
标签:
原文地址:http://www.cnblogs.com/thinkpad/p/5173697.html