I am using HDP for windows (1.3.0.0) single node and Eclipse as development environment. Below are few samples to read and write to HDFS.
- Create a new Java Project in Eclipse.
- In Java Settings go to Libraries and add External JARs. Browse to Hadoop installation folder and add below JAR file.Hadoop-core.jar
- Go into lib folder and add below JAR files.common-configuration-1.6.jar
common-lang-2.4.jar
common-logging-api-1.0.4.jar
Above image shows the needed external JARS in the build path and their locations.
- Create a new Package under src and name it HDFSFileOperation.
- Create a new class in the HDFSFileOperation package. Name it Operations.
- Import below packages.
import org.apache.hadoop.conf.Configuration;
//Needed to get the hadoop configuration.
import org.apache.hadoop.fs.*;
//Needed for HDFS file system operation.
import java.io.*;
//Needed for system input output operation.
- Code for accessing HDFS file system
FileSystem hdfs =FileSystem.get(new Configuration());Path homeDir=hdfs.getHomeDirectory();//Print the home directory
System.out.println(“Home folder -” +homeDir);
}
- Add below code For creating and deleting directory
Path workingDir=hdfs.getWorkingDirectory();Path newFolderPath= new Path(“/MyDataFolder”);
newFolderPath=Path.mergePaths(workingDir, newFolderPath);
if(hdfs.exists(newFolderPath))
{
hdfs.delete(newFolderPath, true); //Delete existing Directory
}
hdfs.mkdirs(newFolderPath); //Create new Directory
- Code for copying File from local file system to HDFS
Path localFilePath = new Path(“c://localdata/datafile1.txt”);Path hdfsFilePath=new Path(newFolderPath+”/dataFile1.txt”);
hdfs.copyFromLocalFile(localFilePath, hdfsFilePath);
- Copying File from HDFS to local file system
localFilePath=new Path(“c://hdfsdata/datafile1.txt”);hdfs.copyToLocalFile(hdfsFilePath, localFilePath);
- Creating a file in HDFS
Path newFilePath=new Path(newFolderPath+”/newFile.txt”);hdfs.createNewFile(newFilePath);
- Writing data to a HDFS file
StringBuilder sb=new StringBuilder();for(int i=1;i<=5;i++)
{
sb.append(“Data”);
sb.append(i);
sb.append(“\n”);
}
byte[] byt=sb.toString().getBytes();
FSDataOutputStream fsOutStream = hdfs.create(newFilePath);
fsOutStream.write(byt);
fsOutStream.close();
- Reading data From HDFS File
BufferedReader bfr=new BufferedReader(new InputStreamReader(hdfs.open(newFilePath)));String str = null;
while ((str = bfr.readLine())!= null)
{
System.out.println(str);
}
You can run the code directly from Eclipse if there are no errors in the code but practically we need a JAR file to run it in Hadoop. For creating a JAR file and using it in Hadoop follow below steps:
- Right click on the project and select Export.
- Select JAR file under Java and click next.
- Provide the location where you want the JAR file to be exported. And click Finish
- To execute it in Hadoop. Use below Hadoop command.
hadoop jar [mainClass]
e.g. hadoop jar c:\Users\Administrator\Documents\fileoperations.jar HDFSFileOperation.Operations
You can see output like below if it executes without any error.
You can check the created file by browsing HDFS file system in web browser.
The complete code of Class file is given below
package HDFSFileOperation;
import java.io.*;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;
public class Operations {
public static void main(String[] args) throws IOException {
FileSystem hdfs =FileSystem.get(new Configuration());
//Print the home directory
System.out.println(“Home folder -” +hdfs.getHomeDirectory());
// Create & Delete Directories
Path workingDir=hdfs.getWorkingDirectory();
Path newFolderPath= new Path(“/MyDataFolder”);
newFolderPath=Path.mergePaths(workingDir, newFolderPath);
if(hdfs.exists(newFolderPath))
{
//Delete existing Directory
hdfs.delete(newFolderPath, true);
System.out.println(“Existing Folder Deleted.”);
}
hdfs.mkdirs(newFolderPath); //Create new Directory
System.out.println(“Folder Created.”);
//Copying File from local to HDFS
Path localFilePath = new Path(“c://localdata/datafile1.txt”);
Path hdfsFilePath= new Path(newFolderPath+”/dataFile1.txt”);
hdfs.copyFromLocalFile(localFilePath, hdfsFilePath);
System.out.println(“File copied from local to HDFS.”);
//Copying File from HDFS to local
localFilePath=new Path(“c://hdfsdata/datafile1.txt”);
hdfs.copyToLocalFile(hdfsFilePath, localFilePath);
System.out.println(“Files copied from HDFS to local.”);
//Creating a file in HDFS
Path newFilePath = new Path(newFolderPath+”/newFile.txt”);
hdfs.createNewFile(newFilePath);
//Writing data to a HDFS file
StringBuilder sb = new StringBuilder();
for(int i=1;i<=5;i++)
{
sb.append(“Data”);
sb.append(i);
sb.append(“\n”);
}
byte[] byt = sb.toString().getBytes();
FSDataOutputStream fsOutStream = hdfs.create(newFilePath);
fsOutStream.write(byt);
fsOutStream.close();
System.out.println(“Written data to HDFS file.”);
//Reading data From HDFS File
System.out.println(“Reading from HDFS file.”);
BufferedReader bfr = new BufferedReader(
new InputStreamReader(hdfs.open(newFilePath)));
String str = null;
while ((str = bfr.readLine())!= null)
{
System.out.println(str);
}
}
}