Exploring Azure Data Lake Store Preview

The Azure Data Lake Store  is a cloud repository where you can easily store data of any size or any type. It is the Hadoop Distributed File System for the cloud and available on-demad. Data stored in Data Lake Store is easily accessible to Azure Data Lake Analytics and Azure HDInsight. It will be possible to integrate it with other Hadoop distributions and projects like Hortonworks , Cloudera, spark, strom and flume.

Below are the steps to create Azure Data Late Store and manage it using Azure Portal and Azure CLI.

Creating Azure Data Lake Store using portal

On Azure portal go to New -> Data + storage -> Data Lake Store

  • Enter a name for Data Lake Store.
  • Select a resource group or create a new resource group.
  • Select Location as East US 2
  • Click on create

portal_datalake_store

Using Data Explorer to manage Data files

Azure Portal Data Explorer helps to visualise and manage the data files in Azure Data Lake Store.  It provides operations such as upload, preview , download, rename, delete and manage accessibility of data files.
portal_data_explorer

Uploading a File:

To upload a data file click on upload button. On Upload files pane select the required file to be uploaded and click on start upload.
upload_azure_datalake_store

File Preview:

Click on the file to have a preview of the data. The data format and the number of rows to display can be set using format settings.
data_preview_azure_datalake_store

Getting PATH of a file:

To get the path and webhdfs path of the file go to properties of the file. The path can be used in HDInsight clusters.
file_Path_azure_datalake_store

Managing Azure Data Lake Storage Using Azure CLI

Installing Azure CLI

    1. Install Microsoft Azure Cross-platform Command Line Tools. Below is the link to download the Web Platform Installer.
      http://go.microsoft.com/?linkid=9828653&clcid=0x409

Azure_Cross-platform_CLI_installer

  1. Install Node.js. Below is the link to the installer
    https://nodejs.org/dist/v5.1.0/node-v5.1.0-x64.msi

node.js_installation

Managing Azure Data Lake Store form CLI

Open a new command prompt or a Powershell and type command azure login. It will return a message with device login url and a code. Open a browser and go to https://aka.ms/devicelogin. Enter the given code in the message and login.
CLI Azure login

After successful login come back to the Powershell.

Change the mode to the Azure Resource Manger mode using below command

[code]azure config mode arm[/code]

If you have more than one subscription for your account you need to set the subscription

To list all your subscription
[code]azure account list[/code]

To set the subscription
[code]azure account set <subscriptionname>[/code]

You need a resource group to create a new datalake store accouont. You can create a new one or use an existing one. Below is the command to create a new resource group.

[code]azure group create <resourceGroup Name> <location>[/code]
e.g. [code]azure group create myDataLakeResourceGroup “East US 2″[/code]
Note that Azure Data Lake Preview is only available to location East US 2.

Use below command to create a new Data Lake Store account
[code]azure datalake store account create <dataLakeStoreAccountName> <location> <resourceGroup>[/code]

e.g. [code]azure datalake store account create [code]testdatalakestore “East US 2” myDataLakeResourceGroup[/code]

Below are the commands to manage the Azure Data Lake Store

Create folder
[code]azure datalake store filesystem create <dataLakeStoreAccountName> <path> –folder[/code]
e.g. [code]azure datalake store filesystem create mynewdatalakestore /mynewfolder –folder[/code]

Upload data file
[code]azure datalake store filesystem import <dataLakeStoreAccountName> “<source path>” “<destination path>”[/code]
e.g. [code]azure datalake store filesystem import mynewdatalakestore “C:\Data\TestData.csv” “/mynewfolder/TestData.csv”[/code]

List files
[code]azure datalake store filesystem list <dataLakeStoreAccountName> <path>[/code]
e.g.[code]azure datalake store filesystem list testdatalakestore /mynewfolder[/code]

Download a file
[code]azure datalake store filesystem export <dataLakeStoreAccountName> <source_path> <destination_path>[/code]
e.g. [code]azure datalake store filesystem export testdatalakestore /mynewfolder/TestData.csv “C:\downloads\TestData.csv”[/code]

Rename a file
[code]azure datalake store filesystem move <dataLakeStoreAccountName> <path/old_file_name> <path/new_file_name>[/code]
e.g. [code]azure datalake store filesystem move testdatalakestore /mynewfolder/TestData.csv/mynewfolder/TestData_copy.csv[/code]

Delete a file
[code]azure datalake store filesystem delete <dataLakeStoreAccountName> <path>[/code]
e.g. [code]azure datalake store filesystem delete testdatalakestore /mynewfolder/TestData_copy.csv[/code]

View access control list
[code]azure datalake store permissions show <dataLakeStoreName> <path>[/code]
e.g. [code]azure datalake store permissions show testdatalakestore /[/code]

Delete Data Lake Store account
[code]azure datalake store account delete <dataLakeStoreAccountName>[/code]
e.g. [code]azure datalake store account delete testdatalakestore[/code]

Leave a Reply

Close Menu