Exploring Azure Data Lake Store Preview

Tweet about this on TwitterShare on LinkedInShare on Google+Share on Facebook

The Azure Data Lake Store  is a cloud repository where you can easily store data of any size or any type. It is the Hadoop Distributed File System for the cloud and available on-demad. Data stored in Data Lake Store is easily accessible to Azure Data Lake Analytics and Azure HDInsight. It will be possible to integrate it with other Hadoop distributions and projects like Hortonworks , Cloudera, spark, strom and flume.

Below are the steps to create Azure Data Late Store and manage it using Azure Portal and Azure CLI.

Creating Azure Data Lake Store using portal

On Azure portal go to New -> Data + storage -> Data Lake Store

  • Enter a name for Data Lake Store.
  • Select a resource group or create a new resource group.
  • Select Location as East US 2
  • Click on create

portal_datalake_store

Using Data Explorer to manage Data files

Azure Portal Data Explorer helps to visualise and manage the data files in Azure Data Lake Store.  It provides operations such as upload, preview , download, rename, delete and manage accessibility of data files.
portal_data_explorer

Uploading a File:

To upload a data file click on upload button. On Upload files pane select the required file to be uploaded and click on start upload.
upload_azure_datalake_store

File Preview:

Click on the file to have a preview of the data. The data format and the number of rows to display can be set using format settings.
data_preview_azure_datalake_store

Getting PATH of a file:

To get the path and webhdfs path of the file go to properties of the file. The path can be used in HDInsight clusters.
file_Path_azure_datalake_store

Managing Azure Data Lake Storage Using Azure CLI

Installing Azure CLI

    1. Install Microsoft Azure Cross-platform Command Line Tools. Below is the link to download the Web Platform Installer.
      http://go.microsoft.com/?linkid=9828653&clcid=0x409

Azure_Cross-platform_CLI_installer

  1. Install Node.js. Below is the link to the installer
    https://nodejs.org/dist/v5.1.0/node-v5.1.0-x64.msi

node.js_installation

Managing Azure Data Lake Store form CLI

Open a new command prompt or a Powershell and type command azure login. It will return a message with device login url and a code. Open a browser and go to https://aka.ms/devicelogin. Enter the given code in the message and login.
CLI Azure login

After successful login come back to the Powershell.

Change the mode to the Azure Resource Manger mode using below command

azure config mode arm

If you have more than one subscription for your account you need to set the subscription

To list all your subscription

azure account list

To set the subscription

azure account set <subscriptionname>

You need a resource group to create a new datalake store accouont. You can create a new one or use an existing one. Below is the command to create a new resource group.

azure group create <resourceGroup Name> <location>

e.g.

azure group create myDataLakeResourceGroup "East US 2"

Note that Azure Data Lake Preview is only available to location East US 2.

Use below command to create a new Data Lake Store account

azure datalake store account create <dataLakeStoreAccountName> <location> <resourceGroup>

e.g.

azure datalake store account create 1testdatalakestore "East US 2" myDataLakeResourceGroup

Below are the commands to manage the Azure Data Lake Store

Create folder

azure datalake store filesystem create <dataLakeStoreAccountName> <path> --folder

e.g.

azure datalake store filesystem create mynewdatalakestore /mynewfolder --folder

Upload data file

azure datalake store filesystem import <dataLakeStoreAccountName> "<source path>" "<destination path>"

e.g.

azure datalake store filesystem import mynewdatalakestore "C:\Data\TestData.csv" "/mynewfolder/TestData.csv"

List files

azure datalake store filesystem list <dataLakeStoreAccountName> <path>

e.g.

azure datalake store filesystem list testdatalakestore /mynewfolder

Download a file

azure datalake store filesystem export <dataLakeStoreAccountName> <source_path> <destination_path>

e.g.

azure datalake store filesystem export testdatalakestore /mynewfolder/TestData.csv "C:\downloads\TestData.csv"

Rename a file

azure datalake store filesystem move <dataLakeStoreAccountName> <path/old_file_name> <path/new_file_name>

e.g.

azure datalake store filesystem move testdatalakestore /mynewfolder/TestData.csv/mynewfolder/TestData_copy.csv

Delete a file

azure datalake store filesystem delete <dataLakeStoreAccountName> <path>

e.g.

azure datalake store filesystem delete testdatalakestore /mynewfolder/TestData_copy.csv

View access control list

azure datalake store permissions show <dataLakeStoreName> <path>

e.g.

azure datalake store permissions show testdatalakestore /

Delete Data Lake Store account

azure datalake store account delete <dataLakeStoreAccountName>

e.g.

azure datalake store account delete testdatalakestore

Leave a Reply

Your email address will not be published. Required fields are marked *


1 + six =