Azure Data Platforms
Following are different data platforms available as of now in Microsoft Azure. Azure Storage Azure Storage is the cloud storage solution…
Following are different data platforms available as of now in Microsoft Azure. Azure Storage Azure Storage is the cloud storage solution…
Zeppelin comes with built in Apache Spark. Below are the steps to use Scala IDE to run and debug code…
The Azure Data Lake Store is a cloud repository where you can easily store data of any size or any…
To use R with Spark we need SparkR package. Below are the steps needed to build, install and use SparkR…
To develop Apache Spark applications in IPython and Python tools for Visual Studio we need to set the environment variables…
I am using HDP for windows (1.3.0.0) single node and Eclipse as development environment. Below are few samples to read…
Problems we had before YARN: JobTracker is solely responsible for handling resources and tasks progress. Scalability Limitation: Maximum cluster size…
Yesterday I tried to install Stinger on Hortonworks HDP 2.0 Sandbox. Below are the steps I followed. I used the Sandbox…
In my last blog post I had discussed about using Map/Reduce to find co-authors in PubMed data on a HDP…
I was trying to write a map/reduce job for Hadoop using Visual Studio 2012 in a HDP for Window environment. In search…