We know that Apache Zeppelin is a web-based multipurpose note book. It provides an interactive Data Analysis and many more. such as Data Ingestion, Data Discovery, Data Visualization & Collaboration. In this post I will explore some basic data analysis using Zeppelin and Spark.
To enable Apache Spark and Zeppelin on Windows system you need to download and install the Sparklet on your windows system.
It’s been a couple of weeks that we have released Sparklet beta version, the Apache Spark and Zeppelin installer for windows standalone. As I was playing around with data visualization, I thought to write a blog post on it. So here are the steps to create a basic chart which are included in Zeppelin.
To run Spark and Zeppelin you need to download and install the Sparklet on your windows system. Go through the Zeppelin’s Display system to learn more about display charts.
Following are different data platforms available as of now in Microsoft Azure.
Azure Storage is the cloud storage solution for modern applications that rely on durability, availability, and scalability to meet the needs of their customers. A standard storage account gives you access to Blob storage, Table storage, Queue storage, and File storage.
Zeppelin comes with built in Apache Spark. Below are the steps to use Scala IDE to run and debug code for Apache Spark.
Below are the links to download to Spark and Zeppeline on Windows env
Sparklet (Apache Spark & Zeppelin Installer for Windows 64 bit)
Needless to mention that Apache Spark is becoming the de facto platform for big data analytics. At the same time there is a notebook revolution going on. Data scientists and others who use a notebook simply love it. A notebook provides a browser based interactive environment to write and execute code, view output, make plots and many more. IPython Notebook is no doubt leading this revolution but it only allows python code.
Apache Zeppelin is a new entrant to the league. It enables interactive data analytics. One can make beautiful data-driven, interactive and collaborative documents with SQL, Scala and more. Zeppelin is based on the concept of an interpreter that can be bound to any language or data processing backend. Basically, Zeppelin is a web based notebook server. Its backend already supports quite a few interpreters like Spark, Scala, Python, Hive, Markdown etc and many more are yet to come. That means from a single notebook you can work with different big data platform and build your analytics solution. Zeppelin tends to cater all your needs: Data Ingestion, Data Discovery, Data Analytics, Data Visualization & Collaboration. It comes with Spark/Scala as its default interpreter.