Microsoft Azure Open Source Big Data & Analytic Service – HDInsight

Hi everyone, In this article, I wanted to talk about a very useful service of Microsoft Azure. I recommend that you check out the previous article before proceeding with this article.

Apache Kafka Producer Example With Java

 

HDInsight

HDInsight provides an environment where you can use applications such as Apache Hadoop, Spark and Kafka. If you’re using a pre-built platform, such as Hortonworks Sandbox or Cloudera Quickstart VM, you can call it a similar.

With HDInsigth you can use the applications in the picture

HDInsight installs in minutes and you won’t be asked to configure it.  Synapse Analytics can seamlessly integrate with many Azure data stores and services, including Azure Cosmos DB, Data Lake Storage, Blob Storage, Event Hubs, and Data Factory.

Azure HDInsight ecosystem enables us to use tools like Apache Zeppelin, VS Code, Tableau.

For more detailed tools, visit https://azure.microsoft.com/en-us/blog/azure-hdinsight-interactive-query-ten-tools-to-analyze-big-data-faster/

 

Let’s examine how we can use HDInsight

Open Azure portal. If not exist create new one.

Search for HDI.

Click the button to create the HDI Cluster.

Let’s fill in the required fields correctly. On the right side, select cluster type.

You can determine your cluster according to your applications. I’m choosing Hadoop at this example

In more detail, you can make more detailed definitions for applications like Hive, Oozie, Ambari.

 

You can view the summary information before creating the HDI Cluster. Then we can press the create button.

This process may take a few minutes

 

After the Cluster has been installed successfully, you will see a screen like in the picture.

You can reach many fields from this page. Ambari home page, documentation, tools and more.

You will see this page when you try to access the Ambari interface, you can log in with your username (possibly admin) and password.

Finally you can see the Ambari interface.

 

If you want, you can connect with ssh through the terminal and make a MapReduce application.  With Microsoft Azure, everything seems easy 🙂

See you in the netx article..

 

About Deniz Parlak

Hi, i’m Security Data Scientist & Data Engineer at My Security Analytics. I have experienced Advance Python, Machine Learning and Big Data tools. Also i worked Oracle Database Administration, Migration and upgrade projects. For your questions [email protected]

Leave a Reply

Your email address will not be published. Required fields are marked *