Cloudera release or sell products which includes the official Apache Hadoop release, and/or their own and other useful tools.
Other companies or organizations release products that include artifact builds from modified or extended versions of the Apache Hadoop source tree.
Such derivative works are not supported by the Apache Team: all support issues must be directed to the suppliers themselves.
There are two versions of Cloudera distribution as follows,
Following are the steps to configure CDH cluster in EC2 machines.
Go to publicIP_of_EC2_machine:7180 in the browser with the below credentials
Click add cluster button and go to continue.
Select the CDH distribution version and package which you want to install in the cluster
Copy the hostnames from /etc/hosts file, ideally all the machines with which you want to create a cluster. Provide the hostnames/ip’s of the machines in textbox and continue.
Once host setup is finished, click continue to go to the services installation section.
In this screen select the services which you want in the cluster (eg.,hadoop, hive, sqoop, oozie, all). And click next to continue
In this section you can select the machines where the master services (Namenode, ResourceManager, HMaster., etc) should be configured and where the slave services (Datanode, Nodemanager, HRegionServer., etc) should be configured. Select the hosts to configure the corresponding services and click next.
Upon successful installation Cloudera manager will start the cluster and we can start monitoring and use it via Cloudera manager and hue for small dev tasks.
By Uma Raj
By Uma Raj
By Abishek Balakumar
Alex is a Big Data Evangelist and a Certified Big Data Engineer with many years of experience. He has helped clients to optimize custom Big Data Implementation, migrate legacy systems to Big Data ecosystem, and build integrated Big Data and Analytics solutions to help business leaders generate custom analytics without need of IT.