Cloudera kickstart workshop

  1. Cloudera Manager
    a. Logging into Cloudera Manager

    i. Start your AWS server

    ii. Use web browser (Google Chrome Browser recommended) to login to Cloudera Manager

    # Below is a list of default port number used for the cloudera components

    Copy your AWS public DNS address from your AWS Console

    # Add the cloudera manager port number like below and paste it into your web browser address:

    xxxxxxxxx

.ap-southeast-1.compute.amazonaws.com:7180

# login cloudera manager with ‘cloudera’ as both username and password and you will see the Cloudera Manager Dashboard

b. Start the Cloudera Services Manually

# Start the services in the following order:

I. Zookeeper

II. YARN

III. HDFS

IV. Hive

V. Impala

VI. HBase

VII. Oozie

VIII. Hue

Note: Do Not start ‘Sqoop’ related services, or else the subsequent tasks of the workshop may encounter error.

c. Monitoring the Cloudera Cluster

# Check Cluster Health Issues

# Check all recent commands

# Check all configuration issues

# Configuration – Disk Space Thresholds

# Configuration – Database Settings

# Configuration – Local Data Directories and Files

# Configuration – Log Directories

# Configuration – Navigator Settings

# Configuration – Ports

# Configuration – Advanced Configuration Snippets

# Cloudera Manager Operation

Cloudera Big Data Architect

  1. Flume
  1. Kafka & KSQL
  2. Nifi
  3. Informatica
  4. Talend
  5. Hive

a. Use WinSCP to go into Hive console

# Launch the terminal from WinSCP

# Start Hive Console by typing ‘Hive’

Below are some of the tools available in Cloudera (Will try to do some basic training materials and share)

  1. Hbase
  2. Kudu
  3. Impala
  4. Hue
  5. Mahout
  6. Spark
  7. Pig
  8. Storm
  9. Flink
  10. Cloudera Data Architect Best Practice

Related Posts