- Cloudera Manager
a. Logging into Cloudera Manageri. Start your AWS server
ii. Use web browser (Google Chrome Browser recommended) to login to Cloudera Manager
# Below is a list of default port number used for the cloudera components
Copy your AWS public DNS address from your AWS Console
# Add the cloudera manager port number like below and paste it into your web browser address:
xxxxxxxxx
.ap-southeast-1.compute.amazonaws.com:7180
# login cloudera manager with ‘cloudera’ as both username and password and you will see the Cloudera Manager Dashboard
b. Start the Cloudera Services Manually
# Start the services in the following order:
I. Zookeeper
II. YARN
III. HDFS
IV. Hive
V. Impala
VI. HBase
VII. Oozie
VIII. Hue
Note: Do Not start ‘Sqoop’ related services, or else the subsequent tasks of the workshop may encounter error.
c. Monitoring the Cloudera Cluster
# Check Cluster Health Issues
# Check all recent commands
# Check all configuration issues
# Configuration – Disk Space Thresholds
# Configuration – Database Settings
# Configuration – Local Data Directories and Files
# Configuration – Log Directories
# Configuration – Navigator Settings
# Configuration – Ports
# Configuration – Advanced Configuration Snippets
# Cloudera Manager Operation
Cloudera Big Data Architect
- Flume
- Kafka & KSQL
- Nifi
- Informatica
- Talend
- Hive
a. Use WinSCP to go into Hive console
# Launch the terminal from WinSCP
# Start Hive Console by typing ‘Hive’
Below are some of the tools available in Cloudera (Will try to do some basic training materials and share)
- Hbase
- Kudu
- Impala
- Hue
- Mahout
- Spark
- Pig
- Storm
- Flink
- Cloudera Data Architect Best Practice