Hadoop Admin Training
- Overview
- Course Content
- Drop us a Query
Hadoop Admin training equips you with the knowledge and skills to plan, install, configure, manage, secure, monitor, and troubleshoot Hadoop Eco System components and cluster. The Hadoop Admin course is a perfect blend of interactive lectures, hands-on practice, and job-oriented curriculum. This Big Data Hadoop training course gives you a comprehensive understanding on the successful implementation of real-life Hadoop for industry projects.
Recognize and identify daemons and understand the normal operation of an Apache Hadoop cluster, both in data storage and in data processing. Describe the current features of computing systems that motivate a system like Apache Hadoop:
Upon the completion of Hadoop Admin training, you will exhibit the following skills:
- Describe the fundamentals and components of Hadoop
- Elucidate the features, architecture, security considerations of Hadoop Distributed File System (HDFS)
- Provide an overview of Hadoop Ecosystem covering different tools for integration, analysis, data storage and retrieval
- Understand the features, concepts, architecture of MapReduce
- Plan, install, and configure Hadoop
- Practice Hadoop security system and configure Kerberos Security
- Manage and schedule jobs to be executed in Hadoop system
- Utilize best practices for deploying, managing, and monitoring Hadoop clusters
- Install and manage other Hadoop clusters including Pig, Hive, HBase, Sqoop, HDFS
Administrators and aspirants willing to develop skills for Hadoop cluster deployment and management are the ideal candidates for this training.
The candidates with working experience in UNIX can get the best out of this training.
- 1. The Case for Apache Hadoop
- 2. Hadoop Distributed File System
- 3. MapReduce
- 4. Overview of the Hadoop Ecosystem
- 5. Planning your Hadoop Cluster
- 6. Hadoop Installation
- 7. Advanced Configuration
- 8. Hadoop Security
- 9. Managing and Scheduling Jobs
- 10. Cluster Maintenance
- 11. Cluster Monitoring and Troubleshooting
- 12. Populating HDFS From External Sources
- 13. Installing and Managing Other Hadoop Projects
- 14. Hadoop Distributed File System (HDFS)
1. The Case for Apache Hadoop
- Brief History of Hadoop
- Core Hadoop Components
- Fundamental Concepts
2. Hadoop Distributed File System
- HDFS Features
- HDFS Design Assumptions
- Overview of HDFS Architecture
- Writing and Reading Files
- Name Node Considerations
- An Overview of HDFS Security
- Hands-On Exercise
3. MapReduce
- What Is MapReduce?
- Features of MapReduce
- Basic MapReduce Concepts
- Architectural Overview
- MapReduce Version 2
- Failure Recovery
- Hands-On Exercise
4. Overview of the Hadoop Ecosystem
- What is the Hadoop Ecosystem?
- Integration Tools
- Analysis Tools
- Data Storage and Retrieval Tools
5. Planning your Hadoop Cluster
- General planning Considerations
- Choosing the Right Hardware
- Network Considerations
- Configuring Nodes
6. Hadoop Installation
- Deployment Types
- Installing Hadoop
- Using Hadoop Manager for Easy Installation
- Basic Configuration Parameters
- Hands-On Exercise
7. Advanced Configuration
- Advanced Parameters
- Configuring Rack Awareness
- Configuring Federation
- Configuring High Availability
- Using Configuration Management Tools
8. Hadoop Security
- Why Hadoop Security Is Important
- Hadoop’s Security System Concepts
- What Kerberos Is and How it Works
- Configuring Kerberos Security
- Integrating a Secure Cluster with Other Systems
9. Managing and Scheduling Jobs
- Managing Running Jobs
- Hands-On Exercise
- FIFO Scheduler
- FairScheduler
- Configuring the FairScheduler
- Hands-On Exercise
10. Cluster Maintenance
- Checking HDFS Status
- Hands-On Exercise
- Copying Data Between Clusters
- Adding and Removing Cluster Nodes
- Rebalancing the Cluster
- Hands-On Exercise
- NameNode Metadata Backup
- Cluster Upgrading
11. Cluster Monitoring and Troubleshooting
- General System Monitoring
- Managing Hadoop’s Log Files
- Using the NameNode and JobTracker Web UIs
- Hands-On Exercise
- Cluster Monitoring with Ganglia
- Common Troubleshooting Issues
- Benchmarking Your Cluster
12. Populating HDFS From External Sources
- An Overview of Flume
- Hands-On Exercise
- An Overview of Sqoop
- Best Practices for Importing Data
13. Installing and Managing Other Hadoop Projects
- Hive
- Pig
- HBase
14. Hadoop Distributed File System (HDFS)
- HDFS Design
- HDFS Daemons
- HDFS Federation
- HDFS HA
- Securing HDFS (Kerberos)
- File Read and Write Paths