Module 1: Introduction to Big data, and Cassandra
- Introduction to Big Data and Problems caused by it
- 5V – Volume, Variety, Velocity, Veracity and Value
- Traditional Database
- Management System
- Limitations of RDMS
- NOSQL databases
- Common characteristics of NoSQL databases
- CAP theorem
- How
- Cassandra solves the Limitations
- History of Cassandra
- Features of Cassandra
Module 2: Cassandra Data Model
- Introduction to Database Model
- Understand the analogy between RDBMS and Cassandra Data Model
- Understand following Database Elements
- a. Cluster
- b. Keyspace
- c. Column Family/Table
- d. Column
- Column Family Options
- Columns
- Wide Rows, Skinny Rows
- Static and dynamic tables
Module 3: Cassandra Architecture
- Cassandra as a Distributed Database
- Key Cassandra Elements
- a. Memtable
- b. Commit log
- c. SSTables
- Replication Factor
- Data Replication in Cassandra
- Gossip protocol – Detecting failures
- Gossip: Uses
- Snitch: Uses
- Data Distribution
- Staged Event-Driven Architecture (SEDA)
- Managers and Services
- Virtual Nodes: Write path and Read path
- Consistency level
- Repair
- Incremental repair
Module 4: Deep Dive into Cassandra Database
- Replication Factor
- Replication Strategy
- Defining columns and data types
- Defining a partition key
- Recognizing a partition key
- Specifying a descending clustering order
- Updating data
- Tombstones
- Deleting data
- Using TTL
- Updating a TTL
Module 5: Node Operations in a Cluster
- Cassandra nodes
- Specifying seed nodes
- Bootstrapping a node
- Adding a node (Commissioning) in Cluster
- Removing
- (Decommissioning) a node
- Removing a dead node
- Repair
- Read Repair
- What’s new in incremental repair
- Run a Repair Operation
- Cassandra and Spark Implementation
Module 6: Managing and Monitoring the Cluster
- Cassandra monitoring tools
- Logging
- Tailing
- Using Nodetool Utility
- Using JConsole
- Learning about OpsCenter
- Runtime Analysis
- Tools
Module 7: Backup/Restore and Performance Tuning
- Creating a Snapshot
- Restoring from a Snapshot
- RAM and CPU recommendations
- Hardware choices
- Selecting storage
- Types of Storage to Avoid
- Cluster connectivity, security and the factors that affect distributed system performance
- End-to-end performance
- tuning of Cassandra clusters against very large data sets
- Load balance and streams
Module 8: Hosting Cassandra Database on Cloud
- Security
- Ongoing Support of Cassandra Operational Data
- Hosting a Cassandra Database on Cloud