Description: A deep dive into the distributed database technology Apache Cassandra.  We explore the problems that software applications face when they need to scale to    handle billions of data sets while still attempting to maintatin low latency and high availability.  Your team will leave the class with a strong understanding of how to model data and architect Cassandra in a way that provides maximum scalability for data intensive applications.

Outcomes:

  • Learn how to architect Cassandra tables for query first design
  • Learn the concepts behind DBMS and NoSQL DB’s
  • Strong understanding of CAP theory
  • Strong Understanding of the Cassandra Architecture and Components
  • Strong Understanding of the Cassandra Data Model
  • Stong Understanding of Cassandra Replication Strategy
  • Understanding of Cassandra Consistency Tuning
  • Learn the tradeoffs between Consistency and Latency
  • Leave the class with the tools needed to continue to practice using Cassandra

Course Outline

Introduction to Big Data
  • Database Management Systems
  • Classical RDMBS and Their Limits
  • The era of Big Data and Scalable Web Apps
  • Problems with Traditional Solutions
  • Distributed Architecture
  • Distributed Caching Systems
  • Intro to NOSql
  • CAP Theory
Cassandra Architecture
  • History
  • What is Cassandra
  • Why Cassandra
  • Cluster Architecture
  • AP Systems
Cluster and Node Communication
  • Settings
  • Gossip Protocol
  • Handshake
  • Node Availability
  • Masterless
  • Coordinator
  • Failure
  • Patterns
Scalability and Availability
  • High Performance
  • Datacenter Deployment
  • Consistency
Cassandra Data Model
  • Column Based Data Stores
  • Column Key and Value
  • Partition Keys
  • Column Families
  • Keyspaces
Cassandra Data Partitioning
  • Primary Keys
  • Partition Key
  • Cluster Key
  • Ring architecture
  • Partitioners
Cassandra Data Modeling
  • Logical Data Modeling
  • Query First Design
  • CQL
  • Query Planning
Node Level Write Operations
  • ogging
  • memTable
  • SSTable data file
  • Write Request CL
Node Level Read Operations
  • Read Path
  • Direct Requests
  • Digest Request
  • Read Repair Mechanism
Data Replication
  • Built-in vs. Customizable
  • Replication Strategies
  • Replication Factor
Consistency Levels
  • CL Overview
  • Options
  • Write Request
  • Read Requests
  • QUORUM
  • Data Loss
Monitoring Cassandras Performance
  • Node Tool
  • Jconsole/JMX
  • Metrics
  • 3rd Party Tools
  • Cassandra in Production

CONTACT US



Our Technologies