Duration: 2-3 days

Audience:  System architects, developers, data engineers, DBA, and anyone who want to learn to use the Kafka messaging system for consuming data in their systems.

Prerequisites:  Student should have at least one programming language (preferably python, java, or scala) and be able to work from the command line in a linux vm or container.

Description:  We explore Apache Kafka Architecture  and learn to configure a distributed messaging broker.

 

Objectives:

  • Learn the Apache Kafka architecture and data model
  • Learn about decoupled services and distributed systems
  • Learn to build robust systems using distributed messaging brokers
  • Learn best practices configuring Kafka clusters in production
  • Write custom Kafka producers and Consumer
  • Build an application that ingest data from streaming API

Big Data and Distributed Systems Primer
  • Distributed Systems
  • High Availability
  • Latency and Scalability
  • Message Brokers and Queues
  • Decoupling Services
  • Lambda Architecture
  • Data Partitioning
Introduction to Apache Kafka
  • Distributed Systems
  • High Availability
  • Latency and Scalability
  • Message Brokers and Queues
  • Decoupling Services
  • Lambda Architecture
  • Data Partitioning
Apache Kafka Core Concepts
  • Kafka Guarantees/Message Ordering
  • Delivery Semantics
  • Dumb Broker vs. MOM
  • Kafka Semantics
Apache Kafka Cluster
  • Installing Cluster
  • Brokers
  • Consumers
  • Producers
Apache Zookeeper
  • cluster management
  • roles
  • basic operations
Apache Kafka Producers
  • Role of Producer
  • Records
  • Message Durability
  • Batching and Compression
  • Create Console Producer
  • Publishing Data to Topics
Apache Kafka Consumers
  • Role of Consumer
  • Offsets
  • Consumers and Logs
  • Create Console Consumer
  • Performance tuning
  • Consumer Groups
  • Consumer Parallelism
  • Consumer Rebalancing
Apache Kafka API
  • Kafka Data Model
  • Topics
  • Partitions
  • Distribution
  • Reliability
  • Leaders/Followers
  • Replication Factor
  • Persistence
Kafka in Production
  • Producer API
  • Consumer API
  • Java, Scala, Python APIs
  • Creating/Modifying Topics
  • Partitioning Topics
  • Reading data from Kafka
  • Writing data to kafka
Apache Kafka Streams
  • Big Data Pipelines
  • Microservices
  • Case Study: Netflix
  • Apache Spark
  • Storm and Hadoop

CONTACT US



Our Technologies