{"id":13725,"date":"2015-10-27T10:00:01","date_gmt":"2015-10-27T10:00:01","guid":{"rendered":"https:\/\/www.whizlabs.com\/?p=13725"},"modified":"2020-09-01T07:04:26","modified_gmt":"2020-09-01T07:04:26","slug":"apache-kafka-what-is-it","status":"publish","type":"post","link":"https:\/\/www.whizlabs.com\/blog\/apache-kafka-what-is-it\/","title":{"rendered":"Apache Kafka \u2013 What Is It?"},"content":{"rendered":"<p style=\"text-align: justify\">For the uninitiated, the Kafka project created by LinkedIn in 2012 and adopted by Apache is a public subscribe distributed messaging system. This post seeks to provide an overview on Kafka by presenting the ideas related to producers, topic, brokers and consumers.<\/p>\n<h4 style=\"text-align: justify\">Introduction to Kafka:<\/h4>\n<p style=\"text-align: justify\">Kafka written in Scala is a scalable, high throughput, replicated, partitioned log system. It was created at LinkedIn primarily aimed at live feeds coming from all social media channels whether they were coming from Twitter, F<span style=\"line-height: 1.5\">acebook or LinkedIn itself. Later on, it was open sourced so that other\u00a0<\/span>organizations may be able to adopt it as well. Like other messaging systems, messages are written to and read from the server &#8211; but with Kafka clusters it happens at a good speed.<\/p>\n<p style=\"text-align: justify\">Kafka is considered to be a \u201cpublic subscribe distributed messaging system\u201d rather than a \u201cqueue system\u201d since the message is received from the producer and broadcast to a group of consumers rather than a single consumer.<\/p>\n<h4 style=\"text-align: justify\">Architecture of Kafka:<\/h4>\n<p style=\"text-align: justify\">Having seen the history of Kafka, let us move onto its architecture. These are the basic terms associated with the Kafka architecture \u2013 producer, broker,\u00a0consumer and topic.<img decoding=\"async\" class=\"aligncenter size-full wp-image-13726\" src=\"https:\/\/www.whizlabs.com\/wp-content\/uploads\/2015\/10\/Kafka-cluster.png\" alt=\"Kafka cluster\" width=\"357\" height=\"186\" \/><\/p>\n<h5 style=\"text-align: justify\">Producer:<\/h5>\n<p style=\"text-align: justify\">Different producers like Apps, DBMS, NoSQL write data to the Kafka cluster. The Kafka cluster consists of many \u201cbrokers\u201d. Each \u201cbroker\u201d in layman term is a \u201cserver\u201d. Each message is given a key which assures that all messages with the same key arrive at the same partition. The producer continuously keeps writing messages to the Kafka cluster without waiting for any acknowledgement. It is this asynchronous way of producing and adding messages to the cluster that gives Kafka its immense speed which is an absolute necessity with today\u2019s live social media feeds.<\/p>\n<h5 style=\"text-align: justify\">Topic:<\/h5>\n<p style=\"text-align: justify\">Messages of a similar type are considered to be a \u2018Topic\u2019. A \u2018Topic\u2019 is similar to a \u2018File\u2019 structure. Messages are published to a \u2018Topic\u2019 and there is a partition associated with each \u2018Topic\u2019.<\/p>\n<h5 style=\"text-align: justify\">Brokers:<\/h5>\n<p style=\"text-align: justify\">The \u201cbroker\u201d in Kafka is similar to what a traditional \u201cbroker\u201d would do. It holds the messages that have been written by the producer before being consumed by the \u2018consumer\u2019.<\/p>\n<p style=\"text-align: justify\">There are many \u201cbrokers\u201d or \u201cservers\u201d inside the Kafka cluster. Each \u201cbroker\u201d has a partition and as already stated each partition is associated with a \u2018Topic\u2019. The brokers receive the messages and they are stored in the \u201cbrokers\u201d for \u2018n\u2019 number of days (which can be configured). After the \u2018n\u2019 of days has expired, the messages are discarded. It is important to state here again that Kafka does not check whether each consumer or consumer groups have read the messages.<\/p>\n<h5 style=\"text-align: justify\">Consumer:<img decoding=\"async\" class=\"aligncenter size-full wp-image-13727\" src=\"https:\/\/www.whizlabs.com\/wp-content\/uploads\/2015\/10\/Kafka-Under-The-Hood.png\" alt=\"Kafka Under The Hood\" width=\"606\" height=\"344\" \/><\/h5>\n<p style=\"text-align: justify\">After the \u201cproducers\u201d have produced the message and sent it to the Kafka brokers, the consumers then read the message. Each \u201cconsumer\u201d or \u201cconsumer group\u201d is subscribed to different \u201ctopics\u201d and they read from the \u201cpartition\u201d for the \u201ctopics\u201d they are subscribed to. If one of the brokers goes down, then the other brokers support the system and makes sure it is running smoothly.<\/p>\n<h5 style=\"text-align: justify\">ZooKeeper:<\/h5>\n<p style=\"text-align: justify\">The Zookeeper\u2019s primary responsibility is to coordinate with the different components of Kafka cluster. The producer hands the message to the \u201cbroker leader\u201d which writes the message onto itself and replicates it onto other brokers. LinkedIn, Yahoo, Twitter, Pinterest, Tumblr, Goldman Sachs and Netflix are just a few examples of organizations that have adopted Kafka into their production systems.<\/p>\n<p style=\"text-align: justify\">This post gave an overview of Kafka followed by its architecture. Kafka will no doubt be embraced by more organizations as time goes by.<\/p>\n<p style=\"text-align: justify\">For more information on Kafka visit: Kafka.apache.org<\/p>\n","protected":false},"excerpt":{"rendered":"<p>For the uninitiated, the Kafka project created by LinkedIn in 2012 and adopted by Apache is a public subscribe distributed messaging system. This post seeks to provide an overview on Kafka by presenting the ideas related to producers, topic, brokers and consumers. Introduction to Kafka: Kafka written in Scala is a scalable, high throughput, replicated, partitioned log system. It was created at LinkedIn primarily aimed at live feeds coming from all social media channels whether they were coming from Twitter, Facebook or LinkedIn itself. Later on, it was open sourced so that other\u00a0organizations may be able to adopt it as [&hellip;]<\/p>\n","protected":false},"author":220,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_uag_custom_page_level_css":"","site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[13],"tags":[148,1007,1359],"class_list":["post-13725","post","type-post","status-publish","format-standard","hentry","category-java","tag-apache-kafka","tag-kafka-big-data","tag-real-time-analytics"],"uagb_featured_image_src":{"full":false,"thumbnail":false,"medium":false,"medium_large":false,"large":false,"1536x1536":false,"2048x2048":false,"profile_24":false,"profile_48":false,"profile_96":false,"profile_150":false,"profile_300":false,"tptn_thumbnail":false,"web-stories-poster-portrait":false,"web-stories-publisher-logo":false,"web-stories-thumbnail":false},"uagb_author_info":{"display_name":"Aditi Malhotra","author_link":"https:\/\/www.whizlabs.com\/blog\/author\/aditi\/"},"uagb_comment_info":0,"uagb_excerpt":"For the uninitiated, the Kafka project created by LinkedIn in 2012 and adopted by Apache is a public subscribe distributed messaging system. This post seeks to provide an overview on Kafka by presenting the ideas related to producers, topic, brokers and consumers. Introduction to Kafka: Kafka written in Scala is a scalable, high throughput, replicated,&hellip;","_links":{"self":[{"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/posts\/13725","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/users\/220"}],"replies":[{"embeddable":true,"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/comments?post=13725"}],"version-history":[{"count":1,"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/posts\/13725\/revisions"}],"predecessor-version":[{"id":75986,"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/posts\/13725\/revisions\/75986"}],"wp:attachment":[{"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/media?parent=13725"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/categories?post=13725"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.whizlabs.com\/blog\/wp-json\/wp\/v2\/tags?post=13725"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}