{"id":1353,"date":"2021-07-01T06:18:28","date_gmt":"2021-07-01T06:18:28","guid":{"rendered":"https:\/\/blog.amt.in\/?p=1353"},"modified":"2021-07-01T06:18:28","modified_gmt":"2021-07-01T06:18:28","slug":"introduction-to-apache-kafka","status":"publish","type":"post","link":"https:\/\/blog.amt.in\/index.php\/2021\/07\/01\/introduction-to-apache-kafka\/","title":{"rendered":"Introduction to Apache Kafka"},"content":{"rendered":"<p>Apache Kafka\u00c2\u00a0is an\u00c2\u00a0open-source\u00c2\u00a0stream processing\u00c2\u00a0platform developed by the\u00c2\u00a0Apache Software Foundation\u00c2\u00a0written in\u00c2\u00a0Scala\u00c2\u00a0and\u00c2\u00a0Java. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. Its storage layer is essentially a &#8220;massively scalable pub\/sub message queue architect-ed as a distributed transaction log,&#8221;\u00c2\u00a0making it highly valuable for enterprise infrastructures to process streaming data.<\/p>\n<p>Additionally, Kafka connects to external systems (for data import\/export) via Kafka Connect and provides Kafka Streams, a\u00c2\u00a0Java\u00c2\u00a0stream processing\u00c2\u00a0library.<\/p>\n<p>It gets used for two broad classes of application:<\/p>\n<ol>\n<li>Building real-time streaming data pipelines that reliably get data between systems or applications<\/li>\n<li>Building real-time streaming applications that transform or react to the streams of data<\/li>\n<\/ol>\n<p>Kafka stores messages which come from arbitrarily many processes called &#8220;producers&#8221;. The data can thereby be partitioned in different &#8220;partitions&#8221; within different &#8220;topics&#8221;. Within a partition the messages are indexed and stored together with a time-stamp. Other processes called &#8220;consumers&#8221; can query messages from partitions. Kafka runs on a cluster of one or more servers and the partitions can be distributed across cluster nodes.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-239 alignleft\" src=\"http:\/\/blog.amt.in\/wp-content\/uploads\/2017\/11\/kafka-apis-Nove-2017-300x252.png\" alt=\"\" width=\"411\" height=\"344\" \/><\/p>\n<p>Apache Kafka efficiently processes the real-time and streaming data when implemented along with Apache Storm, Apache HBase and Apache Spark. Deployed as a cluster on multiple servers, Kafka handles its entire publish and subscribe messaging system with the help of four APIs, namely, producer API, consumer API, streams API and connector API. Its ability to deliver massive streams of message in a fault-tolerant fashion has made it replace some of the conventional messaging systems like JMS, AMQP, etc.<\/p>\n<p>The major terms of Kafka&#8217;s architecture are topics, records, and brokers. Topics consist of stream of records holding different information. On the other hand, Brokers are responsible for replicating the messages. There are four major APIs in Kafka:<\/p>\n<ul>\n<li>Producer API\u00c2\u00a0&#8211; Permits the applications to publish streams of records.<\/li>\n<li>Consumer API\u00c2\u00a0&#8211; Permits the application to subscribe to the topics and processes the stream of records.<\/li>\n<li>Streams API\u00c2\u00a0\u00e2\u20ac\u201c This API converts the input streams to output and produces the result.<\/li>\n<li>Connector API\u00c2\u00a0\u00e2\u20ac\u201c Executes the reusable producer and consumer APIs that can link the topics to the existing applications.<\/li>\n<\/ul>\n<p>Due to its widespread integration into enterprise-level infrastructures, monitoring Kafka performance at scale has become an increasingly important issue. Monitoring end-to-end performance requires tracking metrics from brokers, consumer, and producers, in addition to monitoring\u00c2\u00a0Zookeeper\u00c2\u00a0which is used by Kafka for coordination among consumers.There are currently several monitoring platforms to track Kafka performance, either open-source, like\u00c2\u00a0LinkedIn&#8217;s Burrow, or paid, like\u00c2\u00a0Data-dog. In addition to these platforms, collecting Kafka data can also be performed using tools commonly bundled with Java, including JConsole.<\/p>\n<p>The above written is a brief about Apache Kafka. Watch out this space for latest trends in Technology.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Apache Kafka\u00c2\u00a0is an\u00c2\u00a0open-source\u00c2\u00a0stream processing\u00c2\u00a0platform developed<\/p>\n","protected":false},"author":1,"featured_media":1355,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[72,821,7],"tags":[74,822,18],"class_list":["post-1353","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-apache-kafka","category-open-source-processing-platform","category-techtrends","tag-apache-kafka","tag-open-source-processing-plantform","tag-technology"],"_links":{"self":[{"href":"https:\/\/blog.amt.in\/index.php\/wp-json\/wp\/v2\/posts\/1353","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.amt.in\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.amt.in\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.amt.in\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.amt.in\/index.php\/wp-json\/wp\/v2\/comments?post=1353"}],"version-history":[{"count":1,"href":"https:\/\/blog.amt.in\/index.php\/wp-json\/wp\/v2\/posts\/1353\/revisions"}],"predecessor-version":[{"id":1354,"href":"https:\/\/blog.amt.in\/index.php\/wp-json\/wp\/v2\/posts\/1353\/revisions\/1354"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blog.amt.in\/index.php\/wp-json\/wp\/v2\/media\/1355"}],"wp:attachment":[{"href":"https:\/\/blog.amt.in\/index.php\/wp-json\/wp\/v2\/media?parent=1353"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.amt.in\/index.php\/wp-json\/wp\/v2\/categories?post=1353"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.amt.in\/index.php\/wp-json\/wp\/v2\/tags?post=1353"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}