Friday, December 27, 2024
Google search engine
HomeLanguagesIntroduction to Confluent Kafka Python Producer

Introduction to Confluent Kafka Python Producer

Apache Kafka is a publish-subscribe messaging queue used for real-time streams of data. Apache Kafka lets you send and receive messages between various Microservices. In this article, we will see how to send JSON messages using Python and Confluent-Kafka Library.JavaScript Object Notation (JSON) is a standard text-based format for representing structured data. It is a common data format with diverse uses in electronic data interchange, including that of web applications with servers.

Prerequisites:

  • Good knowledge of Kafka Basic Concepts (e.g. Kafka Topics, Brokers, Partitions, Offset, Producer, Consumer, etc).
  • Good knowledge of Python Basics (pip install <package>, writing python methods).

Solution :

Kafka Python Producer has different syntax and behaviors based on the Kafka Library we are using. So the First Step is choosing the Right Kafka Library for our Python Program.

Popular Kafka Libraries for Python:

While working on Kafka Automation with Python we have 3 popular choices of Libraries on the Internet:

  1. PyKafka
  2. Kafka-python
  3. Confluent Kafka 
     

Each of these Libraries has its own Pros and Cons So we will have to choose based on our Project Requirements.

Step 1: Choosing the right  Kafka Library

If we are using Amazon MSK clusters then We can build our Kafka Framework using PyKafka or Kafka-python (both are Open Source and most popular for Apache Kafka). If we are using Confluent Kafka clusters then We have to use Confluent Kafka Library as we will get Library support for Confluent specific features like ksqlDB, REST Proxy, and Schema Registry. 

We will use Confluent Kafka Library for Python Kafka Producer as we can handle both Apache Kafka cluster and Confluent Kafka cluster with this Library.

We need Python 3.x and Pip already installed. We can execute the below command to install the Library in our System.

pip install confluent-kafka

Step 2: Kafka Authentication Setup.

Unlike most of the Kafka Python Tutorials available on the Internet, We will not work on localhost. Instead, We will try to connect to the Remote Kafka cluster with SSL Authentication. In order to connect to Kafka clusters, Generally, We get 1 JKS File and one Password for this JKS file from the Infra Support Team. This JKS file works fine with Java/Spring but not with Python.

So our job is to convert this JKS file into the appropriate format (as expected by the Python Kafka Library).
For Confluent Kafka Library, We need to convert the JKS file into PKCS12 format in order to connect to remote Kafka clusters.

To learn more visit the below pages:

  1. How to convert JKS to PKCS12?
  2. How to receive messages using Confluent Kafka Python Consumer

Step 3: Confluent Kafka Python Producer with SSL Authentication.

We will use the same PKCS12 file that was generated during JKS to the PKCS conversion step mentioned above.

Python3




import time
import json
from uuid import uuid4
from confluent_kafka import Producer
 
jsonString1 = """ {"name":"Gal", "email":"Gadot84@gmail.com", "salary": "8345.55"} """
jsonString2 = """ {"name":"Dwayne", "email":"Johnson52@gmail.com", "salary": "7345.75"} """
jsonString3 = """ {"name":"Momoa", "email":"Jason91@gmail.com", "salary": "3345.25"} """
 
jsonv1 = jsonString1.encode()
jsonv2 = jsonString2.encode()
jsonv3 = jsonString3.encode()
 
def delivery_report(errmsg, msg):
    """
    Reports the Failure or Success of a message delivery.
    Args:
        errmsg  (KafkaError): The Error that occurred while message producing.
        msg    (Actual message): The message that was produced.
    Note:
        In the delivery report callback the Message.key() and Message.value()
        will be the binary format as encoded by any configured Serializers and
        not the same object that was passed to produce().
        If you wish to pass the original object(s) for key and value to delivery
        report callback we recommend a bound callback or lambda where you pass
        the objects along.
    """   
    if errmsg is not None:
        print("Delivery failed for Message: {} : {}".format(msg.key(), errmsg))
        return
    print('Message: {} successfully produced to Topic: {} Partition: [{}] at offset {}'.format(
        msg.key(), msg.topic(), msg.partition(), msg.offset()))
 
kafka_topic_name = "kf.topic.empdev"   
#Change your Kafka Topic Name here. For this Example: lets assume our Kafka Topic has 3 Partitions==>  0,1,2
#And We are producing messages uniformly to all partitions.
#We are sending the message as ByteArray.
#If We want read the same message from a Java Consumer Program
#We can configure KEY_DESERIALIZER_CLASS_CONFIG = ByteArrayDeserializer.class
# and VALUE_DESERIALIZER_CLASS_CONFIG = ByteArrayDeserializer.class
 
mysecret = "yourjksPassword"
#you can call remote API to get JKS password instead of hardcoding like above
 
print("Starting Kafka Producer")   
conf = {
        'bootstrap.servers' : 'm1.msk.us-east.aws.com:9094, m2.msk.us-east.aws.com:9094, m3.msk.us-east.aws.com:9094',
        'security.protocol' : 'SSL',
        'ssl.keystore.password' : mysecret,
        'ssl.keystore.location' : './certkey.p12'
        }
         
print("connecting to Kafka topic...")
producer1 = Producer(conf)
 
# Trigger any available delivery report callbacks from previous produce() calls
producer1.poll(0)
 
try:
    # Asynchronously produce a message, the delivery report callback
    # will be triggered from poll() above, or flush() below, when the message has
    # been successfully delivered or failed permanently.
    producer1.produce(topic=kafka_topic_name, key=str(uuid4()), value=jsonv1, on_delivery=delivery_report)
    producer1.produce(topic=kafka_topic_name, key=str(uuid4()), value=jsonv2, on_delivery=delivery_report)
    producer1.produce(topic=kafka_topic_name, key=str(uuid4()), value=jsonv3, on_delivery=delivery_report)
     
    # Wait for any outstanding messages to be delivered and delivery report
    # callbacks to be triggered.
    producer1.flush()
     
except Exception as ex:
    print("Exception happened :",ex)
     
print("\n Stopping Kafka Producer")


Sample Output of this Above Code :

Starting Kafka Producer
connecting to Kafka topic...
Message: b'4acef7b3-dx55-5f89-b69r-18b3188f919z' successfully produced to Topic: kf.topic.empdev Partition: [1] at offset 43211
Message: b'98xff6y4-crl5-gfgx-dq1r-k3z5122h611v' successfully produced to Topic: kf.topic.empdev Partition: [2] at offset 43210
Message: b'rus3v9xx-0bd9-astn-mrtn-yyz1920evl6r' successfully produced to Topic: kf.topic.empdev Partition: [0] at offset 43211

Stopping Kafka Producer

Conclusion :

We have got some idea on How to publish JSON messages on Kafka Topic using Python. So we can extend this Code as per our Project needs and continue modifying and developing our Kafka Automation Framework. We can also send all messages based on some condition to a specific Kafka Partition instead of sending equally to all partitions. To explore more on Confluent kafka Python Library we can visit: Confluent Docs

RELATED ARTICLES

Most Popular

Recent Comments