Content

Monday, March 14, 2016

How to setup a 2 Node Apache Kafka 0.9.0.1 cluster in CentOS 7?

Apache Kafka is one of the realtime message brokers used for realtime stream processing in big data world.


The following needs to be done before beginning  the Apache Kafka cluster Setup.

1. Create 2 CentOS 7 Servers KFNODE1 and KFNODE2 as discussed in How to install CentOS 7 on Virtual Machine using VMWare vSphere 6 client?

2. Make sure Java 7 is installed and configured as default as discussed in How to install Java 7 and Java 8 in CentOS 7?.

3. Create the bigdatauser, bigdataadmin and the bigdatagroup as discussed in How to create a user, group and enable him to do what a super user can in CentOS7?

4. Make sure the firewall is disabled and stopped as discussed in How to turn off firewall on CentOS 7? 

5. Change etc/hosts file so that all the IPs and the names of the servers are resolved as discussed in
How to setup DNS entries for big data servers in the cloud or not on a domain in /etc/hosts file?

6. Using the bigdatauser setup password less ssh across the 3 clusters namely KFNODE1 and KFNODE2 as discussed in How to setup password less ssh between CentOS 7 cluster servers?

7. Install Apache Zookeeper clusters as discussed in How to setup a 3 Node Apache Zookeeper 3.4.6 cluster in CentOS 7?. Make sure you do the same as in step 5 for these servers too.


For each of the Servers KFNODE1 and KFNODE2 do the following.

Login using the bigdataadmin

//create a folder for kafka under the /usr/local directory
cd /usr/local
sudo mkdir kafka

//change ownership to bigdatauser
sudo chown -R bigdatauser:bigdatagroup kafka

//create the data cum log directory for kafka under the var/lib
cd /var/lib
sudo mkdir kafka

//change ownership to bigdatauser
sudo chown -R bigdatauser:bigdatagroup  
kafka

Switch to bigdataauser

//download kafka
wget http://apache.claz.com/kafka/0.9.0.1/kafka_2.11-0.9.0.1.tgz

//unpack the file
tar xzf kafka_2.11-0.9.0.1.tgz


//move the kafka installation to the usr/local/kafka from the download directory
mv kafka_2.11-0.9.0.1 /usr/local/kafka/

//switch to the kafka directory
cd /usr/local/kafka/kafka_2.11-0.9.0.1/



//switch to the config directory
cd config

edit the config file and change the following

vi server.properties


#The broker Id should be unique for KFNODE1 and KFNODE2
#KFNODE1
broker.id=1

#KFNODE2
broker.id=2

#change data cum log directory to
log.dirs=/var/lib/kafka


#include the zookeeper servers

zookeeper.connect=ZKNODE1:2181,ZKNODE2:2181,ZKNODE3:2181

//move to the root of the cluster server on any of the cluster. Start the kafka in both servers.
cd /usr/local/kafka/kafka_2.11-0.9.0.1

//start kafka broker
bin/kafka-server-start.sh config/server.properties >/dev/null &


//if you need to stop
kill processid

//check if the process is running
ps -aux | grep kafka

//check for the kafka data/log folders 


//There is no built in UI for kafka nor any commands to query the broker list
//we can use the create topic script to see if we have a cluster. Here we are //attempting to create a topic with replication factor of 3 and the error would say
//how many brokers we have that is 2
bin/kafka-topics.sh --create --zookeeper ZKNODE1:2181,ZKNODE2:2181,ZKNODE3:2181 --replication-factor 3 --partitions 4 --topic testkfbrokers

//Error while executing topic command : replication factor: 3 larger than available brokers: 2 

No comments:

Post a Comment