Checking the Client Node is uploading data Directly on the DataNode or On the NameNode and then uploading to DataNode.

Sumit Rasal
3 min readNov 13, 2020

✴️ Setup a Hadoop cluster with one NameNode(Master) and 4 DataNodes(Slave) and one Client Node.
✴️ Upload a file through client to the NameNode.
✴️ Check which DataNode the Master chooses to store the file.
✴️ Once uploaded try to read the file through Client using the cat command and while Master is trying to access the file from that DataNode where it stored the file delete that DataNode or crash it and see with help of the replicated storage how master retrieves the file and present it to client.

This Task is done by the team. In the team we are 4 member .
1. Sumit
2. Raj
3.shivani
4.Gursimar

We launch the master on the aws ec2 instance and connect the datanode to the master. We connect the three data node. we launch the one -one data node and shivani configure the his virtual machine for the client

In the client we are just install the Hadoop software and in the namenode and datanode we follow the standard configuration process.

Above Image is showing that we launch the master node and one data node on the aws cloud and other data node on the local machine of each member.

Checking the Client Node is uploading data Directly on the DataNode or On the NameNode and then uploading to DataNode. we use the tcpdump command and capture the all packet from particular port number.

This packet capture from sumit datanode

This packages is capture by the Gursimar Datanode

We also check during reading the file client fetch the information from the master node or from Data node

Thanku for Reading …………

--

--

Sumit Rasal

Technology I Know — MLops | Devops | Docker | RHCSA | CEH |