Hadoop Tutorial: Intro to HDFS

In this presentation, Sameer Farooqui is going to introduce the Hadoop Distributed File System, an Apache open source distributed file system designed to run on commodity hardware.

He’ll cover:

– Origins of HDFS and Google File System / GFS
– How a file breaks up into blocks before being distributed to a cluster
– NameNode and DataNode basics
– technical architecture of HDFS
– sample HDFS commands
– Rack Awareness
– Synchrounous write pipeline
– How a client reads a file

** Interested in taking a class with Sameer? Check out https://newcircle.com/category/big-data

Post Author: hatefull