HDFS Blocks and Nodes - MyPythonGuru

Jobs Search Portal and Learning point for Python,Data Science,AI,ML, Cloud and latest technologies.

Follow us on Facebook

Post Top Ad

Your Ad Spot

Tuesday, August 20, 2019

HDFS Blocks and Nodes

Blocks A disk has a block size, which is the minimum amount of data that it can read or write. Filesystems for a single disk build on this by dealing with data in blocks, which are an integral multiple of the disk block size. Filesystem blocks are typically a few kilobytes in size, while disk blocks are normally 512 bytes. This is generally transparent to the filesystem user who is simply reading or writing a file—of whatever length. However, there are tools to perform filesystem maintenance, such as df and fsck, that operate on the filesystem block level. HDFS, too, has the concept of a block, but it is a much larger unit—64 MB by default. Like in a filesystem for a single disk, files in HDFS are broken into block-sized chunks, which are stored as independent units. Unlike a filesystem for a single disk, a file in HDFS that is smaller than a single block does not occupy a full block’s worth of underlying storage. Why Is a Block in HDFS So Large?
Reference : IBM and Apache Hadoop

No comments:

Post Top Ad

Your Ad Spot