Disk Efficiency

Every file on your system is stored in clusters in your hard drive, the maximum of one file can be stored in a particular cluster, so this results in wastage if the file is under the cluster size. The current FAT version (FAT16) organises files in 32K clusters in drives over 1.2gig, while FAT32 will use a minimum cluster size of 4K. This means that a 3K file wastes only 1K of disk space on FAT32, while it wastes 29K of space on a standard FAT system. This wastage can result in over 50% of a 2gig drive being wasted. See the table below.


Average Cluster Efficiency

Note: Disk Size does not apply to FAT32 where the 4K cluster is usually used.

Cluster Size

Efficiency

Disk Size (applies to FAT 16 only)

2K

98.4%

0-127 MB

4K

96.6%

128-255 MB

8K

92.9%

256-511 MB

16K

85.8%

512-1023 MB

32K

73.8%

1024-2047 MB

64K

56.6%

2047 MB >

 

What's a cluster and why does cluster size matter?

The whole problem of wasted space arises from the fact that DOS allocates file space in "clusters". Clusters are sequentially numbered on the disk, starting at 0, and cluster numbers are used both in the FAT (file allocation table) and in the individual directory entry for each file.

Allocation by clusters means some space on the disk will be wasted. Regardless of the actual length of a file as reported by the DIR command, the file will actually occupy a whole number of clusters on the disk. So a 1-byte file will actually use a whole cluster, a file that's 1 cluster plus 1 byte long will use 2 clusters, and so on.

Is this serious? It can be, depending on the pattern of file sizes on your disk. For instance, if you have an 2GB disk with 5,000 files on it, about 100 MB of your disk is being wasted. And the figures can be much worse, depending on the pattern of your usage. One user reported copying 450 MB of files to a 1.6 GB disk and having them take up 600 MB! As your disk approaches being full, you may wish you could squeeze some extra space out of it instead of buying a new disk.

 

How does cluster size depends on hard-disk size?

As mentioned above in the table the cluster size for various partition sizes so that you can make intelligent choices about how to partition your hard disk.

From the above table we see that even 2.1 GB drive is over the 1023 kilo-byte limit for 16 KB clusters and therefore its cluster size (unpartitioned) is 32 KB. With a 32 KB cluster, even a 1-byte file will use 32 KB of disk space. A file whose length is 32,769 to 65,536 bytes will likewise use two clusters (64 KB), and so on for higher file sizes.

Even so, you may be inclined to think this is no big deal. But think about it: if you have a 2.1 GB drive with 5,000 files, you're probably wasting about 160 MB.


How are cluster sizes determined?

Clusters are always some power of 2 times 512 bytes, but just which power of 2 depends on the disk size. Why should this be so? I mentioned above that clusters are numbered sequentially. The problem is that the directory structure and the FAT have room for only 16 bits for a cluster number. Since the largest unsigned number that will fit into a 16-bit field is 2^16-1 = 65535, the disk can hold at most 2^16 = 65536 clusters. This gives the formula

                             disk size
     cluster size     = ---------, rounded up to a power of 2
                              65536

In general the wasted space per file will be half a cluster. We'll explore the implications of this after we look at cluster sizes for various disk sizes.

Return to Menu