Fragmentation slows your computer down.
Disk fragmentation isn't as complicated as it may sound. Fragmentation is just another way of saying randomly scattered. You're probably not used to fragmented information in the real world, but it would be similar to keeping different chapters of a book in different places in the library, and you'd have to use the index to find all of them and put the book together in the right order so you could read it.
Let me present you with a slightly different metaphor that may help explain why file systems get fragmented. Imagine a very long bulletin board. Assume this bulletin board is 11in. high — the height of standard US letter — and hundreds of feet long (you may imagine it to be in a really long hallway). Now imagine that there's a printout of an empty spreadsheet pinned at the very beginning, and a sharp pencil with a nice eraser hanging from a string right next to it.
Because the spreadsheet is blank you can put pieces of paper anywhere on the board. Let's say you wanted to add a three-page story to the bulletin board. You could put it anywhere on the wall, but for convenience you put the pages all in a row right after the spreadsheet. Then all you have to do is add a line in the spreadsheet that says there is a story that spans the first through the third piece of paper on the board (the spreadsheet doesn't count).
Now another person comes along and wants to put up a flier for a band. They can put it anywhere on the board other than the first three page lengths. They decide to put it right after your story and add a line to the spreadsheet saying that it's occupying space 4.
You just had a great idea and decided that you'd like to add another page to your story! Unfortunately there's no room left behind your original post. You've got three options.
- Take off all of your pages and put them up, with the additional page, after the other person's flier, and change the spreadsheet.
- Move the person's flier, change the spreadsheet, add your additional page, and update the spreadsheet once more.
- Put your new page up after the other person's flier and add a new line to the spreadsheet that says your original story is continued at place 5.
Hard drives take the third option. Why do they do this? Because it's faster. What if your original story was a three hundred page novel, and instead of a flier someone put up all the pages of War and Peace after your novel? Now relocating all those pages would take a long time. Instead, it's easier to simply add your new pages wherever you can fit them and make note of it on the spreadsheet.
At this point your story is now fragmented because it's posted to multiple places on the bulletin board. Pretend that the rest of the board gets filled up. You decide that you don't want your story up anymore so you decide to erase it. Instead of taking down all your pages you can simply erase their entry from the spreadsheet and other people can just put things up over your story.
Hard drives work the same way, which is why files can be recovered even after you've "erased" them. Writing zeros over your file on a hard drive would do the same thing as removing your pages from the board in this example; it takes a while but makes the file relatively unrecoverable.
Now imagine you want to put up a two-page notice. Because you just erased your story you can put it up at the first two places. That leaves places 3 and 6-x open. If someone wants to post a large document they might have to put it in three or more different places scattered about the board.
Starting to get the picture? Because it's easier to write to wherever there's free space instead of shifting files around, hard drives will scatter data about the disk. The more files are expanded/contracted and removed, the more that fragmentation occurs.
You've probably already guessed the downside. If you wanted to read a long story from the bulletin board you may have to go to several different places and keep referring to the spreadsheet at the beginning to find where to read next. The same thing happens with hard drives, which is why read times are greatly increased on fragmented hard drives.
Defragmenting, at its most basic level, serves to simply make all the documents on the bulletin board consecutive. There are other types of fragmentation that can be resolved, but that's beyond the scope of this explanation. If you could only rearrange the board by taking individual pages and moving them to unused places to shift things around and make everything contiguous, it would probably take you a very long time. The same thing is true for hard drives. And just as if you had a much longer bulletin board, a much larger hard drive takes considerably longer to defragment.
That raises the question of whether or not it's worth it to defragment, and if so, how often. I attempted to answer that question for the Apple formatting scheme—HFS+—in a previous article. I took the "roboficiency" both before and after defragmenting, then compared it to the amount of time it took to defragment and calculated how much computer use it would take to make up for it. I found that defragmenting my hard drive at that time increased my overall computer speed by 11.52% and took about 2.5hr. I calculated that after using my computer for 22 hours I would save the 2.5 hours worth of time by running 11.52% faster.
What that didn't take into account was the fact that while you're using your computer for 22 hours, your hard drive is getting re-fragmented and hence reducing your roboficiency. That's why I recently did a study to find the fragmentation rate of an HFS+ hard drive.
Method:
I recorded the amount of fragmentation on my hard drive, then defragmented it with a tool called iDefrag. Then for a month I held the shift key after login to prevent any applications from starting and used a tool called XBench—which I've previously shown to be the most accurate OS X benchmarker—to measure my computer and hard drive speed. I would then use iDefrag to measure the amount of fragmentation on the disk.
Results:
What I found was that my hard drive fragmentation increased at a rate of fragments=.212x²+3.5427x+34.357 where "x" is measured in days. It's important to note that I wasn't using my computer for anything highly intensive at the time. I pretty much only wrote papers, surfed the internet, talked online and did basic web programming. I didn't install any software, I didn't download tons of files, I didn't do desktop programming, I didn't crack any encryption, I didn't run virtualization software, and I didn't edit a movie or a song. All of those actions require much more reading and writing from the hard drive and are likely to cause lots of fragmentation. Another important note is that I partitioned the hard drive I was doing this on. One of the main reasons I did that is because partitioning your hard drive separates fragmentation. In this particular configuration—which isn't exactly ideal—I have my operating system and applications on one partition and all of my files on another. That way my file storage partition doesn't get fragmented by all the files that are constantly being read and written by my operating system. For this test I found that the fragmentation rate for my storage partition was only fragments=-.0392x²+1.9741x-1.0912.
Surprisingly there seemed to be no correlation whatsoever between hard drive fragmentation and overall computer speed. While the initial defragmentation did speed up my computer by 7%, after only four days my computer was running slower than when I started! For the rest of the month my computer speed randomly increased and decreased while my fragmentation steadily rose by the rate above.
Conclusions:
- Fragmentation is not highly related to computer speed on HFS+ volumes.
- The fragmentation rate on a system partition is y=.212x²+3.5427x+34.357 for light use.
- The fragmentation rate on a storage partition is y=-.0392x²+1.9741x-1.0912 for light use.
It is possible that other maintenance software—like unix cron scripts—were responsible for the random fluctuations in my system speed.
It is also important to remember that HFS+ drives do some automatic defragmentation for files less than 20MB.