The Adaptec 2810SA RAID-5 controller in my FreeBSD fileserver has gone tits-up. Unfortunately, this happened in the final stretch of business school, right after my primary device (Powerbook G4) had also died, when I had neither time nor money nor nerves to fix the thing.
Once I got around to looking at it, I realized that I needed to educate myself from the ground up on how to deal with this. Data recovery services such as Kroll OnTrack (no, I’m not linking to them) were of no help, as they were either prohibitively expensive or incompetent, or both. One (very nice) tech from OnTrack France went so far as to tell me that if I managed to recover my data on my own, I could go work for them directly.
I finally managed to start recovering things thanks to a lot of sweat and panicking, teaching myself about forensics tools and disk layouts, some very kind donations and other material assistance from friends (still paying off MBA loans) and the excellent help of CGSecurity‘s Christophe Grenier and my friend Adrian Steinmann, FreeBSD guru despite his claims to the contrary. I will try to lay out the steps, terminology and tools I came across very simply here, in the hopes that it will help someone else in my situation.
I. The Setup
The box is an Intel-based FreeBSD 6.3 server, with 8x250GB SATAII drives and about a gig of memory. This yielded a bootable 1629GB array. I had several blogs, a music streaming server, a backup mail server, all my digital photos from the last 8 years as well as apps, games, every personal document I ever scanned in and generally a shitload of stuff on it. Filesystem layout was approximately as follows:
/ (root) — approximately 10 gigs
/usr — approximately 20 gigs
/data — approximately 1580 gigs
The latter is the important one, as it contains almost all my data.
Friends had warned me about the inadvisability of relying on hardware raid, as Adaptec might stop selling my controller model (this is exactly what happened) but as you will see, this was actually a very minor problem.
II. First Steps — The Array
I should have followed the instructions in the man page for the excellent application scan_ffs:
The basic operation of this program is as follows:
1. Panic. You usually do so anyways, so you might as well get it over
with. Just don’t do anything stupid. Panic away from your machine.
Then relax, and see if the steps below won’t help you out.
Sage advice.
The controller card’s failure manifested itself through a shrill whistling. Shutting down and rebooting the machine yielded a “no boot device found” error.
Entering the controller configuration menu, I was confronted with a dialog box that “The array configuration has changed — accept or reject changes.” Rejecting changes meant I could not see an array, nor work on it. Accepting changes showed me a degraded array that would not respond.
III. Rescue Boot Drive
Before booting again, I installed an IDE drive as a boot device (something I will do from now on anyway — booting from a non-RAID drive gives you some flexibility should something go badly wrong.) I set up a bare-bones FreeBSD 6.3 installation on it (not easy, as my floppy drive had also died, and I had no CD drive; I had to install it in my Windows XP workstation for installation; that machine’s power supply subsequently failed. Awesome.) However, when I booted from the IDE drive and tried to attach the failed array, it found no aacd device.
IV. Re-Creating the Array
I found a website that advised me to enter the controller BIOS, rescan the drives, delete the array, and re-create it using the same settings and “Quick Init” in order to salvage it. This uses the existing disk contents and geometry without overwriting anything. For some reason, the Adaptec array creation menu locks up when you attempt to use tab to move between fields (maybe it was just my broken piece of shit card.)
V. Re-Building the Slice? Stupid Stupid Stupid!
However, when I booted again, there was a raw device /dev/aacd0 with nothing on it. At this point, failing to understand how FreeBSD/Intel partitioned drives/UFS work, I decided that re-creating the “partition” with its original geometry (the entirety of the drive) would just show me all my files again. This was partially correct, but I missed some important clues that would have saved me a lot of time.
VI. Disk Logic and Terminology
I used fdisk to recreate what FreeBSD calls a slice (Intel-partitioned hard disks have up to 4 of these; in other operating systems, this is known as a partition. In FreeBSD, however, a partition is equivalent to a file system. I had only one slice (no real need to create more unless you want to dual-boot.)
The physical disk has a boot block at the beginning, as well as a disklabel which defines the drive’s slice layout. Under this, each Unix Filesystem (UFS) is defined by a superblock which contains layout information about the filesystem, information about disk inodes that define where and how big files and directories are and what they’re called. This superblock is located at the start of the partition. UFS also scatteres backup superblocks all over the place within each partition/filesystem; in case the master superblock for the filesystem dies, you can find one of these, as they’re usually at fairly standard locations (counted by sectors from the beginning of the filesystem) and use it to rebuild the original.
VII. Device Nomenclature
UFS partitions are named as follows (using aacd, the device for Adaptec RAID arrays as an example):
/dev/aacd — the device name
/dev/aacd0 — the block device (array 0 in this case)
/dev/aacd0s1 — slice 1 on array 0
/dev/aacd0s1a — partition a on slice 1 on array 0
Important:
/dev/aacd0s1c — is a logical representation of the entire slice. This cannot be mounted.
VIII. Initial Recovery and Further Fuckups
I was able to mount/dev/aacd0s1 (the block device for the entire disk), giving me access to what had been the root directory ( / ) and allowing me to copy off my mail spool and MySQL files, containing my blogs
– the product of many hours of soul-pouring-outing during frustrated INSEAD evenings, as well as restauranting-trying-outing, the really hard stuff. Not something you want to lose. However, try as I would, I could not figure out how to re-create the other partitions.
At this point I did what was probably the stupidest thing I could have tried — I decided that, if fdisk could give me access to the slice when I manually re-defined it, then newfs could do the same for the superblocks and inodes — not realizing that it mercilessly overwrites these. And what do you know, all of a sudden I was left with not only a failing array but a bunch of “empty” partitions to boot.
IX. Assistance
I issued what, in retrospect, can only have been a fairly pathetic-sounding cry for financial assistance, asking friends and family to fork over a few bucks to let me get this done, hoping that somehow I might get enough exposure online to pay for professional recovery services; the deal was that if I ended up not needing the cash, I’d donate it to the EFF and send every kind soul who’d contributed a recovered photo from my collection. That still stands.
A colleague, purely by chance, mentioned that he was selling his used Buffalo Terastation Pro v1 dirt cheap; if I could get it over from the UK, it was mine for a pittance.
X. Setting up the NAS and External Storage
The most important thing in the equation was having enough storage to work with — the IDE boot drive is only 500GB. Assuming I’d want to back up an entire 1629GB raw partition, I’d need more than the TSPro’s measly ~650GB storage. I replaced the drives with 4x750GB SATAII disks, giving me about 1850GB of RAID-5. In addition to this, I bought a 1TB external USB-II drive for my initial backups.
Getting the TeraStation to where I wanted it was a pain in the ass, but I managed with help from the NAS-Central Wiki. Small tip: if you run tools based on broadcast packets in a virtual machine (Parallels on MacOS), make sure your virtual adapters are bridged correctly, or your software may not detect what it’s supposed to.
XI. Sector-by-Sector Data Recovery
I decided to do a low-level sweep of the drive, in order to find any file data in case something during the restoration went wrong. I ran CGSecurity’s photorec tool (which recognizes a wide variety of file types to recover) for several days; unfortunately, whenever the array is powered on for more than a few hours, the failure whistle starts again. It cannot get disabled. Photorec found most of the file data on the raw device (remember that, in Unix, everything is a file, so including the FreeBSD source tree, that’s a lot of files) but without directory structure or names. As I’d taken many of my photos (the most important part of the data) before I learned about EXIF, this was purely a failsafe in case I could not recover the “real” data.
XII. The Partitions Still Exist?
Now I ran CGSecurity’s testdisk, which found traces of partitions. Unfortunately, when I told it that aacd0 was Intel-partitoning, it found a lot of jumbled garbage, stemming from my multiple filesystem overwrites. When I specified “no partitioning”, testdisk found what looked like my partitions, with feasible-sounding size values. However, I could not write this to a disklabel file.
XIII. Analyzing the disk layout
After some pestering, Adrian Steinmann gave me an incredibly helpful tutorial about UFS, how it works, and what I could do to recover it. All sites I found were either pretty theoretical guides on the technical structure of data, or posts with very specific problems that didn’t match mine.
XIV. Dumping a Partition Image
The most important partition for me was my data partition. From the output of testdisk, I had several plausible values of where the data partition started and where it ended. I tried to dump a raw image of it to the NAS, using the command
dd if=/dev/aacd0 of=/mnt/data/raw-partition bs=2048 skip=x count=y conv=noerror
The idea here being to tell it “/data started at block ‘x’ and had length ‘y’ in blocks”. The first partition on an intel-formatted disk will start at block 63, so you can calculate from there. Furthermore, the noerror argument tells it to ignore failed blocks — important, since the RAID controller was giving errors halfway through. This means that dd will not exit on error, but write zeroes to the blocks it cannot copy.
XV. Tools
Note: many of these tools are in the FreeBSD ports collection — either under ‘security’ or ‘sysutils.’ FreeBSD has a number of built-in tools, which you should intricately familiarize yourself with before starting:
- disklabel / bsdlabel
- newfs (particularly newfs -N — which prints out all information, including found backup superblocks, about a given device)
- fdisk
In addition to photorec and testdisk, which appears to be a very useful and versatile tool on Windows and Linux partitions, but a bit more limited under UFS.
I attempted to run sleuthkit and its web frontend, autopsy, on the drive, without usable results — these are great forensics tools if you’re looking for a needle in a haystack, but in the absence of any kind of filesystem information they do not seem to be real good at recovering entire filesystems.
The tool scan_ffs was an incredible help; it looks for superblocks in an unpartitioned space, and can write information about your filesystems out in disklabel format in case you want to try to write it directly to disk afterwards.
Similarly, find-sb (in /usr/src/tools/tools/find-sb on FreeBSD) is a very rudimentary, self-contained tool that searches for all information related to superblocks. The value for me lay mainly in seeing if there was anything around worth looking for.
XVI. Recovering the Superblock
I was able to paste the output from scan_ffs into a file, which looks something like this:
X: 1048576 0 4.2BSD 2048 16384 0 # /
X: 4165632 5189264 4.2BSD 2048 16384 0 # /var
X: 1048576 9354896 4.2BSD 2048 16384 0 # /tmp
X: 8388608 10403472 4.2BSD 2048 16384 0 # /usr
(This isn’t my disklabel, but a friend’s.) Just replace the “X” with what looks plausible, i.e. / -> /dev/aacd0s1a and work from there. I remembered that I had partitions /, /tmp, /usr and /data; since “c” (i.e. /dev/aacd0s1c) is always the entire disk, this means that /usr was on /dev/aacd0s1d and /data was /dev/aacd0s1e.
XVII. Re-Writing the Superblocks and Mounting the Partition
The disklabel file is written to the disk as follows:
disklabel -R [-r] [-n] disk protofile
Where “disk” is the device, in my case /dev/aacd0. The -n option means that nothing is written; disklabel only prints out what it would otherwise write. -r would overwite the boot code, which was irrelevant in my case as I did not intend to boot from this device again.
After writing the disklabel, presto changeo, all the devices show up in /dev.
Unfortunately, since I had already screwed around with /dev/aacd0s1 in step VIII. (and in the process blown away the superblock for /dev/aacd0s1a) that partition was not mountable. Thankfully I had immediately copied off all data I saw during that step. However, /usr and /data mounted just fine.
XVIII. Copying Data
I had to copy individual files off multiple times; the RAID controller kept crapping out on me and filling syslog with panicky messages before dying with the expected shrill whistle. I solved this by re-seating the RAID controller in another slot on the motherboard. For some inexplicable reason that seemed to make it stable enough for me to copy off all my files without another crash.
I now have everything back, hooray. It cost me about 6 months of effort, lots of nerves, a bottle of champagne and about €1,000 for miscellaneous parts, disks, and shipping.
XIX. Lessons Learned
- Don’t panic. Easy to say, if your entire digital life has just been nuked.
- Research more. I did a ton of checking around for tools and “how to recover data” type articles, and did not find anything immediately useful. However, once you have a clue what you’re looking for, the information is out there.
- Bug your friends. Keep bugging them. Reward them if they help you. I am grateful beyond words to ast for turning me onto various tools and for giving me a really basic tutorial on how disks look. Stupidly enough, there is very little information that easily and understandably lays out how things work, especially in a manner that a panicky moron can grasp.
- Make sure you have multiple levels of backup. My current setup includes mails archived on gmail (still working out how to GPG/PGP-encrypt them), my NAS, an external USB backup device, and a plan to build a 500GB RAID 1+0 JBOD from the 4x250GB drives I yanked out of the NAS.

Thanks for posting this ! I read every word and really appreciate it! It helped me get a better understanding of the way UFS works !
Thank you!
please solution for ufs 3 Mr
Very Interesting post! Thank you for such interesting resource!