Corrupt NTFS folder not accessible from either Win

2019-07-26 05:59发布

问题:

I have an external 2TB hard drive with a large number of video files from GoPro and a Sony Handycam as well as other backed up content. Recently while attempting to backup from my Mac (using a bit of a hack via OSXFUSE to allow writing to a NTFS filesystem which had been working for me up until now) I found one of my folders to be missing some folders. I removed the external HD and tried to recover it from Ubuntu, but in Ubuntu I get even less visibility of the content. See the screenshot below. The 2 folders coloured purple no longer act as folders and their names have been shortened, they should read 'Navimag Ferry' and 'Sony Hanicam' (typo from Handycam).

When I run ls -al I get the following

It seems to me that the information is there because the available space has not changed.

So far I have tried the following:

sudo ntfsfix /dev/sda1

Which gives the following output

Mounting volume... OK
Processing of $MFT and $MFTMirr completed successfully.
Checking the alternate boot sector... OK
NTFS volume version is 3.1.
NTFS partition /dev/sda1 was processed successfully.

and

sudo testdisk /dev/sda1

Using testdisk I used the quick search function in analyse followed by the deeper search but both returned Structure: Ok.

Additionally I used the undelete function but could not find the missing files or folders.

It seems to me that the link between the data and the directory structure is missing, but I am unsure how I can get this link back.

Any ideas??

Thanks,

Stu.

回答1:

Small disclaimer / introduction

I am the author of a MSc thesis related to forensic NTFS reconstruction when metadata is partially damaged and the creator of RecuperaBit, an open source software I will mention later in this answer.

What (likely) happened

The 2 folders coloured purple no longer act as folders and their names have been shortened

NTFS file records (called MFT entries) contain some crucial elements:

  • Flags → Some bits describing the file. In particular, one bit corresponds to the "Is this a folder?" question and another to "Is this deleted or still allocated?".
  • $FILE_NAME attribute(s) → Each file has one or more file names, because NTFS is compatible with DOS 8.3 names.
  • $STANDARD_INFORMATION attribute → This contains MAC (modification, access, creation) times and a bit more.

Moreover, each directory contains one $INDEX_ROOT and possibly several $INDEX_ALLOCATION attributes listing the children names (but not the MAC times).

From your output, it seems to me that the MFT entries of those two directories have been lost. You still see them as elements inside camera uploads because they are found in one of the index attributes, but when the system tries to read the records to show you the dates, it fails.

The NTFS driver works as any other "normal" OS utility to access a file system: it goes top-down. Break a node and you lose any sub-tree (the contents of those directories, basically).

Here's where advanced data recovery software can help.

Recovering the files

Since this is a programming related website, I will briefly explain how you would program a software that is able to read a NTFS partition where some MFT entries are missing:

  • Scan the whole drive, attempting to parse any pair of sectors starting with FILE as a valid MFT entry (I am simplifying a bit here)
  • Build a tree bottom up by doing this for any node:
    • Read the id of the parent node
    • If you have a node with said id, link the child to the parent
    • Otherwise, create a Folder_<id> directory under Lost Files and link the child to it
  • Read the $DATA attributes of each file you want to recover and copy them somewhere else

For more details related to the algorithmic techniques for file system reconstruction, check out my thesis linked above.

Tools you can try

I have mentioned a few programs in this answer on the Software Recommendations website. Those were specifically targeted to heavily damaged drives, and they included:

  • DMDE (commercial, for Windows but has a console version for Linux)
  • Restorer Ultimate (commercial, for Windows and OS X)
  • RecuperaBit (open source, Python-based): It runs for sure on Linux but it has been fainly tested on Windows... like once. It should run on OS X as well.

Based on both my (biased) opinion and my test results RecuperaBit is the best one with disks showing severe damage. Yours is slightly damaged, nevertheless I would like to provide a brief guidance on how to recover two specific folders.

Recovering those two directories

First of all, run RecuperaBit on the disk. I would strongly suggest running it on a bitstream copy, but it does not write anything to it, so you might try to run it directly on the device:

mkdir /media/user/External/recovered_files
cd [full path of recuperabit]
pypy main.py /dev/sdb -o /media/user/External/recovered_files -s /media/user/External/savefile.save

Here I assume /dev/sdb is the damaged drive and you want to save the files in another drive mounted on /media/user/External. If you run the tool on the block device directly, I think you'll need sudo.

The scanning process will take a long time (sit back and relax, 2TB are a lot!), however the results are saved to savefile.save should you run the tool a second time. Type recoverable to find out the identifier of the partition you need to restore. The identifier is given by RecuperaBit and does not reflect the partition table.

Assuming it is #2, save a CSV dump of the contents:

csv 2 contents.csv

The program will print the path of the saved file. Open it with LibreOffice and find the id of the folder(s) you want to restore. For example, the root directory would have id 5, but you probably don't want to get a copy of all files if you miss only two directories.

Let's say the broken directory has id 124. Go back to RecuperaBit and type:

restore 2 124

Where #2 is still the partition identifier. It will list the files it is recovering. You can navigate to the output directory and check if what you want is there. If it is not, try again: you might have chosen the wrong identifier.