iOS Forensics: tool validation based on a known dataset - Preamble

Hello world, it’s been a while since my last series of blog posts! But now I am ready to share with you the results of my recent research.

I face many different challenges in my daily work as a digital forensics analyst, who deals mainly with mobile devices. All modern smartphones are encrypted (usually with file-based encryption (FBE)), so obtaining or cracking the passcode is required to gain access to all the data stored on the device. And even if we know the passcode (or the user has not set the passcode, which is increasingly rare these days), we still need an exploit to gain “root” access to the device to read and copy all the data and get our “best acquisition”, usually a full file system (FFS).

And then what? 

Then you have an enormous number of bytes stored in hundreds of thousands of files in which to search for relevant information for the case.

In simple terms, you have a box and you need to find a small piece of information in that box. In the box there is some information that is completely useless for your investigation: for example, think about the operating system data or the known files installed by (native or third-party) applications. The other big challenge is that every single application, as well as every single setting, log, or database at the operating system level, can store its data in different formats. Sometimes it is an XML-style plist file or an SQLite database with only one table, but in other cases, it may be an SEGB file embedded in a binary plist file and stored as BLOB in an SQLite database with more than 200 tables.

There is only one way to do this on a large scale: using tools to process the extraction, find relevant files, process them with an appropriate parser, and then analyze the results. Once the data has been processed by the tools, we can begin the investigation. We usually approach the analysis by selecting one or more pivot points, such as a timestamp, a location, a phone number, an email address, or a specific action performed by the user.

Then it is important to have a whole arsenal of different tools and, in most cases, to perform data processing with all of them. It’s hard for tool developers and analysts to constantly keep up with the latest and greatest features introduced by a new device, a new version of OS, or the latest update to an app. A simple change in a database schema can bring a tool to a complete halt and cause our investigation to be incomplete and perhaps even lead to incorrect conclusions.

Sometimes one tool will analyze a particular app or log file while another tool does not and the opposite is true. I have had cases where a new artifact was discovered months later, and I started thinking: Ouch, that might have been critical in this case!

We walk on thin ice every day...

Over the past few years, I have been involved in several projects and research efforts to standardize digital evidence to facilitate validation and comparison of parsing capabilities. I have always looked at the technical validation reportsfrom NIST, and they remain one of the best methods we can use to validate tools. But in real cases, the differences are in the nuances, and we need to know exactly what artifact might be helpful to us and what tool can easily analyze that specific artifact for you.

To improve our internal investigation and analysis capabilities, about six months ago I started creating a huge and complex Excel file to track and compare the features and parsing capabilities of the different tools we use on a day-to-day basis. Based on this, I decided to take it a step further and create a comparison matrix for tools based on a known image. By using a known image, it is easier to compare and validate the results generated by different tools with the known information. A simple example: If I know that a particular SMS message was sent from a device to a recipient and with text at a particular time, I can verify that a tool can find and process the information correctly.

As a known image, I chose Josh Hickman's public image of iOS 15 available on DigitalCorpora. Thank you, Josh, for your hard work: the amount of information you were able to generate on this iPhone is incredible!

I created a list of 5 commercial tools (in alphabetical order: Belkasoft X, Cellebrite PA, Magnet AXIOM, MSAB XAMN, and Oxygen Forensic Detective), 1 freeware tool (ArtEx), and 2 open source tools (Apollo and iLEAPP).

Based on Josh’s documentation, I created a huge Excel file where I entered line by line every action for every application mentioned in the document (application installation, application usage, messages sent and received, websites visited, etc.).

Then I looked at the results from all the tools and reviewed the following:

  • What information was analyzed by each tool?
  • Whether the information was analyzed as expected or whether a tool made an incorrect interpretation
  • What specific details are available for each app and/or artifact? (e.g., for a particular messaging app, which tool can analyze app contacts or app calls)

The comparison really goes in-depth, not only on data generated by the user (like messages) but also on OS logs and databases that are useful to track the user's "pattern of life".

Over the next few weeks, I will publish a series of articles on this comparison. 

The work will be organized in this way:

  • Experiment setup: acquisition description and processing procedure
  • General device information: hardware and software information, accounts, and installed applications
  • Stock applications: Address book, call log, SMS/iMessage, calendar, notes and so on
  • Third-party apps: these are organized into categories and various sections based on application types such as messaging, social networking, browser, cloud storage, etc
  • Operating system logs and databases: these are also organized into categories and different posts based on the specific information source (e.g. KnowledgeC, InteractionC. PowerLog, Biome, etc.) or the type of information (i.e., location, app usage, device interaction, etc..)

Before I begin, I want to clarify the goals: to support the community and help tool developers improve their parsing capabilities. This research is in no way intended to be a competition for the "best tool”; rather, I want to provide a clear, verifiable, and objective view of the pros and cons of each tool.

I can already anticipate one of my conclusions. You will see that each tool has a specific parser that can be helpful for a particular case. So we need the support of tool developers to make this comparison easier, faster, and more accurate than manual validation.

See you soon with some real data!


Comments

Popular posts from this blog

A first look at Android 14 forensics

Huawei backup decryptor

WhatsApp Xtract