The current story of Gen. PETRAEUS and his affair BROADWELL shines a light at the possibilities of digital surveillance and tracing of crumbs of information. It can serve as an example and a warning against insufficient digital tradecraft. Though news reports about the exact order and nature of the events are imprecise, unreliable and contradictory, we are trying to put them together into a plausible series of events and give some background on techniques that were, or might have been, used to intrude on the privacy of both BROADWELL and PETRAEUS.
Phase I: Threatening emails
The case began when KELLEY received between 5 and 10 emails of threatening content that did not immediately identify the sender. The FBI was contacted through an agent being befriended to KELLEY and the matter was investigated by the FBI cybercrime unit.
To prevent confusion we will refer to the address these emails were sent from as EMAIL_ACCOUNT_A, since the story involves multiple accounts.
Phase II: Requesting account information
Since the emails in question did not immediately reveal the identity of the sender, the FBI most likely contacted the email provider of EMAIL_ACCOUNT_A first, requesting the registration data for the address in question (likely using a subpoena).
An email address consists of two parts, the “Local Part” or “user” and the domain managing the account. Together these form the address as email@example.com. The information about the domain, and thus who manages an email address, is publicly available through the domain registration system and can be looked up within seconds (using a whois service like www.whois.com).
Since EMAIL_ACCOUNT_A was registered under a pseudonym (false user information) and not the real identity of the owner, the FBI resorted to identify the account owner through other means.
Phase III: Tracing access
At this point the FBI either:
- Requested and received historic login data to EMAIL_ACCOUNT_A from the email provider. This would include the dates/times when an account was accessed and which IP-Addresses were used by the user.
- Or the FBI relied on the IP-Address information included in most emails in a section that most email programs hide from the user but that is nevertheless carried by the email itself and easily obtained through the email program. An example of how such an entry in an email looks like is shown here:
Received: from [188.8.131.52] by fmail.com via HTTP; Fri, 11 Nov 2011 11:11:11 PST
At this point the FBI had a list that showed at what dates/times the owner accessed EMAIL_ACCOUNT_A with which IP-Address. From there the FBI used publicly available database to identify the owners and/or locations of the IP-Addresses in question, which resulted in a list that informed them about places and times EMAIL_ACCOUNT_A was used.
Phase IV: Identifying the sender
Apparently EMAIL_ACCOUNT_A was not used from a personal Internet connection to send the emails in question. This lead the FBI to contact the owners of the IP-Addresses identified in Phase III – which included multiple hotels – and request information about potential users of the Internet accounts identified by the collected IP-Addresses. Apparently the FBI needed no subpoenas or even court orders to access this information, hotels simply shared the guest records for the dates in question.
At this point the FBI had a list of persons that included the user of EMAIL_ACCOUNT_A. They then simply looked for persons that had been at all of the places at the times in question. Leaving one suspect: BROADWELL.
Phase V: Widening the picture
At this point the FBI could convince a judge to issue a warrant to identify additional email accounts used by BROADWELL who had been successfully identified as owner of EMAIL_ACCOUNT_A.
It is unclear what technique the FBI used to find additional accounts of BROADWELL. Possible options are:
- Using an FBI controlled software installed on BROADWELL’s computer to identify additional email accounts accessed. BROADWELL’s modus operandi included accessing email accounts from changing Internet connections like those of hotels. Since this was to be expected in the future as well, a FBI controlled data collection software installed on BROADWELL’s laptop would have been a good choice, simply because she would likely use that machine during travels. This software like Magic Lantern, CIPAV or any of their successors would have been the most promising path but also presenting legal obstacles.
- Another approach would have been buying available data from various data traders like Acxiom that often have information about multiple email addresses used by the same person on file. This data is usually collected from various sources and aggregated based on common identifiers like IP-Addresses which together yield a surprisingly detailed picture of the person in question. However, this data is often less complete than required in such an investigation and also makes case information available to a third party.
- Due to less legal obstacles involved a simple communication surveillance on the internet account used by BROADWELL at home – and potentially by her mobile phone – might have been the most likely route of investigation to take. A system of the likeness of Carnivore (since been replaced with more advanced implementations) could have been used to specifically and exclusively look for additional email accounts used as stated in the warrant.
- Asking BROADWELL: Sources are unclear at which point BROADWELL handed her computer over to the FBI for physical investigation of it’s contents. This would likely reveal other email accounts used by traces left in the browser history & bookmarks, configuration of email client software, and entries in automatic password managers or auto-fill records of the browser.
- [Update:] Some sources claim that both EMAIL_ACCOUNT_A and EMAIL_ACCOUNTS_B were managed by Google. It might be the case that the FBI only asked Google, as provider of EMAIL_ACCOUNT_A, to search for other email accounts that were accessed by the same IP-Addresses and at the same times. Google then would have searched the access logs it stores, discovering EMAIL_ACCOUNTS_B and then make them known to the FBI. Sources are unclear in this regard, but it remains a possibility at this point.
By using any or all of the above methods, the FBI found more email accounts, EMAIL_ACCOUNTS_B, which were accessed regularly.
Phase VI: Hitting Gold
The FBI at this point gained access to EMAIL_ACCOUNTS_B discovered in phase V. How exactly the access was gained is unclear and depends on the exact method(s) used in phase V. Either account access credentials were discovered, or additional subpoenas/warrants were issued to access the accounts with the help of their respective providers (see phase II).
When analyzing the content of these accounts stored on the providers’ servers a group of accounts, EMAIL_ACCOUNTS_C, stuck out due to two factors:
- Classified information was stored in the account. Multiple sources refer to this but it might be a confusion with files stored on BROADWELL’s computer which was at some point made available to the FBI.
- Excessive use of the “Drafts”-folder for communication
Especially the use of the Drafts-folder appears to have caught the attention of the media, and possibly the FBI, because it is a common method used to conceal communication.
This method is commonly referred to as a “Digital Dead Drop” (the term drop box is mostly a media error/invention). Here the communicating parties share the access credentials to an email account. By authoring emails and not sending them but storing them instead in the Drafts-folder the parties can exchange messages without actually generating additional traffic “on the wire”. This was popularized by reports about Al-Quaeda operatives using this method.
While it is true that additional traffic is not generated through this technique, the traffic for accessing the accounts and the data in the accounts is still available and often under lower legal protection than actual communication that involves multiple accounts. The method was mostly used out of fear that intelligence agencies would have automated access to international internet communication (true) but would have no access to email accounts stored on servers (false). Even access to email accounts leaves traces that can be scooped up by surveillance operations, and data stored on email accounts is no more secure than transmitted data if the intelligence agency can gain access to the servers – which it usually can.
Furthermore it concentrates all information about the account users in one place instead of spreading it over multiple networks that might not be equally surveilled. Due to the recording of access to email accounts a surveilling party only needs to secure the cooperation (or undermine the protections) of a single party to gain access to the IP-Addresses of communicating parties and times/dates when communication took place.
And this appears to be so in this case.
Phase VII: Identifying other parties
It is unclear how PETRAEUS was linked to EMAIL_ACCOUNTS_C. Most likely the IP-Address information stored by the email provider at each access was used to identify other parties involved. For this subpoenas to Internet service providers could have been used to identify the users of the IP-Addresses stored in the email account logfiles.
More likely however the FBI connected one or more of these IP-Addresses to the CIA immediately and left the final identification to their IT department.
Commentary on the case
Public knowledge about the case is very limited both in depths and reliability. What can be concluded however is that the FBI used a wide array of investigative methods and resources on a simple harassment case that escalated to a case about concerns on national security during the investigation.
If this was in any way justified remains to be seen.
Several lessons can be drawn from this story:
- Investigations that begin with a low interest and impact can escalate quickly, drawing in more and more potent methods and technologies.
- Most internet service providers, email providers and hospitality businesses are not sufficient guardians of one’s privacy.
- Context-Information and Meta-Data (email headers, access logs, IP-Addresses) are the prime source of information for intelligence and investigation operations. These can easily be processed automatically by software because they were created by computers for computers.
- Hear-say tradecraft (Drafts-folder as digital dead drop) without an understanding of backgrounds to protect one’s privacy is not only insufficient but even counter-productive as shown in this case.
Good digital tradecraft for E-Mail
Good tradecraft for protecting email communication does exists:
- Protect email content through message encryption, like GnuPG
- Do not rely on third party storage of emails. Download emails and delete them from the email server.
- Store email and other information (such as browser data) securely using Full Disk Encryption like TrueCrypt.
- Points 1-3 also mean that one shall not use webmail services.
- Select an email provider that is privacy conscious: Removing identifying header information from emails and protecting whois/domain-data or being registered in a jurisdiction other than your own.
- Use encryption to communicate with the email provider: Insist on TLS/SSL encrypted access to their SMTP (outgoing) or POP3/IMAP4 (incoming) servers.
- Only access the Internet with anonymization methods enabled that conceal your true IP-Address from third parties, like Tor/I2P/Multi-Hop VPNs.
- Do not draw unneeded attention towards yourself by harassing people needlessly.
These are only the minimal tradecraft rules for secure and private email use.
But they would have been sufficient to protect PETRAEUS and BROADWELL.
Update 2012-11-15: Added option 5 to “Phase V: Widening the picture”.