Metadata: The Hacker's Breadcrumb Trail (And How to Wipe It Clean)

 



What is metadata?

Common files have ‘data about data’ known as metadata, which shows information such as where an image was taken or who the author of a PDF is. These are known as attributes and they are also valuable data malicious actors can exploit.

Here is a table on common file types, what metadata they contain, and the purpose of the metadata.

File TypeCommon MetadataPurpose/Details
Documents (.docx, .pdf, .xlsx, .pptx)
  • Author/Creator,
  • Creation/Modification Dates,
  • Last Saved By,
  • Company/Organization Name,
  • Comments/Tracked Changes,
  • Document Revision Number,
  • Software/Application Version used to create it,
  • Print Date,
  • Template name.
  • Tracks document history,
  • ownership,
  • creation environment.

Comments and tracked changes are often hidden, but remain in metadata.
Images (.jpg, .png, .tiff)Exif Data (Exchangeable Image File Format):
  • Camera Make/Model,
  • Date/Time the photo was taken,
  • Geolocation (GPS coordinates),
  • Exposure settings (shutter speed, aperture), 
  • Image resolution,
  • Thumbnail.
Provides details about the hardware and conditions under which the image was captured.
Audio/Video (.mp3, .mp4, .mov)
  • Artist,
  • Album,
  • Copyright/Licensing Information,
  • Encoding details (codec, bitrate),
  • Creation Date,
  • Duration.
Used for media organization and copyright enforcement.
Emails (Stored in .eml, .pst, or within mail servers)
  • Sender/Recipient IP Addresses,
  • Timestamps,
  • Email Headers (routing path),
  • Subject,
  • Message-ID.
Essential for email delivery; reveals the origin and route of the message.


Why should you clean metadata?

Because metadata gives attackers data on you. The initial phase of any cyberattack is reconnaissance or information gathering. Metadata can serve as a digital breadcrumb trail to gather sensitive intelligence to help them customize their attacks or to identify vulnerabilities.


How can hackers use your metadata? 

From reconnaissance they can use the metadata to exploit a victim digitally, or use social engineering and phishing methods, and even a user’s physical location.


Metadata: Types and Exploitation

User / Creator Info

For example, ‘Author’, ‘Last Modified By’, or ‘Creator’ attributes often reveals valid usernames or email addresses.

If the metadata shows Author: John Doe the email could be John.Doe@company.com or jdoe@company.com or some other variation.

The hacker can then try brute-force attacking or credential stuffing.

Outdated Software or O/S

In software and OS fingerprinting, the metadata can disclose the software application version or O/S version. If the version is older, the hacker can use publicly known exploits to gain access.

Internal File Paths (i.e., Network Insight)

If there is a file path that’s embedded in the file (for example C:\Users\John.Doe\Documents\Private\Internal_Docs.docx) this shows the user’s home directory structures and possibly other info (such as server names or internal network naming conventions).

Social Engineering & Phishing

The collected metadata is useful for targeted attacks. Phishing relies on social engineering because it is easier to hack a person than to hack a computer.

Spear Phishing

Using the example above, a hacker uses metadata such as:

  • Employee name
  • Document’s internal project name
  • Software Version

Now a hacker could possibly craft a legitimate looking email citing the above information.

“Please review the final version of this [Project Name] document that [Employee name] updated.”


Impersonation

Using an employee’s name (and title if also found in the metadata), hackers can create new, but similar email addresses or social media profiles to trick other employees into divulging information.

Geolocation

Digital photos, (especially those taken by smartphones) contain EXIF (Exchangeable Image file format) metadata, which often includes GPS coordinates (latitude and longitude). 

For example, an employee may think they took an innocent photo, such as of their home office or a new sensitive facility. The EXIF data can reveal the exact physical location and that opens the location to risk.

Creating False Metadata (Steganography)

A hacker can also use metadata fields to hide malicious code or an encrypted payload. These fields are typically not scanned by basic security tools!

A file may look completely normal, but if it is processed by a vulnerable program it could execute the hidden code. The hidden code can then install malware or establish a command-and-control connection.


When should you clean metadata?

Whenever you share any document, image, or media file outside your organization or personal trusted circle, scrub that metadata!

This means:

  • When sharing with clients or vendors
  • Emailing an attachment to an external address
  • Posting to a public website or to social media


What about internally?

When moving sensitive files from a highly restricted server to general access such as a general shared internal drive, be sure to scrub the metadata.

Or for a final, approved version of a file, delete the metadata so that it does not carry the history of potentially contradictory drafts or comments.


How should you clean metadata?

Here's the TLDR; version:

File Type

What to Scrub

Recommended Action/Tool

Documents (.docx, .xlsx, .pptx)

Tracked Changes, Comments, Author Name, Internal Paths, Last Saved By.

Use the "Document Inspector" (built into Microsoft Office). For Excel, review hidden rows/columns and macros.

Images (.jpg, .png, .tiff)

EXIF Data (GPS coordinates, Camera Make/Model, Date/Time).

Use a dedicated EXIF scrubber tool (like ExifTool) or the built-in "Remove Properties and Personal Information" feature on Windows/Mac.

Final Documents (Any file)

All identifying and revision history metadata.

Convert the file to a clean PDF. Note: PDFs can still carry metadata, so you must then use a PDF editor's tool to check and strip the PDF metadata as a second step.


Important Tip: Always save the scrubbed file with a new name or in a separate folder. This ensures you retain the original version with all its useful metadata (such as author history, internal comments) in case you need to refer back to it late.

Detailed Version

To clean metadata from a Word document:

Windows users:

  1. Use the Document Inspector by going to File > Info > Check for Issues > Inspect Document 
  2. The Inspector lists the types of metadata found.
  3. Select properties like "Document Properties and Personal Information" that you wish to remove.
  4. Click the ‘Remove All’ button.

Mac users:

Go to Tools > Protect Document and check the box for "Remove personal information from this file on save".


To scrub EXIF data from an image:

For Windows users:
  1. Right-click on the photo file > select Properties.
  2. Go to the Details tab.
  3. Click Remove Properties and Personal Information.
  4. Choose to create a copy with metadata removed or delete the metadata from the original.
For macOS:

Using the Photos app:
  1. Open the photo in the Photos app.
  2. Click the 'i' button or go to Image > Location > Hide Location.

Using a third-party app:
  1. Download and install an app like ImageOptim.
  2. Drag and drop the photos into the app's window.
  3. The app will automatically remove the EXIF data upon import or action.

For Photos and Images on Mobile Devices:

Android: 

  1. Open the photo in the Gallery app.
  2. Tap the info icon (or swipe up)
  3. Select "Remove location" or the edit icon next to the location to delete it.

iPhone: 

Use the built-in Photos app to remove location data. 
  1. Open the photo.
  2. Tap the 'i' icon.
  3. Select "No Location" from the location field.

Using Online tools:

  1. Navigate to an online EXIF remover website, such as ExifTool.
  2. Upload your photo(s).
  3. Click the button to remove the EXIF data.
  4. Download the new file.

To Sanitize a PDF Using Adobe Acrobat:

  1. Open the PDF document
  2. Go to File > Properties.
  3. Edit or delete metadata properties - you can also check additional metadata fields in the Additional Metadata menu.
  4. Press OK and save the PDF.
Or go to Tools > Protection > click ‘Sanitize Document’.

How can you check that metadata was cleaned out?

You shouldn’t rely solely on the application to ensure that all the hidden data has been cleaned out. It’s best to use a separate, dedicated metadata viewing tool to verify that the cleaning process was successful. Metadata2go.com is one such tool which not only allows you to view metadata, but also edit and remove it.


Here is a quick summary table of Metadata Cleaning: Why, When, and How to Verify

Rationale (Why Clean)

When to Scrub (Mandatory Times)

Metadata serves as a digital breadcrumb trail that malicious actors exploit during the reconnaissance phase of a cyberattack. Cleaning it prevents the exposure of sensitive intelligence (like usernames, internal network paths, or physical location) that could be used to customize targeted attacks (e.g., spear-phishing) or identify system vulnerabilities.

External Sharing: Anytime a file (document, image, or media) is sent outside the company or trusted personal network.

Protecting Personally Identifiable Information (PII) and ensuring legal/regulatory compliance (e.g., GDPR, HIPAA) by avoiding the inadvertent sharing of confidential details (e.g., PII in revision history).

Public Posting: Uploading any file to a company website, social media, or public file-sharing service (critical for removing EXIF GPS data).

Maintaining document integrity by removing outdated, contradictory, or internal comments and tracked changes before final archival or sharing.

Legal/Discovery: When exchanging documents in a legal proceeding to avoid disclosing privileged information hidden in the document properties.

Internal Transfer: (Best practice) When moving a highly sensitive file from a restricted server to a more general internal shared drive.



File Type

Primary Metadata Risk

Verification Check (How to Ensure It's Clean)

Recommended Tools

Images (.jpg, .png, .tiff)

EXIF Data (GPS coordinates, Camera/Device model, Serial number, Date/Time).

Upload the cleaned file to a third-party viewer to confirm all location and device data fields are empty or generic.

ExifTool (Command-Line) or Online EXIF Viewers (e.g., Jimpl).

Documents (.docx, .xlsx, .pptx)

Revision History (Tracked Changes, Comments), Author/User Names, Internal File Paths, Application Version.

1. Re-run the Document Inspector in the application (File > Info). 

2. Manually check the "Review" tab for any residual comments or tracked changes. 

3. Right-click the file, check the Properties > Details tab.

Microsoft Office's Document Inspector.

PDF Files (.pdf)

Document Properties (Author, Creation Software, Keywords), Hidden Layers, Object Metadata.

Open the PDF and navigate to File > Properties or Document Properties

Check the Description and Custom Metadata tabs to ensure identifying information is gone.

Adobe Acrobat (Pro version) or specialized PDF metadata viewers.

All File Types (Universal Check)

Any and all embedded and proprietary metadata tags.

Use a powerful command-line utility to dump all metadata, including hidden and unknown tags, ensuring the output is minimal.

ExifTool (Command: exiftool -a -u -g1 <filename>)




Metadata, the "data about data", can accidentally become a hacker's favorite clue, giving them secrets like who you are, what software you use, or even where you took a photo.

Keeping your files clean is one of the most powerful ways to protect yourself and your information.

By making that quick scrub a habit—whether you're sending a resume, uploading a vacation photo, or finalizing a document—you instantly remove those little breadcrumbs. Remember to sanitize to keep your digital life safe.

Comments

Popular posts from this blog

Resources, Tips, and Techniques that Helped Me Pass the CompTIA Security+ Exam

Protecting Our Elders: A Comprehensive Look at Social Engineering Threats and Proactive Steps for Families

Network+ Deep Dive: Where Firewalls, Load Balancers, and APs Fit in the OSI Model