I’ve been poking at some metadata for information gathering lately for a project or two. One of the document types that I’ve been focuses on has been JPEG images. Why, you ask? Take a look at this web page. See all those pretty pictures. JPGS. Same with just about every other website on the planet.
Look like we have plenty of fodder for our metadata cannon.
So, I began analyzing metadata on JPGS form random websites that struck my fancy. In a few cases, I came across some good information; the type of software used to produce the image (great selecting a particular exploit), the author (great for selecting a target), dates of authorship (good for determining validity of attack and target) and finally some camera types (good for determining some basic financial commitment, and who’s memory cards to steal on a physical assessment). Mostly, I came across a whole bunch of sanitized data. Clearly I needed a better set of JPGS to play with.
Then, 18 gigs of Myspace JPG images fell into my lap.
I figured that I’d be in metadata heaven. I also figured that I might be able to put an author name behind the image of the two dogs humping, or better, the hottie showing off her naughty bits.
I was mistaken.
I ran exiftool on about 10,000 images (with some fits and starts; exiftool is a perl app, and providing it too many images at once caused it to barf), all with the same result. Every image appears to have had the metadata stripped so that only the metadata needed to correctly render the image is left. No author. No creation tool. No dates. No camera info.
Apparently, Myspace sanitizes all of the metadata when you upload your pics.
Good Myspace.
Of course, I had to test, especially since the 18 Gigs of images could have been played with to protect the innocent, given that they originally came from some acquisition techniques that could be described as ethically questionable (they were not acquired by me in that fashion). Here’s how I tested:
First, I needed an image that I knew had good juicy metadata. How about the one from the news story about the hacker 0×80 that Slashdot folks used to track down some pretty scary info on the anonymous 0×80 using the intact metadata:


Yes, this image has the metadata intact.
Here’s the output from exiftool -t -s filename.jpg showing all of the metadata:

Now, I upload it to my Myspace account, and then use Firefox to “Save image as…” to the resulting image:

0x08 from myspace.jpg

Yes, I have a Myspace account. It’s my dirty little information gathering secret.
Here the resulting metadata form the Myspace image, using the same exiftool command:

That’s a BIG difference. Good Myspace. Yes, I know that putting those two words together in the same sentence seems…wrong.
What about Facebook? I uploaded the same original image (with the juicy metadata) to my profile on FaceBook. Here are the results:

0x80 form facebook.jpg

…and the resulting metadata (again, same exiftool command)?

Yes. Good Facebook.
Overall, I was shocked that both Myspace and Facebook had done this. Am I off base? Is this a common thing?
I guess I have a few more “social networks” to try. Twitter, Picasa, LinkedIn, Flickr (I KNOW they keep and analyze some metadata…), and more I’m sure haven’t popped into my head yet.
Looks like I’m still in need of finding a good repository of metadata. Flickr, here I come.
- Larry “haxorthematrix” Pesce
larry /at/ pauldotcom.com

