I've been poking at some metadata for information gathering lately for a project or two. One of the document types that I've been focuses on has been JPEG images. Why, you ask? Take a look at this web page. See all those pretty pictures. JPGS. Same with just about every other website on the planet.
Look like we have plenty of fodder for our metadata cannon.
So, I began analyzing metadata on JPGS form random websites that struck my fancy. In a few cases, I came across some good information; the type of software used to produce the image (great selecting a particular exploit), the author (great for selecting a target), dates of authorship (good for determining validity of attack and target) and finally some camera types (good for determining some basic financial commitment, and who's memory cards to steal on a physical assessment). Mostly, I came across a whole bunch of sanitized data. Clearly I needed a better set of JPGS to play with.
Then, 18 gigs of Myspace JPG images fell into my lap.
I figured that I'd be in metadata heaven. I also figured that I might be able to put an author name behind the image of the two dogs humping, or better, the hottie showing off her naughty bits.
I was mistaken.
I ran exiftool on about 10,000 images (with some fits and starts; exiftool is a perl app, and providing it too many images at once caused it to barf), all with the same result. Every image appears to have had the metadata stripped so that only the metadata needed to correctly render the image is left. No author. No creation tool. No dates. No camera info.
Apparently, Myspace sanitizes all of the metadata when you upload your pics.
Of course, I had to test, especially since the 18 Gigs of images could have been played with to protect the innocent, given that they originally came from some acquisition techniques that could be described as ethically questionable (they were not acquired by me in that fashion). Here's how I tested:
First, I needed an image that I knew had good juicy metadata. How about the one from the news story about the hacker 0x80 that Slashdot folks used to track down some pretty scary info on the anonymous 0x80 using the intact metadata:
Yes, this image has the metadata intact.
Here's the output from exiftool -t -s filename.jpg showing all of the metadata:
======== 0x80_cracker_with_laptop.jpg ExifToolVersion 7.23 FileName 0x80_cracker_with_laptop.jpg Directory . FileSize 44 kB FileModifyDate 2007:12:14 16:05:51 FileType JPEG MIMEType image/jpeg JFIFVersion 1.1 ProfileCMMType Lino ProfileVersion 2.1.0 ProfileClass Display Device Profile ColorSpaceData RGB ProfileConnectionSpace XYZ ProfileDateTime 1998:02:09 06:49:00 ProfileFileSignature acsp PrimaryPlatform Microsoft Corporation CMMFlags Not Embedded, Independent DeviceManufacturer IEC DeviceModel sRGB DeviceAttributes Reflective, Glossy, Positive, Color RenderingIntent Perceptual ConnectionSpaceIlluminant 0.9642 1 0.82491 ProfileCreator HP ProfileID 0 ProfileCopyright Copyright (c) 1998 Hewlett-Packard Company ProfileDescription sRGB IEC61966-2.1 MediaWhitePoint 0.95045 1 1.08905 MediaBlackPoint 0 0 0 RedMatrixColumn 0.43607 0.22249 0.01392 GreenMatrixColumn 0.38515 0.71687 0.09708 BlueMatrixColumn 0.14307 0.06061 0.7141 DeviceMfgDesc IEC http://www.iec.ch DeviceModelDesc IEC 61966-2.1 Default RGB colour space - sRGB ViewingCondDesc Reference Viewing Condition in IEC61966-2.1 ViewingCondIlluminant 19.6445 20.3718 16.8089 ViewingCondSurround 3.92889 4.07439 3.36179 ViewingCondIlluminantType D50 Luminance 76.03647 80 87.12462 MeasurementObserver CIE 1931 MeasurementBacking 0 0 0 MeasurementGeometry Unknown (0) MeasurementFlare 0.999% MeasurementIlluminant D65 Technology Cathode Ray Tube Display RedTRC (Binary data 2060 bytes, use -b option to extract) GreenTRC (Binary data 2060 bytes, use -b option to extract) BlueTRC (Binary data 2060 bytes, use -b option to extract) ApplicationRecordVersion 2 Caption-Abstract SLUG: mag/hacker DATE: 12/20/2005 PHOTOGRAPHER: Sarah L. Voisin/TWP id#: LOCATION: Roland, OK.CAPTION: .PICTURED: Writer-Editor SLV By-line Sarah L. Voisin By-lineTitle STAFF ObjectName mag/hacker Province-State OK Country-PrimaryLocationName USA OriginalTransmissionReference 175706 TimeCreated 12:38:30-06:00 DisplayedUnitsX inches DisplayedUnitsY inches GlobalAngle 30 GlobalAltitude 30 CopyrightFlag False PhotoshopThumbnail (Binary data 3276 bytes, use -b option to extract) PhotoshopQuality 12 PhotoshopFormat Standard ProgressiveScans 3 Scans ExifByteOrder Little-endian (Intel, II) ImageDescription SLUG: mag/hacker DATE: 12/20/2005 PHOTOGRAPHER: Sarah L. Voisin/TWP id#: LOCATION: Roland, OK.CAPTION: .PICTURED: Software Adobe Photoshop CS2 Macintosh Artist Sarah L. Voisin ComponentsConfiguration YCbCr Flash On UserComment InteropIndex R98 - DCF basic file (sRGB) InteropVersion 0100 Compression JPEG (old-style) ThumbnailOffset 17196 ThumbnailLength 3276 Orientation Horizontal (normal) YCbCrPositioning Co-sited XResolution 200 YResolution 200 ResolutionUnit inches Make Canon Model Canon EOS 20D ModifyDate 2006:02:16 15:43:01-05:00 CreateDate 2006:02:16 15:43:01-05:00 MetadataDate 2006:02:16 15:43:01-05:00 CreatorTool Adobe Photoshop CS2 Macintosh ExifVersion 0221 FlashpixVersion 0100 ColorSpace sRGB ExifImageWidth 3504 ExifImageHeight 2336 DateTimeOriginal 2005:12:20 12:38:30-05:00 DateTimeDigitized 2005:12:20 12:38:30-05:00 ExposureTime 1/30 FNumber 5.0 ExposureProgram Manual ISO 100 ShutterSpeedValue 1/30 ApertureValue 5.0 ExposureCompensation 0 MeteringMode Multi-segment FlashFired True FlashReturn No return detection FlashMode On FlashFunction False FlashRedEyeMode False FocalLength 85.0 mm FocalPlaneXResolution 3959.32203389831 FocalPlaneYResolution 3959.32203389831 FocalPlaneResolutionUnit inches CustomRendered Normal ExposureMode Manual WhiteBalance Auto SceneCaptureType Standard NativeDigest 36864,40960,40961,37121,37122,40962,40963,37510,40964,36867,36868,33434,33437,34850,34852,34855,34856,37377,37378,37379,37380,37381,37382,37383,37384,37385,37386,37396,41483,41484,41486,41487,41488,41492,41493,41495,41728,41729,41730,41985,41986,41987,41988,41989,41990,41991,41992,41993,41994,41995,41996,42016,0,2,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,20,22,23,24,25,26,27,28,30;3B11799D192F50186735EF6636B7FD47 DocumentID uuid:5A82A660A09311DAB292D9FC4FB3D5EC InstanceID uuid:5A82A661A09311DAB292D9FC4FB3D5EC DerivedFromInstanceID uuid:5A82A65FA09311DAB292D9FC4FB3D5EC DerivedFromDocumentID uuid:5A82A65FA09311DAB292D9FC4FB3D5EC Format image/jpeg Description SLUG: mag/hacker DATE: 12/20/2005 PHOTOGRAPHER: Sarah L. Voisin/TWP id#: LOCATION: Roland, OK.CAPTION: .PICTURED: Creator Sarah L. Voisin Title mag/hacker CaptionWriter SLV AuthorsPosition STAFF Credit TWP Source 20051220 City Roland State OK Country USA TransmissionReference 175706 ColorMode 3 ICCProfileName sRGB IEC61966-2.1 DateCreated 2005:12:20 History ImageWidth 228 ImageHeight 153 EncodingProcess Baseline DCT, Huffman coding BitsPerSample 8 ColorComponents 3 YCbCrSubSampling YCbCr4:4:4 (1 1) Aperture 5.0 DateTimeCreated 2005:12:20 12:38:30-06:00 ImageSize 228x153 ScaleFactor35efl 1.6 ShutterSpeed 1/30 ThumbnailImage (Binary data 3276 bytes, use -b option to extract) CircleOfConfusion 0.019 mm FOV 15.1 deg FocalLength35efl 85.0 mm (35 mm equivalent: 136.1 mm) HyperfocalDistance 77.02 m LightValue 9.6
Now, I upload it to my Myspace account, and then use Firefox to "Save image as..." to the resulting image:
Yes, I have a Myspace account. It's my dirty little information gathering secret.
Here the resulting metadata form the Myspace image, using the same exiftool command:
======== 0x08 from myspace.jpg ExifToolVersion 7.23 FileName 0x08 from myspace.jpg Directory . FileSize 6 kB FileModifyDate 2008:04:01 13:59:33 FileType JPEG MIMEType image/jpeg JFIFVersion 1.1 ResolutionUnit inches XResolution 100 YResolution 100 ImageWidth 228 ImageHeight 153 EncodingProcess Baseline DCT, Huffman coding BitsPerSample 8 ColorComponents 3 YCbCrSubSampling YCbCr4:2:0 (2 2) ImageSize 228x153
That's a BIG difference. Good Myspace. Yes, I know that putting those two words together in the same sentence seems...wrong.
What about Facebook? I uploaded the same original image (with the juicy metadata) to my profile on FaceBook. Here are the results:
...and the resulting metadata (again, same exiftool command)?
======== 0x80 form facebook.jpg ExifToolVersion 7.23 FileName 0x80 form facebook.jpg Directory . FileSize 6 kB FileModifyDate 2008:04:04 14:25:48 FileType JPEG MIMEType image/jpeg JFIFVersion 1.1 ResolutionUnit inches XResolution 72 YResolution 72 ImageWidth 228 ImageHeight 153 EncodingProcess Baseline DCT, Huffman coding BitsPerSample 8 ColorComponents 3 YCbCrSubSampling YCbCr4:2:0 (2 2) ImageSize 228x153
Yes. Good Facebook.
Overall, I was shocked that both Myspace and Facebook had done this. Am I off base? Is this a common thing?
I guess I have a few more "social networks" to try. Twitter, Picasa, LinkedIn, Flickr (I KNOW they keep and analyze some metadata...), and more I'm sure haven't popped into my head yet.
Looks like I'm still in need of finding a good repository of metadata. Flickr, here I come.
- Larry "haxorthematrix" Pesce
larry /at/ pauldotcom.com