I've been poking at some metadata for information gathering lately for a project or two. One of the document types that I've been focuses on has been JPEG images. Why, you ask? Take a look at this web page. See all those pretty pictures. JPGS. Same with just about every other website on the planet.
Look like we have plenty of fodder for our metadata cannon.
So, I began analyzing metadata on JPGS form random websites that struck my fancy. In a few cases, I came across some good information; the type of software used to produce the image (great selecting a particular exploit), the author (great for selecting a target), dates of authorship (good for determining validity of attack and target) and finally some camera types (good for determining some basic financial commitment, and who's memory cards to steal on a physical assessment). Mostly, I came across a whole bunch of sanitized data. Clearly I needed a better set of JPGS to play with.
Then, 18 gigs of Myspace JPG images fell into my lap.
I figured that I'd be in metadata heaven. I also figured that I might be able to put an author name behind the image of the two dogs humping, or better, the hottie showing off her naughty bits.
I was mistaken.
I ran exiftool on about 10,000 images (with some fits and starts; exiftool is a perl app, and providing it too many images at once caused it to barf), all with the same result. Every image appears to have had the metadata stripped so that only the metadata needed to correctly render the image is left. No author. No creation tool. No dates. No camera info.
Apparently, Myspace sanitizes all of the metadata when you upload your pics.
Good Myspace.
Of course, I had to test, especially since the 18 Gigs of images could have been played with to protect the innocent, given that they originally came from some acquisition techniques that could be described as ethically questionable (they were not acquired by me in that fashion). Here's how I tested:
First, I needed an image that I knew had good juicy metadata. How about the one from the news story about the hacker 0x80 that Slashdot folks used to track down some pretty scary info on the anonymous 0x80 using the intact metadata:
Yes, this image has the metadata intact.
Here's the output from exiftool -t -s filename.jpg showing all of the metadata:
======== 0x80_cracker_with_laptop.jpg
ExifToolVersion 7.23
FileName 0x80_cracker_with_laptop.jpg
Directory .
FileSize 44 kB
FileModifyDate 2007:12:14 16:05:51
FileType JPEG
MIMEType image/jpeg
JFIFVersion 1.1
ProfileCMMType Lino
ProfileVersion 2.1.0
ProfileClass Display Device Profile
ColorSpaceData RGB
ProfileConnectionSpace XYZ
ProfileDateTime 1998:02:09 06:49:00
ProfileFileSignature acsp
PrimaryPlatform Microsoft Corporation
CMMFlags Not Embedded, Independent
DeviceManufacturer IEC
DeviceModel sRGB
DeviceAttributes Reflective, Glossy, Positive, Color
RenderingIntent Perceptual
ConnectionSpaceIlluminant 0.9642 1 0.82491
ProfileCreator HP
ProfileID 0
ProfileCopyright Copyright (c) 1998 Hewlett-Packard Company
ProfileDescription sRGB IEC61966-2.1
MediaWhitePoint 0.95045 1 1.08905
MediaBlackPoint 0 0 0
RedMatrixColumn 0.43607 0.22249 0.01392
GreenMatrixColumn 0.38515 0.71687 0.09708
BlueMatrixColumn 0.14307 0.06061 0.7141
DeviceMfgDesc IEC http://www.iec.ch
DeviceModelDesc IEC 61966-2.1 Default RGB colour space - sRGB
ViewingCondDesc Reference Viewing Condition in IEC61966-2.1
ViewingCondIlluminant 19.6445 20.3718 16.8089
ViewingCondSurround 3.92889 4.07439 3.36179
ViewingCondIlluminantType D50
Luminance 76.03647 80 87.12462
MeasurementObserver CIE 1931
MeasurementBacking 0 0 0
MeasurementGeometry Unknown (0)
MeasurementFlare 0.999%
MeasurementIlluminant D65
Technology Cathode Ray Tube Display
RedTRC (Binary data 2060 bytes, use -b option to extract)
GreenTRC (Binary data 2060 bytes, use -b option to extract)
BlueTRC (Binary data 2060 bytes, use -b option to extract)
ApplicationRecordVersion 2
Caption-Abstract SLUG: mag/hacker DATE: 12/20/2005 PHOTOGRAPHER: Sarah L. Voisin/TWP id#: LOCATION: Roland, OK.CAPTION: .PICTURED:
Writer-Editor SLV
By-line Sarah L. Voisin
By-lineTitle STAFF
ObjectName mag/hacker
Province-State OK
Country-PrimaryLocationName USA
OriginalTransmissionReference 175706
TimeCreated 12:38:30-06:00
DisplayedUnitsX inches
DisplayedUnitsY inches
GlobalAngle 30
GlobalAltitude 30
CopyrightFlag False
PhotoshopThumbnail (Binary data 3276 bytes, use -b option to extract)
PhotoshopQuality 12
PhotoshopFormat Standard
ProgressiveScans 3 Scans
ExifByteOrder Little-endian (Intel, II)
ImageDescription SLUG: mag/hacker DATE: 12/20/2005 PHOTOGRAPHER: Sarah L. Voisin/TWP id#: LOCATION: Roland, OK.CAPTION: .PICTURED:
Software Adobe Photoshop CS2 Macintosh
Artist Sarah L. Voisin
ComponentsConfiguration YCbCr
Flash On
UserComment
InteropIndex R98 - DCF basic file (sRGB)
InteropVersion 0100
Compression JPEG (old-style)
ThumbnailOffset 17196
ThumbnailLength 3276
Orientation Horizontal (normal)
YCbCrPositioning Co-sited
XResolution 200
YResolution 200
ResolutionUnit inches
Make Canon
Model Canon EOS 20D
ModifyDate 2006:02:16 15:43:01-05:00
CreateDate 2006:02:16 15:43:01-05:00
MetadataDate 2006:02:16 15:43:01-05:00
CreatorTool Adobe Photoshop CS2 Macintosh
ExifVersion 0221
FlashpixVersion 0100
ColorSpace sRGB
ExifImageWidth 3504
ExifImageHeight 2336
DateTimeOriginal 2005:12:20 12:38:30-05:00
DateTimeDigitized 2005:12:20 12:38:30-05:00
ExposureTime 1/30
FNumber 5.0
ExposureProgram Manual
ISO 100
ShutterSpeedValue 1/30
ApertureValue 5.0
ExposureCompensation 0
MeteringMode Multi-segment
FlashFired True
FlashReturn No return detection
FlashMode On
FlashFunction False
FlashRedEyeMode False
FocalLength 85.0 mm
FocalPlaneXResolution 3959.32203389831
FocalPlaneYResolution 3959.32203389831
FocalPlaneResolutionUnit inches
CustomRendered Normal
ExposureMode Manual
WhiteBalance Auto
SceneCaptureType Standard
NativeDigest 36864,40960,40961,37121,37122,40962,40963,37510,40964,36867,36868,33434,33437,34850,34852,34855,34856,37377,37378,37379,37380,37381,37382,37383,37384,37385,37386,37396,41483,41484,41486,41487,41488,41492,41493,41495,41728,41729,41730,41985,41986,41987,41988,41989,41990,41991,41992,41993,41994,41995,41996,42016,0,2,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,20,22,23,24,25,26,27,28,30;3B11799D192F50186735EF6636B7FD47
DocumentID uuid:5A82A660A09311DAB292D9FC4FB3D5EC
InstanceID uuid:5A82A661A09311DAB292D9FC4FB3D5EC
DerivedFromInstanceID uuid:5A82A65FA09311DAB292D9FC4FB3D5EC
DerivedFromDocumentID uuid:5A82A65FA09311DAB292D9FC4FB3D5EC
Format image/jpeg
Description SLUG: mag/hacker DATE: 12/20/2005 PHOTOGRAPHER: Sarah L. Voisin/TWP id#: LOCATION: Roland, OK.CAPTION: .PICTURED:
Creator Sarah L. Voisin
Title mag/hacker
CaptionWriter SLV
AuthorsPosition STAFF
Credit TWP
Source 20051220
City Roland
State OK
Country USA
TransmissionReference 175706
ColorMode 3
ICCProfileName sRGB IEC61966-2.1
DateCreated 2005:12:20
History
ImageWidth 228
ImageHeight 153
EncodingProcess Baseline DCT, Huffman coding
BitsPerSample 8
ColorComponents 3
YCbCrSubSampling YCbCr4:4:4 (1 1)
Aperture 5.0
DateTimeCreated 2005:12:20 12:38:30-06:00
ImageSize 228x153
ScaleFactor35efl 1.6
ShutterSpeed 1/30
ThumbnailImage (Binary data 3276 bytes, use -b option to extract)
CircleOfConfusion 0.019 mm
FOV 15.1 deg
FocalLength35efl 85.0 mm (35 mm equivalent: 136.1 mm)
HyperfocalDistance 77.02 m
LightValue 9.6
Now, I upload it to my Myspace account, and then use Firefox to "Save image as..." to the resulting image:
Yes, I have a Myspace account. It's my dirty little information gathering secret.
Here the resulting metadata form the Myspace image, using the same exiftool command:
======== 0x08 from myspace.jpg
ExifToolVersion 7.23
FileName 0x08 from myspace.jpg
Directory .
FileSize 6 kB
FileModifyDate 2008:04:01 13:59:33
FileType JPEG
MIMEType image/jpeg
JFIFVersion 1.1
ResolutionUnit inches
XResolution 100
YResolution 100
ImageWidth 228
ImageHeight 153
EncodingProcess Baseline DCT, Huffman coding
BitsPerSample 8
ColorComponents 3
YCbCrSubSampling YCbCr4:2:0 (2 2)
ImageSize 228x153
That's a BIG difference. Good Myspace. Yes, I know that putting those two words together in the same sentence seems...wrong.
What about Facebook? I uploaded the same original image (with the juicy metadata) to my profile on FaceBook. Here are the results:
...and the resulting metadata (again, same exiftool command)?
======== 0x80 form facebook.jpg
ExifToolVersion 7.23
FileName 0x80 form facebook.jpg
Directory .
FileSize 6 kB
FileModifyDate 2008:04:04 14:25:48
FileType JPEG
MIMEType image/jpeg
JFIFVersion 1.1
ResolutionUnit inches
XResolution 72
YResolution 72
ImageWidth 228
ImageHeight 153
EncodingProcess Baseline DCT, Huffman coding
BitsPerSample 8
ColorComponents 3
YCbCrSubSampling YCbCr4:2:0 (2 2)
ImageSize 228x153
Yes. Good Facebook.
Overall, I was shocked that both Myspace and Facebook had done this. Am I off base? Is this a common thing?
I guess I have a few more "social networks" to try. Twitter, Picasa, LinkedIn, Flickr (I KNOW they keep and analyze some metadata...), and more I'm sure haven't popped into my head yet.
Looks like I'm still in need of finding a good repository of metadata. Flickr, here I come.
- Larry "haxorthematrix" Pesce
larry /at/ pauldotcom.com