Here is another great post from Doug Burks.
For the purposes of this blog post, “evil User Agents” could be truly
evil User Agents like “Bob’s Evil Clown C&C Agent” or they could
simply be outdated and vulnerable browser software like


Only from a cheese grater..

If you already proxy your outbound HTTP traffic, then this a trivial
exercise of just parsing your existing proxy logs. This is left as an
exercise to the reader, but the command-line kung-fu at the bottom of
this blog post may help you get started. If you don’t already have a
proxy, read on for some other ideas.
WPAD is Web Proxy Auto Discovery. By default, most browsers will
attempt to retrieve a proxy configuration file from a local WPAD
server. Even if you don’t have a proxy server, you can create a
“wpad” A record in your internal DNS and point it at a web server
where you can monitor the logs. You don’t have to add a proxy
configuration file to the web server, just let the clients request the
file and then query the web server logs for their User Agent strings.
This isn’t a 100% solution as some malware may not try to connect to
WPAD. [1] [2] httpry
If you can span or tap your outbound Internet traffic to a box running
the httpry utility [3], it will create logs very similar to a proxy
but without requiring any reconfiguration of your network or clients
[4] [5]. A recent update to Security Onion added httpry and
configured it to run on all monitored interfaces, so you can have full
IDS/NSM *and* searchable HTTP logs in one box [6].
httpry’s output format is configurable. If you want just the client
IP and User Agent, you can configure httpry to log just those fields.
If running httpry in Security Onion, it logs in the following format:
(for uploading into Sguil [7]). We can use our old friends cut, awk,
sort, and uniq to pare this down to just client IP and User Agent to
produce an actionable report:

cut -f2,10 /nsm/sensor_data/*/httpry/`date +%Y-%m-%d`.log | awk '$2 !=
"-"' | sort | uniq -c |sort -nr

Let’s break the command down:
cut -f2,10 /nsm/sensor_data/*/httpry/`date +%Y-%m-%d`.log
Security Onion stores its httpry logs in
/nsm/sensor_data/NAME_OF_SENSOR/httpry/YYYY-MM-DD.log (where
NAME_OF_SENSOR is the actual sensor name and YYYY-MM-DD is the actual
date). We can get today’s date in YYYY-MM-DD format using the
backticked command `date +%Y-%m-%d`. So the full command extracts
fields 2 and 10 from today’s log on all sensors on the box.
| awk ‘$2 != “-”‘
Take the output from the cut command and remove lines where the User
Agent field is just a hyphen.
| sort | uniq -c |sort -nr
Take the output from the awk command, sort the logs by IP address and
collapse them into unique entries (giving a count of each unique entry
at the beginning of the line), and then sort in reverse numerical
format. This puts our User Agents with the highest amount of traffic
on top.
Here’s some sample output:

 2701  Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_2)
AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.835.202 Safari/535.1
1024  Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
rv: Gecko/20081217 Firefox/
992  Mozilla/5.0 (Windows NT 6.1) AppleWebKit/535.1
(KHTML, like Gecko) Chrome/14.0.835.202 Safari/535.1
39  Mozilla/5.0 (X11; U; Linux i686; en-US;
rv: Gecko/20110921 Ubuntu/10.04 (lucid) Firefox/3.6.23
78  Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.1
(KHTML, like Gecko) Chrome/14.0.835.202 Safari/535.1
11  Roku/DVP-3.0 (013.00E02227A)
5  Wget/1.12 (linux-gnu)
3  Bob's Evil Clown C&C Agent
2  curl/7.19.7 (i486-pc-linux-gnu) libcurl/7.19.7
OpenSSL/0.9.8k zlib/ libidn/1.15

On the second line of the output, we see that is running the
vulnerable Firefox browser. On the next to last line of the
output, we find Bob’s Evil Clown C&C Agent on
[1] –
[2] –
[3] –
[4] –
[5] –
[6] –
[7] –

About the author

Leave a Reply