Archiv für Mai, 2008

tattletale search forms

Posted in web with tags , , , on 2008-05-14 by docsmith

Many sites are embedding ads, and the company presenting the most ads in the net is double-click.net. It’s said that up to 80% of the web sites out there display ads hosted by double-click („dc“).

The story behind is told so often already: By placing a cookie onto your computer the first time you load an dc ad, they can track your way through the web and display ads according to your interests. If they see you visiting travelchannel.com and opodo.com, they could display ads leading to yet another travel company.

But sometimes there’s more than meets the eye …

Some sites are passing the data you entered in a search form to the next page by adding them to the address line, e.g. like travelchannel.com does:
http://search.travelchannel.com/search?w=holiday+caribbean+sea
&searchFormSubmit.x=0&searchFormSubmit.y=0

This line is visible to dc as parameter called „referer“ which indicates the page which embeds the ad, which means this time they not only see which site you’ve visited but also what you’ve been searching for – without you noticing. Searching for „holiday caribbean sea“ doesn’t seem to do any harm, but what if you’ve been looking for „nudist camp“ or „blood cancer therapy“? Do you really like some ad company like dc to know what’s on your mind?

At least there’s no direct link to your person. Well, there shouldn’t be. As far as there’s no real chance to escape from the dc ads in the modern web, the only and most practicable way of defense is to clear the cookie cache of your browser frequently, to avoid substantial surf and/or search profiles – in regard of search engines also. Some browsers like Firefox allow you to deny cookies from certain domains, but it’d be real hard work to build a list of all ad companies.

If you’ve got an apache webserver at hand and you’d like to get an impression which sites are embedding dc ads, including the information the latter receive via referer etc., then you may try to setup a little trap with the two steps described in the attached pdf document. Roughly spoken, it’s just a redirection of the most known dc hostnames to your own apache where a little cgi-bin is logging all the calls.