Now if that headline doesn’t get me some search ranking juice, nothing will - though my contextual ads (left) are likely to be less impressed.
I was going to post this earlier in the week, but Eric Peterson’s swashbuckling defense of cookies (and my hand-wringing response) intervened. As it turns out, though, that debate is very relevant to this post, which concerns the latest build of Internet Explorer 8 (still used by around 80% of the world’s web users, though not by you lot, who seem to favor Firefox by a nose), which hit the web this week.
I’ve already posted once about IE8, and talked about its new “InPrivate” features (also known as “porn mode”) that allow you to surf the web without leaving a trace (on the machine you’re using, of course – the websites you visit can still track your behavior). It’s worthy of another post because the specific feature that piqued my interest the last time – InPrivate Blocking – has a new name and somewhat different behavior now.
The new name for InPrivate Blocking is InPrivate Filtering, which is certainly a better name. You may recall that InPrivate Blocking was a feature that allowed the user to tell IE to block requests to third-party websites, either manually, or if content from those sites had been served in a third-party context more than 10 times. Examples of this kind of content? Web analytics tracking tag code; ads; widgets; embedded YouTube videos. The idea is to enable users to opt out of this kind of content because it enables third parties to track user behavior (with or without cookies) without them really knowing.
So what’s new in RC1, apart from a friendlier name? Well, a couple of things. The first is that InPrivate Filtering can be turned on even if you’re not browsing in “InPrivate mode”, via the Safety menu, or a handy little icon in the status bar:
Click it, and InPrivate Filtering is on. There’s no way to turn this on by default; you have to click the icon every time you start a new IE instance.
The other major change is that there’s more control over how third-party content is blocked. In the previous beta, content was automatically blocked if it turned up more than 10 times (i.e. on 10 different sites) as third-party content. That number is now tunable, to anywhere between 3 and 30:
The idea of InPrivate Filtering Subscriptions still exists – a user can import an appropriately formatted XML file (or click on a link on a site, such as this one) to subscribe to a list of blocked third-party content.I’ve not seen any public subscriptions pop up, however, in the time since IE8 beta 2 came out.
In my previous post in IE8, I wrote about how, as someone whose job depends on being able to track users, I am conflicted about this functionality. This revision makes it slightly easier for privacy hawks to block third-party content, and whilst I welcome it, my original prediction – that it will be relatively lightly used in practice – still stands.
Interestingly, since IE8 beta 2 was announced in August, other browser manufacturers have followed suit – most notably, Mozilla, which will be including InPrivate-style functionality in Firefox 3.1 – though without the third-party content blocking feature. Apple’s Safari browser has had similar functionality for some time.
Eric Peterson has an impassioned post on his blog in which he defends the Obama Administration’s decision to use persistent cookies for tracking behavior on the Whitehouse.gov site. He directs particular ire at an article by Chris Soghoian at CNET from November which questioned whether it was a smart move for the (then) Obama Transition Team to be using embedded YouTube videos for streaming Obama’s weekly addresses on the Change.gov site.
Eric’s post is a follow-up to his post from November in which he called upon Barack Obama to relax the burdensome rules around the use of persistent cookies on Government websites. And let me say this: those rules suck. They ban the use of persistent cookies altogether, both first- and third-party. And I stand firmly behind Eric’s stance that those rules should be re-written – Government can’t be effective in providing services online if it can’t track the usage of those services.
But in his enthusiasm, Eric does actually conflate two somewhat separate issues - cookies on the one hand, and third-party content & tracking on the other. And third-party tracking & content deserves at least as much attention as cookies (if not more, in fact).
Whilst it's no skin off my nose to send this data to Webtrends and Google, this is partly because a) I know and trust those organizations, and b) the content on the Whitehouse.gov site is pretty uncontentious. But what if I were looking at detailed information about entitlement programs, or applying online for some Government help with my mortgage? There is at least a valid question to be asked about how this kind of behavior data is shared with third-parties, separate from the cookie discussion.
My view? I don’t really think Government websites should be sending tracking data to third-parties, or retrieving content from third-party sites (other than other Government sites). There are plenty of options for first-party analytics solutions which offer just as much functionality as hosted solutions and would allow the Government to maintain control of this data and to be able to be definitive about how it is stored and used.
Eric also makes the point that, with everything else that’s going on right now, it’s borderline irresponsible to be chewing up the new administration’s time with pedantic questions about cookies or third-party tracking. But I don't think it's inappropriate at this stage to flag this to the Obama administration, because I imagine that at this moment (or very shortly) a variety of Federal agencies are looking at how they can put more information and services online.
Helping the administration to set sensible policies now will stop precious money being wasted if policies have to be changed later. And besides, wasn’t it Eric who called on Obama’s team to take the time to review the rules in the first place? Could they not churn out some websites with some simple log-based tracking now and then focus on E-government policy when the economy’s calmed down?
Another issue addressed in Chris’s original post is the wisdom of using YouTube (or indeed any third-party streaming service) for the videos on the Change.gov site (YouTube is also used on Whitehouse.gov). This raises a number of questions, such as how was Google chosen over, say, Vimeo, or Hulu, or MSN Video, and whether there any SLAs in place to ensure this material is available on an ongoing basis.
Let me make it clear that I don't object to Obama's addresses being available on YouTube - they should be there, and on every other video streaming website. But for information published through the Whitehouse.gov website itself, I'm not sure that a third-party streaming site is the best choice. How confident can we be about the integrity of this information? After all, we wouldn't want Obama to be RickRolled, now would we?
You’re probably thinking “Jeez, what a kill-joy” as you read this post. And it’s true that privacy wonks (which I would not fully consider myself to be) do have a rather Cassandra-ish quality, always looking for the bad. But this is an essential part of the dynamics of the debate on topics like this – which means that Eric’s robust post is also essential and welcome, I should add. But we did get into rather hot water with the previous administration’s disregard for privacy. So it only makes sense that the new guys should get to hear these concerns now.
Does entropic de-anonymization of sparse microdata set your pulse racing? If so, you’re gonna love this paper [PDF] by Arvind Narayanan and Vitaly Shmatikov of the University of Texas at Austin. Even if your stats math is as rusty as mine, however, the paper makes fascinating reading - and is surprisingly readable, if you skip over the algorithm-heavy bit in the middle.
For those of you who don’t have time to read an academic paper, here’s a summary. The paper presents a method for taking an ‘anonymized’ data set – for example, the Netflix Prize data – and locating the record for a user about which you have a limited set of approximate data. If, for example, I know that you’re a fan of The Bourne Ultimatum, Minority Report and Delicatessen but that you absolutely hated Hitch, Music and Lyrics and Along Came Polly (can’t blame you for the last one, by the way), then there’s about an 80% chance I can find your entry in the Netflix Prize dataset (assuming it’s there – it’s only a 10% sample of Netflix’s total ratings data). And I can do this even if I don’t know anything else about you.
The reason this is possible is that the data is so-called ‘sparse’ data – each record (which represents a Netflix user) has many, many fields (each field represents a particular movie), of which only a tiny fraction are non-null (because even the most prolific Netflix user has only rated a tiny fraction of Netflix’s total library). So the chances of two or more users giving the same rating to the same set of movies is actually quite small.
A lot of the detail in the paper relates to the fact that the information you start with doesn’t even have to be 100% accurate – for example, even though I know that you loved Minority Report, I may not know if you gave it 4 or 5 stars on Netflix. The algorithms are surprisingly robust in this environment. If you know just a little bit more (specifically, when the ratings were entered, to within some tolerance of accuracy), it becomes even easier to locate a record based upon some starting data. Especially if the person is interested in less popular movies (the inclusion of Delicatessen in the list above would dramatically increase the chance of a match).
Why is this interesting? Well, when Netflix released this data they confidently said that it had been shorn of all personally identifiable information – the implication being that you couldn’t link a specific record to an individual. But this paper gives the lie to that - It drives, if not a truck, then certainly a decent-sized minivan through Netflix’s claims.
As the AOL Search data debacle in 2006 showed, simply removing identifiers from this kind of data is not enough to render it properly anonymous. And if you’re thinking that Netflix preference data is hardly sensitive data, then remember that media consumption has a long and inglorious history of being the basis for discrimination and persecution in society – and there are certain idiot politicians who even today still seem to think this kind of stuff is ok.
[Update, 10/3/08: One of the authors of the paper, Arvind Narayanan, has very kindly commented on this post, and points me to a blog that he has started to discuss this topic and its impact, which you can find at http:///33bits.org. The blog has already helped me to understand eccentricity better, so go take a look.]
[Update 10/1/08: BT has announced that it will commence a new trial with Phorm to start September 30 in the UK. The trial, in accordance with the conditions below, is opt-in]
Beleaguered behavioral targeting outfit Phorm appears finally to have caught a bit of a lucky break - the UK Government has (belatedly) responded to the EU's queries about Phorm's business practices by saying that Phorm does not break EU data collection/retention laws. But the Department for Business, Enterprise and Regulatory Reform (BERR) - the Government department tasked with assessing Phorm's business and responding to the EU - has placed the following conditions on its approval (from an excerpt of the full letter sent to the EU which is reproduced on The Register - my highlighting added):
The two key bullets here are the last two - Phorm will be required to operate this service as an opt-in service only, with clear language and functionality enabling even opted-in users to opt out at any time. And BERR states that it will be keeping a close eye on Phorm to ensure that it continues to comply with these conditions.
The news may do a little to shore up Phorm's deflating stock price, which has lost about 80% of its value since the heady days of March. But it's hard to imagine Phorm building much of a sustainable business on the back of an opt-in only system - it's going to be an incredibly hard sell for the ISPs that Phorm partners with (BT, TalkTalk and Virgin Media being the only ones mentioned so far). The only model I can think of is that the ISPs offer reduced rates in exchange for opting into the targeting system; but that negates the very purpose of implementing the system in the first place - to shore up sagging ISP revenues in the wake of the last few years' broadband price wars. I fear that Phorm is not out of the woods yet - especially if the recent happenings at its competitor NebuAd are anything to go by.
Yahoo is not letting the grass grow under its feet with its integration of IndexTools. Today IndexTools partners received an e-mail from Yahoo informing them of a change to the terms & conditions of the service, which need to be agreed to by October 15 in order to retain access to IndexTools.
The e-mail calls out a change to the Ts & Cs which require IndexTools partner customers (i.e. the site owners themselves) to place the following (or equivalent) language on their websites (my highlighting):
“Third-Party Web Beacons: We use third-party web beacons from Yahoo! to help analyze where visitors go and what they do while visiting our website. Yahoo! may also use anonymous information about your visits to this and other websites in order to improve its products and services and provide advertisements about goods and services of interest to you. If you would like more information about this practice and to know your choices about not having this information used by Yahoo!, click here.”
Yahoo goes on to say that it will be auditing client sites and will disable accounts where this verbiage has not been included on the site (I wonder how effective this will be in practice - it may just be sabre-rattling). Partners and client sites have until October 15 to comply.
The comment from the IndexTools partner who forwarded on this information was that it would be a challenge for their clients to implement this - from a logistical perspective, if nothing else. But I can understand Yahoo's move here - part of the benefit of a company like Yahoo (or Microsoft, or Google) offering a web analytics service is the secondary use of the resulting data for ad targeting purposes (something that Yahoo is very good at).
For comparison, here is (a shortened version of) the paragraph that Google requests its customers insert onto their sites:
“[...] Google Analytics uses “cookies”, which are text files placed on your computer, to help the website analyze how users use the site. [...] Google will use this information for the purpose of evaluating your use of the website, compiling reports on website activity for website operators and providing other services relating to website activity and Internet usage. Google may also transfer this information to third parties where required to do so by law, or where such third parties process the information on Google's behalf. Google will not associate your IP address with any other data held by Google. [...] By using this website, you consent to the processing of data about you by Google in the manner and for the purposes set out above.”
This wording does not seem to imply that Google will reuse the data for other purposes, including ad targeting (IANAL, however); though Google did introduce some reuse of data (and some options for controlling it) with their data sharing feature that they launched back in March.
The corresponding paragraph from adCenter Analytics is:
Microsoft may retain and use user data subject to the terms of the Microsoft privacy statement and publish in aggregate or average form such information in combination with information collected from others’ use of adCenter Analytics except that Microsoft will not disclose to any third parties any user data collected by adCenter Analytics from your websites in a manner that (i) contains or reveals any personally-identifiable information or (ii) is specifically attributable to you or your websites.
The Microsoft privacy statement does say that we may use the information we collect to deliver services, "including personalized content and advertising".
So Yahoo is not doing anything here that hasn't been done before; and, as I've said several times before, you can't expect a company to provide a free web analytics service of the quality of IndexTools and not attempt to monetize it in some way. What is a little different about Yahoo's approach, though, is that it's taking a sterner line on actual implementation of the data reuse language, and actually threatening to disable accounts where the wording hasn't been added. This implies that Yahoo anticipates that it may need to defend its usage of this data (at least from a PR perspective), and wants to ensure that it can point to this wording on any site that uses IndexTools, so that users can't complain that their behavior data is being reused without their consent.
[Update 9/11/08: Added a reference to Google data sharing]
[Update 9/12/08: Corrected IndexTools' name - duh]
Once upon a time, when I was a young turk, I would assiduously download every last doodad that my employer created as soon as it shipped - or often long before, happily reaching for the pile of floppy disks as I rebuilt my computer for the umpteenth time following the latest toxic combination of untested software.
Age (and a need to still be able to work on my computer) has slowed me down. So I passed over IE8 beta 1, preferring to read about others' experiences of the new "standards mode" that is the default rendering mode for the new browser.
But last week, only hours after its public availability, I downloaded and installed IE8 beta 2. Why? Because it contains a raft of new features for protecting user privacy. I've blogged previously about the eternal tension between user privacy on the web, and the measurement and tracking that is so essential to many websites' business models. Put simply, if users' behavior could not be measured online, a lot of online businesses would go out of business.
So how does IE8 contribute to the debate? Well, there are a number of minor features to protect users, and one major one. The minor ones include a nice feature in the address bar to highlight the actual domain of the site you're looking at:
This makes it much easier to spot phishing attacks, since many phishing sites try to confuse users by including familiar looking domains as subdomains of the real site, e.g.:
Another nice feature, related to phishing, is the "Smartsite Filter". This allows the user to check the current website against a known list of bad sites.It's essentially a UI into the automatic phishing filter that was built into IE7 - but it allows users to report sites as well as check them, adding a Cloudmark-like element of user contribution to the process of spotting evil sites.
The other small enhancement worth noting is that the "browsing history deletion" feature has become smarter - you can elect to delete the cookies etc. for all sites except those in your favorites list. This is a step forward, but it still mystifies me that IE has no easy way for browsing the cookies (and their content) on your computer, and selectively deleting them (as Firefox has had since v2, it pains me to say).
The big new security/privacy feature in IE8 is called InPrivate Browsing (others have dubbed it "porn mode", but I am above such lewdness). InPrivate Browsing allows the user to browse without storing any cookies or browsing history, or locally cached files. It's good for when you're borrowing someone else's computer, or if you share a computer and don't want the other people who use the computer to know what you've been up to (now you are starting to understand where the "porn mode" nickname comes from).
The naming of the InPrivate functionality is somewhat confusing. Once you turn on InPrivate Browsing (either from the Safety menu or using Ctrl+Shift+P), something called InPrivate Blocking is also activated. InPrivate Blocking prevents your browser from sending requests for third-party content that it thinks are principally for the purpose of tracking your behavior. The big difference here is that this isn't just blocking third-party cookies - it's third-party content. That's tracking pixels, third-party JS calls, and yes, ads.
InPrivate Blocking will block third-party requests if one of the two following conditions have been met:
To understand the first condition, take a look at the screenshot below, which is the dialog that comes up if you select InPrivate Blocking from the Safety menu when InPrivate Browsing is active:
You'll notice that there are some third-party request URLs that come up, well, a lot. googleadsyndication.com is the domain that Google AdSense ads are served from; and you will doubtless know what comes from google-analytics.com. In the dialog above, the four URLs across these two sites have each been requested at least 20 times in a third-party context, and I've only been using IE8 for a few days. With the default settings ("Automatically block"), these URLs are blocked when I am in InPrivate mode.
The other way of adding a URL to the blocked list is to subscribe to an InPrivate Blocking list. This is an RSS or Atom feed of URLs that IE8 should block in InPrivate mode. I have created a subscription list which blocks third-party requests to analytics.live.com - the domain for adCenter Analytics's tracking JS and pixel. You can try it out by clicking here.
The power of the feed-based approach to InPrivate Blocking is that privacy advocacy sites can post a single link to a feed XML file which users subscribe to; if that file changes, the users' blocking lists change. So you can expect to find "click here to block ALL tracking pixels and ads" links on such sites in the not-too-distant future. You can take a look at your InPrivate Subscriptions through the Manage Add-ons option in the Tools menu:
Whether news of this functionality sends a shiver down your spine or warms the cockles of your heart depends on whether your business depends on online advertising or web analytics. Popular third-party analytics systems like Google Analytics, or third-party ad servers like Atlas Enterprise will lose data on users who enable InPrivate Browsing; and even a less popular service that might not normally be blocked automatically could end up on common "Opt-out" feeds and have its tracking blocked, especially if had a poor reputation for privacy.
I must admit that when I first read of this functionality, I was - ahem - a little apprehensive, for the reasons above. And in truth, only time will tell what proportion of users are engaging InPrivate browsing (although, given the nature of the functionality, we'll not be gathering this data). But my gut feel is that, whilst this capability is a welcome addition to the privacy and security arsenal of Internet Explorer, actual take-up of the feature will be low. It needs to be invoked explicitly, of course, and the blocking of persistent cookies means that some desirable features of websites (such as being able to remember you from visit to visit) will be disabled. So I imagine it will be used sparingly by the vast majority of users.
Even so, this feature could easily add another 1 - 2% to the existing disparity between different measurement systems (such as an in-house web analytics system and a third-party ad server). Though there are techniques that vendors could use to work around the automatic blocking - the best example being the use of CNAME DNS entries to make the third-party tracking URLs look like first-party URLs - these techniques will add complexity to the implementation of such systems; so it might be easier for us all to live with a little less certainty.
If you'd like to read more about the new features in IE8, there's a ton of stuff over at the IE blog. And, with my Microsoft hat firmly on my head, I should say that the IE team has done an outstanding job with this beta, which is performing really well for me, and rendering most sites flawlessly, with just a few slight layout differences cropping up here and there. Well done, guys.
There's been a lot of chatter recently about the "dark side" of online advertising, in particular, the activities of companies like NebuAd and Phorm using somewhat shady techniques to gather behavioral data about users and using this data to target ads. I've even blogged about it myself. And click fraud remains a significant challenge to confidence in online advertising.
But whilst the term "click fraud" generates about 25 million results on the world's best search engine, the term "malvertising" generates only 2,170. Since you may not be familiar with the term, I'll offer you the definition I found on urbandictionary.com (sadly, there's no Wikipedia entry for Malvertising):
An Internet-based criminal method for the installation of unwanted or malicious software through the use of Internet advertising media networks and exchanges.
So Malvertising = malware + advertising. See? Clever (if ugly). But despite its goofy name and low profile, malvertising arguably represents a greater threat to the online advertising industry than either unscrupulous behavioral targeting or click fraud.
Malvertising can take a number of forms, typically along the following lines:
The enormous reach of modern ad networks, plus the ability to place malicious code on thousands of otherwise innocent sites, makes distributing malware via advertising networks a very attractive proposition.
The malware itself is usually focused on stealing users' personal data (e.g.login details for broker accounts), taking control of the user's machine for distributed denial-of-service attacks (turning it into a zombie), or convincing the user to spend their own money buying malware "removal" software after they have been "infected".
But it's not just the end user that suffers. The publisher who has unwittingly hosted the malvertising can find themselves besieged by angry users demanding to know why they've been served malware from their site. If the ad was served via an ad network, the publisher will possibly cancel their contract, depriving the ad network of their business (ESPN has already ditched ad networks altogether, although not ostensibly for this reason). And advertisers who want to use increasingly sophisticated ads with high levels of interaction may find that they are unable to because these ads are some of the ones most likely to contain malware, and so are blocked by the ad networks and publishers the advertiser wants to deal with.
Furthermore, if end users lose confidence in the ads they're being shown, either in terms of where a click will lead, or whether the ad itself is malicious, this will drive down ad clicks and drive up the installation of ad blocking software - both of which will have a disastrous effect on the industry.
The malvertising problem is not insoluble, but it will demand a concerted effort from all industry participants to fix (or, at least, contain) it. I'll blog about these topics again in more detail, but the main areas of attention will need to be:
Creative/URL scanning: Ad networks and third-party ad servers will need to start scanning creatives and destination URLs as a matter of course. The technical challenge of scanning Flash or Silverlight-based creatives is considerable, since malicious ads will take steps to cover their tracks, such as obfuscating code, and behaving normally if they detect they're being scanned. Ultimately, the co-operation of Adobe and Microsoft may be required to put in place more robust systems for determining an ad's provenance.
URL scanning is a more manageable problem - all ad networks should ensure that ad click destinations do not lead to sites which are known to host malware.
Creative template quality: Malware has been known to sneak into ads through sloppy management of creative templates - if an agency uses an infected template, then of course all ads created using that template will be infected. This problem will grow as larger numbers of smaller advertisers start to use online services which provide Flash templates that are customized to order - the advertisers will not have the technical sophistication to determine whether the resulting ads are safe or not. Some kind of 'quality seal' may be required for these services, though that will not stop bogus ones springing up.
Outlawing redirect-based tracking: At the moment, many ad networks use redirects to track ad clicks, meaning that a single ad click can be passed around many ad networks before the user is finally deposited at the advertiser site. This system is open to abuse via "click hijacking", where a bogus network sends some clicks for legitimate ads to malware sites. Publishers should inform ad networks that redirects for tracking are unacceptable, which will mitigate this problem.
Ad isolation: At the moment, an ad which is served with a page (rather than via an iframe) has access to that page's DOM, which means that if the ad is malicious, it can crawl the DOM, looking for user PII (such as usernames and passwords for the site the ad is on, or credit card details). Microsoft is working on some technology to isolate ads that are served on its network, so that even if they're served in a first-party context (i.e. not via an iframe or redirect), they are unable to access the page DOM. Other publishers & networks should consider doing the same.
Industry co-operation: Currently, very little specific information about malware is shared within the industry, partly for noble reasons (it can be difficult to be specific about a malware instance without revealing user PII) but mostly for ignoble ones (no ad network wants to advertise the fact that they've been subject to a malware attack). This must change - the industry needs to find a way to share this kind of data without an individual network or publisher having to step into the firing line.
As I said, I'll return to this subject with some more thoughts on some of the above issues. In the meantime, a great resource for information on malvertising is Spyware Sucks, a blog run by Microsoft MVP Sandi Hardmeier, who tirelessly chronicles various malvertising outbreaks. It makes for sobering reading.
There's been plenty of buzz (more of the angry hornet variety rather than the just-inhaled-a-lungful-of-dope variety) about Phorm of late, precipitated by a press release that the company put out on Feb 14 in the UK, announcing partnerships with three major UK ISPs to provide a system "...which ensures fewer irrelevant adverts and additional protection against malicious websites". Critics of the system (led by noted UK cage-rattler, The Register) claim that the technology is little more than spyware by another name. The negative press around Phorm's announcement has caused at least one of their ISP partners to back away from the deal, and cause their stock to plummet by more than 30%. It looks like this could be the latest in an increasingly long line of bungled targeting announcements from the industry (Beacon, anyone?). But what went wrong?
Phorm as a company is the new name for 121Media, a UK AIM-listed company who started out producing a browser toolbar which tracked your page usage to provide a social media environment, connecting you with other people who were looking at the same page. Ad-funded, the toolbar quickly picked up a reputation for being spyware (even though I agree with Phorm's protestations that it was really adware, which is better, but still tarred with the same brush), so it was dropped and the company renamed Phorm.
The new service Phorm has launched is called Webwise (not to be confused with the BBC site of the same name). Essentially it is technology that ISPs install at their data centers which analyzes the URL and textual content of web pages being served and uses this information to place users into interest categories so that they can be served behaviorally-targeted ads. The technology does this by intercepting the page request and sending a copy of it to a "Profiling" server which extracts keywords and uses this information to assign users to interest groups:
The same technology has a function to alert the user to phishing web sites; since the URL and content is being examined, phishing sites can be spotted and blocked. This functionality forms a core part of Webwise's value proposition to users.
The other part of the alleged value to users is that this profiling process does not permit the ISP to associate a user's profile with their IP address; that means that the ISP (and any government agency who subpoenaed the ISP's records) could not re-associate the Phorm data with a customer record (ISPs can tell which IP address was assigned to which customer at a particular time). The Phorm system does also not store any of the page information or extracted keywords; once the interest "channel" has been arrived at, all the rest of the data is deleted.
So Phorm claims that its system is a real step forward for user privacy on the Internet, whilst at the same time enabling advertisers to reach their audience more effectively. But the industry (and the public) haven't really seen it like this.
Phorm's announcement was always bound to generate a certain amount of controversy, because it's in the sensitive area of behavioral profiling & targeting. But there has been a particularly virulent reaction in the UK, which, whilst started by sites like the Register, has now spread to the "mainstream" media.
Some of the reasons for the fuss are (comparatively) silly things - for example, the renaming of the company from 121Media, which has just made people nervous, especially given the previous company's adware history, or the fact that the company operates out of serviced offices in the UK and doesn't really have a physical address in the US.
A more serious blunder on Phorm's part is their failure to anticipate the scrutiny that this kind of system would be placed under. In this kind of environment, given the firm's history, absolute transparency is essential, and Phorm hasn't provided this. There are still unanswered technical questions about Phorm's system, such as how it manages the opt-out (does data still get collected, or not?), and there have been inconsistencies in the claims that Phorm has made about third-party privacy audits of their software.
Phorm has also made the mistake of launching prematurely, with many of their partnerships still only half-baked. At the moment there is no benefit to users being delivered, because none of the systems that Phorm has announced are actually live within ISPs, and so all the focus is on the downside. Phorm would have done much better to wait until the service was fully baked with at least one of their partners and they had some real users onboard who could testify to the increased relevance of ads and how comfortable they were with their privacy with Phorm, before making a big splash. The press release looks like the product of an over-zealous PR agency looking to ensure their monthly coverage targets were being hit. Well, they've certainly done that.
The main problem here is a poorly thought-out balance of benefits for 'costs' in this offer. Phorm have claimed that this system protects user privacy, but it doesn't really; it's just an ad targeting system with a better-than-average approach to protecting privacy. Users who are opted into Phorm will still receive cookies and targeted ads from other ad networks, and their behavior will still be tracked by those other networks.
Apart from the phishing protection (which is already baked into IE7 and Firefox anyway, and turned on by default), there's nothing in the Phorm system which provides users with protection of their personal data across the Internet. The only way that Phorm's entry into this market can elevate user privacy overall is if other providers of targeted ads who are storing more data decide to pack up and go home - which I doubt will happen.
The furore also highlights the challenges of partnering with ISPs for this kind of service. Because ISPs are the gatekeepers of the Internet (and because, for many people, switching ISPs is a pain in the a**), users are very sensitive to any perceived exploitation of this relationship by the ISPs. In the UK, ISPs are some of the best-known Internet brands, but also some of the least liked. Ironically the cause of this dislike (poor customer service) is a direct result of the price war that has precipitated ISPs' interest in this kind of service, as they are receiving a cut of the revenues, of course.
Ultimately the tale makes clear how careful any company has to be in launching a service like this - the balance of benefits has to be clearly stacked in favor of the user. As Chris Williams of The Register said during an interview with Phorm's CEO, Kent Ertegrul, said:
"a big difference I see between what you're doing and what Google does is that people feel that they're getting a service from Google. I don't think people feel they'll be getting a service from you"
It will be interesting to see how the Phorm saga plays out. Perhaps one day it'll find its way onto an online marketing MBA module syllabus.
Sigh. Blog post topics seem to be like buses - you wait ages for one to come along, and then three come along all at once. Actually, I've got four things to post about, but I'm going to leave two until after the weekend. Here are the other two. Funnily enough, they're related - both are about benchmark data.
Online traffic benchmarking service Compete.com has been bought by UK-based market research firm TNS (Taylor Nelson Sofres). This is a good result for the folks at Compete, who have been waging a four-way battle with Quantcast, Alexa, and Comscore. Funnily enough the deal isn't stellar, despite the significant attention that Compete (and benchmarking services in general) has been getting recently - it's only a guaranteed $75m, with another $75m payable on achievement of revenue targets. Compete Inc has accepted about $43m in investment since it started in 2,000, so I guess the investors are pleased but not delighted.
The rest of TNS's business is pretty traditional market research stuff, so it'll be interesting to see how they integrate/expoit Compete's capabilities. Moving the footprint outside of the US seems like one obvious goal they may look to achieve in the not-too-distant future.
Logging onto Google Analytics this week, I was interested to see the new data sharing options that the product is making available:
So the key option in the above list is #2 - allowing GA to share your data with its "benchmarking service", where data from sites in a similar industry will be aggregated together for benchmark reports, like the sample below:
This is a smart thing for Google to do, as it provides an incentive for GA users to share their data by providing them with a solid benefit in return. It will be interesting to see how GA determines which industry a site is in; I guess they will mine the search index for those sites and use some behavioral targeting-type techniques to drop a site into a category based upon the words that appear on the site's pages. I have no idea how they'll categorize my site - they'll probably drop it into a "blogs" industry segment, since Google already knows that my site is a blog.