October 13, 2008

Clouds, Impressions and Pork Bellies

pork-bally With Microsoft’s (sort of) biennial PDC on the horizon, my mind (and the minds of many of my colleagues) turns to our cloud computing efforts, which will have their coming-out party in Los Angeles at the end of the month. Like anybody else here at Microsoft, there’s little specific that I can say about these efforts before the conference; all I can say is that we’re working on ways of making it much, much easier to develop,deploy and pay for apps in the cloud.

One of the things I’ve been thinking about, however, in the context of cloud computing, is how it may or will change the way that IT infrastructure (by which I mean, processing power, storage and bandwidth) is bought and paid for. Most cloud or utility-computing offerings in the marketplace today are priced on a consumption basis (that is, you pay for what you use, and no more) – indeed, some people hold that you’re not really doing cloud computing if you’re not charging for it on this basis.

This model for charging represents a significant transfer of risk from the customer to the vendor: whereas an enterprise might today purchase so many servers, and so many OS, database and other software licenses to support a particular service, knowing that some will not be used, now it is up to the cloud vendor to predict demand for their services and purchase the appropriate hardware and software.

But I think that cloud computing may yet enable its customers to further reduce the risk they face, by enabling the trading of futures positions in compute power and storage. And in this respect, cloud computing shares some very interesting characteristics with the online advertising business (Ah, you say, now I understand where he’s going with this). Please note,  by the way, that nothing that follows is intended to indicate any specific Microsoft plan in this area. This is just me riffing.

Clouds as commodities

So, imagine you’re running a news and current affairs website. And further imagine that, oh, there’s an election coming up later in the year which you’re confident will generate a big spike in traffic. If you’re running your site on an on-demand cloud infrastructure, then you’ll be confident that your site will scale elegantly if you get traffic spikes – but at what cost? You may be able to buy compute capacity at (say) $0.10 per processor-hour (or whatever measure of compute capacity emerges) on a spot basis; but if you were to reserve this capacity on a forward basis (i.e. a few months in advance), you could pay only $0.05 per processor-hour.

But what if the spike never materializes? You could just release that capacity back to the cloud vendor and get some number of cents on the dollar for it. But an alternative is that you could in theory sell that pre-reserved capacity to someone whose need is greater than yours, potentially at some profit.

Now consider the same business from an advertising perspective. Anticipating the spike in traffic, you want to sell your anticipated inventory for the best price – which means striking a number of ‘guaranteed’ deals, where you commit to delivering the impressions during the time period (and, given the nature of advertising during an election campaign, you really don’t want to be delivering make-good ads after November 4). So to hedge the risk of not meeting your projected impression goals, you buy a block of inventory that you can use, if necessary, to fulfill your obligations.

As the election looms, however, you discover that your traffic is exceeding your expectations – so you don’t need the inventory hedge. You could choose to take a little revenue from this inventory by serving discretionary ads into it, or you could sell it on to someone else whose inventory prediction was not so on-the-money, and needs inventory to fulfill a guaranteed deal. You could potentially get a better rate doing this than by serving remnant ads into the inventory yourself.

What these two examples have in common is that the publisher is taking a forward position on a commodity in order to mitigate against risk on the supply or demand-side of their business. Of course, this kind of hedging is nothing new – the Chicago Mercantile Exchange has been enabling it for years, for commodities as diverse as pork bellies, oil and coffee. And since energies futures are such an important part of that market, it shouldn’t be a far-fetched idea that compute power (which many have described as moving to be a utility, like electricity) could move to being traded in the same way.

More options

The model even lends itself to the idea of options trading – in both the above examples, the publisher could pay for the option to purchase compute capacity or advertising inventory at a particular price, rather than reserving the capacity or inventory itself; and those options could then be sold on later (or exercised, or left to expire, of course).

The next logical step from there is that folks who have nothing to do with online advertising or cloud computing could start buying and selling these commodities and securities with a view to making a profit on price changes. The economy’s current woes notwithstanding, I can see this happening in the next 5 – 10 years.

To make either scenario a reality, however, there need to be functioning exchanges for the buying and selling of the commodities. This is close to becoming a reality for online ad inventory – the likes of Right Media Exchange, DoubleClick Exchange and our own AdECN are close to providing open trading platforms for advertisers, publishers and networks to buy and sell inventory. Though there is no talk of futures trading in these environments right now.

It’s significantly further off for cloud computing capacity. For a start, the industry lacks standards for measurement and billing – will it be the processor-hour, or the Gbyte-day, or the Gbit-month, or some combination of the above? Secondly, unlike the online ad market, where a given ad will run on most publisher sites (with the exception of rich media ads), there is illiquidity between different technology platforms in cloud computing – so an app written for Amazon Web Services will not run unmodified on Salesforce.com’s cloud platform, or Google’s. This may never change, in which case any kind of market or exchange for compute capacity will be limited to a single vendor’s system, greatly limiting the effectiveness of such an approach. But interesting to think about, nonetheless.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

March 22, 2007

How much are you worth?

I came across a very interesting article in last week's Economist Technology Quarterly the other day (which I was reading a week late, thanks to the efficiencies of the US Postal Service). The article mentioned a couple of sites such as AttentionTrust and Agloco which have sprung up to help users take ownership of their own online behavior data and sell this data to advertisers who want to target them with ads.

Both sites use a browser plug-in which captures browsing behavior and stores it online where it can be aggregated and sold on to advertisers. It's an interesting idea; since the user generates the valuable data about their own preferences, it seems fair that they should get a cut of the advertising revenues generated by this information (according to Agloco, up to 90%).

The only problem is that these users are already getting something for nothing - content. In the current model of ad-supported websites, publishers take money from advertisers who want to reach their readers, and use this money to pay for web hosting, design, maintenance,  content authoring, editing and all the other myriad expenses associated with publishing on the web. As a "thank you" to their users, they offer their content for free (ironically, even the Economist is doing this now).

But if users start taking a big piece of the revenue pie just for the privilege of making their eyes available to be presented with ads, ad-supported publisher business models could collapse. The only way out of this bind is if these "attention" networks can take so much of the weight and expense of managing user profiles off the publishers that they (the publishers) can afford to give away such a big chunk of the ad revenues to the users themselves. And it will be a long time before a sufficiently large number of users are in such networks to make it worthwhile for publishers to abandon their own behavioral targeting efforts. And as a publisher I'm not sure I would want to have to deal with multiple attention networks - so consolidation aroung a single (ideally not-for-profiit) network seems like another pre-requisite.

But the development is interesting, nevertheless. At the very least, Agloco's claimed 10 million users is testament to the fact that users are becoming much more savvy about their personal information and even their browsing behavior, and are looking to monetize themselves (what a great phase that is: "Honey, I'm off to monetize myself for the day. I'll be back around 6.30"). How much are you worth?

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

December 11, 2006

Swivel - the YouTube of data?

Should have blogged about this last week, but other demands on my time prevailed.

There's an article on TechCrunch (brought to my attention by my colleague Justin) about the launch of Swivel, whose founders Dmitry Dimov and Brian Mulloy describe as the "YouTube of data". What they mean by this is that they've created a place where users can upload interesting data sets and then plot them against other data sets from other users to look for correlations, such as the interesting one below:

1170971

Unfortunately I don't have much particularly interesting data to upload (and the data that I do have that is interesting is confidential), so I wasn't able to try this with some of my own data. Apparently when the site launches, you will be able to upload data and keep it private - though I don't know how many people will be happy to trust their precious data to a relatively unknown third party (not to mention the legal aspects).

If Swivel can overcome this obstacle, however (and they need to - charging for private data is their main revenue source, apparently), then they could be onto something. They're building out significant data center capability to perform correlations behind the scenes and suggest data sets that you might want to compare. But it will be interesting to see whether the correlations they come up with are anything more than just of the 'happy coincidence' variety (for example, the rising plot of oil prices in the chart above could appear to correlate nicely with the usage of World of Warcraft, if you're careful to pick the right range, etc). So perhaps Swivel should have a little tutorial on how correlation does not imply causation on their home page.

The site's other challenge is the cleanliness of the data - even when trying to compare data that was date-based, the site choked several times (doubtless these are problems that the team is working out), but there is a larger issue of 'standardization' of axes or segments. Date is (relatively) easy - you can make some assumptions about the date range that a particular data point relates to - but other ranges/segments are harder, such as:

  • Country (problems with old vs new names, regions, etc)
  • Age (lots of data is grouped into age ranges, e.g. 16-24, 25-34, but these are not consistent)
  • Income (same problem as above, plus currency fluctuations thrown into the mix)

And that's just the axes/segments for humans - other entities like companies have their own characteristics which are not measured in a standard way, especially not internationally.

It'll be interesting to come back to Swivel in a few months when there's some more data in there (and when they have their private data service up and running). I wish them well.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

December 04, 2006

Whither the page view?

There's an amusing graphic on Steve Rubel's Micro Persuasion blog lamenting the demise of the page view (1994 - 2010, says Steve). Steve highlights the fact that, as web applications move beyond the traditional 'page by page' model, utilizing technology like Ajax and Flash, the page view - which, together with the click, is the cornerstone of how all online media is sold - is on its last legs.

The reason for this, if you haven't already read this on a hundred other blogs, is that these new (actually, not new at all, but let's gloss over that) technologies allow content to be refreshed in one part of the screen without the whole page (and its attendant ads) being refreshed. Steve describes this problem as online advertising's "dirty little secret". I don't agree with that - sure, there are a number of competing initiatives to solve the problem, such as the Web Analytics Association's standards committee, and the IAB's version of the same thing - but plenty of noise is being made about the issue.

Steve rightly identifies that the issue is more a business issue than a technology one, though there is a bit of the "we're all doomed!" in his tone. In my opinion, this is a market efficiency issue, and those tend to sort themselves out (with the odd casualty here and there) by themselves without too much bother. But the post got me thinking: what alternatives to the page view exist? Here's a not-at-all-definitive list from the recesses of my own brain:

  1. Click events. Unless you're into radically rethinking application design, user interaction will still be facilitated by clicking the mouse. Not all mouse clicks are created equal, of course; so a site couldn't just publish its click numbers - that would be a bit like the old hits metric, since it would be so easy to game the system by creating a click-heavy site (lots of drop-down boxes, radio buttons etc). So you'd need some way of identifying which clicks returned content, and which didn't, which would lead you to a...
  2. Content events. Although page views are being broken down into smaller pieces, there is still some intuitive mileage in the idea of the 'content event'; a package of activity where the user requests some new content, and the content is displayed to them. This is a bit like a mini page-view. Content retrieval is Ajax apps is usually done by asking for a piece of XML from the server, and then rendering it (using JavaScript) in the browser. However, the thing that makes Ajax interesting is the fact that content can be retrieved from the server in the background (don't forget, the "A" is for "Asynchronous"), whilst the user's doing something else - for example, the next e-mail in the list is retrieved whilst you're reading the current one. So you couldn't just count the number of XML 'pages' retrieved from the server, because any app could game this by pre-fetching.
  3. Time on site. For relatively static parts of an Ajax app interface, the amount of time the user spends interacting with the site can be relevant, because ads could be coded to auto-rotate every 30 seconds or so. There are several challenges with this, however: firstly, it's actually quite hard to determine how long a user spends on a site because when they leave, they just disappear - their final trackable action is the last thing they did before they left, which could be many minutes before they actually left the site. Paradoxically, the serving of timed auto-rotate ads could help here, because if you know that you served 10 ads on a 30-second rotation to a user, they were at the site for at least 9 minutes 30 seconds, and for no longer than 10 minutes. The second challenge is that ads in static parts of a site design tend to be ghettoized by users - that is, they quickly learn where the ads are and ignore them. This is not a new problem - it's why the banner has suffered as an online ad format. Finally, auto-rotating ads are much less well-suited to contextual advertising, since the ad rotation can't take the currently displayed content into account.
  4. 'Ad refresh events'. This is a sort of combination of the above three measures - a media owner designs their app and builds in some technology for automatically inserting and rotating ads on some basis, linked to clicks/content events. So to take the example of a mail or feedreader app, display ads might refresh once every three content events, whilst a smaller contextual placement might refresh every time the 'content pane' (however defined) updates. It doesn't actually matter how the media owner does the refreshing, or on what basis - merely that they stick to what they've said they will do, and that this can be measured. Those two last things are, of course, the sticking point - how to make sure the owner of an app behaves honestly? Plus, the media owner will have to be able to publish information about refresh mechanisms for the various types of placement on offer; I guess you'd see something like:

Placement Ad display events
Home page top banner 1,534,346
Home page contextual 6,324,235
Login page display 3,232,453
App interface banner 768,255
App content page contextual 2,850,235

Now that I've written those out, it's clearer to me what a can of worms this is. It's an irony that, having been the most accurately measurable marketing medium of all time, advances in technology mean that online is likely to get less, not more, measurable in the future.

Have I missed any options? Let me know in the comments box if I have.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

October 13, 2006

What is Web 2.0? (Part III)

You remember I mentioned in my first post about Web 2.0 that I was preparing to deliver a presentation on the topic to some folks at McCann Erickson in London? No? Ah. Well, I was. And I promised to post the presentation on this blog. So here it is, courtesy of slideshow-hosting site Slideshare, which itself is a good example (they hope) of a  Web 2.0 site - kind of a YouTube for Powerpoint slides. Monetization strategy seems  to be Google Adsense revenues, as you'd expect.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

October 12, 2006

What is Web 2.0? (Part II)

Following on from my lengthy post on the topic of what the heck Web 2.0 actually is, an excellent paper on Web 2.0 [PDF format] has appeared on Pew Internet's website. It draws on Hitwise data about the growth of usage of certain "Web 2.0" sites (such as Wikipedia) compared to the stagnation of "Web 1.0" sites (such as, ahem, Encarta):

They have a number of other comparisons they make which quite nicely characterize Web 2.0 vs Web 1.0:

They also provide some data that indicates that Web 2.0 is being driven by the younger generation - though a post on the Hitwise blog (via which I found this report) has some interesting data which indicates that YouTube's audience has been getting older during 2006:

I imagine this data makes Google feel a bit better about shelling out $1.65bn for YouTube - 35-44 year-olds usually have more spending power than the 18-24's.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

September 26, 2006

What is Web 2.0, anyway?

One of the questions I'm asked pretty frequently is, "What exactly is Web 2.0, anyway?" I'm going to have to answer this question in some detail for an audience at McCann Erickson  next month (I'm hoping to post my material here after the fact), but I thought I'd blog something brief here, which - who knows - may help to answer the question for some people.

The main thing you need to know is that Web 2.0 is not one single thing. In fact, the 2.0 moniker is a bit counter-productive: it implies a big, single product 'release', conjuring up images of lock-step development and the software release cycles of old. In fact, Web 2.0 refers to a collection of changes to software, the web, and related business models, that are ushering in a 'new era' of business opportunity. Vague enough for you?

At the heart of Web 2.0 is a move from a centralized, 'old-media' view (large publishers & retailers telling people what they want) to a new empowered/participative model where users are in charge: defining what they want, reading and publishing, buying and selling online. Where Web 1.0 was about hubs and spokes, Web 2.0 is about networks (see this post).

The best way to explain Web 2.0 is simply to enumerate its major elements. I've listed them here in their order of importance; at least as far as I'm concerned. Feel free to disagree.

1.  Participation & the network effect
In the Web 1.0 days (that is, the bad old days), content was created for you by large organizations like the BBC, NBC, AOL etc and you 'consumed' (read/watched/listened to) it - a one-way relationship. Web 1.0 content business models were all about aggregating large numbers of users together, like a newspaper or TV station does, and selling them to advertisers. In Web 2.0, users create their own content, as well as consuming content created by other users.

User-created content and participation creates a virtuous network effect - the more users who are participating, the better the system. Systems like BitTorrent take this to another level to co-opt users' computers to help with media distribution itself; the more computers who are joined into the network, the more nodes there are to distribute the content, so performance scales smoothly.

Who's leading the way? Blog systems like TypePad, BloggerWindows Live Spaces; video publishing sites like YouTube and Google Video; photo-sharing site Flickr; self-publishing sites like Lulu.com; and music publishing sites Msoundz and Burnlounge, as well as MySpace; Wikis like Wikipedia and Wikihow.

What's the impact? Traditional revenue models for large content aggregators, based around cutting large-scale advertising deals with a few large advertisers, are being undermined. Increasingly, the best way for an advertiser to reach their desired audience is to advertise on the hundreds or thousands of small sites which serve that audience's needs. But finding those sites is pretty challenging.

2. Personalization & collaboration
Hand in hand with no. 1 above is personalization. Whilst not a new concept, as the number and range of sources of information has expanded, users have had to become cleverer at selecting the information they want, and tools have grown up to help them do this.

The thing that makes this a Web 2.0 thing is that personalization is now a collaborative effort - by sharing their opinions and preferences, users create a very rich map of the web in a truly democratic way. One of the best examples of this is Amazon's customer reviews; this is known as 'collaborative filtering'.

Who's leading the way? Social bookmarking/tagging sites like del.icio.us, Digg and ma.gnolia; blog tagging/search sites like Technorati and Bloglines; Amazon.com's customer reviews; eBay's seller reviews; custom home pages from My Yahoo!, Live.com, PageFlakes and Google; PVR technology like TiVo, Slingbox and Sky+.

What's the impact? As users become more demanding and more savvy with the personalization tools to hand, they will spend a greater proportion of their time within their own 'interest zone'; opportunities to interest them in other things become scarcer. Such opportunities will rely on understanding likely correlations between known interests and possible new ones.

3. Democratization of market access
This clumsy title refers to the complete transformation in the ability that individuals and businesses have to promote themselves that has come about in the past few years. This whole thing is really dependent on one key technological development: search marketing. Rather than a few advertisers using large media spends to push a (relatively) small number of products, search marketing has allowed thousands of advertisers to promote a vast array of products - the so-called 'long tail'.

Who's leading the way: Google and Overture (now Yahoo! Search Marketing) created the paid search market; MSN is looking to catch up with adCenter. eBay and Amazon Marketplace achieve the same kinds of things within their own (big) worlds. Blogads extends self-service marketing to the blogosphere.

What's the impact? Small businesses can now have advertising budgets - even if they're only $50 a month, and can track the effectiveness of that advertising every day. If you're prepared to ship internationally, you can address a global market, even if you're selling left-handed widgets for Virgoans. 0.001% of the total market is still 10,000 people (based upon a billion Internet users today).

Coupled with the growth in participation and collaboration, this provides a fertile environment for small businesses suddenly to become big - if something captures the public imagination, and gets promoted via collaborative filtering and participation, it can become very big very quickly.

4. Richer apps
A lot of commentators would put the new generation of funky (in the good sense) apps as central to what defines Web 2.0, but many of the Web 2.0 'leaders' (Google, eBay, Amazon, YouTube) are (for the most part) built using Web 1.0-style development methods. But the emergence of AJAX (itself a catch-all expression for a range of things, not a specific technology) does mean that web applications can behave more like the desktop apps that most people are used to; enabling them to help people achieve more complex tasks online with far fewer clicks and less waiting time.

What's more interesting in the application development space is the emergence of web services and public APIs for things like Google Maps, which has enabled a new breed of web app to emerge: the mashup. Mashups allow new apps to be built out of existing components available on the web, to combine the best of both apps. A good example is www.housingmaps.com, which combines Craigslist housing information with Google Maps to help potential buyers or renters to find properties in their interest area easily.

The new opportunities to create cool apps has led to an enormous number of Web 2.0 startups, creating a mini-bubble. Most such startups have little or no concrete revenue model, beyond placing Google Adsense ads on their sites (which seems to be a monetization panacea these days). Most will die, or be picked up by a bigger player, as was the case with Kiko.

Who's leading the way? Google, Yahoo! and Microsoft all make APIs available for their web platforms, and so do Flickr, eBay, Amazon and FedEx. Notable mashups include WeatherBonkHotCaptcha and Read All About It; and GoogleMapsMania has a whole lot more (for some reason, about 80% of all mashups involve Google Maps).

What's the impact? Major providers of original content & services (like those listed above) will seek to build developer eco-systems around their products. What will be interesting will be the revenue models for those developers - but major content providers will end up sharing their advertising revenues, that's for certain.

To summarize...
A lot of folks have focused on some very narrow things (e.g. AJAX) as 'defining' Web 2.0. But for me, it is much more about changing business models, supported by new technology, than the technology itself. What this means is that Web 2.0 cannot be dismissed as a fad, since it is really just (just!) an evolution in the way people make money out of the web.

References:
What is Web 2.0? By Tim O'Reilly (who originally coined the term)

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

September 15, 2006

Into the den

Was intrigued and amused to see an ex-colleague of mine, Steve Johnston, on the BBC programme Dragon's Den on Thursday night, pitching his StoryCode start-up. Steve was admirably sanguine about the moderate savaging he got on the show - at least his idea was deemed interesting enough to get one of the four or so 15-minute treatments on the show, rather than just being an also-ran dismissed within 10 seconds by Evan Davies. The highlight for me was when Steve admitted that the only business that StoryCode had done, and which he'd valued at £1m, was a £500 order from Foyle's. It was one of the more skewed valuations I've seen on the show (P/E ratio of 2,000, anyone?).

For those of you who didn't see the show, StoryCode is a collaborative filtering system for book recommendations, where readers rate the books they've read along 40 axes which describe the content of the book; the information is then used to correlate between books which have similar qualities and make recommendations to readers.

I happen to know, having known Steve for about 6 years, that the StoryCode idea isn't new; in fact, Steve was working on it when I first met him. It still retains some uniqueness, but the online bookselling industry (i.e. Amazon.com) has moved on quite a lot in that time, with Amazon in particular pioneering collaborative filtering ("People who bought this book also bought...") plus a lot of recommendation functionality in their site. But the company has other challenges, too. Here's my impression of them:

  1. The technology may be clever, but could probably be relatively easily replicated by Amazon
  2. The company's business model (licence software) is stuck in the '90s
  3. It's not a sustainable business at the moment, so no one is being paid for their efforts, meaning that investors would have to fund salaries (which never goes down well)
  4. It's not clear why people would bother to spend time rating books in the system

And here, for what they're worth, are my recommendations (not all of these are completely original thoughts):

  1. The exit strategy should be to sell the technology to Amazon (or possibly a competitor), and soon. They may buy it if by doing so they can save themselves time building something; building relationships with lots of book retailers will make this more difficult. So cultivating Amazon would seem like a good approach.
  2. The monetization strategy for the company needs to come from affiliate deals on sales of books (as pointed out by ex-Dragon Doug Richard), with a little contextual advertising (always the last refuge of the monetization scoundrel) thrown in. The market for the actual software itself is tiny.
  3. The company needs to start making (or at least taking) some money. This may (will) require some investment to drive more traffic to the site; if storycode.com starts to make a few thousand pounds of revenue a month through affiliate deals, it'll be an awful lot more attractive. Plus, it doesn't need a bloated 'management team'. StoryCode is a two-guys-in-a-garage thing which gets sold on for a couple of million(this isn't a pejorative remark) - not an old-style dotcom behemoth.
  4. They need to come up with an incentive system for entering ratings, and quick. Relying on people's goodwill is not enough. Some kind of points system, redeemable against actual books perhaps, is what comes to mind.

That's my twopence-worth. Steve, if you're reading this, I thought you did a great pitch - but there were rather a lot of holes.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

September 06, 2006

Mint

No, not the UK credit card. Nor the shop on Wigmore Street in London where I bought my wife a very nice chandelier a few years ago. Mint is a nice little web stats app written by Shaun Inman. Unusually for such things in this day and age, it's not a hosted service but a little chunk of PHP/MySQL/JavaScript which costs $30 to install for a single site.

Given the bazillions of teeny tiny web stats packages out there, this wouldn’t normally be worth a mention, but it has a number of features which are pretty cool and not often found at this level:

Interface - the Mint interface is very, very simple (there aren't even any charts, at least in the version I've seen), but does have a nice feature that lays out multiple tables on the screen intelligently even as the screen is resized. This is paired with a jump-to navigator in the top bar of the interface, which makes it easy to get to a particular table of data (and the jump is nicely animated, too - or that may just be IE7).

API – the most interesting thing about Mint is its API, Pepper, which allows people to write plug-ins for the app which display specific kinds of data, such as outbound clicks on Google AdSense ads (something Google Analytics doesn't do). Pepper has become pretty popular – see Peppermint Tea for a list of current plug-ins - and is a stroke of genius on Shaun's part, since the effective functionality of Mint is now far greater than anything he could have created on his own.

RSS – information can be made available as an RSS feed. This is fairly obvious, since RSS is becoming the new e-mail, and web stats packages have been able to e-mail reports out for some time now (and many of them make their data available in XML format). But it's a nice little touch, and given than Mint is aimed squarely at the self-hosting/blogging community (even though you can't run it on a shared blogging service like TypePad), it's very much up with the zeitgeist.

(via Vitamin)

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

August 31, 2006

Measuring Web 2.0

Eric Peterson has an interesting post on his blog about a couple of ideas he's had for measuring 'Web 2.0' usage - by which he means sites that use AJAX for in-page events, and (in particular) mashup other apps or web services.

Essentially, Eric suggests two methods:

  1. Web service apps (e.g. Google Maps) implement the ability to have a user identifier passed in when they're called, and then expose an API to extract usage data for this identifier at a later date, which can then be rolled into the usage data for that user on the calling site
  2. Apps expose an API for passing in a tag destination URL and a user identifier which the app will ping with activity data for the user (so that the app can contribute usage data to whichever web analytics app the creator of the mashup is using)

I think both of these ideas have merit, but the second seems way more practical to me, since there would be an enormous overhead in retrieving the usage data - the analytics tool would have to pass in a list of user IDs which could be millions long.

We went through this kind of pain at WebAbacus when a client implemented a SOAP API for retrieving CMS data; because the API was designed to return information about a document at a time, it was appallingly slow when you wanted to retrieve data about thousands of documents in one go.

The challenges facing the second approach are also considerable, but manageable. The key thing would be constructing the right tag request with the information in it. The method that Eric suggests is too primitive; a better method would be to pass through the name of a JS file that the remote app could include, and the name of a JS function that would be called. But there would still be the problem of capturing the individual events within the app.

A third method that Eric doesn't suggest is to implement an API in the remote app which can pass back usage events to the 'main' app, which can then log them using its own measurement technology. My home-grown knowledge of JavaScript becomes fuzzy at this point, but I know there would be some issues with passing activity data from one site to another (most mashuppable apps place themselves in an iframe, which is technically another site). The other challenge would be a security/privacy one; apps which were prepared to pass click info to calling pages would potentially be open to attack from malicious sources.

Some of this comment is to miss the point of Eric's post, which is to propose some standards for Web 2.0 analysis and reporting; something I wholeheartedly endorse.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

About

About me

Disclaimer

Subscribe

Enter your email address:

Delivered by FeedBurner

Subscribe