Under the bonnet: Publishing live Wimbledon stats to multiple platforms
For some years now we've been publishing the annual Wimbledon tennis statistics to multiple platforms. Since last year, with our colleagues in FM&T Journalism, we're publishing to . For your delight, here's a peak at how this is implemented.
1) We receive a single feed from IBM. We have a direct connection to their DB2 database which runs on-site at Wimbledon.
2) Our client code, written in Perl, is a daemon. It receives notification from the DB2 database, which triggers every time the content changes. Our stats client interrogates the database to see what's changed; if the change is deemed 'interesting', it extracts the useful data and publishes downstream. Our client contains the business logic which works out what is deemed interesting. For our purposes, we're looking for changes to set points, and matches starting and finishing.
3) The stats client generates two XML feeds - one for the red button platforms, and one for the others. The red button feed contains abbreviated player names so that they can physically fit onto the TV screen. Historically, we had a special feed for Ceefax - since 2007, this was upgraded to XML to aide publishing to other non-broadcast platforms (see below).
4) The red button transcoder repurposes the content so it's suitably ready for broadcast as in-band content. Each platform has its own format of data, however visually the results are pretty much the same to the viewer regardless of the platform. It's actually implemented as a complex array of components in a combination of Perl and Java, which I hope to write more on in a future posting. For 2008, we've overhauled these components to join up the formerly separate video and text services.
5) The second transcoder targets the IP platforms and Ceefax. From 2007, the new XML feed was used, I believe, to publish live stats to for the first time. New to 2008 is the Wimbledon widget for the homepage.
Have you spotted the curious cats? Firstly you may note the combination of Perl and Java, and secondly the two transcoders each targeting a distinct set of platforms. This might be seen as a real-life example of . Character encodings and the quality of service also need to be dealt with. Having a direct link to the onsite Wimbledon network helps with the latter, and differentiates this type of solution from using data feeds through the Internet.
The stats client is special, since it needs to have a semantic understanding of tennis, or at least of the specific Wimbledon schema. Due to test data, or database failovers, erroneous data can still occasionally make its way into the database, which emphasises the need for defensiveness in any type of feed handler. Our client attempts to strike a middle ground so as not to inflate the code complexity; nevertheless, watching for dodgy Wimbledon stats has become something of an annual event here in the Bush!
Comments
The following comments were originally posted on the ´óÏó´«Ã½i Labs blog
At 2:09pm on 01 Jul 2008, mattcopp wrote:
This is very interesting stuff. I have always figured the systems must work like this.
Does it work the same for football too? Often a commentator will say something like "that hasn't happened for 27 years". Is that the business logic picking it out?
I also wonder if you have, or ever will make, the XML data available to the general public? This idea excites me that the users can make their own stats and figures, and get live data on games they aren't watching, or -- as I have found irritating a few times -- have the score displayed on screen at all times.
This would also be perfect for the ´óÏó´«Ã½'s newly purchased F1 rights.
At 5:07pm on 08 Jul 2008, rob_hardy wrote:
Just had look at the code (implemented in perl). The business logic is as follows:
my @rules = (
{ has_become_completed_or_suspended => 1},
{ is_match_inactive => 0},
{ has_score_changed => 1},
{ has_become_active => 1},
{ has_courts_changed => 1},
{ is_court_invalid => 0},
);
Each hashkey is a method, containing logic which compares matches, scores etc; the hash values are the outcome. We don't want to publish score changes on inactive matches or invalid courts - we do in the other cases. For example:
sub has_score_changed
{
my ($self, $old_match, $new_match) = @_;
if (not defined $old_match) {
return 1;
}
return $old_match->score_equals($new_match) ? 0 : 1;
}
As for making the XML public - I suspect IBM and the AELTC would need to agree to this.
Comments