Advogato Data Recovery
13 January 2005
My account vanished a few days ago, I've been hoping that waiting a few days would see it restored from backup, but then I noticed that I was not alone and that recreating the account would get my data back.
So, here I am with a new account very similar to the old one, but lacking all the certification.
A couple of days ago, zeenix mentioned he had overwritten his diary (and I would have responded sooner if not for the lost account issue).
The good news is that I archive the HTML pages I have rawdog generate from RSS et al. This means I have 88MB of HTML from all the feeds I subscribe to going back to last June. So, I thought it would be a fairly trivial operation to recover the lost entries with a splash of Perl.
#!/usr/bin/perl use strict; use warnings; use HTML::Parser; use HTML::TreeBuilder; foreach my $file_name (@ARGV) { my $tree = HTML::TreeBuilder->new; # empty tree $tree->parse_file($file_name); my @elements = $tree->find_by_attribute("class", "item feed-6e2302e9"); # feed-6e2302e9 is the id of elements that a third # party scrapes from advogato recent log and provides # as an RSS feed to a few people # Since it has all Advogato entries in it we need to # Parse the HTML to look at the name of the person # who wrote it. Advogato puts that in the title of the # entry. foreach my $node (@elements) { my @title = $node->find_by_tag_name('h3'); if ($title[0]->as_text =~ /zeenix/) { print $node->as_HTML; } } $tree->delete; }
It seemed like a good idea, but it looks like the missing entries never made it to my feedparser, perhaps not even onto Advogato. So the results weren't great. (2013: 8 years later and I'm moving to a new CMS and don't think that data is important enough to import into it)