Dorward

Initial Investigation Incorrect

03 February 2008

I reran the tests for my initial XHTML experiments this morning to see if results had been generated for any of the other search engines, or if the changes in file extension had had any impact. The results were not what I was expecting.

</tr></thead>
<tbody>
<tr>
<td scope="row"><a href="http://www.google.co.uk/search?q=site%3Astone.thecoreworlds.net+Crazy+dancing+telephone+people">Google</a></td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
</tr>
<tr>
<td scope="row"><a href="http://uk.search.yahoo.com/search;_ylt=A0geumL5rZdHLLoA4mJLBQx.?p=Crazy+dancing+telephone+people+site%3Astone.thecoreworlds.net&amp;y=Search&amp;fr=yfp-t-501&amp;ei=UTF-8&amp;rd=r1">Yahoo!</a></td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>

</tr>
<tr>
<td scope="row"><a href="http://uk.altavista.com/web/results?sc=off&amp;q=Crazy+dancing+telephone+people+domain%3Astone.thecoreworlds.net">Altavista</a></td>     
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
</tr>                  

<tr>
<td scope="row"><a href="http://search.live.com/results.aspx?q=Crazy+dancing+telephone+people&amp;go=Search&amp;form=QBRE">LiveSearch</a></td>     
<td>Yes</td>
<td>No</td>
<td>No</td>
</tr>

<tr>
<td scope="row"><a href="http://uk.ask.com/web?qsrc=167&amp;o=312&amp;l=dir&amp;siteid=&amp;q=%22Crazy+dancing+telephone+people%22&amp;search=search&amp;dm=all">Ask.com</a></td>     
<td>No</td>
<td>No</td>
<td>No</td>
</tr>       


</tbody>
</table>
                 

Despite the control page and the content pages being added to the web at the same time, some didn't start to appear in results as quickly. In the last week the other types of documents have started showing up in results, including the XHTML, served as application/xhtml+xml with well formedness errors. The only exceptions being LiveSearch (but, given the track record of other browsers, I'm going to give it more time to pick up the XHTML pages) and Ask (which hasn't picked up the control page yet).

I don't think this is a very good approach for search engines to take, since users of Internet Explorer will be presented with download dialogue boxes, and users of browsers which support XHTML will receive error messages when they view the documents that are not well formed.

I have, of course, updated the project page.

Results (as of February 3, 2008)
Search engine Indexed pages
Control XHTML Not Well-formed