Painfully Parsing RSS
on 08.13.02, 07:15am in weblog • comments (0)
[Ziv Caspi]: Mark Pilgrim writes (and rants) of the pain that is parsing RSS. For some reason, even people well-aware of XML, HTML and the gap that’s between them don’t bother to check that their RSS feeds are actually well-formed XML. People leave stray ‘&’ around; people include HTML entities (such as ") although XML only has five built-in entities; they do other mistakes. I totally agree with you guys. In writing Pocket Feed, I found that the MSXML parser on Windows CE crashes even when you look at it funny. The RSS differences, coupled with variations of RDF and OPML (for example, Radio uses subtle difference in the tags compared with AmphetaDesk, Aggie, etc) has caused me nothing but aggravation for importing subscriptions and reading feeds.



