• MonkderZweite@feddit.ch
    link
    fedilink
    arrow-up
    11
    ·
    edit-2
    10 months ago

    Actually, you can’t even parse html (5) with specialized tools or by converting it and then using xml linters (they quit out due to too many errors). Only tools capable of reliably parsing html (mostly) are the big 3 browser engines. Experience from converting saved webpages to asciidoctor, it involves cleaning up manually, despite tidy and pandoc.

    • kevincox@lemmy.ml
      link
      fedilink
      arrow-up
      7
      ·
      10 months ago

      This isn’t true. HTML5 made a very strict set of rules and there are a large handful of compliant parsers. But yes, you absolutely can’t use an XML parser. You can’t even use an XML emitter, as you can emit valid XML that means something completely different in HTML.

      …what a fucking disaster. I still wish XHTML won.