And I’ll show you YAML

(a continuation of this post)

  • magic_lobster_party
    link
    fedilink
    537 months ago

    JSON for serialization all the way. It’s simple and to the point. It does one thing and does it well. There’s little room for annoying surprises. Any JSON can easily be minified and prettified back and forth. If you want it in binary format you can convert it to BSON.

    Yaml is too much of a feature creep. It tries to do way too many things at the same time. There are so many traps to fall into if you’re not cautious enough. The same thing can be written in multitudes of ways.

  • interolivary
    link
    fedilink
    47
    edit-2
    7 months ago

    There’s a special place in hell for the inventor of semantically significant whitespace.

    YAML itself is one of the circles of hell. You have to copy-paste YAML from web etc sources with dubious formatting for all eternity, and the editor doesn’t have YAML support. Also you can only use Python

    • magic_lobster_party
      link
      fedilink
      14
      edit-2
      7 months ago

      Indenting copy pasted yaml is always a pain in the butt. Any indentation you give is likely a valid yaml. Especially bad considering indentation has a significant meaning. You have to double check back and forth to ensure nothing bad has sneaked in.

      With JSON there are no such discrepancies. It’s likely the editor has figured it out for you already. If it hasn’t it’s easy to prettify the JSON yourself.

      • interolivary
        link
        fedilink
        57 months ago

        Semantic whitespace problems can easily be literally impossible to solve automatically. One of the dumbest fucking ideas anybody ever came up with in computing and its inventor if anyone belongs in YAML Hell. As a fuckup it’s not quite as bad as null, but that ain’t exactly a high bar

      • interolivary
        link
        fedilink
        47 months ago

        I’m not sure which thought is scarier: that you don’t know what you’re signing up for, or that you do know and you enjoy fixing undecidable formatting fuckups manually

        • synae[he/him]
          link
          fedilink
          English
          27 months ago

          There’s a bonus third option: I started writing python professionally in 2007 and nowadays spend 75% of my “hands on keyboard” time working on kubernetes YAML and I am indeed having a good time.

          I admit, I hastily misread the tail end of your comment as (e.g.) “A reason YAML is bad because you have to copy-paste from the web and that sucks”; not as you probably meant it “in this special hell, you must deal with copy-pasted nbsp and other trash”. So maybe I did not know exactly what I was signing up for ;)

          I don’t deal with anything like that and not entirely sure how it happens to people enough that it is a common complaint. “undecidable formatting fuckups” are a non-issue in my life, I really don’t understand how people encounter such things. Maybe they need to fix their editor/IDE/tools? Skill issue? IDK.

          As a tangent- I don’t care what language code is written in, it had better be indented properly (and linted, and follow the project’s codestyle, …). Our juniors learn pretty early that their change requests will be blocked on formatting alone by CI, and a human won’t even bother reviewing the substance of their change if they don’t follow convention. I don’t hear them ever complaining about any of these things, least of all semantic whitespace … and we have a rich culture of bitching about menial/pedantic things ;)

  • @[email protected]
    link
    fedilink
    407 months ago

    My problem with yaml is if you truncate it at any random spot, there’s a high likelihood it’s still valid yaml. I don’t like the idea that things can continue without even knowing there’s a problem. The single opening and closing curly braces enclosing a json object is all it takes to at least know you didn’t receive the entire message. Toml has the same issue. I’ll stick with json when it makes sense.

      • @[email protected]
        link
        fedilink
        137 months ago

        Quite like YAML, XML has too many stuff in it. While a lot of parsers are not standard compliant and safe, if there’s any chance the stuff you include on your code can evolve into a fully featured parser, including it is something to avoid.

        There is this language called KDL that looks interesting.

  • @[email protected]
    link
    fedilink
    397 months ago

    Serializing? For serializing you probably want performance above all else. I’m saying this without checking any benchmark, but I’m sure yaml is more expensive to parse than other formats where indentation don’t have meaning.

    For human readability: it has to be readable (and writeable) by all humans. I know (a lot of people) that dislike yaml, toml and XML. I don’t know of a single person that struggles to read/write json, there is a clear winner.

    • Dr M
      link
      fedilink
      207 months ago

      JSON would be perfect if it allowed for comments. But it doesn’t and that alone is enough for me to prefer YAML over JSON. Yes, JSON is understandable without any learning curve, but having a learning curve is not always bad. YAML provides a major benefit that is worth the learning curve and doesn’t have the issues that XML has (which is that there is no way to understand an XML without also having the XSD for it)

      • @Michal
        link
        257 months ago

        Json should also allow for trailing commas. There’s no reason for it not too. It’s annoying having to maintain commas.

      • Kogasa
        link
        17 months ago

        If a comment isn’t part of the semantic content of a JSON object it has no business being there. JSON models data, it’s not markup language for writing config files.

        You can use comments in JSON schema (in a standardized way) when they are semantically relevant: https://json-schema.org/understanding-json-schema/reference/comments

        For the data interchange format, comments aren’t part of the JSON grammar but the option to parse non-JSON values is left open to the implementation. Many implementations do detect (and ignore) comments indicated by e.g. # or //.

        • @[email protected]
          link
          fedilink
          27 months ago

          JSON models data, it’s not markup language for writing config files.

          JavaScript package management promptly said otherwise. JSON is a config format no matter if you like it or not.

          • Kogasa
            link
            17 months ago

            I’ve disagreed with JavaScript before, what makes you think I won’t do it again?

            Anyway, anything using JSON as a config language will also certainly use a JSON interpreter that can ignore comments. Sure that’s “implementation specific,” but so is a config file. You wouldn’t use “MyApplication.config.json” outside the context of MyApplication loading its own configuration, so there’s no need for it to be strictly compliant JSON as long as it plays nicely with most text editors.

    • @[email protected]
      link
      fedilink
      167 months ago

      I don’t know of a single person that struggles to read/write json, there is a clear winner.

      Really? Any JSON over 80 chars becomes a nightmare to read for me, especially if indention is not used to make it more readable.

    • Kogasa
      link
      17 months ago

      Serializing isn’t necessarily about performance, or we’d just use protobuf or similar. I agree Json is a great all rounder. Combine with JSON object schema to define sophisticated DSLs that are still readable, plain JSON. TOML is nice as a configuration language, but its main appeal (readability) suffers when applied to complex modeling tasks. XML is quite verbose and maybe takes the “custom DSL” idea a little too far. YAML is a mistake.

    • @[email protected]
      link
      fedilink
      English
      17 months ago

      I don’t know why we’re fucking about trying to use text editors to manipulate structured data.

      Yeah, it’s convenient to just be able to use a basic text editor, but we’re not trying to cram it all on a floppy disk here. I’m sure we could have a nice structured data editor somewhere for all those XML, JSON and YAML files we’re supposed to maintain every day.

  • synae[he/him]
    link
    fedilink
    English
    257 months ago

    For serializing? I’d probably just go with json.

    For content meant to be written or edited by humans? YAML all day baby

    • @Andy
      link
      57 months ago

      Ever tried NestedText? It’s like basic YAML but everything is a string (types are up to the code that ingests it), and you never ever need to escape a character.

      • synae[he/him]
        link
        fedilink
        English
        97 months ago

        I’ve got too many consumers that I don’t control which dictate their input formats. And to be quite honest, “types are up to the code that ingests it” sounds like a huge negative to me.

        • @Andy
          link
          37 months ago

          Ah, well I love that policy (types being in code, not configs). FWIW I sometimes use it as a hand-edited document, with a small type-specifying file, to generate json/yaml/toml for other programs to load.

    • Terrasque
      link
      fedilink
      147 months ago

      puts the json in the yaml parser

      Your move, foolish mortal

      • @derpgon
        link
        77 months ago

        For those uninitiated, every JSON is a valid YAML, since YAML is just a superset of JSON.

        • TechNom (nobody)
          link
          English
          17 months ago

          Protobuf is also not a proper binary alternative for Yaml. Protobuf needs a schema in the form of its IDL. Cbor and messagepack might be more analogous.

  • @[email protected]
    link
    fedilink
    6
    edit-2
    7 months ago

    Yaml is a great, human-readible file format. Unless there’s an exclamation point in it, then it is an illegible Eldrich horror.

  • @[email protected]
    link
    fedilink
    57 months ago

    Genuinely curious what features OP is looking for, specifically for serialization as per the post, that has resulted in the conclusion being yaml.

  • JoYo
    link
    fedilink
    English
    17 months ago

    I didn’t even know there was a difference.