LLMs are solving MCAT, the bar test, SAT etc like they’re nothing. At this point their performance is super human. However they’ll often trip on super simple common sense questions, they’ll struggle with creative thinking.

Is this literally proof that standard tests are not a good measure of intelligence?

  • Z3k3@lemmy.world
    link
    fedilink
    English
    arrow-up
    26
    arrow-down
    3
    ·
    9 months ago

    When I was at uni lecturers would often state that exams were thr worst measure of grasping the subject material but its all we have at the moment.

    I saw this my self with some of my class mates testing very well but when discussing or problem solving outside of the class there was nothing there.

    I think llms fall into this category but with way better recall.

      • Z3k3@lemmy.world
        link
        fedilink
        English
        arrow-up
        6
        ·
        edit-2
        9 months ago

        I’m not in the us collages are generally vocational here with both colleges being less (while not totaly) concerned by the money side.

        For example where I live university courses are free for those in country outside they pay fees

        Dunno how it’s done elsewhere but our course are usually measured in 3 parts 1 exam 2 practical 3 essey/investigation. Everyone hates exams

      • brianorca@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        9 months ago

        It’s also the only way that is portable. A professor could evaluate each student, but has no way to transmit that kind of evaluation in a way that schools or employers across the country would trust. They didn’t know who the professor is, or what his standards are, or even if he is being bribed to pass somebody. (Which would happen much more if the professors opinion had the weight that the standardized test does. )

    • Fermion@mander.xyz
      link
      fedilink
      arrow-up
      7
      ·
      9 months ago

      I had a lot of professors who put most of the grade weight on large projects. It made for a very heavy workload, but projects/ papers give a much better picture of how capable someone is of not only reciting knowledge, but also applying it.

      • Z3k3@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        9 months ago

        Most of my grades were split 40/40/20

        With the 3 being a writen component