• FlexibleToast@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 month ago

    A 120b parameter model is small compared to the models running in datacenters. However, this does seem like the current “Moore’s Law” for AI. Finding more and more efficient ways to run larger parameter models.