• xcjs
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    5 months ago

    I just wanted to update this to mention that there are a lot of custom low level performance improvements for CPU based inferencing in Llamafile: https://justine.lol/matmul/