• Aloso
    link
    fedilink
    English
    arrow-up
    2
    ·
    2 years ago

    The reddit thread has some interesting discussion, and a solution using no SIMD intrinsincs that is more than 200x faster, by using .chunks_exact(), and letting the compiler auto-vectorize it.