• cucumberbob
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    If they were only looking at the previous word, they would. But they have an attention mechanism which stores the “meaning” of all previous words in a vector using a different neural net.