(For context, I’m basically referring to Python 3.12 “multiprocessing.Pool Vs. concurrent.futures.ThreadPoolExecutor”…)

Today I read that multiple cores (parallelism) help in CPU bound operations. Meanwhile, multiple threads (concurrency) is due when the tasks are I/O bound.

Is this correct? Anyone cares to elaborate for me?

At least from a theorethical standpoint. Of course, many real work has a mix of both, and I’d better start with profiling where the bottlenecks really are.

If serves of anything having a concrete “algorithm”. Let’s say, I have a function that applies a map-reduce strategy reading data chunks from a file on disk, and I’m computing some averages from these data, and saving to a new file.

  • milicent_bystandr@lemm.ee
    link
    fedilink
    arrow-up
    3
    ·
    1 month ago

    For small, highly parallel operations, probably Python isn’t the right language and something like Rust should be explored.

    You could also try Julia, which, if I’m not mistaken, handles concurrency and parallelism well, but is also interactive and easy to write like python.

      • milicent_bystandr@lemm.ee
        link
        fedilink
        arrow-up
        2
        ·
        1 month ago

        I don’t think so, there was some discussion about why writing Julia as a python transpiler wouldn’t work as well. But it does supposedly have very good interoperability, both ways - calling Julia functions from Python or vice versa.