python - Defining appropriate number of processes -


i have python code treating lot of apache logs (decompress, parse, crunching numbers, regexping etc). 1 parent process takes list of files (up few millions), , sends list of files parse workers, using multiprocess pool.

i wonder, if there guidelines / benchmarks / advices can me estimate ideal number of child process ? ie. having 1 process per core better launching few hundreds of them?

currently 3/4 time of script execution reading files , decompressing them, , in terms of resources, cpu 100% loaded, memory , i/o being ok. assume there lot can done proper multiprocessing settings. script running on different machines / os, os-specific hints welcome, too.

also, there benefit in using threads rather multiprocess?

i wonder, if there guidelines / benchmarks / advices can me estimate ideal number of child process ?

no.

having 1 process per core better launching few hundreds of them?

you can never know in advance.

there many degrees of freedom.

you can discover empirically running experiments until level of performance desire.

also, there benefit in using threads rather multiprocess?

rarely.

threads don't much. multiple threads doing i/o locked waiting while process (as whole) waits o/s finish i/o request.

your operating system very, job of scheduling processes. when have i/o intensive operations, want multiple processes.


Comments

Popular posts from this blog

c# - How to set Z index when using WPF DrawingContext? -

razor - Is this a bug in WebMatrix PageData? -

visual c++ - Using relative values in array sorting ( asm ) -