linux - High Kernel CPU when running multiple python progams -

July 15, 2012

i developed python program heavy numerical calculations. run on linux machine 32 xeon cpus, 64gb ram, , ubuntu 14.04 64-bit. launch multiple python instances different model parameters in parallel use multiple processes without having worry global interpreter lock (gil). when monitor cpu utilization using htop, see cores used, of time kernel. generally, kernel time more twice user time. i'm afraid there lot of overhead going on on system level, i'm not able find cause this.

how 1 reduce high kernel cpu usage?

here observation made:

this effect appears independent of whether run 10 jobs or 50. if there fewer jobs cores, not cores used, ones used still have high cpu usage kernel
i implemented inner loop using numba, problem not related this, since removing numba part not resolve problem
i though might related using python2 similar problem mentioned in question switching python2 python3 did not change much
i measured total number of context switches performed os, 10000 per second. i'm not sure whether large number
i tried increasing python time slices setting sys.setcheckinterval(10000) (for python2) , sys.setswitchinterval(10) (for python3) none of helped
i tried influencing task scheduler running schedtool -b pid didn't help

edit: here screenshot of htop: enter image description here

i ran perf record -a -g , report perf report -g graph:

samples: 1m of event 'cycles', event count (approx.): 1114297095227                                    -  95.25%          python3  [kernel.kallsyms]                           [k] _raw_spin_lock_irqsave   ◆    - _raw_spin_lock_irqsave                                                                          ▒       - 95.01% extract_buf                                                                           ▒            extract_entropy_user                                                                      ▒            urandom_read                                                                              ▒            vfs_read                                                                                  ▒            sys_read                                                                                  ▒            system_call_fastpath                                                                      ▒            __gi___libc_read                                                                          ▒ -   2.06%          python3  [kernel.kallsyms]                           [k] sha_transform            ▒    - sha_transform                                                                                   ▒       - 2.06% extract_buf                                                                            ▒            extract_entropy_user                                                                      ▒            urandom_read                                                                              ▒            vfs_read                                                                                  ▒            sys_read                                                                                  ▒            system_call_fastpath                                                                      ▒            __gi___libc_read                                                                          ▒ -   0.74%          python3  [kernel.kallsyms]                           [k] _mix_pool_bytes          ▒    - _mix_pool_bytes                                                                                 ▒       - 0.74% __mix_pool_bytes                                                                       ▒            extract_buf                                                                               ▒            extract_entropy_user                                                                      ▒            urandom_read                                                                              ▒            vfs_read                                                                                  ▒            sys_read                                                                                  ▒            system_call_fastpath                                                                      ▒            __gi___libc_read                                                                          ▒     0.44%          python3  [kernel.kallsyms]                           [k] extract_buf              ▒     0.15%          python3  python3.4                                   [.] 0x000000000004b055       ▒     0.10%          python3  [kernel.kallsyms]                           [k] memset                   ▒     0.09%          python3  [kernel.kallsyms]                           [k] copy_user_generic_string ▒     0.07%          python3  multiarray.cpython-34m-x86_64-linux-gnu.so  [.] 0x00000000000b4134       ▒     0.06%          python3  [kernel.kallsyms]                           [k] _raw_spin_unlock_irqresto▒     0.06%          python3  python3.4                                   [.] pyeval_evalframeex

it seems if of time spent calling _raw_spin_lock_irqsave. have no idea means, though.

if problem exists in kernel, should narrow down problem using profiler such oprofile or perf.

i.e. run perf record -a -g , read profiling data saved perf data using perf report. see also: linux perf: how interpret , find hotspots.

in case high cpu usage caused competition /dev/urandom -- allows 1 thread read it, multiple python processes doing so.

python module random using initialization. i.e:

$ strace python -c 'import random; while true:     random.random()' open("/dev/urandom", o_rdonly)     = 4 read(4, "\16\36\366\36}"..., 2500) = 2500 close(4)                                   <--- /dev/urandom closed

you may explicitly ask /dev/urandom using os.urandom or systemrandom class. check code dealing random numbers.

Search This Blog

Lix

linux - High Kernel CPU when running multiple python progams -

Comments

Post a Comment

Popular posts from this blog

c++ - Difference between pre and post decrement in recursive function argument -

javascript - IE11 incompatibility with jQuery's 'readonly'? -

php - How can I echo out this array? -