2007-02-15

dtrace & java: part 1

We have application written in java which is a sort of proxy between mobile device and our database. Every time end user commits some transaction on the mobile device, it is sent to the application which, on behalf of the device, commits it in database. It works very well and doesn't overload server. In fact cpu is almost all the time idle (sar output):

...
08:00:00 1 0 0 98
08:10:00 3 1 0 97
08:20:00 2 1 0 97
08:30:00 3 1 0 97
08:40:00 3 1 0 97
08:50:00 2 1 0 97
09:00:00 2 1 0 97
09:10:00 2 1 0 97
09:20:00 2 1 0 97
09:30:00 2 1 0 98
09:40:00 2 0 0 98
09:50:00 2 0 0 98
10:00:00 1 0 0 98
10:10:00 2 0 0 98
10:20:00 1 0 0 98
10:30:01 1 0 0 98
10:40:00 1 0 0 99
10:50:00 1 0 0 99
11:00:00 1 0 0 99
11:10:00 1 0 0 99
11:20:00 1 0 0 99
11:30:00 1 0 0 99
11:40:00 1 0 0 99
11:50:00 1 0 0 99
12:00:00 1 0 0 99
...
Such workload is the same for many days.
But from time to time I noticed strange behaviour - cpu becomes very busy (again: for many days):
...
06:10:00 54 1 0 45
06:20:00 54 2 0 45
06:30:00 53 1 0 46
06:40:00 52 1 0 47
06:50:00 52 1 0 47
07:00:00 52 1 0 47
07:10:00 52 1 0 47
07:20:00 52 1 0 47
07:30:00 52 1 0 47
07:40:00 52 1 0 46
07:50:00 52 1 0 46
08:00:00 53 1 0 46

...
As you can see even in the night ! I didn't expected that. And type of workload doesn't justify such behaviour (I have also installed the latest java patch: 1.5.0_11)
I was courious and run prstat to see which application might do that (I was almost sure that the java app is guilty but to be sure):
   PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
4498 przemol 964M 804M cpu0 0 10 62:18:01 51% java/81
26850 root 7600K 3796K sleep 59 0 0:00:00 0.1% sshd/1
26861 root 1076K 900K sleep 59 0 0:00:00 0.0% sh/1
26866 root 2832K 2608K cpu1 59 0 0:00:00 0.0% prstat/1
136 daemon 3932K 1704K sleep 59 0 0:00:01 0.0% kcfd/3
26865 root 2452K 1680K sleep 49 0 0:00:00 0.0% bash/1
26853 przemol 7372K 2000K sleep 59 0 0:00:00 0.0% sshd/1
26855 przemol 1160K 932K sleep 59 0 0:00:00 0.0% ksh/1

...
Indeed, the application used most of cpu time. This time prstat -mL -p 4498:
   PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
4498 przemol 48 0.0 0.0 0.0 0.0 21 31 0.6 77 165 77 1 java/5897
4498 przemol 31 0.0 0.0 0.0 0.0 20 49 0.4 103 88 96 4 java/6948
4498 przemol 19 0.0 0.0 0.0 0.0 80 0.5 0.0 40 58 178 0 java/4
4498 przemol 1.5 0.2 0.0 0.0 0.0 0.0 98 0.0 93 4 304 0 java/7629
4498 przemol 1.0 0.1 0.0 0.0 0.0 99 0.0 0.1 2K 3 2K 0 java/3
4498 przemol 0.9 0.1 0.0 0.0 0.2 99 0.0 0.0 511 1 521 0 java/2
4498 przemol 0.0 0.0 0.0 0.0 0.0 0.9 99 0.0 152 1 89 0 java/16
4498 przemol 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 3 0 51 0 java/7634
4498 przemol 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 24 0 35 0 java/7
4498 przemol 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 3 0 3 0 java/17
[...]

What caught my eye was the LCK column. The application spent significant amount of time waiting for locks. I remember that in Solaris 10 plockstat can reveal user-level lock statistics:
bash-3.00# plockstat -n 20 -A -e 600 -p `pgrep java`

[...]

Count nsec Lock Caller
-------------------------------------------------------------------------------
574 30515 libc.so.1`libc_malloc_lock libjvm.so`__1cCosEfree6Fpv_v_+0x18

[...]

Mutex hold

Count nsec Lock Caller
-------------------------------------------------------------------------------
1 16340 libc.so.1`libc_malloc_lock libjvm.so`__1cCosGmalloc6FI_pv_+0x28
1 15707 libdvmti.so`.XAKUgFjQxPCE3Rm.data libdvmti.so`class_create_entry+0xf7
144 15211 0xd02b8900 libjvm.so`__1cCosTget_native_priority6F
690 14849 0xd02ac1c0 libjvm.so`__1cCosTget_native_priority6F
1 14822 0xd02b4000 libjvm.so`__1cCosPpd_start_thread6FpnGT
1 14364 libc.so.1`libc_malloc_lock libjava_crw_demo.so`method_write_all+0x
543 14317 0xd02b8240 libjvm.so`__1cCosTget_native_priority6F
10107 14224 libc.so.1`libc_malloc_lock libnet.so`Java_java_net_SocketInputStre
1 13958 0xd02b42c0 libjvm.so`__1cCosPpd_start_thread6FpnGT
1 13940 0xd02b4180 libjvm.so`__1cCosPpd_start_thread6FpnGT
1 13809 0xd02b4280 libjvm.so`__1cCosPpd_start_thread6FpnGT
1 13804 libdvmti.so`.XAKUgFjPxPCEnMm.data+0x8 libdvmti.so`cbClassFileLoadHook+0x3e
1 13800 0xd02b3f80 libjvm.so`__1cCosPpd_start_thread6FpnGT
1 13697 0xd02b40c0 libjvm.so`__1cCosPpd_start_thread6FpnGT
1 13679 0xd02b4080 libjvm.so`__1cCosPpd_start_thread6FpnGT
1 13677 0xd02b4100 libjvm.so`__1cCosPpd_start_thread6FpnGT
1 13661 0xd02b4240 libjvm.so`__1cCosPpd_start_thread6FpnGT
6 13655 libc.so.1`_uberdata libc.so.1`_thrp_create+0xc8
1 13644 0xd02b4200 libjvm.so`__1cCosPpd_start_thread6FpnGT
1 13601 libc.so.1`libc_malloc_lock libnsl.so.1`free_entry+0x1d

[...]

Mutex spin

Count nsec Lock Caller
-------------------------------------------------------------------------------
1 13180 libc.so.1`libc_malloc_lock libjava.so`JNU_GetStringPlatformChars+0
1 13053 libc.so.1`libc_malloc_lock libc.so.1`dgettext+0x5b
493 12796 libc.so.1`libc_malloc_lock libjvm.so`__1cCosGmalloc6FI_pv_+0x28

Mutex hold

Count nsec Lock Caller
-------------------------------------------------------------------------------
408 12779 libc.so.1`_uberdata+0x3c0 0xffffffff
10107 12549 libc.so.1`libc_malloc_lock libnet.so`Java_java_net_SocketInputStre
18 12461 libnsl.so.1`netpp_mutex libnsl.so.1`setnetconfig+0x3e
1 12429 0xd02b3fc0 libjvm.so`__1cCosPpd_start_thread6FpnGT
1 12388 libc.so.1`libc_malloc_lock libjvm.so`__1cCosGmalloc6FI_pv_+0x28
1 12255 libc.so.1`libc_malloc_lock libjava_crw_demo.so`java_crw_demo+0x437
1 12142 libc.so.1`_uberdata+0x440 0xffffffff

Mutex spin

Count nsec Lock Caller
-------------------------------------------------------------------------------
2 12115 libc.so.1`libc_malloc_lock libjava.so`JNU_ReleaseStringPlatformCha
1 12038 libc.so.1`libc_malloc_lock libnsl.so.1`netconfig_dup+0x1e

Mutex hold

Count nsec Lock Caller
-------------------------------------------------------------------------------
440 11918 libc.so.1`_door_lock libnsl.so.1`_door_gethostbyname_r+0x7e

Mutex spin

Count nsec Lock Caller
-------------------------------------------------------------------------------
1 11892 libc.so.1`libc_malloc_lock libmanagement.so`Java_com_sun_managemen

Mutex hold

Count nsec Lock Caller
-------------------------------------------------------------------------------
1 11887 0xd02b4300 libjvm.so`__1cCosPpd_start_thread6FpnGT
1 11830 0xd02b3f40 libjvm.so`__1cCosPpd_start_thread6FpnGT
1 11789 0xd02b4000 libc.so.1`thr_create+0x42
1 11403 0xd02b41c0 libjvm.so`__1cCosPpd_start_thread6FpnGT
2 10895 libc.so.1`_uberdata+0x40 0xd0582481
1 10830 0xd02b3fc0 0
252 10440 libc.so.1`libc_malloc_lock libjava.so`JNU_GetStringPlatformChars+0
18 10396 libnsl.so.1`netpp_mutex libnsl.so.1`getnetconfigent+0x42

[...]

Mutex hold

Count nsec Lock Caller
-------------------------------------------------------------------------------
2 9967 libc.so.1`libc_malloc_lock libjvm.so`__1cCosEfree6Fpv_v_+0x18
238 9828 libc.so.1`_door_lock libnsl.so.1`_door_gethostbyaddr_r+0x73
220 9575 libnsl.so.1`netpp_mutex libnsl.so.1`getnetconfigent+0x42
1 9491 libc.so.1`_uberdata+0xac0 libjvm.so`__1cDhpiEread6FipvI_I_+0xa8

Mutex spin

Count nsec Lock Caller
-------------------------------------------------------------------------------
2 9386 libc.so.1`libc_malloc_lock libjvm.so`__1cCosGmalloc6FI_pv_+0x28

Mutex hold

Count nsec Lock Caller
-------------------------------------------------------------------------------
1 9351 libc.so.1`libc_malloc_lock libdvmti.so`class_create_entry+0x107
2 9240 libc.so.1`libc_malloc_lock libc.so.1`dgettext+0x5b
2 9167 libc.so.1`libc_malloc_lock libjava_crw_demo.so`method_init+0x89
148 8997 libc.so.1`libc_malloc_lock libmanagement.so`Java_com_sun_managemen
309 8928 libc.so.1`libc_malloc_lock libjvm.so`__1cCosGmalloc6FI_pv_+0x28
1 8862 libc.so.1`libc_malloc_lock libnsl.so.1`free_entry+0x1d
1 8834 libc.so.1`libc_malloc_lock libjvm.so`__1cCosGmalloc6FI_pv_+0x28
1 8765 0xd02b39c0 0
3536 8747 libc.so.1`libc_malloc_lock libjvm.so`__1cCosGmalloc6FI_pv_+0x28
1 8727 libc.so.1`libc_malloc_lock libnsl.so.1`free_entry+0x1d
1 8496 0x8071a50 libdvmti.so`hash_lookup+0x63
1 8489 0xd02b3fc0 libjvm.so`__1cG_start6Fpv_0_+0x6a
1 8380 0xd02b3ec0 libjvm.so`__1cG_start6Fpv_0_+0x6a
1 8380 0xd02b3d80 0
148 8357 libc.so.1`libc_malloc_lock libmanagement.so`Java_com_sun_managemen
1 8334 0xd02b4140 libjvm.so`__1cG_start6Fpv_0_+0x6a
1 8321 0xd02b4200 libjvm.so`__1cG_start6Fpv_0_+0x6a
1 8308 0xd02b40c0 libc.so.1`thr_create+0x42
1 8305 0xd02b4240 libjvm.so`__1cG_start6Fpv_0_+0x6a
1 8280 0xd02b4080 libjvm.so`__1cG_start6Fpv_0_+0x6a

Mutex spin

Count nsec Lock Caller
-------------------------------------------------------------------------------
3 7443 libc.so.1`libc_malloc_lock libjava.so`JNU_ReleaseStringPlatformCha

Mutex hold

Count nsec Lock Caller
-------------------------------------------------------------------------------
2 7302 libc.so.1`_uberdata+0x40 0xd0582481
148 7291 libc.so.1`libc_malloc_lock libmanagement.so`Java_com_sun_managemen
2 7248 libc.so.1`_uberdata+0xe00 libc.so.1`setlocale+0x2b
1 7200 0xd02b2a80 0
2 7119 libc.so.1`libc_malloc_lock libjava_crw_demo.so`method_init+0x133
1 7101 0xd02b4040 0
91058 7057 libc.so.1`libc_malloc_lock libjvm.so`__1cCosGmalloc6FI_pv_+0x28
1 6940 0x8071a50 libdvmti.so`hash_add+0x70
1 6935 libc.so.1`libc_malloc_lock libdvmti.so`class_set_method_count+0x6e
148 6922 libc.so.1`libc_malloc_lock libmanagement.so`Java_com_sun_managemen
19 6894 libc.so.1`libc_malloc_lock libjvm.so`__1cCosGmalloc6FI_pv_+0x28
2 6888 libc.so.1`libc_malloc_lock libc.so.1`dgettext+0x5b
17 6846 libc.so.1`libc_malloc_lock libjvm.so`__1cCosGmalloc6FI_pv_+0x28
8 6831 libc.so.1`_uberdata+0x3c0 0xc9c0f8b3
309 6818 libc.so.1`libc_malloc_lock libjvm.so`__1cCosGmalloc6FI_pv_+0x28
1 6809 0xd02b3e80 libc.so.1`_thrp_create+0xc8
1 6772 0xd02ac4c0 libjvm.so`__1cCosTget_native_priority6F
1 6742 0xd02b3680 0
1 6735 0xd02b3fc0 libc.so.1`thr_create+0x42
11 6734 libc.so.1`_uberdata+0x3c0 0

(BTW: I have installed java dtrace probes from https://solaris10-dtrace-vm-agents.dev.java.net/ That explains libdvmti in 'Caller' columns)

Add the winner is: libc_malloc_lock ! As Adam Leventhal said: "In multi-threaded apps, lock contention can be huge performance killer" (http://blogs.sun.com/ahl/entry/baby_s_first_dtrace). Because malloc on Solaris performs well only in single-threaded applications tomorrow I will try to run the java program using another memory allocator, mtmalloc which was designed for multi-threaded apps (you will find comparison of memory allocators here http://developers.sun.com/solaris/articles/multiproc/multiproc.html):

LD_PRELOAD=/usr/lib/libmtmalloc.so; export LD_PRELOAD

Because the problem with locks comes after a few days of running I will have to wait again.
But two questions remains:
  • what is causing (which java operations) so big memory pressure (frequent allocations/deallocations) ?
  • is it a bug in jvm or jvm is doing really hard work caused by the java app ?
Because java details are beyond my knowledge I have asked our java guru for help.

Do you imagine doing the same analysis without dtrace ?



No comments: