2012-09-12

DTrace & ioctl under Solaris


Just recently we had some higher then usually syscalls on our Solaris server. Even though people don't complain it is good to know why it is higher. I have used the following DTrace script:



#!/usr/sbin/dtrace -s
syscall:::entry
{
        self->start = vtimestamp;
}
syscall:::return
/(int)arg0 != -1 && self->start/
{
        this->time = vtimestamp - self->start;
        @Time[probefunc] = sum(this->time);
        @Time["TOTAL:"] = sum(this->time);
        self->start = 0;
}
profile:::tick-5s
{
        trunc (@Time, 20);
        normalize (@Time, 1000000);
        printa(@Time);
        clear(@Time);
        exit (0);
}


The following part:

profile:::tick-5s
{
        ...
        exit (0);
}

forces the script to finish after 5 seconds. It summarizes time spent in each successful syscall:

dtrace: script '/tmp/syscall-all.d' matched 475 probes
CPU     ID                    FUNCTION:NAME
 36  80431                         :tick-5s
  pread                                                            13
  stat64                                                           13
  getdents                                                         14
  semsys                                                           16
  pwrite                                                           23
  pollsys                                                          48
  times                                                            60
  resolvepath                                                      66
  close                                                            66
  stat                                                             79
  mmap                                                             98
  write                                                           111
  read                                                            120
  open                                                            155
  exece                                                           156
  munmap                                                          159
  memcntl                                                         190
  fork1                                                           357
  ioctl                                                          2565
  TOTAL:                                                         4420

As we can see the the most of syscall time is spent in ioctl. Since the syscalls are used mostly on behalf of applications let's see which software  is reponsible for such behaviour:

dtrace -n 'syscall::ioctl: { @[execname] = count();}' -n tick-5sec'{trunc(@,20); printa(@); trunc(@); exit(0);}'
dtrace: description 'syscall::ioctl: ' matched 2 probes
dtrace: description 'tick-5sec' matched 1 probe
CPU     ID                    FUNCTION:NAME
 64  79810                       :tick-5sec
  vxstat                                                           40
  sqlplus                                                          42
  sh                                                               56
  cron                                                             76
  sshd                                                             92
  tnslsnr                                                         108
  emagent                                                         110
  init                                                            146
  top                                                             203
  tqbnp                                                           660
  vxconfigd                                                       668
  nmupm                                                           730
  tqbsp                                                           778
  vmstat                                                         2660
  dtrace                                                         6107
  oracle                                                        20495

It seems that Oracle is responsible for most of ioctls.
Using DTrace ustack() action we can trace the user thread's stack. Let's see why it is calling ioctl:

dtrace -n 'syscall::ioctl: / execname == "oracle" /{ @[ustack()] = count();}' -n tick-5sec'{trunc(@,5); printa(@); trunc(@); exit(0);}'
dtrace: description 'syscall::ioctl: ' matched 2 probes
dtrace: description 'tick-5sec' matched 1 probe
CPU     ID                    FUNCTION:NAME
 34  79810                       :tick-5sec

              libc.so.1`ioctl+0x8
              libodm.so.1`odm_io+0x2d4
              oracle`ksfd_odmwat+0x2ac
              oracle`ksfdwtio+0x464
              oracle`ksfdwat1+0x9c
              oracle`ksfdrwat0+0x230
              oracle`kcrfw_post+0x1e0
              oracle`kcrfw_redo_write+0xba0
              oracle`ksbabs+0x320
              oracle`ksbrdp+0x3f8
              oracle`opirip+0x344
              oracle`opidrv+0x4b0
              oracle`sou2o+0x50
              oracle`opimai_real+0x10c
              oracle`main+0x98
              oracle`_start+0x17c
              132

              0xffffffff7c1dbf30
              0xffffffff7b50400c
              0x1006bfddc
              0x1006ae5d8
              0x1006adfe0
              0x102ac5620
              0x10102e618
              0x100fcf260
              0x1013ac3c8
              0x10152230c
              0x10154687c
              0x102d56034
              0x101f9d218
              0x103216584
              0x101fa0f28
              0x1034cfb94
              0x1034d8ae4
              0x1012e4940
              0x1025c07e4
              0x1041acf48
              140

              libc.so.1`ioctl+0x8
              libodm.so.1`odm_io+0x2d4
              oracle`ksfd_odmio+0x6dc
              oracle`ksfdread1+0x5d8
              oracle`ksfdread+0x20
              oracle`kcfrbd+0x360
              oracle`kcbzib+0x7f8
              oracle`kcbgtcr+0x1ad0
              oracle`ktrget+0x280
              oracle`kdiixs1+0x3d8
              oracle`qerixtFetch+0x1c0
              oracle`qerjoFetch+0x3e8
              oracle`qertbFetchByRowID+0x218
              oracle`rwsfcd+0x78
              oracle`rwsfcd+0x78
              oracle`qerhjFetch+0x1b0
              oracle`qergsFetch+0x194
              oracle`opifch2+0x2150
              oracle`kpoal8+0xc8c
              oracle`opiodr+0x600
              194

              libc.so.1`ioctl+0x8
              libodm.so.1`odm_io+0x2d4
              oracle`ksfd_odmio+0x6dc
              oracle`ksfdread1+0x5d8
              oracle`ksfdread+0x20
              oracle`kcfrbd+0x360
              oracle`kcbzib+0x7f8
              oracle`kcbgtcr+0x1ad0
              oracle`ktrget+0x280
              oracle`kdsgrp+0x198
              oracle`kdsfbr+0xec
              oracle`qertbFetchByRowID+0x3bc
              oracle`qergsFetch+0x194
              oracle`opifch2+0x21c0
              oracle`opiefn0+0x1d8
              oracle`opipls+0xe50
              oracle`opiodr+0x600
              oracle`rpidrus+0xc4
              oracle`skgmstack+0xa8
              oracle`rpidru+0xb8
              228

              libc.so.1`ioctl+0x8
              libodm.so.1`odm_io+0x2d4
              oracle`ksfd_odmio+0x6dc
              oracle`ksfdread1+0x5d8
              oracle`ksfdread+0x20
              oracle`kcfrbd+0x360
              oracle`kcbzib+0x7f8
              oracle`kcbgtcr+0x1ad0
              oracle`ktrget+0x280
              oracle`kdsgrp+0x198
              oracle`kdsfbr+0xec
              oracle`qertbFetchByRowID+0x3bc
              oracle`rwsfcd+0x78
              oracle`rwsfcd+0x78
              oracle`qerhjFetch+0x1b0
              oracle`qergsFetch+0x194
              oracle`opifch2+0x2150
              oracle`kpoal8+0xc8c
              oracle`opiodr+0x600
              oracle`ttcpip+0x4a4
              333

As we can see the "fetch by rowid" is the source of Oracle's ioctls. Since it occurs when rows are fetched using a ROWID (usually recovered from an index) we shouldn't worry too much ;-)