2011-07-21

Investigating Memory Leaks with Dtrace

Investigating Memory Leaks with Dtrace

Solaris and fast crash dump

Just recently we had Solaris panic.
Yes - it happens ...

Since this server had a lot of memory I was waiting quite a long (a few hours ...). Of course after the panic I was looking for information how to speed up the crush dump. The first page I met was
Oracle Solaris 10 9/10 What's New (scroll down to "Fast Crash Dump"). So since Solaris 10 9/10 the performance of crash dump is improved a lot
by utilizes lightly used CPUs on large systems.

But just recently there was a new blog post by Steve Sistare about Fast Crash Dump. It describes internals (I love it :-) ) of the whole optimized process.

There is also additional document on Oracle support page How to Use the Oracle Solaris Fast Crash Dump Feature

2011-07-18

When was Red Hat / CentOS server installed ?

Just recently I was looking for information when (what date) was particular server installed. We can check it using two ways (both using rpm):

First way: rpm -qi basesystem

Name        : basesystem                   Relocations: (not relocatable)
Version     : 8.0                               Vendor: Red Hat, Inc.
Release     : 5.1.1                         Build Date: Wed 12 Jul 2006 09:08:04 AM CEST
Install Date: Thu 05 Aug 2010 08:42:38 PM CEST      Build Host: ls20-bc2-14.build.redhat.com
Group       : System Environment/Base       Source RPM: basesystem-8.0-5.1.1.src.rpm
Size        : 0                                License: public domain
Signature   : DSA/SHA1, Thu 18 Jan 2007 04:33:57 PM CET, Key ID 5326810137017186
Packager    : Red Hat, Inc. 
Summary     : The skeleton package which defines a simple Red Hat Linux system.
Description :
Basesystem defines the components of a basic Red Hat Linux system (for
example, the package installation order to use during bootstrapping).
Basesystem should be the first package installed on a system, and it
should never be removed.

As you can see in line "Install Date" - this system was installed on Thu 05 Aug 2010.

Second way: rpm -qa --last | tail

gnome-mime-data-2.4.2-3.1                     Thu 05 Aug 2010 08:42:51 PM CEST
gnome-backgrounds-2.15.92-1.fc6               Thu 05 Aug 2010 08:42:51 PM CEST
emacs-leim-21.4-20.el5                        Thu 05 Aug 2010 08:42:51 PM CEST
desktop-backgrounds-basic-2.0-37              Thu 05 Aug 2010 08:42:51 PM CEST
comps-extras-11.1-1.1                         Thu 05 Aug 2010 08:42:51 PM CEST
mailcap-2.1.23-1.fc6                          Thu 05 Aug 2010 08:42:48 PM CEST
termcap-5.5-1.20060701.1                      Thu 05 Aug 2010 08:42:38 PM CEST
cracklib-dicts-2.8.9-3.3                      Thu 05 Aug 2010 08:42:38 PM CEST
basesystem-8.0-5.1.1                          Thu 05 Aug 2010 08:42:38 PM CEST
setup-2.5.58-7.el5                            Thu 05 Aug 2010 08:42:37 PM CEST
The second column indicates exact time/date of installation.

2011-06-03

sftp - how to log file transfers

Just recently I needed to log file transfers using sftp. Here is short description how to achieve it (I am using RedHat/CentOS in my example).Edit file

/etc/ssh/sshd_config


and change line:


Subsystem       sftp    /usr/libexec/openssh/sftp-server

to

Subsystem       sftp    /usr/libexec/openssh/sftp-server -l INFO



and then restart ssh service:

service ssh restart


Having that you can watch transfered files in /var/log/messages (they are identified by string "sftp-server"

2011-04-04

Solaris 10 zones - increase timeout while shutting down

Recently I encountered a problem while shutting down Solaris 10 zone with an application which is closing quite long - it happened a few times that the application had not enough time for its shutdown. We can give more time for a zone by changing its properties using svccfg:


# svccfg
svc:> select system/zones
svc:/system/zones> listprop stop/timeout_seconds
stop/timeout_seconds  count    100
svc:/system/zones> setprop stop/timeout_seconds=900
svc:/system/zones> listprop stop/timeout_seconds
stop/timeout_seconds  count    900
svc:/system/zones>

Don't forget about the last important (and necessary !!!) step:

# svcadm refresh svc:/system/zones 

Thats it - now the zones has 15 minutes for its shutdown before they are halted.

2011-02-10

Netbackup - how to customize job report (CLI)

Instead of using GUI to list all jobs we can use CLI command /usr/openv/netbackup/bin/admincmd/bpdbjobs, e.g.:

bash-3.00# /usr/openv/netbackup/bin/admincmd/bpdbjobs -report|head -3
JobID           Type State Status                 Policy    Schedule   Client Dest Media Svr Active PID FATPipe
42857         Backup  Done      0        server1-db      Differ   server1         server1      26952      No
42856         Backup  Done      0        server2-app      Differ   server2         server2      26600      No
42855         Backup  Done      0        server3-all      Differ   server3         server3      26074      No
This allow you to write your own scripts which analyze what is going on with all the NBU jobs.
But you can also customize bpdbjobs output. If you want to add or remove particular commands you have to edit /usr/openv/netbackup/bp.conf file and add needed fields. In my case it was:

BPDBJOBS_COLDEFS = JOBID 7 true
BPDBJOBS_COLDEFS = PARENTJOBID 13 true
BPDBJOBS_COLDEFS = TYPE 7 true
BPDBJOBS_COLDEFS = STATE 5 true
BPDBJOBS_COLDEFS = STATUS 6 true
BPDBJOBS_COLDEFS = POLICY 8 true
BPDBJOBS_COLDEFS = SCHEDULE 8 true
BPDBJOBS_COLDEFS = CLIENT 8 true
BPDBJOBS_COLDEFS = DSTMEDIA_SERVER 14 true
BPDBJOBS_COLDEFS = ELAPSED 10 true
BPDBJOBS_COLDEFS = FILES 8 true
BPDBJOBS_COLDEFS = KBPERSEC 10 true
BPDBJOBS_COLDEFS = KILOBYTES 12 true
BPDBJOBS_COLDEFS = RETENTION 10 true
BPDBJOBS_COLDEFS = LASTBACKUP 18 true

This way you can display whatever you want to:

bash-3.00# /usr/openv/netbackup/bin/admincmd/bpdbjobs -report|head -3
  JobID  Parent JobID           Type State Status                 Policy    Schedule   Client Dest Media Svr    Elapsed    Files KB Per Sec    Kilobytes  Retention        Last Backup
  42857         42834         Backup  Done      0        server1      Differ   server1         server1  000:03:23       21      80000          160    2 weeks  02/09/11 06:07:15
  42856         42835         Backup  Done      0        server2      Differ   server2         server2  000:00:56      217        756        20544    2 weeks  02/09/11 06:10:34