Thursday, June 23, 2011

What's new in RHEL6

Details are provided at http://www.europe.redhat.com/products/rhel/server/details/

Some important ones are:
1. Red Hat Enterprise Linux 6 supports more sockets, more cores, more threads, and more memory.
2. Memory pages with errors can be declared as "poisoned", and will be avoided.
3. The new default file system, ext4, is faster, more robust, and scales to 16TB.
4. Redhat Cluster Nodes can re-enable themselves after failure without administrative intervention using unfencing.
5. iSCSI partitions may be used as either root or boot filesystems.
6. The new System Security Services Daemon (SSSD) provides centralized access to identity and authentication resources, enables caching and offline support.
7. Dracut has been introduced as an replacement of mkinitrd.
7. XEN has been replaced with KVM.

Wednesday, June 22, 2011

Live upgrade of ZFS root file system

1.       Create a new boot environment to patch or uprade
# lucreate –n newBE

2.       Upgrade the new Boot environment
# luupgrade –u –n newBE –s /cdrom

3.       Activate the new Boot Environment
# luactivate newBE

4.       Boot from new boot environment
# init 6

Live upgrade of Veritas mirrored system

System had two disks c1t0d0 and c1t1d0, so first break mirror and remove disk c1t1d0 from disk group.

1.       # vxplex -g rootdg dis rootvol-02 swapvol-02 home-02
2.       # vxdg -g rootdg -k rmdisk rootdg02
3.       Create a file autoreg with entry “auto_reg=disable”
4.       Modify vxlustart script and mention auto registration disable by putting –k /tmp/autoreg into command running luupgrade otherwise it will fail.
5.       ./vxlustart -u 5.10 -d c1t1d0s2  -g rootdg -s /cdrom
6.       ./vxlufinish –u 5.10 –g <new root disk group name>
7.       shutdown –y –g0 –i6
8.       After few reboots system will come up with new Boot environment, mirror the rootdisk and remove old rootdg.

Create users in different Unix Flavors

In Aix
# mkuser id='<id>' pgrp='<group name>' groups='<secondary group>' su='true' home='/home/mqm' gecos='<Comment>' maxage='0' <user id>

In Solaris/RHEL

#useradd -g <group> -d <home directory> -c <Comment> <user id>

Friday, March 25, 2011

How to add Solaris server to Nagios for monitoring

1. Create nagios user and group

2. Download Nagios plugin and nrpe daemon from http://www.guntram.de/nagios/ for appropriate solaris version

3. Unzip and untar the files under /usr/local/nagios

4. Create xml file to import nrpe as SMF service or take it from here.
http://unixjournal.org/tag/nrpe/

5. Import xml file into SMF
# svccfg import /var/svc/manifest/network/nrpe.xml

6. Keep the service disabled for time being
# svcs -a|grep nrp
disabled 17:09:11 svc:/network/nrpe:default

7. Create an nrpe file under /lib/svc/method
# more /lib/svc/method/nrpe
#!/sbin/sh
#

LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH
PIDFILE=/var/run/nrpe.pid
NRPE_BIN=/usr/local/nagios/bin/nrpe
CONFIG_FILE=/usr/local/nagios/etc/nrpe.cfg

case $1 in
# SMF arguments (start and restart [really "refresh"])
'start')
$NRPE_BIN -d -c $CONFIG_FILE
;;

'restart')
if [ -f "$PIDFILE" ]; then
/usr/bin/kill -HUP `/usr/bin/cat $PIDFILE`
fi
;;

'stop')
if [ -f "$PIDFILE" ]; then
/usr/bin/kill `/usr/bin/cat $PIDFILE`
fi
;;
*)
echo "Usage: $0 { start | stop | restart }"
exit 1
;;
esac

exit $?
#
8. Start the nrpe service
# svcadm enable nrpe
# svcs -a|grep nrp
online 17:17:31 svc:/network/nrpe:default

9. Now modify /usr/local/nagios/etc/nrpe.cfg file to include what will be monitored on this server.

command[check_local_disk]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /
command[check_local_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_local_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_local_procs]=/usr/local/nagios/libexec/check_procs -w 350 -c 400
command[check_zfs_rpool]=/usr/local/nagios/libexec/check_zfs1 rpool 1
command[check_zfs_datapool]=/usr/local/nagios/libexec/check_zfs1 datapool 1
command[check_fmd]=/usr/local/nagios/libexec/check_fmd
command[check_meta]=/usr/local/nagios/libexec/check_meta

Remember to download check_fmd, check_meta and check_zfs binaries from

http://exchange.nagios.org/directory/Plugins/Uncategorized/Operating-Systems/Solaris

10. Now configuration change has to be made on Nagios server. In the /usr/local/nagios/etc/objects/solaris.cfg file add following

# Define a host for the hostname.domain.org

define host{
use solaris-server host_name hostname.domain.org
alias hostname
address 100.100.100.15
contact_groups Admins
}
###############################################################################
###############################################################################
#
# SERVICE DEFINITIONS
#
###############################################################################
###############################################################################
# Define a service to "ping" the local machine

define service{
use Remote-service ; Name of service template to use
host_name hostname.domain.org
service_description PING
check_command check_ping!100,20%!500,60%
}


# Define a service to check the disk space of the root partition
# on the Remote machine. Warning if < 20% free, critical if # < 10% free space on partition. define service{ use Remote-service ; Name of service template to use host_name hostname.domain.org service_description Root Partition check_command check_nrpe_1arg!check_local_disk } # Define a service to check the number of currently logged in # users on the Remote machine. Warning if > 20 users, critical
# if > 50 users.

define service{
use Remote-service ; Name of service template to use
host_name hostname.domain.org
service_description Current Users
check_command check_nrpe_1arg!check_local_users
}

# Define a service to check the number of currently running procs
# on the Remote machine. Warning if > 250 processes, critical if
# > 400 users.

define service{
use Remote-service ; Name of service template to use
host_name hostname.domain.org
service_description Total Processes
check_command check_nrpe_1arg!check_local_procs
}
# Define a service to check the load on the Remote machine.

define service{
use Remote-service ; Name of service template to use
host_name hostname.domain.org
service_description Current Load
check_command check_nrpe_1arg!check_local_load
}
define service{
use Remote-service ; Name of service template to use
host_name hostname.domain.org
service_description Solaris Fault Manager
check_command check_nrpe_1arg!check_fmd
}
define service{
use Remote-service ; Name of service template to use
host_name hostname.domain.org
service_description Meta Device Status
check_command check_nrpe_1arg!check_meta
}

# Define a service to check SSH on the Remote machine.
# Disable notifications for this service by default, as not all users may have SSH enabled.

define service{
use Remote-service ; Name of service template to use
host_name hostname.domain.org
service_description SSH
check_command check_ssh
}

11. Note the check_nrpe_1arg command used instead of default check_nrpe (since i am not using any arguments. we should have a command by name check_nrpe_1arg in commands.cfg like below.

# # 'check_nrpe_1arg' command definition
define command{
command_name check_nrpe_1arg
command_line /usr/local/nagios/libexec/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}

12. Restart nagios service on Nagios server and in few minutes solaris client will be monitored.

Problems encountered.

1. could not complete ssl handshake

Downloaded nrpe is compiled with ssl support. Make sure that on server command check_nrpe_1arg does not has -n with it as that is used for without ssl support.

2. no output returned from plugin

Try running command from server using command line. for example
#pwd
/usr/local/nagios/libexec
# ./check_nrpe -H hostname.domain.org -c check_meta
OK - No disk failures detected

If you get the above output it means you are using check_nrpe instaed of check_nrpe_1arg. Check_nrpe expects some arguments which we do not require.

Monday, March 21, 2011

How to install nagios server in Unix

1. Download nagios software and Nagios plugin from http://www.nagios.org/download/

2. Install gcc, libgcc for compilation.


3. Create User nagios and group nagios

4. Install Nagios

# tar –xvf nagios-3.2.3.tar
# cd nagios-3.2.3
# ./configure –with-command-group=nagcmd
# make all
# make install
# make install-config
#make install-commandmode

5. Configure the web interface
# make install-webconf
# htpasswd –c /usr/local/nagios/etc/htpasswd.users nagiosadmin
Enter the password

6. Compile and install nagios plugin
# tar –xvf nagios-plugin-1.4.11.tar
# cd nagios-plugin-1.4.11
# ./configure - -with-nagios-user=nagios - -with-nagios-group=nagios
#make
#make install

7. Start Nagios
#chkconfig - -add nagios
#chkconfig nagios on
# service nagios start

8. Test nagios configuration file
# /usr/local/nagios/bin/nagios –v /usr/local/nagios/etc/nagios.cfg
9. Login to nagios web interface at http://hostname/nagios/ using nagiosadmin account.

Check Ramesh Natrajan's blog at http://www.thegeekstuff.com/2008/05/nagios-30-jumpstart-guide-for-red-hat-overview-installation-and-configuration/

from where i took help to setup my Nagios server

Friday, October 29, 2010

How to take snapshot in M4000 for oracle support

To take snapshot for Oracle support.
1. Go to XSCF and login using admin account.
2. Connect the USB drive (FAT 32) in the Service Processor
3. XSCF> snapshot -d usb0

Copy the zip file from USB to Sun FTP site.