The VMS Shark

"Resource Waits" in the OpenVMS Operating System

Any use of the information presented here is entirely at the reader's own risk. Always backup your system before attempting any procedure which could cause your VMS system to hang or crash. Though VMS (now known as OpenVMS) is very robust, some of the techniques presented here involve unusual kernel mode operations which are extremely risky on a production system.
 

 
 
 

Resource Waits in the OpenVMS Operating System
or
What to do when you R-WASTed by OpenVMS

DECUS Spring '93 Atlanta Symposia VS060
David L. Cathey Montagar Software Concepts
P. O. Box 260772 Plano, TX 75026-0772 (972) 578-5036

davidc@montagar.com


 

David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 1


Session Outline

David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 2


What is a "Resource Wait"

David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 3


Resource Waits and MUTEXes (Examples)

David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 4


Resource Waits and MUTEXes (Data Structures)

   31                           16 15                           0 
  +----------------------------+--+------------------------------+
  |              MBZ           | 1|          Owner Count         |
  +----------------------------+--+------------------------------+
                                Write-in-progress or 
                                Write-pending status bit 

David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 5


Example of Accessing a MUTEX

; Note that this code must be in Kernel mode, 
; in order to access the MUTEX data cell for R/W access. 
; 
; Grab the Intrusion Queue mutex, so we can 
; scan it safely... 

        moval g^CIA$GL_MUTEX,r0 
        jsb g^SCH$LOCKW ; Lock MUTEX 

        movl    g^CIA$GQ_INTRUDER,r3    ; Get first intrusion blk 
        moval   g^CIA$GQ_INTRUDER,r4    ; Get listhead address 
1$:     cmpl    r3,r4                   ; If r3 is listhead, bail 
        beql    5$ 
        ...                             ; Do lots of neat stuff 
        movl    CIA$L_FLINK(r3),r3      ; Get next intrusion blk 
        brw     1$ 
5$: 
        moval   g^CIA$GL_MUTEX,r0       ; Unlock MUTEX 
        jsb     g^SCH$UNLOCK 

David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 6


Resource Waits and MUTEXes (Processes)

David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 7


Resource Waits and MUTEXes (Reason Codes)

State
Reason Code
Value
Meaning
RWAST
RSN$_ASTWAIT
1
Wait for AST event
RWMBX
RSN$_MAILBOX
2
Mailbox I/O
RWNPG
RSN$_NPDYNMEM
3
Nonpaged Dynamic Memory
RWPAG
RSN$_PGDYNMEM
5
Paged Dynamic Memory
RWMPE
RSN$_MPLEMPTY
11
Waiting for Modified List to empty
RWMPB

RSN$_MPWBUSY

12

Modified Page Writer Busy
(ReallyWantedMyProcessBack - Pat O.)

RWSCS

RSN$_SCS

13

System Communications Services

RWCAP
RSN$_CPUCAP
15
CPU Capability (Vectors, etc)
RWCSV
RSN$_CLUSRV
16
Cluster Server Process Busy

David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 8


Resource Waits and MUTEXes (Executive)

   31                                    12                     0 
  +-------------------------------------+--+---------------------+
  |                                     | 1|                     |
  +-------------------------------------+--+---------------------+

David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 9


Example of Putting a Process in a Resource Wait

; 
;       Put self in RWAST if there unable to allocate 
;       the required non-paged pool... 
; 
;       Assume R4 hold value of current PCB 
1$: 
        movl    #GOOF$C_LENGTH,r1 

        jsb     g^EXE$DEBIT_BYTCNT_ALO  ; Allocate 1000 bytes 
        blbs    r0,5$ 
        movl    #RSN$_NPDYNMEM,r0       ; Can't do it, wait until 
        jsb     SCH$RWAIT               ; the system frees some and 
        brb     1$                      ; then try it again... 
5$: 
        movl    r1,GOOF$W_SIZE(r2)      ; Play with our new buffer 
        ... 

David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 10


RWAST Causes and Descriptions

David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 11


Using SDA to Determine RWAST Cause

David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 12


Sample SDA Output: SHOW PROCESS/INDEX

SDA> SHOW PROCESS/INDEX=0CF 

            Process index: 000F   Name: DAVIDC_1   Extended PID: 000000CF 
- - - - - - -------------------------------------------------------------
Status : 02040001 res,phdres
Status2: 00000001 quantum_resched
PCB address              805659E0    JIB address              806D2F80 
PHD address              808F9000    Swapfile disk address    00000000 
Master internal PID      00020019    Subprocess count                1 
Internal PID             0003000F    Creator internal PID     00020019 
Extended PID             000000CF    Creator extended PID     00000099 
State                       RWAST    Termination mailbox          002F 
Current priority                6    AST's enabled                KESU 
Base priority                   4    AST's active                 NONE 
UIC                [00002,000001]    AST's remaining               148 
Mutex count                     0    Buffered I/O count/limit        0/40 <---+
Waiting EF cluster              0    Direct I/O count/limit         40/40     |
Starting wait time       1B001B1B    BUFIO byte count/limit      30800/30800  |
Event flag wait mask     00000001<-+ # open files allowed left     147        |
Local EF cluster 0       E0000000  | Timer entries allowed left     20        |
Local EF cluster 1       00000000  | Active page table count         0        |
Global cluster 2 pointer 00000000  | Process WS page count         161        |
Global cluster 3 pointer 00000000  | Global WS page count           40        |
                                   |                                          |
                                   |                    Zero remaining quota--+ 
                                   |
                                   +- Event Flag Mask == 1 == RSN$_ASTWAIT 
                                      if 8nnnnnnn, then it would indicate which MUTEX 

David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 13


Sample SDA Output: SHOW PROCESS/CHANNEL

SDA> SHOW PROCESS/INDEX=0CF/CHANNEL 

            Process index: 000F   Name: DAVIDC_1   Extended PID: 000000CF 
- - - - - - -------------------------------------------------------------

                            Process active channels
                            -----------------------

            Channel  Window           Status        Device/file accessed
- - - - - - -------  ------           ------        --------------------
  0010  00000000                        DUA0: 
  0020  8071C470                        DUA0:[DAVIDC.RWAST]RWAST_BIO.EXE;4 
  0030  00000000            Busy        MBA50: <-+
  0040  00000000                        TWA3:    |
  0050  00000000                        TWA3:    |
                                                 |
 Mailbox I/O incomplete, probably needs flushing-+

David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 14


Breaking Processes Out of RWAST

David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 15


Network Devices

$ MCR NCP SHOW KNOW LINKS $! kill the link that seems to be connected to the RWAST'd process $ MCR NCP DISCONNECT LINK
 
$ MCR NCP SHOW KNOW LINK
 
Known Link Volatile Summary as of 6-APR-1993 20:17:47
 
Link
Node
PID
Process
Remote link
Remote user
8193
1.42 (AVATAR)
21600033
REMACP
8445
DAVIDC
 
$ MCR NCP DISCONNECT LINK 8193
 

David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 16


Mailbox Devices

$ COPY MBA1284: NLA0:
or
$ COPY LOGIN.COM MBA1284:
 
It's probably a better practice to copy from before copying to...
 

David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 17


Tape Devices

        devnam: .ascid  /MUA0:/ 
        chan:   .word   0 
                .entry  packack,0 
                $ASSIGN_S       chan=chan,- 
                                devnam=devnam 
                $QIOW_S         chan=chan,- 
                                func=#IO$_PACKACK 
                ret 
                .end     packack 

David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 18


Disk Devices

David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 19


Line Printer Devices (believe it or not!)

A printer got stuck at the same time the SYMBIONT had a lock on a RIGHTSLIST entry... and was stopped. The SYMBIONT was RWASTed, had a blocking lock on the RIGHTSLIST, that ended up locking up everyone on the system (600+ angry users) in LOGINOUT, DIR/OWNER, etc.

The solution? Close the door to the 15-year-old-washing-machine-sized LP27 printer :-(
 

David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 20


Brute Force Approaches to Getting out of RWAST

David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 21


For those that need the code example...

          .title   DISABLE_RW 
;++ ;     DISABLE_RW -- Disable Resource Wait of another process 
; ; Author:        David L. Cathey 
;                  Montagar Software Concepts 
;                  P. O. Box 260772 
;                  Plano, TX 75026-0772 
;                  davidc@montagar.com 
; 
          .link             "SYS$SYSTEM:SYS.STB"/SE 
          .library /SYS$LIBRARY:LIB/ 
          $PCBDEF           ; Process Control Block definitions 

asc_pid: .ascid    "xxxxxxxx"                   ; Save space for PID 
bin_pid: .long     0 
prompt:  .ascid    "Process ID: "               ; Prompt string 

         .entry    Main,0 

         pushaw    asc_pid 
         pushaq    prompt 
         pushaq    asc_pid 
         calls     #3,g^LIB$GET_FOREIGN         ; Get PID from user 
         blbc      r0,999$ 

         pushal    bin_pid 

David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 22


         pushaq    asc_pid                      ; Convert ascii hex to binary 
         calls     #2,g^OTS$CVT_TZ_L 
         blbc      r0,999$ 

         $CMKRNL_S routin=do_it                 ; Play with the process... 
999$:    ret 


         .entry    Do_It,^M<> 

         movl      bin_pid,r0 
         jsb       g^EXE$EPID_TO_PCB            ; Get PCB from EPID 
         tstl      r0                           ; Did we??? 
         beql      99$                          ; Nope, bail out 
         bisl2     #PCB$M_SSRWAIT,PCB$L_STS(r0) ; Set SSRWAIT disable 
         bicl2     #PCB$M_DELPEN,PCB$L_STS(r0)  ; Clear delete pending 
         $DELPRC_S pidadr=bin_pid               ; And delete again. 
         ret                                    ; Bye... 
99$:     movl      #SS$_NONEXPR,r0              ; Non-existent process! 
         ret 
         .end      Main 

David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 23


Back to OpenVMS
Back to Home
Neil Rieck
Kitchener - Waterloo - Cambridge, Ontario, Canada.