OpenVMS Notes: Apache HTTPd

  1. The VMS SharkThe information and software presented on this web site are intended for educational use only by OpenVMS application developers and OpenVMS system attendants.
  2. The information and software presented on this web site are provided free of charge.
  3. The information and software presented on this web site are presented to you as-is. I will not be held responsible in any way if the information and software presented on this web site damages your computer system, business or organization (sounds like the legal warning from a Microsoft shrink-wrap seal, eh?)
  4. Is this text too small? You have two options:
    1. hold down the CTRL key while rolling the mouse wheel (zoom-in, zoom-out)
    2. use your keyboard like so:
      • hit: CTRL with "-" key to zoom smaller
      • hit: CTRL with "+" key to zoom larger
      • hit: CTRL with zero key to reset zoom
 

Table of Contents

Introductory Notes

  1. Click here to visit Apache at DigitalApache got its name by being "a patched" NCSA web server. Back in the early 1990s system managers would download the public domain NCSA web server code then apply patches to it so that it worked properly. After a period of years it made more sense to download the patched product as a complete kit, so the not-for-profit Apache Foundation was created. Apache is open source code which is the same process making LINUX so powerful and popular:
    "You take the best computer people on the planet and let them collaborate in a world wide public forum (the internet) to produce a product that is better than any commercial variety"

    Our machine looks like this; just a little less full...According to this site: www.netcraft.com 65% of the web servers in the world are based upon Apache while only 15% are based upon products from Microsoft.

  2. Compaq's port of Apache HTTPd for OpenVMS Alpha is called CSWS (Compaq Secure Web Server) which is pronounced "C-Swiss" by OpenVMS Engineers.
     
  3. After HP took over (er, merged with) Compaq, CSWS was unofficially rebranded to SWS (pronounced "Swiss").
     
  4. CSWS-1.3-1 is based upon Apache 1.3.26 and OpenSSL 0.9.7d
     
  5. SWS-2.0 is based upon Apache 2.0.47 and OpenSSL 0.9.7d
     
  6. SWS-2.1 is based upon Apache 2.0.52 and OpenSSL 0.9.7d
     
  7. SWS-2.2 is based upon Apache 2.0.63 and OpenSSL 0.9.8h
     
  8. CGI is an acronym which means Common Gateway Interface.
  9. Caveat: Apache for OpenVMS only runs on Alpha or Itanium. If you're still running on VAX then you'll need one of the following:
  10. If you are overly cautious and do not want to mess up your OpenVMS system, download Apache HTTPd for Windows and install it on your PC. Note that Microsoft's IIS (Internet Information Services) is much easier to set up and configure than Apache. However, Apache allows programmers to get out of the box (by learning) while IIS is locked down to serve the majority of Microsoft's customers.
     
  11. Just like the previous example, if you are going to play with Java Server Pages (jsp) or Java Servlets, then you should download and install Apache Tomcat for Windows. This is a standalone server which normally runs on port 8080 and doesn't require Apache. Be sure to configure the Tomcat connector which allows Apache HTTPd to divert accesses to Java documents (with extensions like ".jsp") to Tomcat over a back channel.
     
  12. Unfortunately, PHP is not part of any Apache release so Windows versions are available here

Problems With CGI (common gateway interface)

Although quite a bit of Apache documentation exists, info about CGI is lacking. On top of this, a few poorly documented quirks have been thrown into the OpenVMS extensions.

p.s. be careful when Googling the phrase "CGI". You might accidentally return information for www.cgi.com which happens to be the name of a computer services company headquartered in Montreal, Quebec, Canada.

CGI Tip-1 (evolution from beginner to expert)

  1. Many older VMS programmers begin developing CGI-based web applications using this inefficient model:
     
    1. Triggered by a browser event, Apache accesses a document in one of the script directories (this is usually a DCL script; ignore PHP and server-side Java for now)
       
    2. The DCL script attempts to detect REQUEST_METHOD (or WWW_REQUEST_METHOD if not Apache) which should be "GET" or "POST".
       
    3. If your script detected "GET" or "POST" then it calls an external program to read the CGI data which is then translated into either "DCL symbols" or "process-level logical names".
       
    4. The script now runs the desired web application.

      Here is an example DCL-based CGI script sitting in one of Apache's script directories:
      $  set noon					! do not stop on errors
      $  say :== write sys$output			!
      $  debug = f$trnlnm("NEIL$DEBUG","LNM$SYSTEM_TABLE")
      $  if (debug .eqs. "Y")				! if CGI debugging is desired
      $  then						!
      $    say "Status: 200"				! start of document header
      $    say "Content-Type: text/html"		! mime type declaration
      $    say ""					! end of document header
      $    say "<html><head></head>"			! start of HTML
      $    say "<body><pre>"				!
      $  endif
      $! temps = "''WWW_REQUEST_METHOD'"		! purveyor method
      $  temps = "''REQUEST_METHOD'"			! apache method
      $  if (temps .eqs. "POST") .or.  -
            (temps .eqs. "GET")			!  
      $  then						!
      $     run csmis$exe:read_html_apache.exe	! create DCL symbols (and sets $status to 1 if ok)
      $  endif					!
      $  run csmis$exe:desired_application.exe	! this is our desired web application
      $  if debug .eqs. "Y"				!
      $  then						!
      $    say "</pre>"				!
      $    say "</body></html>"			!
      $  endif					!
    1. In the previous example you executed one DCL script and two images. As your platform begins to serve hundreds of hits per hour, you might not want to incur this amount overhead. By moving symbol reading logic from "read_html_apache.exe" into your application, you can drop this down to one DCL script and one image like this:
      $  set noon					! do not stop on errors
      $  say :== write sys$output			!
      $  debug = f$trnlnm("NEIL$DEBUG","LNM$SYSTEM_TABLE")
      $  if (debug .eqs. "Y")				! if CGI debugging is desired
      $  then						!
      $    say "Status: 200"				! start of document header
      $    say "Content-Type: text/html"		! mime type declaration
      $    say ""					! end of document header
      $    say "<html><head></head>"			! start of HTML
      $    say "<body><pre>"				!
      $  endif					!
      $  run csmis$exe:desired_application.exe	! this is our desired application
      $  if debug .eqs. "Y"				!
      $  then						!
      $    say "</pre>"				!
      $    say "</body></html>"			!
      $  endif					!
      or this:
      $  set noon					! do not stop on errors
      $  run csmis$exe:desired_application.exe	! this is our desired application (must return 1)
      $  rc = f$integer($STATUS)			!
      $  if ((rc .and. 7) .neq. 1)			! 1=success, 2=error, 4=fatal
      $  then						!
      $    say :== write sys$output			!
      $    say "Status: 500"				! start of document header
      $    say "Content-Type: text/html"		! mime type declaration
      $    say ""					! end of document header
      $    say "<html><head></head>"			! start of HTML
      $    say "<body><pre>"				!
      $    say "Script: 2001"				!
      $    say "error: ",rc				!
      $    say "</pre></body></html>"			!
      $  endif					!
      or this:
      $ run csmis$exe:desired_application.exe		! note: "csmis$exe" is a directory on my platform
      But your ultimate goal is to run the binary application directly from a specially configured Apache directory. This can be done my enabling one, or more, directories to run executables. In the NCSA web server, some directories were enabled to run applications by default. In Apache the default settings for scripting and running applications are disabled for security reasons. You turn them on my creating/modifying <Directory> declarations in a file named APACHE$COMMON:[000000.CONF]HTTPD.CONF

CGI Tip-2 (debugging)

If you're familiar with writing CGI's for either Purveyor (Process Software Corporation) or OSU DECthreads then you'll probably recognize the "show symbol" and "show logical" lines in the following CGI script.

$  set noon					! do not stop on errors
$  say :== write sys$output			!
$  say "Status: 200"				! start of document header
$  say "Content-Type: text/html"		!
$  say ""					! end of document header
$  say "<html><head></head>"			! start of HTML
$  say "<body><pre>"				!
$  show symbol /local/all			! show web-server process-level local symbols
$  show symbol /global/all			! show web-server process-level global symbols 
$  show logical/proc/job			! show web-server job-level logical names 
$! temps = "''WWW_REQUEST_METHOD'"		x purveyor method
$  temps = "''REQUEST_METHOD'"			! apache method
$  if (temps .eqs. "POST") .or.  -
      (temps .eqs. "GET")  
$  then
$     run csmis$exe:read_html_apache.exe	! create DCL symbols (and sets $status to 1 if ok)
$  endif
$  show symbol /local/all			! show all process-level local symbols
$  show symbol /global/all			! show all process-level global symbols 
$  show logical/proc/job			! show all job-level logical names 
$  say "</pre></body></html>"			! end of HTML

The program just listed will not work properly with SWS because the interface has been locked down to limit hacking. This means that "$show symbol /all" statement won't display any environment variables passed to the CGI by the server. (but you will see FORM variables which were created by your application). To view server-created variables you must explicitly request them by name like this:

$ show symbol/local  SYMBOL-NAME
$ show symbol/global SYMBOL-NAME

... which means you need to know their names ahead of time. Inspect the contents of script "TEST-CGI-VMS.COM" to see what I mean. Note that this incomplete file is missing environment variables AUTH_TYPE, HTTP_COOKIE, REMOTE_USER (and probably a few more).

CGI Tip-3 (controlling the server with logical names)

Starting CSWS with "certain system-level logical names" will modify Apache's operation. Put these declarations in script sys$manager:SYSTARTUP_VMS.COM just before you invoke @sys$startup:APACHE$STARTUP.COM

  1. Inspect APACHE$CGI_MODE in the table below. You will need to start this with setting "1" or "2" if you want to process cookies larger than 970 characters (pretty much standard fare these days). I always use "1" but needed to write a stand-alone function to detect when the symbol overflows into multi-item logicals.
     
  2. I'm don't think that APACHE$SHOW_CGI_SYMBOL actually does a "wildcard show symbol" operation; it seems to run program TEST-CGI-VMS.EXE which was compiled from TEST-CGI-VMS.C which means that environment variables are probably missing from this program too.
     
  3. If you're porting CGI from Purveyor to SWS, you might want to define logical APACHE$PREFIX_DCL_CGI_SYMBOLS_WWW to "YES" so that the environmental variables don't need to be changed.

This table was copied from Compaq's "V2.1-1 Installation and Configuration Guide" (more logical names were added with 2.2)

Table 3-5 User Defined Logical Names
Logical Name Description
APACHE$BG_PIPE_BUFFER_SIZE
(New in Version 2.0)
System logical name that is used to set the socket pipe buffer size for exec functions. If this logical is not set, the default is 32767.
APACHE$CGI_BYPASS_OWNER_CHECK
(Obsolete in V2.x)
If defined to any value, this logical name causes the Secure Web Server to bypass the file owner check of the CGI script file. The default is to enforce the owner check on CGI script files for security purposes.
APACHE$CGI_MODE System logical name that controls how CGI environment variables are defined in the executing CGI process. There are three different options. Note that only one option is available at a time.
0 Default. Environment variables are defined as local symbols and are truncated at 970 (limitable with DEC C).
1 Environment variables are defined as local symbols unless they are greater than 970 characters. If the environment value is greater than 970 characters, it is defined as a multi-item logical.
2 Environment variables are defined as logicals. If the environment value is greater than 255 characters, it is defined as a multi-item logical.
APACHE$CREATE_SYMBOLS_GLOBAL If defined, this system logical name causes CGI environment symbols to be defined globally. They are defined locally by default.
APACHE$DAV_DBM_TYPE
(New in Version 2.0)
Used to define the desired DBM organization to use for MOD_DAV. The valid options for this logical are: GDBM, SDBM, VDBM. If this logical is not set, the default is VDBM.
APACHE$DEBUG_DCL_CGI If defined, this system logical name enables APACHE$VERIFY_DCL_CGI and APACHE$SHOW_CGI_SYMBOL.
APACHE$DL_CASE
(New in Version 2.0)
System logical name that controls how Apache will locate shareable entry points. There are four different options. Note that only one option is available at a time.
1 Entry points are located using upper case search.
2 Entry points are located using mixed case search.
3 Default. Entry points are located using upper case search, then mixed case search.
4 Entry points are located using mixed case search, then upper case search.
APACHE$DL_FORCE_UPPERCASE
(Obsolete in Version 2.x)
If defined to be true (1, T, or Y), this system logical name forces case-sensitive dynamic image activation symbol lookups. By default, symbol lookups are first done in a case-sensitive manner and then, if failed, a second attempt is made using case-insensitive symbol lookups. This fallback behavior can be disabled with APACHE$DL_NO_UPPERCASE_FALLBACK.
APACHE$DL_NO_UPPERCASE_FALLBACK
(Obsolete in Version 2.x)
If defined to be true (1, T, or Y), this system logical name disables case-insensitive symbol name lookups whenever case-sensitive lookups fail. See APACHE$DL_FORCE_UPPERCASE
APACHE$FIXBG
(Obsolete in Version 2.x)
System executive mode logical name pointing to installed, shareable images. Not intended to be modified by the user. Replaced by APACHE$SET_CCL.EXE
APACHE$FLIP_CCL
(New in Version 2.0)
Used by APACHE$SET_CCL.EXE, which replaces APACHE$FIXBG.EXE
APACHE$INPUT Used by CGI programs for PUT/POST methods of reading the input stream.
APACHE$MB_PIPE_BUFFER_SIZE
(New in Version 2.0)
Used to set the mailbox pipe buffer size for exec functions. If this logical is not set, the default is 4096.
APACHE$PLV_ENABLE_<username>
(Obsolete in Version 2.x)
System executive mode logical name defined during startup and used to control access to the services provided by the APACHE$PRIVILEGED image. Not intended to be modified by the user.
APACHE$PLV_LOGICAL
(Obsolete in Version 2.x)
System executive mode logical name defined during startup and used to control access to the services provided by the APACHE$PRIVILEGED image. Not intended to be modified by the user.
APACHE$PREFIX_DCL_CGI_SYMBOLS_WWW If defined, this system logical name prefixes all CGI environment variable symbols with "WWW_". By default, no prefix is used.
APACHE$PRIVILEGED
(Obsolete in Version 2.x)
System executive mode logical name pointing to installed, shareable images. Not intended to be modified by the user.
APACHE$READDIR_NO_DOT_FILES
(New in Version 2.0)
Used to disable the simulating of dot files when processing directories. There is no default value.
APACHE$READDIR_NO_NULL_TYPE
(New in Version 2.0)
Used to disable the elimination of the null type which contains a single dot when processing directories. There is no default value.
APACHE$READDIR_NO_UNIX_OPEN
(New in Version 2.0)
Used to disable the processing of unix files when processing directories. There is no default value.
APACHE$SET_CCL
(New in Version 2.0)
Used by APACHE$SET_CCL.EXE, which replaces APACHE$FIXBG.EXE
APACHE$SHOW_CGI_SYMBOL If defined, this system logical name provides information for troubleshooting the CGI environment by dumping all of the symbols and logicals (job/process) for a given CGI. Use with APACHE$DEBUG_DCL_CGI.
APACHE$SSL_DBM_TYPE
(New in Version 2.0)
Used to define the desired DBM organization to use for MOD_SSL. The valid options for this logical are: GDBM, SDBM, VDBM. If this logical is not set, the default is VDBM.
APACHE$SPL_DISABLED
(New in Version 2.0)
Used to determine whether Shared Process Logging is to be disabled. There is no default value.
APACHE$SPL_MAX_BUFFERS
(New in Version 2.0)
Used to determine the maximum buffer quota for each Shared Process Logging mailbox. If this logical is not set, the default is 10.
APACHE$SPL_MAX_MESSAGE
(New in Version 2.0)
Used to determine the maximum message size for each Shared Process Logging mailbox. If this logical is not set, the default is 1024.
APACHE$SPL_FLUSH_INTERVAL
(New in Version 2.0)
Used to determine the maximum message count per Shared Process Logging file before data is flushed to disk. If this logical is not set, the default is 256.
APACHE$USE_CUSTOM_STAT
(New in Version 2.0)
System logical name that is used to indicate that the custom apache stat function should be used rather than the run-time stat function.
APACHE$USER_HOME_PATH_UPPERCASE
(Obsolete in Version 2.x)
If defined to be true (1, T, or Y), this system logical name uppercases device and directory components for user home directories when matching pathnames in <DIRECTORY> containers. This provides backward compatibility for sites that specify these components in uppercase within <DIRECTORY> containers. See the UserDir directive in Modules and Directives section for more information.
APACHE$VERIFY_DCL_CGI If defined, this system logical name provides information for troubleshooting DCL command procedure CGIs by forcing a SET VERIFY before executing any DCL CGI. Use with APACHE$DEBUG_DCL_CGI.

CGI Tip-4 (authorization problems)

I was recently working on an SWS application for my employer to do the following:

  1. upon accessing a given web page, prompt the user for a username and password and then validate it against SYSUAF in OpenVMS
     
  2. display the web page, allowing the user to fill out a form
     
  3. upon clicking submit we would run a DCL script to:
    1. determine the REQUEST_METHOD
    2. determine the USERNAME previously entered
    3. run a VMS-BASIC program to handle the request and respond with some HTML

First off, I enabled these two lines in HTTPD.CONF

LoadModule auth_module modules/mod_auth.exe
LoadModule auth_openvms_module modules/mod_auth_openvms.exe 

...then restarted the server. Everything seemed to work as expected except that I couldn't get any USERNAME info to show up in the DCL script. Furthermore, the DCL version of CGI debugging script "TEST-CGI-VMS.COM" seems to be missing some environmental variables like REMOTE_USER. To make matters worse, the otherwise excellent book "OpenVMS with Apache, OSU, and WASD" states that variable HTTP_AUTHORIZATION should be available in all servers including SWS.

(I was looking for this environmental as a sign that the base64 encoded username and password were making it through to my CGI; I have since discovered that neither SWS nor "Apache on UNIX" ever passed HTTP_AUTHORIZATION to a CGI program).

CGI Tip-4 (preparing for authorization)

According to some helpful folks at Compaq (now HP), in order to get authorization information like REMOTE_USER transferred to the CGI you must do the following:

  1. Add the "AuthAuthoritative On" directive to the ".HTACCESS" file which protects the directory containing the protected files (web pages and/or scripts)
     
  2. Since both the document directory and scripts directory require identical authentication data, copy the document ".HTACCESS" file to the scripts directory which will contain the CGI program requiring authentication info (this meant creating a "protected_scripts" directory because we have other CGI programs which do not require authentication).
     
  3. Modify file APACHE$COMMON:[000000.CONF]HTTPD.CONF using either the "AllowOverride AuthConf" or "AllowOverride All" for both the document directory and protected scripts directory. Failure to do this means that many "Auth" directives in ".HTACESS" will be ignored by the server but "AuthOpenVMSUser" and "AuthOpenVMSAuthoritative" aren't affected so invoking a username and password dialog doesn't mean that everything is working.
    ServerRoot "/apache$root"
    DocumentRoot "/apache$common/main"
    LoadModule auth_openvms_module /apache$common/modules/mod_auth_openvms.exe_alpha
    
    ScriptAlias /cgi-bin/ "/apache$root/cgi-bin/"
    ScriptAlias /scripts/ "/apache$root/scripts/"
    ScriptAlias /bell_private_scripts/ "/apache$root/bell_private_scripts/"
    ScriptAlias /ics_private_scripts/ "/apache$root/ics_private_scripts/"
    
    Alias /bell_private/ "/apache$root/bell_private/"
    <Directory "/apache$root/bell_private">
        Options FollowSymLinks
        AllowOverride All         <-- enable .HTACCESS (this red remark should not be in your file)
        Order allow,deny
        Allow from all
    </Directory>
    
    Alias /ics_private/ "/apache$root/ics_private/"
    <Directory "/apache$root/ics_private">
        Options Indexes FollowSymLinks Multiviews
        AllowOverride All         <-- enable .HTACCESS (this red remark should not be in your file)
        Order allow,deny
        Allow from all
    </Directory>
    
    <Directory "/apache$root/bell_private_scripts">
        AllowOverride AuthConfig  <-- enable .HTACCESS (this red remark should not be in your file)
        Options ExecCGI
        Order allow,deny
        Allow from all
    </Directory>
    
    <Directory "/apache$root/ics_private_scripts">
        AllowOverride AuthConfig  <-- enable .HTACCESS (this red remark should not be in your file)
        Options ExecCGI
        Order allow,deny
        Allow from all
    </Directory>
  4. File ".HTACCESS" after modification
    1. you can have different files in different directories
    2. since the file name begins with a period, it will not appear in any "file-view" in a browser)
    AuthType Basic
    AuthAuthoritative On
    AuthName "ICSIS Bell-ATS Authentication"
    AuthOpenVMSUser On
    AuthOpenVMSAuthoritative On
    require valid-user
  5. Apache 2.0 Update: For security and performance reasons, Apache 2.0 documentation recommends that you place these directives between <Directory> statements in APACHE$COMMON:[000000.CONF]HTTPD.CONF rather than using ".HTACCESS" files. Click here for more details.
    #
    ScriptAlias /cgi-bin/                   "/apache$documents/cgi-bin/"             <-- enable scripting
    ScriptAlias /scripts/                   "/apache$documents/scripts/"             <-- enable scripting  
    ScriptAlias /ics_private_scripts/       "/apache$documents/ics_private_scripts/" <-- enable scripting 
    #
    <Directory "/apache$documents/cgi-bin">
        AllowOverride None
        Options None
        Order allow,deny
        Allow from all
    </Directory>
    #
    <Directory "/apache$documents/scripts">
        AllowOverride None
        Options None
        Order allow,deny
        Allow from all
    </Directory>
    #
    <Directory "/apache$documents/ics_private_scripts">
        AllowOverride AuthConfig   <-- enable authorization (this red remark should not be in your file)
        Options None
        Order allow,deny
        Allow from all
    </Directory>
    #

CGI Tip-5 (dealing with large uploads)

Most form data sent to Apache will be less than or equal to 32,767 bytes. But if you are supporting file-upload like so:

<form method="post" enctype="multipart/form-data" action="/scripts/upload_test_neil">
  <input type="text"   name="textline">
  <input type="file"   name="datafile">
  <input type="submit" name="Send">
</form>
...then you might experience uploads larger than 32,767 bytes. Okay so what's the big deal? Well, if your CGI program was written in "C" then it would be no big deal to read Apache symbol "CONTENT_LENGTH" then malloc that amount (if you have enough memory). Next, you would call fopen(fname,"rb") then read the whole amount all in one operation.

This is not possible if your CGI program was written in BASIC since that language limits strings to a maximum size of 32,767. This means you would need to do multiple reads until you have extracted CONTENT_LENGTH bytes.

Caveat: I am currently (2010-05-30) working on a better CGI program which will properly deal with form data larger than 32k and will post the solution soon. As always, it will be available free-of-charge. The solution will be written in HP BASIC for OpenVMS but I will also post some working stubs written in "C".

CGI Related Links:

CSWS/SWS Installation Tips

Installation Tip #1 (using TELNET to test a new installation)

Testing with HEAD

Telnet www.bellics.net 80					<<<--- type this then hit <enter>
HEAD / HTTP/1.0							<<<--- type this then hit <enter>
								<<<--- hit <enter> (a blank line to end the HTTP header)
HTTP/1.1 200 OK							<<<--- start of the HTML response header
Date: Mon, 08 Jun 2009 20:17:47 GMT				<<<--- server's current time stamp
Server: Apache/2.0.52 (OpenVMS) mod_ssl/2.0.52 OpenSSL/0.9.7d	<<<--- web server flavor and version; ssl version
Last-Modified: Thu, 13 Aug 2009 16:59:51 GMT			<<<--- web page time stamp (for receiver's caching logic)
Accept-Ranges: bytes						<<<--- server accepts "bytes"
Connection: close						<<<--- one request and one response
Content-Type: text/html						<<<--- the following document is HTML formatted
								<<<--- notice the blank line to end the HTTP header

Testing with "GET and HTTP/1.0" (simpler than HTTP/1.1)

Telnet www.bellics.net 80					<<<--- type this then hit <enter>
GET / HTTP/1.0							<<<--- type this then hit <enter>
								<<<--- hit <enter> (a blank line to end the HTTP header)
HTTP/1.1 200 OK							<<<--- start of the HTML response header
Date: Mon, 08 Jun 2009 20:17:47 GMT                     	<<<--- server's current time stamp
Server: Apache/2.0.52 (OpenVMS) mod_ssl/2.0.52 OpenSSL/0.9.7d	<<<--- web server flavor and version; ssl version
Last-Modified: Thu, 13 Aug 2009 16:59:51 GMT			<<<--- web page time stamp (for receiver's caching logic)
Accept-Ranges: bytes						<<<--- server accepts "bytes"
Content-Length: 982 bytes					<<<--- the HTML content block is 982 bytes
Connection: close						<<<--- one request and one response
Content-Type: text/html						<<<--- the following document is HTML formatted
								<<<--- notice the blank line to end the HTTP header
<html>								<<<--- start of HTML content (the web page)
<head>
<title>Integrated Convergence Support Information System</title>
Command Notes:
line 1 telnet to "www.bellics.net" using TCP/IP port 80 (telnet defaults to port 23)
line 2 GET will pull back the whole web page; use HEAD or OPTIONS to only pull back server data
  "/" requests the server's default document found in the root directory; you could have also entered something like:
"/default.htm" or "/login.html" or "/scripts/whatever"
  HTTP/1.0 indicates we do not want a persistent connection etc. (keep things really simple in this demo)
line 3 a blank line indicates the end of the sender's HTML request block

Response Notes:
line 1 HTTP/1.1 I am able to support HTTP version 1.1 (persistent connections, etc.)
  200 OK HTTP status message indicating that everything went as planned
line 2 Date ... server's current date + time usually in international format
line 3 Server ... Server software and installed security modules
line 4 Last-Modified last modified date of the file I am sending you (for your cache)

Testing with "GET and HTTP/1.1"

Telnet www.bellics.net 80					<<<--- type this then hit <enter>
GET / HTTP/1.1							<<<--- type this then hit <enter>
host: www.bellics.net						<<<--- mandatory with HTTP/1.1
content-type: text/html						<<<--- optional: says "I can process HTML documents"
connection: close						<<<--- optional: return a page then close (do not persist)
								<<<--- hit <enter> (a blank line to end the HTTP header)
HTTP/1.1 200 OK							<<<--- start of the HTML response header
Date: Mon, 08 Jun 2009 20:17:47 GMT                     	<<<--- server's current time stamp
Server: Apache/2.0.52 (OpenVMS) mod_ssl/2.0.52 OpenSSL/0.9.7d	<<<--- web server flavor and version; ssl version
Last-Modified: Thu, 13 Aug 2009 16:59:51 GMT            	<<<--- web page time stamp (for receiver's caching logic)
Accept-Ranges: bytes						<<<--- server accepts "bytes"
Content-Length: 982 bytes					<<<--- the HTML content block is 982 bytes
Connection: close						<<<--- one request and one response
Content-Type: text/html						<<<--- the following document is HTML formatted
								<<<--- notice the blank line to end the HTTP header
<html>								<<<--- start of HTML content (the web page)
<head>
<title>Integrated Convergence Support Information System</title>

Testing through a proxy server (just to show how it is done)

Caveats:

  1. a proxy server is a device employed by large enterprises to protect an intranet from an extranet (the real world internet).
  2. a proxy server is not the same as a router/firewall appliance which usually employs NAT (network address translation)
<ur>	Telnet 192.168.210.220 80						!connect to proxy server on port 80
<sr>	%TCPWARE_TELNET-I-TRYING, trying concealed.ca,http (192.168.210.220,80) ...
%TCPWARE_TELNET-I-ESCCHR, escape (attention) character is "^\" <ur> CONNECT www.bellics.com:80 HTTP/1.1 ! connect to node on port 80 using HTTP1.1 ! blank line ends HTTP header <sr> HTTP/1.1 200 Connection established ! proxy has connected <ur> GET / HTTP/1.1 !
host: www.bellics.com !
content-type: text/html !
connection: close ! ! blank line ends HTTP header <sr> HTTP/1.1 200 OK
Date: Mon, 08 Jun 2009 20:17:47 GMT
Server: Apache/2.0.52 (OpenVMS) mod_ssl/2.0.52 OpenSSL/0.9.7d
Last-Modified: Thu, 13 Aug 2009 16:59:51 GMT
Accept-Ranges: bytes
Content-Length: 982 bytes
Connection: close
Content-Type: text/html

<html>
<head>
<title>Integrated Convergence Support Information System</title>

Installation Tip #2 (ODS-5 and the system disk)

Installation Tip #3 (ODS-5 + SSL)

Compaq states that an ODS-5 volume is not required for a non-JAVA installation of SWS, but I've found that the online documentation for "mod ssl User" will be corrupt due to the presence of filenames with multiple dots (which is supported in the optional ODS-5 but not the standard ODS-2). What's worse is that you won't see any error messages during installation to an ODS-2 disk. Because I thought that other things could have become compromised, I decided to only install to ODS-5 volumes.

ODS-5 and DCL

This paragraph has nothing to do with the web servers but I decided to mention it here anyway. If you've enabled ODS-5 then you might notice a few strange changes:

If a file is created by an application program written in a high level language (e.g. BASIC, C, C++, etc.) and the file name was defined using lower case characters, and the file doesn't yet exist, then when the file is created you will see a lower case name in the associated directory. If the file already exists, a new file of the same name and location will match the case of the original file.

This DCL command (in effect by default) will make your interactive session work the ODS-2 way on an ODS-5 system.

$set proc/parse_style=traditional/case_lookup=blind

This command will allow you to make your process case-sensitive (so you can rename a file changing its case):

$set proc/parse_style=extended/case_lookup=sensitive

CAVEAT: If your system has been running for years in case-blind mode, then it would be a real bad idea to place this case-sensitive entry into system file "SYS$MANAGER:SYLOGIN.COM". However, it is a completely different situation if your system is new AND all users will be running in case-sensitive mode. Consult HP OpenVMS System Manager's Manual, Volume 1: Essentials before you make any changes affecting more than one account. I have encountered situations were a user couldn't even log off.

SWS-2.2 (based upon Apache 2.0.63 and OpenSSL 0.9.8h)

HPQ says this is only a maintenance release but I wanted it anyway. Why?

My Apache Tweaks

What's up with favicon.ico ?

Analyzing Apache log files will show a huge number of references to a missing file named favicon.ico

Don't let SSL and IE kill your system
#
# file: [.conf]ssl.conf
#

{ ...snip... }

#
#	this old Apache declaration is no long 100% true (why treat all IE browsers badly?)
#
#SetEnvIf User-Agent ".*MSIE.*" nokeepalive ssl-unclean-shutdown
#
#	experimental hack to speed up SSL for IE users - NSR (2012-08-13)
#
#BrowserMatch "MSIE [1-4]" nokeepalive ssl-unclean-shutdown downgrade-1.0 force-response-1.0
#BrowserMatch "MSIE [5-9]" ssl-unclean-shutdown
#
#	experimental hack to speed up SSL for IE users - NSR (2012-08-14)
#
#	note:	"MSIE 1" (above) didn't support SSL so it is superfluous
#       	Meanwhile, "MSIE 1" probably break MSIE 10 when it finally appears)
#
BrowserMatch ".*MSIE [2-5].*" nokeepalive ssl-unclean-shutdown downgrade-1.0 force-response-1.0
Ruthless Browser Caching (caveat: do not use this code without first testing it on a captive system)

Our site employs AJAX and I just noticed (2012) that certain apps are continually (every minute) pulling back three little gifs. What is worse is that these gifs are sent over the encrypted channel (which adds additional overhead) and the problem is multiplied by the fact that ~120 employees are using the app at the same time. You can't control how a user sets up his browser "cache-wise", but this little tweak has reduced the problem on my system:

Caveats:
  1. I have removed the 2012 hack which didn't work well in 2013 after most of our clients upgraded their browsers from IE7 to IE8
    (why such old browsers when Microsoft is just introducing IE11? Our corporate IT people said so)
  2. The following 2013 hack is for modern browsers and implements ruthless caching which means that once a file is cached, your browser will never check back for changes (cache validation) until expiry. By ending validation tests you will reduce a lot of traffic and will also stop all those "HTTP/1.1 304 Not Modified" messages seen in your Apache access logs.
  3. Sites implementing this kind of caching will run into problems if "software is constantly under development" or "data is constantly changing". The recommended way out is to embed dates (or version numbers) into the filename like these examples:
    1. my_app_name_20131231.css
    2. your_app_name_20131231.js
    3. message_of_the_day_20131231.js
  4. This twaek has been tested to work as-is with Apache/2.0.63 on OpenVMS (examined the Apache logs; examined both client and server packets with WireShark; tested with YSlow)
#
# file: [.conf]httpd.conf
#

[...snip...]

#----------------------------------------------------------------------------------------
# 2013 tweak to force browsers to cache more stuff (NSR 2013-12-16)
#
# notes:
# 0) verified with Apache/2.0.63 on OpenVMS
# 1) Expires requires mod_expire
# 2) Header requires mod_header
# 3) Apache docs indicate that Expires will set the necessary header fields so keep your
#    header directives to a minimum
# 4) Increasing the amount of meta data sent to the browser will increase the likelihood
#    that your browser will do more cached-data validation tests (switching from ruthless
#    caching to careful caching) so send less
# 5) typing CTRL-R or clicking page reload will force browsers to do cache revalidation
#    (cool)
#----------------------------------------------------------------------------------------
ExpiresActive                           On
#
# By default, cache all files for 30 days after access (A)
# If your system uses dates (or version numbers) in filenames (e.g. logo_20131231.gif)
#       then consider changing ExpiresDefault to one year
#
ExpiresDefault                          A2592000
#
# 24 cache (note: "YSlow v2" would prefer you set these to at least 72 hours)
# If your system uses embedded dates (or version numbers) in the filenames
#       (e.g. slide_menu_20131231.js), then do not override ExpiresDefault here
#
ExpiresByType   text/css                A86400
# ExpiresByType text/javascript         A86400 # not for our system (see conf/mime.types)
# ExpiresByType text/x-javascript       A86400 # not for our system (see conf/mime.types)
ExpiresByType   application/javascript  A86400
#
# Do not cache dynamically generated pages (if you don't employ AJAX then you might not
# need these next three lines on YOUR system but I recommend keeping text/html under 4
# hours so you can detect changes which might request page component name changes (eg.
# .gif .css .js etc.)
#
ExpiresByType text/html                 now
ExpiresByType text/xml                  now
ExpiresByType text/plain                now
#----------------------------------------------------------------------
#       this will override (if a static file) something set above
#
# note: changes in browser caching policy might make you think this logic is not firing.
# To verify that it is, uncomment the "header set MyHeader" line below to see the
# associated text in the data packet (you will need a packet sniffer like WireShark)
#----------------------------------------------------------------------
# 30 days
<FilesMatch "\.(gif|jpg|jpeg|ico|png|pdf)$">
Header set Cache-Control "public, max-age=2592000"
#       note: 'Header set Pragma "public"' is only used by IE8 and lower
Header set Pragma "public"
#       sending ETags will disable Ruthless Caching
FileETag none
Header unset FileETag
#       note: uncomment next line to see this message in the packet (WireShark)
# Header set MyHeader "trigger block #1 (images)"
</FilesMatch>
#----------------------------------------------------------------------
#       this will override (if a static file) something set above
#----------------------------------------------------------------------
# 24 hours
<FilesMatch "\.(js|css)$">
Header set Cache-Control "public, max-age=84600"
#       note: 'Header set Pragma "public"' is only used by IE8 and lower
Header set Pragma "public"
#       sending ETags will disable Ruthless Caching
FileETag none
Header unset FileETag
#       note: uncomment next line to see this message in the packet (WireShark)
# Header set MyHeader "trigger block #2 (js,css)"
</FilesMatch>
#----------------------------------------------------------------------
#       this will override (if a static file) something set above 
#----------------------------------------------------------------------
# 4 hours
<FilesMatch "\.(htm|html)$">
ExpiresByType text/html                 A14400
Header set Cache-Control "public, max-age=14400"
#       note: 'Header set Pragma "public"' is only used by IE8 and lower
Header set Pragma "public"
#       sending ETags will disable Ruthless Caching
FileETag none
Header unset FileETag
#       note: uncomment next line to see this message in the packet (WireShark)
# Header set MyHeader "trigger block #3 (html)"
</FilesMatch>
#-----------------------------------------------------------------------------------------

{ ...snip... }
Post Script: Rather than get into a religious discussion on client-side web-caching let me just pass on the 10,000 foot view. Client-side caching comes in three basic flavors:
  1. No Caching
    • every page visit or page reload will result in the client (usually a browser) requesting every supporting page component from the server
    • Out-of-the box web servers tell browsers to use method. YOU need to adjust YOUR server accordingly.
  2. Careful Caching (the apparent default for modern browsers in 2013 provided you set up your server properly)
    • the first page visit will cause the client to request page components from the server
    • the server will send page components along with caching suggestions (meta data)
    • if the browser sees any caching tips it does not like (these rules appear to be browser specific) then it will do one of the following:
      • not cache the page at all (think of what might happen if you inadvertently expire a page in the past)
      • cache the page then send a freshness validation request before the cached copy is reused locally (hey, has the page changed since last time?). Although these packets are small compared to doing full file transfers, these secondary requests must be processed by a software driven system. This added overhead will be in the way of serving primary requests. You can confirm their existence by inspecting your Apache request logs looking for "HTTP/1.1 304" response messages.
    • an increase in the amount of meta data sent to the browser will increase the likelihood that your browser will engage in Careful Caching (so consider sending less if you want client browsers to resort to Ruthless Caching)
  3. Ruthless Caching
    • the first page visit will cause the client to request page components from the server
    • as long as the client thinks the cached data is fresh (even if it is 30 days old), it will never send a freshness validation test.
    • caveat: now there is an increased likelihood that the browser will contain stale data. The only way out of this situation is to embed dates or version numbers in your file names. Here are some examples:
      • corporate_logo_20131231.gif
      • slide_menu_20131231.js
      • main_page_20131231.css

Mission Impossible (not)

Your mission, should you decide to accept it, is to use various hacker tools along with your wits to implement Ruthless Caching. Your system will appear much faster because primary transactions (user requests) will not be blocked by secondary transactions (freshness validations); your network costs will drop; your users will think you are a genius; your boss will give you a raise; women will offer to sleep with you; you will be master of your domain. (note: these superlatives were meant to attract the attention of web spiders like Googlebot )

More Suggestions:
  1. Your continuing mission is to make sure every new iteration of a half-dozen browsers continues to work properly with your solution.
    Caveat: watch out for clients using IE8, IE9 or IE10 who have inadvertently set their browsers (via the F12 key or some other Compatibility View mechanism) back into IE7 mode.
  2. You will need to check this functionality every time you upgrade Apache.
  3. Check your Apache logs weekly. If you don't have time to read them line-by-line then just inspect their daily sizes (a sudden jump will indicate something has changed; could be hackers; could be people are using browsers you haven't tested, etc.) 
Sensible Browser Caching
#----------------------------------------------------------------------------------------
# 2013 tweak to force browsers to cache more stuff (NSR 2013-12-06)
#----------------------------------------------------------------------------------------
# notes:
# 0) verified with Apache/2.0.63 on OpenVMS
# 1) Expires requires mod_expire
# 2) Header requires mod_header
# 3) Specs say Expires is for older browsers whilst set-max=value is for newer browsers
#    (you can never tell when browser vendors will change browser caching policies)
# 4) typing CTRL-R or clicking page reload forces browsers to do cache revalidation so
#    test caching (after the initial page load) by pasting a URL into a new browser tab
# 5) Our system contains mixed-case filenames. (ugh!)
#----------------------------------------------------------------------------------------
#
# 30 days
<FilesMatch "\.(ico|gif|png|jpg|jpeg|pdf|flv|swf|ICO|GIF|PNG|JPG|JPEG|PDF|FLV|SWF)$">
ExpiresActive On
ExpiresDefault A2592000
Header set Pragma "public"
Header set Cache-Control "public, max-age=2592000"
Header unset FileETag
FileETag none
</FilesMatch>
#
# 24 hours
<FilesMatch "\.(xml|txt|html|htm|js|css|XML|TXT|HTML|HTM|JS|CSS)$">
ExpiresActive On
ExpiresDefault A86400
Header set Pragma "public"
Header set Cache-Control "public, max-age=86400"
Header unset FileETag
FileETag none
</FilesMatch>
Better Apache config (for corporate use on an intranet)
#
# file: [.conf]httpd.conf
#

{ ...snip... }

#
#	enable HTTP/1.1 keepalives so clients do not open/close on every page component
#
KeepAlive		On

#
#	more efficient than the default of 15 (should never exceed 60 seconds)
#
KeepAliveTimeout	60

#
#	more efficient than the default of 100
#
MaxKeepAliveRequests	999

#
#	since Apache only grows once per second, start off with MaxSpareServers
#
StartServers		10
MinSpareServers		5
MaxSpareServers		10

#
#	you will never have more server processes than MaxClients
#
# If you set MaxClients too large then you might run out of VMS process slots when
# hackers use Apache to probe our system (I have seen this happen).
# Consider using SYSGEN to increase MaxProcessCnt when you increase MaxClients
#
MaxClients		100

#	MaxRequestsPerChild defaults to 0 (which is good when there are no memory leaks)
#
MaxRequestsPerChild	999

#
# send less text on the server line (affects every web page, and page component)
# this:
#	Server: Apache/2.0.63 (OpenVMS) mod_ssl/2.0.63 OpenSSL/0.9.8w
# becomes this:
#	Server: Apache/2.0.63
#
ServerTokens		Min

[...snip...]
Changes to OpenVMS SYSGEN (for Apache)

Apache needs resources for interprocess communications (VMS mailboxes are like UNIX pipes)

legend:	<sr> = system response
	<ur> = user response
-------------------------------------------------------------------------
	recording your current sysgen settings to a file
<sr>	$
<ur>	def/user sys$output sysgen_20131031.txt	! output will be diverted
<sr>	$
<ur>	mcr sysgen
<sr>	SYSGEN>
<ur>	sho /all				! output goes to file
<sr>	SYSGEN>
<ur>	exit
<sr>	$
<ur>	type sysgen_20131031.txt
<sr>	...file contents are displayed...
-------------------------------------------------------------------------
	making changes to your running system (DANGER)
<sr>	$
<ur>	mcr sysgen
<sr>	SYSGEN>
<ur>	sho maxbuf
<sr>	Parameter Name Current Default    Min.    Max.  Unit Dynamic
        -------------- ------- ------- ------- -------  ---- -------
        MAXBUF            8192    8192    4096   64000 Bytes       D
<ur>	set maxbuf 64000
<sr>	SYSGEN>
<ur>	set defmbxbufquo 64000
<sr>	SYSGEN>
<ur>	set defmbxmxmsg 64000
<sr>	SYSGEN>
<ur>	write current
<sr>	SYSGEN>
<ur>	write active
<sr>	SYSGEN>
<ur>	exit
<sr>	$
--------------------------------------------------------------------------
	ensure you add these overrides to file sys$system:modparams.dat
Webpage hit-counters (2013-11-29)
Webmasters have always been told to never use hit-counters because they consume precious resources. But many rinky-dink sites, especially some on corporate intranets sitting behind a firewall, need them for good P.R. with other departments.

Image-based Counters

These seem to be the gold-standard in counters. Between 1995 and 2000 (when web servers were powerful platforms serving up mostly static webpages to underpowered desktop PCs sporting Pentium-2 or Pentium-3 processors) most sites used a bit of freeware written by Muhammad Muquit named "count". Needless to say, I am envious of his programming skills. But this solution places a more-than-a-trivial computational burden on the server. Why? Updating a count-file is the easy part so no problem here. However, using the count-file digits to reference a library of individual digits graphics then assemble binary slices scan-line by scan-line into a resultant GIF is the harder part.

Needless to say that Muquit's program works on many webserver flavors including those for VMS and OpenVMS including CSWS (Apache), OSU HTTPd, Purveyor, and WASD. Some of those distributions can be found here
Text-based Counters (2013-11-xx)

I'm running an overworked decade-old server (an AlphaServer DS20e installed in 2002) and now think the time has come to shift the computational burden from the server to the client's browser. I have a little "C" program ready which increments the counter file then sends back the plain-text result. The calling webpage uses a small amount of AJAX (~20 lines) to send the increment request, receive the plain-text count, then inject the value into the browser's DOM for rendering. The browser may not have access to all the cool image libraries seen in Muhammad's offering but that may not be as big a deal as you might think. At least your boss (and user community) will get instant feedback.

On my systems, most users can't tell one font from another so never noticed anything other than a faster response.

Freeing Up Disk Space / Log Rotation

Note: this will aid in analyzing your Apache log files so you can improve your web server system. Failure to do this places all your transactions in one huge file.

<sr>	$
<ur>	sh time
<sr>	27-NOV-2005 20:52:11
	$ 
<ur>	dir [...]*.*/siz=all/date/sel=siz=min=10000
<sr>	Directory APACHE$COMMON:[000000.SPECIFIC.KAWC15.LOGS]

	ACCESS_LOG.;1        1061068/1061095  16-SEP-2003 19:48:46.00
	ERROR_LOG.;1          455636/455700   16-SEP-2003 19:48:44.50
	SSL_ENGINE_LOG.;1     238230/238280   16-SEP-2003 19:48:44.56

	$
<ur>	del APACHE$COMMON:[000000.SPECIFIC.KAWC15.LOGS]*_log*.*;*

Wow! These files have been growing 26 months

Notes:

  1. These files grow forever. I have not been able to get "built-in Apache log rotation" working on OpenVMS (it would have added overhead to Apache anyway) so it is probably a good idea to stop the server every month to delete the files. I use a script and do it on Sunday mornings at 1:00 AM.

    -or-
  2. this might be a better method (you do not need to stop the server)
    • start by creating 31 subdirectories named [.log01] to [.log31] which would hold log files for each day of the month
    • run a batch job every night (perhaps at one second after midnight) to do the following:
      1. rename the files into the desired subdirectory while they are open (this is legal in OpenVMS and writing will continue in the new location)
      2. now execute this job (a DCL command template for day 31)
        $ set def sys$common:[000000.specific.www.logs]	!
        $ ren *_log*.* [.log31]				! okay to do this while files are open
        $ @APACHE$COMMON:[000000]APACHE$SETUP		! create Apache symbols for our process
        $ httpd -k flush ! tell Apache to flush the log buffers to disk
        $ httpd -k new ! tell Apache to close current files then open new ones
    -or-
  3. you could just use this automated DCL script    <<<---***

Links:


Back to OpenVMS
Back to Home
Neil Rieck
Kitchener - Waterloo - Cambridge, Ontario, Canada.