SDSU CS 596 Client-Server Programming
CGI

[To Lecture Notes Index]
San Diego State University -- This page last updated February 22, 1996

Contents of TITLE Lecture

  1. CGI
    1. Uses of CGI
    2. CGI: the protocol
      1. Trivial CGI program
    3. MIME types and CGI
    4. HTML Forms
      1. Environment variables
      2. Notable CGI variables
      3. Example QUERY_STRING
  2. Java vs. CGI
    1. jcgi
    2. Java properties
    3. sdsu.CGI
  3. Web applications
    1. Class schedule web application
    2. Command dispatching
  4. Issues with web applications
  5. CGI programs for non-HTML

CGI

CGI = Common Gateway Interface

Standard for interfacing external applications with information servers.

Most common application: HTTP servers

Why CGI?

Static pages served by HTTP servers are boring...

CGI allows for dynamic generation of web documents.

Good CGI reference material at

http://hoohoo.ncsa.uiuc.edu/cgi/


Uses of CGI


Some examples of dynamic documents at SDSU:



Why a dynamic SDSU homepage?


CGI: the protocol


Basic steps in the life of a CGI program:

  1. Web server gets request
  2. Web server figures out that the request is for a document that it knows is a CGI program
  3. Web server builds an environment with several special purpose variables
  4. Web server starts the CGI program in this environment
  5. CGI program interprets the environment variables
  6. CGI program sends document type information to STDOUT
  7. CGI program sends generated document to STDOUT
  8. CGI program quits.



Trivial CGI program


Here is possibly the most trivial CGI program that can be written under Unix:

#!/bin/sh
echo "Content-type: text/plain"
echo ""
echo "Hello, World"


If this program were to be referenced by a web browser, the resulting page would show just

Hello, World


at the top left corner.

Things to note:


MIME types and CGI


When a CGI program creates a dynamic document, it has to tell the client what the document type is.

Some MIME types:


Most of the time text/html is used


HTML Forms


Besides just generating documents dynamically, CGI's main purpose is to deal with HTML forms.

Sample HTML form:

<form action="/cgi-bin/doSomething">
<input type="text" name="someText">
<input type="submit" value="Enter">
</form>


This will produce a text field and a button labeled "Enter".

The action attribute specifies the CGI program that gets run when the button is clicked.


Environment variables


The CGI program does NOT get any information from its command line.

Environment variables are used:


Notable CGI variables


REQUEST_METHOD
This is either "GET" or "POST"
The difference is in how form values are retrieved.

QUERY_STRING
If the REQUEST_METHOD is "GET", this is a list of name/value pairs separated by `&'.
The name and values are separated by `=`

CONTENT_LENGTH
If the REQUEST_METHOD is "POST", this contains the length of data available on STDIN. This input then needs to be interpreted the same as the data from QUERY_STRING above.

PATH_INFO
Is used to pass extra information that was encoded in the URL that started the CGI program.
This is normally in the form of values separated by `/'.


Example QUERY_STRING


If our CGI program was called in response to the following HTML form:

<form action="/cgi-bin/doSomething">
<input type="text" name="someText">
<input type="text" name="someMoreText">
<input type="submit" value="Enter">
</form>


and the user had entered "hello there" in the first text field and "Goodbye" in the second, the QUERY_STRING would be:

someText=hello+there&someMoreText=Goodbye


Note that the QUERY_STRING data will be URL encoded, meaning that characters that would confuse the syntax need to be encoded as a `%' followed by two hex digits.
In addition, all spaces in values are replaced with `+'.


Java vs. CGI


Problems with using a Java program as a CGI program:


Solution:

jcgi

jcgi is a little C program that takes care of these problems:


jcgi


To use the jcgi program, you need to create a symbolic link from the jcgi executable to a file that is the name of the class that contains main().

Example:

public class TestThis
{
  public static void main(String a[])
  {
    PrintStream out = System.out;
    out.println("Content-type: text/plain");
    out.println("");
    out.println("Hello, World");
  }
}

% ls
TestThis.java    TestThis.class
% ln -s /opt/local/lib/java/jcgi TestThis.cgi
%


The .cgi extension is used so that the program will be seen as a CGI program by the web server on moria.


Java properties


Once we have a java CGI program, we now want to access the possible information passed to it from a form.

The relevant environment variables are available as properties in the java program.

Use System.getProperty() to get these.

for example:

String QueryString = System.getProperty("QUERY_STRING");

Once the query string is available, we can use the StringTokenizer class to split it up into a set of name-value pairs.

Each name-value pair can then be split up into the name and value and placed into a Hashtable for each access.


sdsu.CGI


The sdsu.CGI class will take care of all these things.

The notable method is

String get(String) which will get the value for a name.

To use sdsu.CGI in a program you can do the following:

import sdsu.CGI;
import java.io.PrintStream;
public class CGITest
{
  public void main(String a[])
  {
    PrintStream out = System.out;
    CGI cgiVariables = new CGI();
    out.println("Content-type: text/plain");
    out.println("");
    out.println("The text was: " +
               cgiVariables.get("someText"));
  }
}


Web applications


All the CGI stuff is pretty neat, but how can it really be used?

The WWW browser together with a CGI program can be seen as the GUI to an application.

Notable issues here:


This state information needs to be mostly hidden from the user:


Class schedule web application

The class schedule is a CGI program located at http://www.sdsu.edu/cgi-bin/schedule

To get to the Summer 96 schedule, the following URL is used:

http://www.sdsu.edu/cgi-bin/schedule/semester=summer96

The CGI program will be run with PATH_INFO set to

/semester=summer96

This information is then used by the CGI program to select the database to use.

All links from that point on will use that URL as the base, so that the operations are performed on that semester.

For example if we browse the courses offered in the Biology department we get the following URL:

http://www.sdsu.edu/cgi-bin/schedule/browse=dept/command=search/dept=BIOL/semestr=summer96


Command dispatching


Since a web application will probably have several different things it can do, there needs to be some sort of command in the URL that calls the CGI program.

Either PATH_INFO or hidden form elements can be used for this.

The main program will then need to dispatch to different code depending on the command.

In C or Perl this can easily be done by using a table that maps commands to functions.
In java this can be done by mapping commands to objects.


Issues with web applications


There are several things to be aware of when designing web applications:


CGI programs for non-HTML


As we have seen, the first thing a CGI program needs to do is to advertise what type of document it is going to send.

The access counter that Mark Boyns wrote uses this feature to create a dynamic GIF image when it is run.

The CGI program keeps track of the number of times it was called for that particular page and then it creates the image.

Look at http://www.sdsu.edu/~boyns/counter.html for more information on this program.