CGI = Common Gateway Interface
Standard for interfacing external applications with information servers.
Most common application: HTTP servers
Why CGI?
Static pages served by HTTP servers are boring...
CGI allows for dynamic generation of web documents.
Good CGI reference material at
http://hoohoo.ncsa.uiuc.edu/cgi/
Some examples of dynamic documents at SDSU:
Why a dynamic SDSU homepage?
Basic steps in the life of a CGI program:
Here is possibly the most trivial CGI program that can be written under
Unix:
#!/bin/sh echo "Content-type: text/plain" echo "" echo "Hello, World"
If this program were to be referenced by a web browser, the resulting page
would show just
Hello, World
at the top left corner.
Things to note:
When a CGI program creates a dynamic document, it has to tell the client what
the document type is.
Some MIME types:
Most of the time text/html is used
Besides just generating documents dynamically, CGI's main purpose is to deal
with HTML forms.
Sample HTML form:
<form action="/cgi-bin/doSomething"> <input type="text" name="someText"> <input type="submit" value="Enter"> </form>
This will produce a text field and a button labeled "Enter".
The action attribute specifies the CGI program that gets run when the button is
clicked.
The CGI program does NOT get any information from its command line.
Environment variables are used:
REQUEST_METHOD
This is either "GET" or "POST"
The difference is in how form values are retrieved.
QUERY_STRING
If the REQUEST_METHOD is "GET", this is a list of name/value pairs
separated by `&'.
The name and values are separated by `=`
CONTENT_LENGTH
If the REQUEST_METHOD is "POST", this contains the length of data
available on STDIN. This input then needs to be interpreted the same as the
data from QUERY_STRING above.
PATH_INFO
Is used to pass extra information that was encoded in the URL that
started the CGI program.
This is normally in the form of values separated by `/'.
If our CGI program was called in response to the following HTML form:
<form action="/cgi-bin/doSomething"> <input type="text" name="someText"> <input type="text" name="someMoreText"> <input type="submit" value="Enter"> </form>
and the user had entered "hello there" in the first text field and "Goodbye" in
the second, the QUERY_STRING would be:
someText=hello+there&someMoreText=Goodbye
Note that the QUERY_STRING data will be URL encoded, meaning that characters
that would confuse the syntax need to be encoded as a `%' followed by two hex
digits.
In addition, all spaces in values are replaced with `+'.
Problems with using a Java program as a CGI program:
Solution:
jcgi
jcgi is a little C program that takes care of these problems:
To use the jcgi program, you need to create a symbolic link from the jcgi
executable to a file that is the name of the class that contains main().
Example:
public class TestThis { public static void main(String a[]) { PrintStream out = System.out; out.println("Content-type: text/plain"); out.println(""); out.println("Hello, World"); } }
% ls TestThis.java TestThis.class % ln -s /opt/local/lib/java/jcgi TestThis.cgi %
The .cgi extension is used so that the program will be seen as a CGI program by
the web server on moria.
Once we have a java CGI program, we now want to access the possible information
passed to it from a form.
The relevant environment variables are available as properties in the java
program.
Use System.getProperty() to get these.
for example:
String QueryString = System.getProperty("QUERY_STRING");
Once the query string is available, we can use the StringTokenizer class to
split it up into a set of name-value pairs.
Each name-value pair can then be split up into the name and value and placed
into a Hashtable for each access.
The sdsu.CGI class will take care of all these things.
The notable method is
String get(String) which will get the value for a name.
To use sdsu.CGI in a program you can do the following:
import sdsu.CGI; import java.io.PrintStream; public class CGITest { public void main(String a[]) { PrintStream out = System.out; CGI cgiVariables = new CGI(); out.println("Content-type: text/plain"); out.println(""); out.println("The text was: " + cgiVariables.get("someText")); } }
All the CGI stuff is pretty neat, but how can it really be used?
The WWW browser together with a CGI program can be seen as the GUI to an
application.
Notable issues here:
This state information needs to be mostly hidden from the user:
The class schedule is a CGI program located at
http://www.sdsu.edu/cgi-bin/schedule
To get to the Summer 96 schedule, the following URL is used:
http://www.sdsu.edu/cgi-bin/schedule/semester=summer96
The CGI program will be run with PATH_INFO set to
/semester=summer96
This information is then used by the CGI program to select the database to
use.
All links from that point on will use that URL as the base, so that the
operations are performed on that semester.
For example if we browse the courses offered in the Biology department we get
the following URL:
http://www.sdsu.edu/cgi-bin/schedule/browse=dept/command=search/dept=BIOL/semestr=summer96
Since a web application will probably have several different things it can do,
there needs to be some sort of command in the URL that calls the CGI
program.
Either PATH_INFO or hidden form elements can be used for this.
The main program will then need to dispatch to different code depending on the
command.
In C or Perl this can easily be done by using a table that maps commands to
functions.
In java this can be done by mapping commands to objects.
There are several things to be aware of when designing web applications:
As we have seen, the first thing a CGI program needs to do is to advertise what
type of document it is going to send.
The access counter that Mark Boyns wrote uses this feature to create a dynamic
GIF image when it is run.
The CGI program keeps track of the number of times it was called for that
particular page and then it creates the image.
Look at http://www.sdsu.edu/~boyns/counter.html for more information on this
program.