by Michael Erwin
So far you've seen several FORM-based interactions and a few uses for IMAGE MAPs. Although you already have a good start on understanding what makes CGI tick, you need to look at one area that seems to be overlooked a lot on the Web-the end user and how to use CGI scripts to interact with this unknown variable.
As you'll see in this chapter, the more invisible a CGI script is, the better off you'll be. You can make your scripts more invisible by making the interaction with them intuitive. By making your scripts and the associated interfaces intuitive, the end user will move seamlessly within your Web site, thus increasing the flow of work and knowledge.
After all, isn't this one of the reasons you bought this book? You wanted to go beyond HTML and create true client/server interactivity on your Web site.
In this chapter, you'll learn about the following:
You've already seen CGI interaction through basic HTML forms in Chapter 8 "Modifying CGI Scripts"-for example, by using such scripts as Guestbook and WWWBoard (see figs. 17.1 and 17.2). However, all of us should strive to take this interaction further. How? By enabling multiple users to interact with each other through a CGI script.
Figure 17.1 : Here's the Add Entry HTML Interface to the guestbook CGI script covered in Chapter 8.
Figure 17.2 : Shown here is some of the HTML output from Matt Wright's WWWBoard CGI script.
With Guestbook and WWWBoard, you started allowing other users to do some kind of interaction with each other. In the Guestbook example, users could post simple messages and leave their e-mail addresses, which allowed your users to send notes to one another. In WWWBoard, you took this example a little further, enabling users not only to read the postings of others, but also to reply to the original posting without starting an e-mail program. Doing this gives you a repository of related information within the message threads. This also adds significant value to the archived information.
This allowed your users, in a fairly simple manner, to interact with many other users through the CGI script. However, the communication between users can be taken even further with CGI. What if you could give your users a way to communicate with others on your Web site in pseudo real time? This would provide additional flexibility that some groupware designers only wish their products had.
WWW Interactive Talk (WIT) is an HTML forms-based discussion system
that's very similar in most cases to the way Lotus Notes can be
used for group discussion and comments. (This kind of software
is also referred to as groupware.) It was created to allow
individuals to comment on various areas within a fairly structured
environment. It's a way that you can provide an HTML page that
others can append their comments to, so individuals immediately
see whether a specific matter has been brought to the surface
before it's resolved.
NOTE |
For more information on WIT, check out http://www.w3.org/hypertext/WWW/WIT/. WIT is also included on the CD-ROM accompanying this book. |
This is a far different approach than what happens in Usenet newsgroups or mailing list servers. In fact, it's far superior for group discussions, such as workgroups. It can handle the process of discussion in a manner that many managers would appreciate. Instead of a drawn-out process, in which everyone must go read the FAQ of that area and follow threads, WIT items are divided into three groups: discussion areas, topic documents, and proposals.
Unlike newsgroups, everyone participating in a WIT forum must follow a few rules for everything to work the way it should. First, a discussion area can be created only by the system manager. For example, as the system administrator, I created a discussion area called CGI Discussion Area. This will be the area in which we discuss items related to CGI and, of course, can cross-reference related discussion areas. The CGI discussion area might be cross-referenced to the Using Perl for CGI area. Get the idea? Figure 17.3 shows a typical discussion area.
Figure 17.3 : The WIT user information documentation can be found at http://www.w3.org/.
Under each discussion area is a slightly more specific area called topics (see fig. 17.4). Now the way the CGI script is written, anyone can create a topic document under any of the discussion areas. Here's where the rules come in:
An example of a topic would be, "What should we do about secure Web transactions?" This would be a great topic to be discussed under the CGI Discussion Area, and maybe "What language should we use for CGI?" would also fit.
After you create the basic discussion areas and users start putting topics for discussion within the topic area, the next type of document falls into place: proposals. In this section, people post their ideas for addressing a topic or problem (see fig 17.4). For this to work with the other areas, these proposals are posed as statements with which users can either agree or disagree, such as the following:
"We should use SSL."
"We need a new, secure Web server, such as Netscape."
"We should restrict access to some parts of the Web site."
By using such a system as WWW Interactive Talk, the knowledge accumulated over time can be referenced and modified at any time. You can look through the sections and specific documents to see whether a matter has already been resolved. For example, the work flow in a CGI discussion area would go something like the following:
"What should we do about secure Web transactions?"
"We should use SSL."
"Can we afford a new secure Web server, such as Netscape?"
"We should restrict access to some parts of the Web site."
"What language should we use for CGI?"
That's the progression of the simple guest book in a workgroup
organizational system. Such a system gives your Web site added
value because of the information stored on it. And while we're
here, one other environment that this type of CGI application
would easily fit into is a corporate intranet Web server. Then
you've given management a way to follow ideas and processes from
conception to implementation, both easily and affordably.
NOTE |
An intranet comes about when a company uses Internet technology, such as Web servers and CGI, on the corporate LAN. Access to intranet services is limited to workstations inside the company via the LAN or even a wide area network (WAN). Because so many companies use TCP/IP as their network protocol of choice, especially if they're in a WAN environment, they can use an Internet firewall and Web server configurations to keep outsiders out. All the CGI scripts in this book can be used in intranet environments. Again, you see the use of widely used, affordable Internet technology to replace proprietary and sometimes very costly communication software. |
Another way of enabling human-to-human interaction is simple communication. One of the most popular features of the Net is Internet Relay Chat (IRC), which allows many users in various locations to chat with each other. Think of IRC as a text-based telephone party line through which many people are sometimes talking about the same general thing. Although IRC is as close to real-time human to human interaction than is currently available, the communication is very unorganized. Information that has been discussed is lost unless one of those in the conversation has been capturing the flow of text.
Chat systems is an area where CGI scripts have faded a little bit because of the dedicated IRC client software that has become increasingly available on the Net. This isn't to say that chat systems implemented on a Web server are less functional-in fact, I would say just the opposite is true. HTML and CGI has given many people a great form of entertainment. Not only can they attach small pictures of themselves to the text they write, they can even create other little HTML worlds in which to chat with one another (see fig. 17.5).
Before I get too deep into CGI chat scripts, I feel I should cover the area of performance of these types of CGI- and HTML-based systems. Because of the nature of real-time chat, you have to update the client's browser by using Netscape's extensions of PUSH/PULL or have the user keep clicking some sort of update button.
The way these scripts work is quite simple. Think of it this way: A user receives an HTML document like the one in figure 17.5. At the bottom of the screen the user can type a message along with a user name of some kind. When the user submits the HTML form, a CGI script takes the inputted message, appends it to a chat file, and normally deletes the first message in the file. It then composes the HTML header for the page and inserts an HTML version of the chat file. Then the CGI script composes the bottom of the HTML file while passing information, such as the user's nickname, back into the form.
It's pretty simple program flow, except that it's a huge resource hog! Why? Because the user will get the new updated screen. Suppose that 50 people are using this chat system. That means the server will need to send the HTML document 50 times so that everyone will get the updated HTML document. It also means that you have to put HTML extensions in the document to cause the users' browsers to request the chat document every so often or to have them keep reloading the page. And you would need to do both because not everyone is going to be using a PUSH/PULL compatible browser. However, those who are using a compatible browser are going to be screaming, "Why don't you use PUSH/PULL technology?"
So if you add PUSH/PULL, how long are you going to wait for the browser to PULL the next update? Five, 10, 20 or more seconds? For the sake of argument, you set the META header to pull the document every 30 seconds. That means your server hits will be adding up at the rate of 100 per minute-and that doesn't include graphic images. If three of the 10 or so chat file messages have associated graphics with them, you've just added an additional 300 hits per minute on your server. That works out to be 24,000 hits an hour.
All of this is contingent on updating the file only every 30 seconds. This gets even wilder; only 10 messages are in the chat file at any given time. This means that no more than 20 or so messages can be entered every minute, or else everyone is going to miss a few messages every refresh. This isn't to mention that the average file, including graphics files, will be around 4K in size. This leads to another problem. You know you'll have 24,000 hits an hour. So if you multiply 24,000 hits by 4K of data, you're going to wipe out 96,000,000 bytes, or 96M, of bandwidth an hour (1.6M per minute). A T-1 has approximately 1.1M-per-minute capacity. I don't know about your circumstances, but doing this through a T-1 just became a wipeout. It's time to be adding another T-1 data circuit, or no one will be happy.
Is it unrealistic to imagine 50 people using a chat system? Maybe a little; it might be even higher. Due to the nature of the Web server, there also is no way of reliably limiting the users. So always do the math on some of these little projects. It doesn't take long for things to get out of hand. Remember two things when calculating bandwidth requirements. First, the IP bandwidth numbers are finite. Second, speed costs money. How fast do you want to go?
One of the nicer chat systems available is WebChat. This CGI-based system is very flexible for most Webmasters. It's written in Perl, so it should be fairly easy to modify to suit your specific needs.
This CGI chat system consists of a couple of GIF images, two Perl scripts, and an HTML form interface. You can FTP the archived tar file from ftp://ftp.webchat.org/pub/webchat, or from the CD-ROM accompanying this book. One nice feature of this system is that all the popular Web browsers can be used to interface with it because this system uses only an HTML form for the interface (see fig. 17.6).
Another feature of WebChat is the capability to link to images
anywhere in the world. This takes an unnecessary burden off the
Web server because it allows the client's browser to get the file
from someone else's Web server. The downside of this is that it
may take a while for the requested corresponding image to be returned
to your browser.
NOTE |
This area of CGI scripts-chat systems-is becoming more and more commercialized. In fact, many of the publicly available CGI software packages are becoming commercial. WebChat even has a bigger brother that isn't cheap; however, the "commercial" version does do some impressive things, like WebChatCam. Some CGI-based chat systems are selling for well over $1,000 for a 10-user license. I have a hard time justifying that, unless it's going to be used for corporate Web sites. |
After you retrieve the software, you'll need to untar the archive into the cgi-bin directory on your Web server. Now, this version works only on UNIX-based Web servers. Another version is available that might be modifiable to run on a Windows NT Web server with Perl.
After following the installation directions included with the distribution tar file, the CGI scripts will need only a small amount of modification to be able to use them. Mostly, this will be the editing of paths and executables. It shouldn't take you any more than 30 minutes to get this system up and running.
After you install the CGI scripts, you can test the system by loading the URL of the chat form. Enter information into the HTML form and submit it. You should be sent another HTML form similar to that shown in figure 17.7.
NOTE |
If you're using Netscape's browser, you'll be glad to know that the WebChat system of CGI scripts uses client PULL to get the new updated HTML document. If your browser doesn't support Netscape's PULL feature, click the chat button to update your page. |
Now that you've installed a highly interactive CGI system, what can you really use this type of chat system for?
No matter what you use a chat system for, I'm sure you'll find even more uses for it than cited here.
One way you can make your CGI scripts more interactive is by using something fairly new to the world called magic cookies or just cookies.
So what are these cookies? They're just small text files stored on the client side of the Web. That means you can actually have your CGI scripts make a cookie, and then have your Web server send this information to the client's browser. When the client's browser gets the information, it will store the data on the client's hard drive. Then at a later date, when the client revisits your Web site and uses a CGI script that request this cookie, the client's browser will look to see whether it has the requested cookie. If it does, the browser will send the information stored in the cookie.
There's a possible downside to using cookies. Currently, only Netscape, Netcruiser v3.0, Microsoft Internet Explorer, and Quarterdeck's Mosaic v2.0 browsers support using cookies. So you'll probably have to make sure that your CGI script is going to be compatible with the other browsers in the world. This shouldn't be a problem, though, if you require the users of your service to use one of these browsers or if you're using cookies on an intranet Web site where the company regulates what browser software is running within the company, and the company chooses to use a "cookie"-compatible browser.
Compared with using CGI to build a custom HTML form that has hidden input data for forms, cookies have a much greater prospective use. You could use cookies to support a CGI-based shopping system in which the customers' selected items are put into a virtual shopping cart, which is really stored in the cookie.
For other services, such as those that require registration, you could store your users' registration information in a cookie so that when they return to the service, a CGI script can check to see whether they already have an appropriate cookie. If they do, you could have the CGI script retrieve it from the client side and use the cookie data to build a custom HTML interface. That would seem to the users as though the service already knew who they were. And if a client has rights to only certain features of the service, your CGI script would already have that information.
Think about it this way: The client needs to fill out a registration form only once. This information is stored on the client side rather than on your Web server in some huge data file, which will become unmanageable. Talk about behind-the-scenes invisible CGI user interaction!
You could even use cookies as a kind of virtual coupon. This could be a little incentive for users to fill out a questionnaire. After the form is filled out the way you wanted, you could give users virtual coupons to be redeemed for some type of Web-based service. In fact, you could even set an expiration date so that if a client didn't use the cookie/coupon by a certain date, it would be void.
That's enough about what you could use cookies for; I'm sure you've even thought up a couple of other uses for them.
A cookie is made up of several items: URL names, an expiration date, PATH, and a secure flag. This information is actually sent in the HTTP header of a document. Now, the format for a cookie is as follows:
Set-Cookie: NAME=VALUE; expires=DATE; path=PATH; Âdomain=DOMAIN_NAME; secure
To break this format down, Set-Cookie: tells the client's browser that a cookie is getting ready to be handed to it. The next attribute is the cookie's name. This name can be anything you want and, of course, the value associated with this name can also be anything, such as NAME_OF_BAKERY=Torlones or ITEM_NUMBER=CC295. There's a limit to how much you can put into a NAME and the associated VALUE. You're limited to 4K of data. That should handle just about anything you'll need. This is also the only required attribute of a Set-Cookie: header for Netscape. However, Microsoft's Internet Explorer requires a full cookie header.
The next attribute of a cookie is expires=DATE. When this expiration date is reached, the client's browser will delete the associated cookie and no longer give it out. The following is an example of expires=DATE:
Set-Cookie: USERID=Michael_T_Erwin; expires=Tuesday, 31-Dec-96 Â23:59:59 GMT
In this cookie, my stored USERID name will no longer
be valid after 11:59:59 p.m. GMT Tuesday, December 31, 1996. This
cookie will expire at that point, and the browser won't send it
out.
TIP |
If you need to use spaces in the stored value of the cookie, use %20. For example, if I wanted the USERID to actually be Michael T Erwin without the underlines within the value, I could have written the following: Set-Cookie: USERID=Michael%20T%20Erwin; |
The path attribute can get a little confusing, so bear with me. It tells the browser what directories are valid for this cookie, as follows:
Set-Cookie: USERID=mikee; path=/bbs
This tells the browser that any time it requests a URL from the site and the URL is below /bbs, send the cookie, USERID=mikee, to the Web server. For example, if you requested /bbs/mainmenu.html from the Web server with the request for the document mainmenu.html, it would have also sent USERID=mikee. What's more, it also would send USERID=mikee if the URL had been /bbsdocs/index.html because you told it that the cookie is valid for any URL using the path /bbs.
Now if you had specified path=/, any URL you requested
from the cookie's originating Web site would cause the browser
to send the cookie to the Web server with the request for any
URL at that site. If you hadn't specified a path, the cookie would
be sent only if the directory was the same as the originating
URL.
COUTION |
There's a bug in Netscape Navigator version 1.1 and earlier. Cookies that have the path attribute set to / will be saved only if they have an expires attribute. |
The domain attribute tells the browser what domain names this cookie is valid for. If you set domain=.mcp.com, the browser would send that cookie to any of the Web servers at mcp.com. However, this also depends on the contents of the path attribute.
Another related issue with domain is that only hosts within the
same domain may set cookies to be used within that domain. To
carry this even further, you have to have at least two periods
in the domain attribute. This prevents someone from doing something
lame like domain=.com, and if you use a regional type
domain name, such as .k12.wv.us, you need to have three
periods in the domain attribute.
NOTE |
If a browser is requesting an URL that meets the criteria of several stored cookies, it will see every cookie that meets the domain and path criteria with the URL request. That results in the Web server receiving a cookie like this: Cookie: NAME=VALUE; NAME=VALUE; NAME=VALUE;... |
So how do you set a cookie in your CGI scripts? Well, you'll need to have a section of script that looks something like listing 17.1. In this example of a UNIX shell script, the CGI script creates the HTML header information and then sends the cookie, which is then followed by the rest of the HTML document.
Listing 17.1 A UNIX Shell Script for Sending the Cookie
#!/bin/sh echo "Content-type: text/html" echo "Set-cookie: UserId=mikee; expires=Wednesday, 31-Jan-96 12:00:00 GMT" echo "Set-cookie: Password=guess; expires=Wednesday, 31-Jan-96 12:00:00 GMT" echo "" echo "<HTML><HEAD><TITLE>Welcome to WWW BBS</TITLE>"
The CGI script in listing 17.1 is hard coded. That means you have to write a new shell script for every cookie you want to send, and you don't want that to happen. First, you must decide what information you need to put in the form of a cookie. After you do that, you need to decide where this information is to come from. Is the information going to be generated on the Web server, from an HTML form the client filled out, or possibly both?
Look at figure 17.8. This flow chart shows how the user will interact with a simple CGI registration service using cookies as the form of authentication. The first step is to request an URL, which is really a front door to the service. To keep things simple, make this HTML document a combination of items (see fig. 17.9).
When the client submits the form shown in figure 17.9, a CGI script is started to process the information. This CGI application will take the information the user inputted into the HTML form and generate several cookies, including one you won't tell the user about. The CGI script generates cookies for Userid and Password, and then sets the user's initial security or access level. It will then send a thank you and a welcome HTML document (see fig. 17.10). What your user doesn't know is that, at this point, he has just received three different cookies.
Now when a user reloads the welcome screen from his hotlist, the browser will notice that there are three cookies for this URL. It will then send the three cookies with the request to load the URL. When the Web server gets the request, it sees that it's to start up a CGI script.
The CGI script will actually look at the cookies' contents to see whether they have rights to access this page. The CGI script will also add an entry to the logs for this visit.
Users no longer need to worry about their registration information because it just became seamless to them. As the administrator, you see not only the Web hit, but you can look at the logs to see who's actually using the system. This gives users the freedom of not worrying about passwords and such.
You also get the capability to increase or decrease your user's security level because you've included it in a cookie. This creates a nice flexible system that's easily navigated by the client and manageable by the Webmaster.
As I stated before, more and more software is becoming commercial. One of the better commercial cookie-based software packages available on the Web is OopShop Shopping Cart System (see fig. 17.11). Because the system is being developed for commercial accounts, expect to pay around $500 for it.
Figure 17.11 : The OopShop home page is located at http://www.ids.net/~oops.
Jerry Yang, author of this system, has even released a smaller version of the software that he published under the GNU General Public License, version 2, which is included on the CD-ROM accompanying this book. He calls this software OopShop Free Cart. For more information on this cookie-based system, check out http://www.ids.net/~oops/tech/make-cookie.html.