Chapter 3 Security and Encryption

by Billy Barron

CONTENTS

Security
Firewalls
Encryption and Digital Signatures
Summary

This chapter touches on a variety of issues and technologies to help you maintain security and privacy for yourself and your clients while programming. You learn what to watch out for when writing your Internet programs. Additionally, the chapter provides some background material on security and data encryption. You don't learn specific details of how to deal with every security and encryption issue. It would be impossible to cover everything in a single chapter; whole books have been written about firewalls and encryption alone. I attempt to refer you to other resources, such as other books or Web sites, to get more information.

Security

When you are programming on the Internet, never skip security. People really do crack Internet systems. If you are on the Internet long enough, someone will try to break into your system. Even the U.S. Department of Justice has had its Web site broken into. By taking appropriate steps to secure your systems and your programs, you (hopefully) will be able to avoid break-ins. Take the word of experience: The time spent up front is well worth it. A single security incident can easily eat up weeks of your time and potentially even millions of dollars if your systems are critical or have confidential information.

If setting up security is going to take an insane amount of time and money, you need to do a risk assessment to determine whether you should go ahead and do it. In a risk assessment, you look at the likelihood of a security breach and the cost of such a breach to your organization-remembering to include indirect factors such as loss of trust by your clients and related lost business-versus the cost of securing your system. Often, this analysis makes the answer obvious.

NOTE

No security system is ever 100 percent secure. This doesn't mean that you should just give up on security. Your goal is to secure your system by spending less money on security than a break-in would cost to fix.

People might try to break into your system or program for a variety of reasons. Some do it for fun. Others, unfortunately, do it to steal information or damage your machines. Therefore, security is almost always important. This is especially true of your Internet programs. Because they are likely one of the first things that will be attacked, spend some time thinking about the security implications of any code you are going to deploy. If you are an Internet novice, show your design and code to an Internet expert to have him verify that your program appears to be secure.

NOTE

Although this chapter focuses on Internet security, you need to remember that your Internet programs should try to deal with security on other fronts. In particular, do your best to defend against attacks from the other employees at your workplace. More security breaches are caused by employees than by crackers out on the Internet. Though you think you can trust your coworkers, keep the number of people who have access to bypass the security mechanisms to an absolute minimum. This strategy improves security and accountability.

General Internet Security

When you look at Internet security, you must look from the bottom of the protocol stack (the physical wire) all the way up to end-user applications. Any level can be attacked; therefore, every level needs some security mechanisms in place. This section mainly covers the potential weaknesses of the security mechanisms, so that you can be expecting them and possibly can deal with them in your Internet programs.

One common kind of Internet attack is to eavesdrop on packets as they cross the network, whether on the Internet or your local LAN. Networking people use the term sniffing instead of eavesdropping, however. Just about any computer can be turned into a sniffing device. The scariest thing is that any cracker who becomes root on your UNIX systems can use your UNIX box from her remote cracking site to see packets on your network. By doing this, she can gain more passwords to get into even more machines.

NOTE

Throughout this chapter, you will notice that the word cracker is used instead of hacker. The term hacker has two meanings. Originally, it was (and still is in many circles) a positive term, meaning "someone who plays with the internal workings of a system." The media then came along and twisted it to mean "someone who breaks into computer systems." Many people have recommended the term cracker for the second definition. It's a good fit, as a safe cracker is comparable to a computer cracker.

Besides preventing the cracker from getting root on your system, you can do a few things to minimize the threat of this attack:

Don't use Ethernet networks for your LAN if security is your number one goal because eavesdropping on an Ethernet is easy to do. However, Ethernet is a very good, inexpensive technology for LANs in general.
If you do use Ethernet, the best thing to do from a security standpoint is to use switching hubs instead of repeated hubs. The advantage is that a single machine on a switched hub doesn't see traffic on the network that isn't supposed to go to that machine.
The final approach is to encrypt traffic as it goes across your LAN or the Internet. This is very labor-intensive and shouldn't be done lightly. (You will learn much more about encryption later in this chapter.)

One organization that can potentially help you is CERT (Computer Emergency Response Team). CERT makes announcements about what bugs exist in Internet-related software and where to acquire patches to solve the bug problems. If patches aren't available, it usually at least suggests some workarounds. It also maintains some security tools on its FTP site. You might want to take a look before starting on Internet programming to make sure that the tools you use in your development are secure. Its URL is http://www.cert.org/.

Web Security

The Web can be fairly secure, completely insecure, or anywhere in between, depending on how it's configured. Because different Web servers have different security mechanisms, the types of Web security available can range widely from system to system. Most of the popular Web servers (for example, NCSA, Apache, and Netscape Communications/Commerce Server) offer reasonably good security if configured correctly.

One of the security options controls whether someone can upload files via the PUT command. I strongly recommend turning off PUT unless you absolutely need it and fully understand what it's doing. You can still get information via the Common Gateway Interface (CGI). CGI is another optional feature to which you might want to restrict access on the server. Poorly written CGI programs can create large security holes, as discussed later in this chapter.

If you need advanced security, you can run one of the Web servers that offers encryption capabilities, such as the Apache/SSL, Open Market Secure Web Server, NCSA, or Netscape FastTrack Server. It won't solve all of your security problems, but it definitely can help, especially against packet sniffing attacks.

General Programming Security

Over the 20-plus years that the Internet has been in existence, many security bugs have arisen in Internet programs. The same types of security bugs, however, tend to crop up over and over again. For example, the same kind of security bug has shown up in the finger daemon, sendmail, Gopher, and the NCSA Web server. Another set of similar bugs has appeared in sendmail, Gopher, and many CGI scripts. In other words, in Internet programming, history does repeat itself.

Preventing Buffer Overflow

The most common bug in Internet programs involves buffers. Internet programs must be prepared to accept input of any length; the mistake that programmers commonly make is to assume that the input has a maximum length and then write it into a fixed-length buffer when the input exceeds that maximum length. Even this can be stopped automatically if your compiler inserts code into the executable to make sure that the buffer (array) boundaries aren't crossed. However, in C, the traditional Internet programming language, array boundaries aren't checked.

A cracker can exploit this problem by overwriting the buffer. The excess input then can overwrite the program that exists beyond the end of the buffer. By this overwriting, the cracker then can have the target system execute his own code. The hard part of this attack for the cracker is that he must change his attack for every different type of machine, if not every version of a particular operating system.

To prevent this problem, you must make sure that your buffers don't overflow under any circumstance. The first solution is to always dynamically allocate your buffers as needed. Some languages (for example, Perl) can automatically do this for you. Another approach is to look at the size of the input; if it's too long, you treat it as an error or you truncate it. Each of these methods works if implemented correctly, but you need to decide which is the best method for your application.

Preventing Shell Command Attacks

Another common mistake is to allow a cracker to send commands to a shell. This problem occurs when the program accepts some input and then uses it in a shell command without performing adequate checking. On UNIX, it's absolutely critical to eliminate some special characters, such as backticks (`) and quotes ("). The characters in UNIX that are safe to pass to a shell are all alphanumeric: underscore (_), minus (-), plus (+), space, tab, forward slash (/), at (@), and percent (%). For DOS and other systems, a different set of characters might be valid. In the CGI section of this chapter, I come back to this security problem and explain some ways to solve it for CGI programs.

Changing the Root Directory

One way to minimize all Internet security programming problems is to use a chrooted environment on a UNIX system to run the server program. If you are not on UNIX, this method is not possible and you might want to go on to the next section. The name comes from the chroot (change root) command in UNIX, which changes the root directory for the program. Setting up this kind of environment correctly is somewhat troublesome, but the security benefits can be huge if security is critical for your program. If a cracker does break through your program's security, he will be in a "rubber room" environment on your system-without access to the real operating system files or anything else you don't want to expose. This isolation will enormously reduce his ability to cause damage.

If your program works with confidential or sensitive information, it's critical that your Internet programs can keep that information out of the hands of people who shouldn't see it. Beyond your program's security, you need to protect the data from access outside of your program. For example, you might have a program that receives credit card numbers. If a cracker breaks into your system and becomes root, he then can look at the file where you have stored the credit card numbers. A solution to this problem is to keep all this data encrypted whenever you store it. (Before you ignore this occurrence as being unlikely, you should know that it has already happened to a major Internet service provider.)

It is a good idea to have your program demand absolute security by keeping an audit trail; sending this audit trail to a different machine (possibly via syslog) also is advisable. With this method, if the machine that runs your program is compromised, you still have the audit trail from which to recover (as well as a backup, hopefully). Audit trails also are good for finding program bugs and abnormal use patterns, which are often a sign of compromise or attempted compromise. On the negative side, some audit trails can use an enormous amount of disk space and are hard to find useful information from.

Java Security

Java is probably the most secure of all of the Internet programming languages in existence; Java was built from the ground up to be ultra-secure. The Java security model is somewhat complex, but it incorporates security on several levels. If it were bug-free, it would be an extremely strong security mechanism that would greatly reduce worries about the security of running programs on the Internet. Unfortunately, so far a couple of bugs have been found in the security protection of the language, which would enable people to circumvent the security. The good news is that they are just bugs, not fundamental flaws in the design, and Sun is quickly addressing them. The other good news is that the bugs haven't been exploited to cause damage.

Byte Code Verifier

The lowest level of security is the byte code verifier in the Java Interpreter, which implements the Java Virtual Machine. The byte code verifier makes sure that a Java program is valid and doesn't perform any operations that might enable the program access to the machine underlying the interpreter. Checks performed include checking for code that will overflow or underflow the stack and code that tries to access objects that it isn't allowed to access.

Sun is trying hard to make sure that its byte code verifier is totally secure. When other vendors release their implementations of the Java Virtual Machine, can you trust their byte code verifiers? I don't have the answer to this question, but it's something worth paying attention to.

The byte code verifier is invoked by the class loader (java.lang.ClassLoader). The class loader is responsible for loading classes from whatever source it needs, whether the local disk or across the network. The class loader protects a class from being replaced by another class from a less-secure source.

Security Manager

One level above the Class Loader is the Security Manager. The Security Manager implements the security policy for the running Java programs. Stand-alone Java applications can set their own security policy by overriding the Security Manager and instantiating it. Web browsers instantiate their own Security Managers, and applets aren't allowed to modify them. Netscape Navigator, in particular, implements a very strict Security Manager. For example, accessing the local disk is impossible from any class loaded from the network. With HotJava and AppletViewer, accessing the local disk is possible for a class loaded off the network, if the user allows it.

By modifying the Security Manager, you can change the way security is handled in your applications. With the Security Manager, you can control file access. You can control where network connections can be made. You even can control access to threads through the Security Manager.

Writing your own Security Manager is a very advanced topic, worthy of a book in its own right. Needless to say, I won't cover it here. If you need some material on writing a Security Manager, the book Tricks of the Java Programming Gurus by Sams Publishing has several chapters on how to do it.

Many people are now talking about putting digital signatures on applets (digital signatures are covered later in this chapter). With this system, you will be able to know who wrote a particular applet. Then, if you trust the author, you can relax the limits that the Security Manager imposes. Watch for this technology; it will be important to you, because you probably will need to sign your applets at some point in the future.

JavaScript Security

Although JavaScript has the word Java in it, don't expect the same level of security from it as from Java. JavaScript originally was called LiveScript and had nothing to do with Java. The name change was almost strictly a marketing ploy on the part of Netscape.

Java offers a multilevel security model that protects Java code from doing dangerous things, but JavaScript does nothing of the sort. Netscape just tried to design a language with no dangerous commands. In Java, it's possible to control the level of security by overwriting the Security Manager. JavaScript offers no such flexibility.

Apparently JavaScript wasn't designed with security and privacy as primary design principles. If they were primary design principles, they were poorly implemented. People already have managed to exploit features in the language to do things such as steal URL history and forge e-mail. Netscape is gradually trying to address these problems.

To be honest, many people doubt that JavaScript will ever be completely secure because of its design. In fact, I don't feel comfortable with it personally, and I have disabled it in my copy of Netscape Navigator.

VBScript Security

When Java and, to a lesser extent, JavaScript, burst on the scene, Microsoft had to strike back. It did so with Visual Basic Script (VBScript). Visual Basic Script is nothing more than Visual Basic with the parts that Microsoft saw as being potentially dangerous ripped out of the language. The end result is another language-from a security point of view-that's similar to JavaScript. Whether it's secure is somewhat of a mystery at this time, because it hasn't been widely deployed on the Internet. Many systems seem secure until enough people try to break them. We will just have to see how VBScript holds up.

CGI Security

You can look at CGI security from several angles. I'll cover some of the specific programming problems first and then get around to some of the more global issues related to CGI scripts later.

Handling Input to a Shell

First, go back to the passing-commands-to-a-shell problem discussed earlier. Because Perl is the most commonly used language for CGI programs, let's look at how to fix this problem in Perl. If the input should never be used in a shell command in any way, the version of Perl known as taintperl should be used instead of normal Perl. taintperl is included with the standard Perl distribution and doesn't enable any input to be used in a shell command unless you go through a troublesome process of untainting it first.

However, in many cases you need to use some of the input as part of a shell command. You have two choices:

You can write your own error-checking routines. If you do so, I would strongly recommend that you look for characters you know you can trust and then assume that anything else is an error.
Looking for characters you can trust is a much safer method than the second alternative-looking specifically for characters you know you can't trust. This method isn't as safe because you might miss one.

If you think this is too much trouble, you're right-but you have an easy alternative. Libraries are available that check the input for you automatically, as well as parsing it into an easy-to-use format. For Perl 4, CGI-LIB is an excellent library. It's available at the following address:


ftp://ftp.ncsa.uiuc.edu/Web/httpd/Unix/ncsa_httpd/cgi/cgi-lib.pl.Z

Don't be confused by the fact that it's included with the NCSA httpd distribution. It should work with any UNIX-based Web server and possibly any Web server supporting CGI on any platform that has Perl. For Perl 5, you can get CGI.pm at the following address:


ftp://ftp.ncsa.uiuc.edu/Web/httpd/Unix/ncsa_httpd/cgi/CGI.pm-1.53.tar.Z

CGI Scripts and `userid`

If you are writing CGI programs and they don't run, the problem might be due to the security setup on your Web server. Many system administrators allow only specific people to execute CGI scripts for certain directories, to minimize the security exposure. The reason for this is that CGI scripts typically run as the userid of the Web server. A poorly written program (by any user) that can create files on the Web can cause damage to the Web server. If the Web server runs as root on UNIX or administrator on Windows NT, the damage could be even greater. It is never a good idea to run a Web server as either of these two user IDs. Anyway, if your script doesn't run at all, talk to your Web administrator.

To solve the problem just discussed, many Web administrators install a program called CGI-WRAP. CGI-WRAP runs on a UNIX Web server and is a setuid program that performs a variety of beneficial tasks:

First, it runs user-written CGI scripts as that user. This scheme protects the Web server from these programs.
Second, it eliminates all the dangerous characters previously discussed-before the CGI script ever runs-so that the system administrator doesn't need to worry about whether users are writing scripts in a secure fashion.
It also has an option to automatically kill off CGI scripts if they use too much CPU time.

If you are on a system using CGI-WRAP, you need to use different URLs, and you might need to modify your program slightly. Find some local document or talk to your Web administrator to find out how.

NOTE

A common and deadly mistake that many Web administrators and programmers alike have been making recently is to place a copy of Perl itself in a directory where CGI programs can execute. A knowledgeable cracker can use this mistake to do just about anything he wants to your system. The best strategy is to always keep Perl outside any directory from which the Web server can read.

Firewalls

A firewall is a critical part of the security of any Internet site. Basically, a firewall improves the security of a site by limiting the access of that site to an absolute minimum. It's important to remember that although a firewall doesn't solve all Internet security problems, no Internet site should be without a firewall. This section covers only the basics of firewalls and how they affect Internet programming. If you need more detailed information, the book Building Internet Firewalls by O'Reily & Associates, Inc., is an excellent source.

Types of Firewalls

Three basic types of firewalls exist: the bastion host, the packet filter, and the proxy gateway. Before you do any Internet programming, you need to find out which kind of firewall you have. Each kind affects Internet programming in different ways. Also, your site's particular implementation factors into the programming picture. A firewall can be completely transparent to your program in some cases. In other cases, it might make your program impossible to write. If this is the case, you need to ask yourself and the security people at your site whether a way exists to relax the restrictions your firewall imposes, without compromising the security of your site. Sometimes the answer to this question is no, and you just have to give up on your project.

It's also possible to combine or modify the three basic firewall types to make more complex security setups. If this is done correctly, you can get exactly the security you want. The variations are endless and each has its own effect on Internet programming. Therefore, I cover only the effects of the basic types. If you have a more complex setup, extrapolate the information here into your own environment.

Building Your Own Firewall

You can either buy a firewall or design your own if you don't have one. Internet routers, such as Cisco or Bay Networks, offer all the packet-filtering options you could want. Because you need a router to connect to the Internet anyway, this is a good, inexpensive option. The only negative is that these devices can be hard to configure. You can purchase a variety of products, such as Firewall-1, Gauntlet, CheckPoint, and Sidewinder. These products offer advanced firewall configurations but are very expensive.

If you build your own firewall, take a look at Trusted Information Systems' (TIS) Firewall Toolkit as a base on which to build. Building your own firewall can be relatively inexpensive until you factor in personnel costs, but it probably won't be as high-quality. The good news about building your own firewall is that you can customize it highly to meet the needs of your programs.

Bastion Hosts

The bastion host firewall scheme is usually simple to implement, but it is very noticeable and cumbersome to users and programmers alike. The bastion host sits between the Internet and the internal network (see Figure 3.1). The Internet can talk to the bastion host. The internal network can talk to the bastion host. However, the Internet and the internal network can't talk directly to each other.

Figure 3.1 : The bastion host.

For any communication to occur between the Internet and the internal network, a login must exist to the bastion host. In the old days of the Internet, this kind of firewall was potentially very secure and the loss of functionality was acceptable in many cases. Over time, though, the network has changed-with the explosion of the Web and PCs with Internet access. In today's world, a true bastion host would make using a graphical Web browser on a PC impossible without going to extreme lengths.

Any services your site plans to make available to the Internet must reside on the bastion host. This includes your e-mail routing server and your Web server. But these services can be too much load for one machine; you might need to have multiple bastion hosts to handle all the servers.

For Internet programmers, the bastion host can stop many programming projects cold. Basically, your programs must reside on the bastion host itself, or they won't be able to talk to the Internet at all. One trick is to have a program on the bastion host that relays information between the Internet and your main program running on the internal network. If you do this, however, you no longer have a true bastion host; you have created a proxy gateway. Proxy gateways are covered in their own section later in this chapter.

Packet Filters

Instead of trying to force all traffic between the Internet and the internal network to be authenticated through one host, like the bastion host system, the packet filter tries to screen out harmful traffic between the Internet and the internal network. In addition, the packet filter allows traffic directly between the Internet and the internal network (see Figure 3.2).

Figure 3.2 : A packet filter.

Packet filters often are implemented as part of the router that ties together the Internet and the internal network. Less often, you'll find them implemented in a host computer. Packet filters are implemented differently at every site. Their security can range from dangerously lax to incredibly strict.

With a packet filter, you can almost always restrict access on protocol and IP addresses (both source and destination addresses). With this level of control, it's possible to allow another company you work with often to have more access to your systems, while not giving any access to the rest of the Internet. Some packet filters will even look inside packets and only forward packets based on some content inside the packet.

In terms of how the firewall will affect your programs, some sites might have access to your running code, while others don't. This scheme can be helpful to you at times to help manage program security.

Proxy Gateways

A proxy gateway is similar to a bastion host except for one major difference. The proxy gateway has a program or a set of programs running on it to relay packets between the Internet and the internal network.

The idea is basically a game of smoke-and-mirrors, giving the user the perception that she's directly connected to services on the other side of the firewall, when in reality she isn't (see Figure 3.3).

Figure 3.3 : A proxy gateway.

Proxies are very good security mechanisms and are great at logging Internet activity. The bad news is that they're a lot of extra work. Your client and server programs must normally be modified to be able to deal with the proxy. The good news is that the major Web browsers already support proxying, so they won't need modification.

Another problem with proxies is that the proxies have to be written for almost every different protocol. They're already written for almost every protocol currently in wide use on the Internet, but new releases of protocols can break proxies, and they might need to be rewritten. Also, for a few protocols (for example, talk), no proxy has been written, and writing one might even be impossible.

Encryption and Digital Signatures

Encryption, in one form or another, has been around since almost the dawn of civilization. Computers and networks have made advanced forms of encryption possible. The kinds of encryption historically used (for example, secret decoder rings or German Enigma machines) are trivial to break on a computer, taking only seconds. Almost all versions of UNIX come with a program called crypt, which is a software implementation of the German Enigma machine used in World War II. A program exists on the Internet that breaks this code in seconds. Therefore, when dealing with encryption, always insist on using high-quality algorithms; weak ones can be broken easily.

Digital signatures are a way of securely signing documents. This strategy is useful for two reasons. The first is like a regular signature-to indicate that you agree with the terms of the document. The second is to detect whether someone has changed the document after you signed it (something that was unlikely with traditional paper documents). Digital signatures are implemented by making a digest of the message, using an algorithm such as MD5. This digest is encrypted by the user's key. The recipient can decrypt the encrypted digest and compare it to the document. If the document doesn't match, it has been tampered with or has a forged signature.

Legal Issues

When writing Internet programs that involve encryption, it's very important to be aware of the legal issues involved. If you don't, you could end up in prison on felony charges.

First, some countries ban outright any use of encryption. France and Iran are two notable examples. Therefore, you shouldn't write any encryption code, import any encryption programs, or even use encryption in these two countries. Other countries might have such laws on the books. I recommend checking with an attorney unless you are absolutely sure of the legal status in your country.

The United States is another country with strange laws about encryption. No problem exists with using any form of encryption you want-within the country. The trouble starts when you want to write a piece of encryption software in the United States and then export it to another country (except possibly Canada) in electronic form. The U.S. government has labeled certain types of encryption software as munitions-just as if these programs were missiles or something. The law is known as ITAR (International Traffic in Arms Regulations).

It's possible to write certain types of weaker encryption programs for export. You then have to request an exemption from the export law from the U.S. government. The level of encryption allowed is strong enough to be time-consuming for a cracker to break the code, but it can be done in a few months on some machines. For public key algorithms, covered in the section "Public Key Encryption," the Software Publishers Association (SPA) and the government agreed to give quicker approval to algorithms using 40-bit keys or less. A message encrypted with a 40-bit key takes only about 200 MIPS/year of CPU time (that is, a 200 MIPS computer would take one year to crack it). Therefore, any highly secure, non-time-sensitive communication is potentially at risk. Keep this fact in mind.

A few ways exist to get around this law. The best is to just write your software outside the U.S. Some people ask the question, "What if my machine is out of the country, but I'm in the U.S.?" or vice versa. My answer is that there's no legal precedent on this issue, so don't take chances. The other alternative is that apparently you can print out the source code (or binary) to the program and carry the printouts out of the country without violating the law. Then, in the foreign country, you can scan them back in. Again, check with your lawyer before attempting this. Another method is for you to write a U.S. version of the program that you tightly control, and then have an associate write an international version outside the U.S. This has been done with PGP (Pretty Good Privacy), which is discussed later in this chapter.

To make matters even worse, the law even precludes giving the program to people in the U.S. who are not U.S. or Canadian citizens or permanent residents. I know this part of the law is widely violated, but my recommendation, as always, is to play it safe.

Another legal issue is that the major public key encryption algorithms are all patented. The patent owners are pretty aggressive about protecting their patents; however, many of the key patents expire in the next couple of years. The release of these patents will break the stranglehold the patent holders have had on public key encryption. I predict that after the patents expire we'll see many more public key encryption implementations than we have today.

Private Key Encryption

Private key encryption is the oldest form of encryption in existence. In private key encryption, there is one key. You encrypt the message, using this key. The person decrypting the message must have the same key. It works just like the secret decoder rings you might have played with when you were a child.

The strength of private key encryption algorithms varies greatly. Some are trivial to crack; others are computationally impossible to crack, given today's computers. Some vendors, notably WordPerfect, have included some private key encryption code in their products. It turns out that many of these are almost completely useless. You can download programs from the Internet to crack the keys, run them, and have the original file back in under 15 minutes (the download is the time-consuming part, too!).

The most famous private key encryption algorithm is DES (Data Encryption Standard). It's very popular, and you can find implementations for almost any platform available today. However, DES is aging (originally published in 1975) and computers are getting faster. Attacks on DES are becoming increasingly possible. A modified version of DES known as Triple-DES is available and is harder to crack.

Public Key Encryption

The big problem with private key encryption is making sure that the people on both ends of the communication have the keys-without anyone else having them. The only ways to exchange keys in a secure fashion are to already have a secure communication channel available, which means that both parties already have encryption available; or both people must be in the same place without anyone else around, so that they can be sure they're not being bugged. This is very cumbersome for most people.

Public key encryption eliminates the need for a secure key-exchange mechanism. Each person has a private key, which he uses to decrypt or digitally sign messages, and a public key, which others use to send messages to him or to verify his digital signature. Each person keeps his private key private to himself. The public key is public information and can be known by anyone.

Key Length

The length of the key is a critical issue in the security of a public key encryption algorithm. The longer the key, the safer your encrypted messages but the longer it takes to encrypt or decrypt them. My advice is to go ahead and use a long key (1024 bits or more). Most people today have enough CPU horsepower to encrypt messages to you quickly, even with a long key. This isn't true when breaking messages without knowing the key, however. The longest message known to be broken had a 429-bit key. It required an international effort, involving 600 sites. The good news is that the amount of time needed with current cracking algorithms doubles with every additional 10 bits of key length. However, computer performance and algorithms to break public key encryption algorithms are improving, making cracking easier. Therefore, a long key is essential in protecting your messages for a long period of time.

Key Exchange

With public key encryption, the main implementation difficulty changes from key exchange to identity verification. If someone e-mails you her public key across the Internet, how do you know it's really from her? The answer is that you don't. The secure way is to get together with the person, in person, and exchange public keys. Well, if you do this, you almost might as well have used private key encryption and exchanged keys. Also, this model doesn't scale well. Obviously, you can't meet with every single person with whom you need to exchange public keys.

Two basic models for key exchange have been developed. The first is a hierarchical approach. At the top is a person or organization that everyone has to trust. This person/organization hands out authority to other organizations or people who can authenticate certain groups of people. These second-tier authenticators then publicly publish the public keys of the people they authenticate. This is the simplest form of this scheme. It's possible (and probably necessary) to have many more levels than this.

Organizations or individuals can issue a special document known as a "Digital Certificate" saying that they have authenticated a certain person. Then that person can show the digital certificate to others as proof of her identity on the Internet. These digital certificates are usually in a format known as X.509. One company issuing digital certificates is VeriSign (http://www.verisign.com/). To use the Netscape FastTrack Server, you have to acquire a digital certificate from VeriSign, though it should be possible to use other companies in the near future (probably by the time you are reading this).

The other model, known as the Web of Trust, doesn't depend on being able to trust these higher-level people and organizations. In the Web of Trust, you initially exchange keys with at least a few people you meet in person. Both of you can digitally sign each other's keys with your names. Then, when others later get your key over the network, they see that it's signed by the person with whom you exchanged keys in person. If they have exchanged keys in person with that person and trust that person, they then know that they can trust your key.

To make this system work well, you have to decide how much you trust other people's ability to validate and exchange keys correctly. If you exchange keys with someone you don't think validates people correctly, you can later ignore any key you get over the network that is signed by that person and not signed by someone you do trust. If you trust their validation techniques, you take any keys signed by them. You can also partially trust them with some implementations of this model. When you start thinking about scaling this model, your head will probably start to spin, but it's a powerful way for small groups of people to communicate without a lot of overhead.

Most people tend to be rather religious about which model of key exchange is the best. However, in reality, it has been proven mathematically that either model can emulate the other model with a little work. In fact, converting the Web of Trust into the hierarchical approach is almost trivial. An organization only needs to create a key and then sign the keys of people whom they can authenticate. If people know they can trust that organization, they will accept keys that are signed by the organization as being valid.

My personal belief is that both methods are flawed in some ways. Therefore, this ability to twist one model into acting like the other is essential. Sometimes you want to get keys from a central authenticated database, especially when you can't meet this person in person or can't meet someone who can meet them to handle the key exchange. At other times, you want to get a key directly from someone (or slightly indirectly, from a source you personally trust), so that you know it's absolutely correct.

Popular Public Key Packages

The most popular piece of public key encryption software on the Internet is PGP (Pretty Good Privacy). PGP is available for numerous platforms and uses the Web of Trust model of key exchange. To get around the legal restrictions, there's a U.S. version of the program and a separate international version that was written outside the U.S. Therefore, you can use PGP without worrying about the export controls. (Just don't carry a copy on your laptop from inside the U.S. to other countries!) In reality, at least two U.S. versions of PGP exist. A version for noncommercial use is freely available by signing some documents with MIT. There's also a commercial version known as ViaCrypt PGP.

PEM (Privacy-Enhanced Mail) is another algorithm for public key encryption. PEM is based on the hierarchical key exchange model. RIPEM (Riodran's Internet Privacy-Enhanced Mail) is the reference implementation, which is available from RSA Data Systems. PEM seemed destined for greatness a couple of years ago, but it really has taken a back seat to PGP in actual use.

MOSS (MIME Object Security Services) is intended to correct a couple of the flaws of PEM, one being that PEM's rigid hierarchies are too strict on many occasions. MOSS relaxes the restrictions somewhat. MOSS is designed to handle MIME messages, unlike PEM. MOSS has too many options, and it might be possible for two different vendors to write MOSS implementations that can't speak to each other. Some people at the Internet Engineering Task Force (IETF) told me and others to forget PEM, and that MOSS is the algorithm for the future. However, upon researching for this chapter, I found material on the Web indicating that MOSS is a niche system and that PEM was still alive and well. I recommend watching market trends on this debate to see which side is better to use and is gaining market share.

SSL

SSL (Secure Socket Layer) is a security protocol developed by Netscape. In the protocol stack, it runs above the TCP protocol but below the application layer protocols such as NNTP, HTTP, and FTP. While I am writing this, SSL has been implemented only in uses related to HTTP.

SSL enables the client to authenticate the server. It also enables data being transmitted over the Internet to be encrypted. If you are using Netscape Navigator, you can know you are talking to an authenticated server when the key in the lower-left corner of the window is in one piece and not broken. If you have the exportable version of Netscape Navigator, it uses only the 40-bit key discussed earlier (in the section dealing with U.S. export restrictions). So, this isn't a very secure system.

The current version that is implemented is SSL 2.0. However, the SSL 3.0 specification is available; 3.0 enables the client, as well as the server, to be authenticated.

An SSL server runs on two ports. First, it runs normal, unencrypted as always. Also, it answers on a second port, 443 by default, for encrypted transactions. If your URL for unencrypted transactions is http://www.utdallas.edu/, for example, your URL for encrypted transactions is https://www.utdallas.edu/. The only difference is at the beginning of the address: https instead of http.

At the current time, SSL is implemented in the Netscape FastTrack Server, recent versions of the NCSA server, and Open Market's Secure Web Server. Patches are also available for the popular, free Apache Web server. SSL also is implemented in Netscape Navigator on the client side. For more information, see the following address:


http://home.netscape.com/newsref/std/SSL.html

If you need to combine SSL with a proxy-based firewall, see this address for a specification of how to do it:


ftp://ds.internic.net/internet-drafts/draft-luotonen-ssl-tunneling-02.txt

S-HTTP

S-HTTP (Secure HyperText Transfer Protocol) is a higher-level encryption scheme than SSL to protect Web transmission. Although S-HTTP and SSL seem to be in competition, it has been discussed that there's no reason not to use both in conjunction with each other. In fact, Open Market has implemented both in its Secure Web Server product. Netscape is considering support of S-HTTP as well as SSL in its products. URLs using S-HTTP start with s-http://.

The S-HTTP specification is being developed by CommerceNet and can be seen at the following addresses:


http://www.commerce.net/internet/standards/drafts/shttp.txt

ftp://ds.internic.net/internet-drafts/draft-ietf-wts-shttp-01.txt

You can find more information at http://www.eit.com/creations/s-http/. An S-HTTP server can be found at http://www.commerce.net/software/Shttpd. Patches for the CERN httpd server can also be found at these locations. There's also a version of Mosaic called Secure Mosaic, but it's available only to CommerceNet members.

Shen

Shen is a proposal similar in nature to S-HTTP. It hasn't received widespread support. However, it's being developed by Phillip Hallam-Baker of the W3 Consortium; and because it is one of the key players in the Web standards world, you should keep an eye on it just in case. The message format it uses is inspired by PEM but unfortunately is not compatible with it. Shen is discussed at these addresses:


http://www.w3.org/hypertext/shen/ref/security_spec.html

http://www.w3.org/hypertext/WWW/Shen/ref/shen.html

S/MIME

S/MIME (Secure/Multipart Internet Mail Extensions) is a standard to exchange e-mail in encrypted form. The specification can be found at http://www.rsa.com/rsa/S-MIME.

Like PGP, the public key encryption is just to manage key exchange; the bulk of the encryption is done with private key encryption algorithms. S/MIME is flexible and enables the use of DES, Triple-DES, and RC2 as private key encryption algorithms.

GSS-API

GSS-API (Generic Security Service-Applications Programming Interface) is a program interface for security that includes both client and server authentication as well as data encryption. It's "generic" because it was designed to work with any Internet service that needs security. When used with HTTP, the URLs start with gss-http://. GSS-API has its supporters, but hasn't been deployed much. It's an interesting approach, though. More information can be found at this address:


ftp://ietf.cnri.reston.va.us/internet-draft/draft-ietf-wts-gssapi-00.txt

SET

SET (Secure Encryption Technology) is a standard for exchanging credit card transactions across the Internet. It was developed by a group including MasterCard, Visa, Netscape, IBM, Microsoft, VeriSign, and GTE. American Express has shown support for it more recently. The major credit card companies state that they don't see encryption technology as a point of difference between them. They all agree that all transactions should be secure-whether by them or their competition.

Because SET has the backing of so many major players in the electronic commerce world, it definitely bears watching. Expect that you will see the beginning of deployment in late 1996. It also is using X.509 certificates, just like SSL.

The SET standard is documented at this location:


http://www.visa.com/cgi-bin/vee/sf/standard.html

More technical information is available at


http://www.visa.com/cgi-bin/vee/set/settech?2+0

Summary

This chapter covered a lot of security-related material. It is enough to get you started and headed in the correct direction but by no means covers all the details. If you are working in any particular area, please use the references given to get more information.

However, one thing should be apparent to you now: In all of your Internet programming work, you should never ignore security; always think about the security implications of what you are doing. It is also a good practice to discuss your approach with someone else familiar with security to make sure you do not make an oversight. If you do these things, you will eliminate 99 percent of the potential security problems in advance.

Chapter 3

Security and Encryption

Preventing Buffer Overflow

Preventing Shell Command Attacks

Changing the Root Directory

Byte Code Verifier

Security Manager

Handling Input to a Shell

CGI Scripts and userid

Key Length

Key Exchange

Popular Public Key Packages

CGI Scripts and `userid`