Chapter 30

When and Why to Use Native Methods


CONTENTS


The goal for this chapter is to introduce you to Java's native methods. If you are new to Java, you may not know what native methods are, and even if you are an experienced Java developer, you may not have had a reason to learn more about native methods. At the conclusion of this chapter you should have a better understanding of what native methods are, when and why you may want to use them, and the consequences of using them. You should also have a basic understanding of how native methods work. You will then be more than ready to tackle the next three chapters, which dive into the nitty-gritty details of Java's Native Methods.

What Is a Native Method?

Simply put, a native method is the Java interface to non-Java code. It is Java's link to the "outside world." More specifically, a native method is a Java method whose implementation is provided by non-Java code, most likely C (see Figure 30.1). This feature is not special to Java. Most languages provide some mechanism to call routines written in another language. In C++, you must use the extern "C" stmt to signal that the C++ compiler is making a call to C functions. It is common to see the qualifier pascal in many C compilers to signal that the calling convention should be done in a Pascal convention, rather than a C convention. FORTRAN and Pascal have similar facilities, as do most other languages.

Figure 30.1 : A native method is a Java method whose implementation is provided by non-java code.

In Java, this is done via native methods. In your Java class, you mark the methods you wish to implement outside of Java with the native method modifier-much like you would use the public or static modifiers. Then, rather than supplying the method's body, you simply place a semicolon in its place. As an example, the following class defines a variety of native methods:

public class IHaveNatives
{
  native public void Native1( int x ) ;
  native static public long Native2() ;
  native synchronized private float Native3( Object o ) ;
  native void Native4( int[] ary ) throws Exception ;
}

This sample class shows a number of possible native methods. As you may have noticed, native methods look much like any other Java method, except a single semicolon is in the place of the method body. Naturally, the body of the method is implemented outside of Java. What you basically define is the interface into this external method. This method declaration describes the Java view of some foreign code.

The only thing special about this declaration is that the keyword native is used as a modifier. Every other Java method modifier can be used along with native, except abstract. This is logical, because the native modifier implies that an implementation exists, and the abstract modifier insists that there is no implementation. Your native methods can be static methods, thus not requiring the creation of an object (or instance of a class). This is often convenient when using native methods to access an existing C-based library. Naturally, native methods can limit their visibility with the public, private, private protected, protected, or unspecified default access. Native methods can also be synchronized (see Chapter 7, "Concurrency and Synchronization"). In the case of a synchronized native method, the Java VM will perform the monitor locking prior to entering the native method implementation code. So, as in Java, the developer is not burdened with doing the actual monitor locking and unlocking.

The example uses a variety (although not all) of types. This is because a native method can be passed any Java type. There is no special procedure within the Java code to pass data to the native method. However, the developer of native methods must be careful that his native methods behave properly when manipulating Java datatypes. Native methods do not undergo the same kinds of checking as a Java method, and they can easily corrupt a Java datatype if care is not taken (see Chapter 31, "The Native Method Interface").

A native method can accept and return any of the Java types-including class types. Of course, the power of exception handling is also available to native methods. The implementation of the native method can create and throw exceptions similar to a Java method. When a native method receives complex types, such as class types (such as Object in the example) or array types (such as the int[] in the example), it has access to the contents of those types. However, the method used to access the contents may vary depending on the Java implementation being used. The major point to remember is that you can access all the Java features from your native implementation code, but it may be implementation-dependent and will surely not be as convenient or easy as it can be done from Java.

The presence of native methods does not affect how other classes call those methods. The caller does not even realize it is calling a native method, so no special code is generated, and the calling convention is the same as for any other method-the calling depends on the method being virtual or static. The Java virtual machine will handle all the details to make the call in the native method implementation. One minor exception may be with the methods marked with the final modifier. The Java implementation may take advantage of a final method and choose to inline its code. It would be doubtful that this could be achieved with a native final method, but this is an optimization issue, not one of functionality. When a class containing native methods is subclassed, the subclass will inherit the native method and also will have the capability of overriding the native method-even with a Java method (that is, the overridden method can be implemented in Java). If a native method is also marked with the final modifier, a subclass is still prevented from overriding it.

Native methods are very powerful, because they effectively extend the Java virtual machine. In fact, your Java code already uses native methods. In the current implementation from Sun, native methods are used in many places to interface to the underlying operating system. This enables a Java program to go beyond the confines of the Java Runtime. With native methods, a Java program can virtually do any application level task.

Uses for Native Methods

Java is a wonderful language to use. However, there are times when you either must interface with existing code, can't express the task in Java, or need the absolute best performance.

Accessing Outside the Java Environment

There are times where a Java application (or applet) must communicate with the environment outside of Java. This is, perhaps, the main reason for the existence of native methods. For starters, the Java implementation will need to communicate with the underlying system. That underlying system may be an operating system such as Solaris or Win32, or it may be a Web browser, or it may be custom hardware, such as a PDA, Set-top-device, and so forth. Regardless of what is under Java, there must be a mechanism to communicate with that system. At some point in a Java program, there will be that point where Java meets the outside world, an interface between Java and non-Java worlds. Native methods provide a simple clean approach to providing this interface without burdening the rest of the Java application with special knowledge.

Accessing the Operating System

The Java virtual machine describes a system that the Java program can rely on to be there. This virtual machine supports the Java Language and its runtime library. It may be composed of an interpreter or can be libraries linked to native code. Regardless of its form, it is not a complete system and often relies on an existing system underneath to provide a lot of support. More than likely, a full-fledged operating system, such as Solaris or Win32, resides beneath it. The use of native methods enables the Java Runtime to be written in Java yet have access to the underlying operating system, or even the parts of the Java virtual machine that may be written in a language such as C. Further, if a Java feature does not encapsulate an operating system feature needed by an application, native methods can be used to access this feature.

Embedded Java

It is conceivable to see a Java virtual machine embedded inside another program. Several WWW browsers come to mind. Perhaps this enclosing program is not implemented in Java. The Java Runtime may need to access the enclosing program for services to support the Java environment. Once again, native methods provide a clean interface for this access to the surrounding program. Furthermore, the vendor of the program may wish to expose some features of the program to a Java applet. The vendor would simply need to create a set of Java classes containing native methods, which provide the interface for the Java application into the program. The native method implementation would then be the "glue" between the Java applet and the internals of the enclosing program.

Custom Hardware

Another important possible application of native methods being used to access a non-Java world is providing Java programs access to custom hardware. Perhaps a Java virtual machine is running within a PDA or Set-Top-Device. A lot of what would normally be in an operating system may exist in hardware or software embedded in ROM, or other custom chip sets. Another possibility is that a computer may be equipped with a dedicated graphics card. It would be ideal to have Java make use of the graphics hardware. A set of Java classes with native methods defined would provide the Java program access to these features.

Sun's Java

In the current implementation from Sun, the Java interpreter is written in C and can thus talk to the outside environment as any normal C program can. A majority of the Java Runtime is written in Java and may make calls into the interpreter or directly to the outside environment, all via native methods. The application deals mostly with the Java Runtime, but it may also talk to the outside environment via native methods. For example in the class java.lang.Thread the setPriority() method is implemented in Java but calls the method setPriority0(), which is a native method in the Thread class. This native method is implemented in C and resides within the Java virtual machine itself. On the Windows 95 platform this native method will then call (eventually) the Win32 SetPriority() API. This is an example where the native method implementation was provided by the Java virtual machine directly. In most cases the native method implementation resides in an external dynamic link library (discussed in a following section), but the call still goes through the Java virtual machine.

Performance

Another major reason for native methods is performance. The Java language trades some performance for features like its dynamic nature, garbage collecting, and safety. Some Java implementations, like the current crop, may be interpreters, which also add extra overhead. The lost performance can be small as the implementation technology for Java systems improve, but until then and even after there may always be a small performance overhead for certain functionality a Java program may need. This functionality can be pushed down into a native method. That native method can then be implemented efficiently at the native lower level of the system on which the Java virtual machine is running. Once at the native implementation level, the developer can use the best-suited language, such as C or even assembler. In this way, maximum performance can be achieved in those specific areas while the bulk of the application is done within the safe and robust Java virtual machine. One area where you may choose to implement some parts of an application in native methods is time-intensive computations, such as graphics rendering, simulation models, and so forth.

Accessing Existing Libraries

The fact that Java is targeted at the production of platform-neutral code means that the current implementations may not access system features that you may need. An example is a database engine. If you need to, you can use the native method facility to provide your own interface to such libraries. Further, you may want to use Java to write applications that use existing in-house libraries. Again, the use of native methods enables you to make such an interface. This enables you to leverage off your existing code base as well as gradually introduce Java-based applications among your other applications coded in an older language.

Benefits and Trade-Offs

The presense of native methods offers many benefits, the biggest being the extension of Java power. However, there is always a downside to all good things, and native methods definitely have their downsides. Depending on what the goals of your application are, the downsides may not be that terrible. Foremost is the fact that, by definition, the use of native methods defeats several of Java's main goals: platform neutrality, system safety, and security.

Some of Java's attractive features help minimize the downsides, however. The best feature of all is that Java is such a nice language to develop in you won't want to use native methods unless you have to.

Platform Neutrality

Because a native method is implemented in a foreign language, the platform neutrality is limited to the language being used. Most likely, native methods are implemented in C or C++. Although those languages have standards, these standards leave a lot of room for implementation-defined attributes (even compilers on the same system may differ), so your mileage may vary. If the native method accesses the underlying system, you are tying your implementation to that system. For example, the file systems of UNIX and Win32 have some differences. There may even be differences between flavors of UNIX and Win32 (Win95 and WinNT are not identical). Once again, you may sacrifice your platform neutrality with your native method. This may cause you to have to support a limited number of platforms (rather than all Java platforms). Further, for each platform you choose to support, you may (probably will) have to implement several flavors of the native method.

The Java language and runtime provide a number of features that make applications more robust and safe. Java's memory management, synchronization features, and lack of address manipulation help prevent common programming mistakes from slipping through the development and testing phases of your product. However, once you drop out of Java into a native method, you are, once again, at the mercy of the language and system in which you are implementing the native method. If your native method implemented in C chooses to manipulate an address directly, you risk corrupting some part of memory, perhaps even the Java virtual machine itself.

Security Concerns

Additionally, the Java Language provides features to aid in the writing of secure applications. A Java virtual machine is much more capable of detecting an "evil" Java program than an application in other languages. Once you drop into a native method, the Java virtual machine can no longer verify, catch, or prevent the program from violating the security of the environment in which the Java virtual machine is running. This is the reason a Java-enabled Web browser typically does not allow a nontrusted native method to be called. In today's browsers, a trusted native method must be present on the local system in a certain location to be executed from an arriving applet (in other words, one loaded from a remote site). For more information on security, in general, see Part 6, "Security." For more information on how security applies to native methods, see Chapter 33, "Securing Your Native Method Libraries."

System Safety

Another potential hazard is the fact that a native method is not isolated. When a native method is entered, it not only accesses the environment outside the Java virtual machine, it also freely accesses the Java virtual machine directly. This is a necessary evil. It gives the native method quite a bit of power and flexibility, because it may need access to information kept within the virtual machine to do its job. This flexibility, however, exposes the internals of the Java virtual machine to the native method.

Dependence on the Java Implementation

It should be obvious that the implementation of native methods is also dependent on the Java implementation itself. This means that the native methods you write today for use with the Sun implementation of Java may not work with a Java implementation from another vendor.

The interface used for the Java virtual machine to call out to native methods and the interface that native methods use to access the internal functions and data structures of the Java virtual machine are not, currently, defined by either the Java Language Specification or the Java Virtual Machine Specification. A lot of native methods call back into the Java virtual machine for instantiating new objects, calling Java methods, throwing Java Exceptions, and so forth. Further, the method used to lay out Java types is also not defined. So, although your native method of today knows how to access the fields of an object, this could be different on the Java virtual machine of tomorrow. This oversight can be greatly helped if a standard API is defined for both how a Java program interacts with a native method and how a native method accesses data within the Java virtual machine. Even after such an API, implementation-defined behavior will likely still be present.

Java to the Rescue!

Recall that Java helps to minimize the damage of native methods. When you find yourself in the position that you must use native methods, you can take advantage of Java's features to help isolate the usage and perhaps maintain a fair amount of Java's advantages.

The Java Class System

Because Java narrows the use of native facilities to within the confines of a method, it does not affect the design of the program. A program is still a collection of classes and all classes still communicate with each other via their defined interface-that is, the classes' methods. Thus the callers of native methods do not know they are calling native methods. Because methods are discrete operations on the data of a specific object, they tend to be small chunks of code. This implies that native methods tend to be conceptually small, easily managed, pieces of code.

Java Still Works for You

Java will still perform a variety of duties-such as parameter checking, stack checking, synchronization, and so forth-before entering the actual native code. It greatly aids the developer in writing correct native methods. A native method is capable of creating new objects and calling Java methods, and it can even cause exceptions to be thrown. In the current implementation from Sun, an exception can be created by a native method and registered for throwing. When Java virtual machine gains control back from the native method, usually because of that method returning, the exception will then be thrown.

It's a good idea to make your native methods as small as possible and have them do a specific task. Do the work that needs to be done and pass the information back into the Java method. It's also wise to have your Java classes make the native methods private, then provide a public Java method that will call the private native method. This enables the Java method to perform error checks and other data manipulations, freeing your actual native method implementation to focus on its simple task.

How Does This Magic Work?

Much of the magic of making native methods work is provided in the next three chapters. This section provides an introduction, which neglects many of the details but should give you a good frame of reference for understanding the following chapters. If you don't really want to use native methods, but just want a basic understanding this discussion should satisfy your needs.

Sun's Implementation

Due to the lack of a well-defined interface between a Java implementation and its surrounding environment, the details of writing native methods will most likely be specific to the implementation of the Java system you are using. The next sections are based on the implementation provided by Sun on the Solaris-Sparc and Win32-Intel platforms.

Using Dynamic Linking

Sun's Java implementation interfaces to native methods by using the dynamic linking capabilities of the underlying operation system. The Java virtual machine is a complete program, which is already compiled for its respective platform. The nature of Java enables it easily to absorb a Java class and execute its behavior. However, for a compiled native method, things are not so simple. Somehow, the Java virtual machine must be taught how to call this native method. This is done by relying on the implementation of native methods to reside in a dynamic link library, which the operating system magically loads and links into the process that is running the Java virtual machine. On the Solaris platform, such a library is often called shared objects, or shared libraries, or simply dot-so's (.so's). On Win32 platforms, they are called dynamic link libraries (DLLs). This chapter uses DLL to refer to both.

Both Solaris and Win32 provide the necessary capabilities to achieve this dynamic linking. The dynamic linking facilities of both Solaris and Win32 are similar in concept, but differ in their details. This chapter does not attempt to describe the two in detail; however, if you wish to do native method programming, you should understand the mechanism used by your platform. On Solaris, you can begin by viewing the manual page on the dlopen() system call, and its relatives. On Win32, start with the help file on LoadLibrary() and its relatives. Further, you should understand the calling conventions and linking convention used by your compiler.

Sometime before a native method is invoked, the Java virtual machine must be told to find, load, and link the necessary DLLs, which contain the native method implementations. This is conveniently achieved by using the static method java.lang.System.loadLibrary( "mystuff" ). It is worth noting here that the name passed is not the actual filename of the DLL. Java maps the passed name into an expected filename, appropriate for the underlying system, of the DLL. In the call described previously, the string "mystuff" is mapped to a DLL named libmystuff.so on Solaris and mystuff.dll on Win32. If you run the Java program under a debugger, Java conveniently maps the same name "mystuff" to libmystuff_g.so and mystuff_g.dll. This enables you to supply two versions of the DLL-one with debug symbols, one without. Java magically finds the right one, dependent on whether you run under a debugger or not.

Defining the Calling Convention

In, essence, Sun defines the method its Java virtual machine will use to call external functions. In order to dynamically link and call the implementation of a native method successfully, the Java virtual machine must know several details. It must know the name of the function within the DLL (the implementation of the native method) to locate the symbol and its entry point. It also must know how to call that function (its return type, number of parameters, and types of parameters). The Java virtual machine expects the functions to be coded in C using the calling conventions appropriate for the underlying architecture (and compiler).

In simple terms, this means the actual function calls Java makes into the DLL must be known names; if you are trying to get Java to call into your existing library, unless your functions magically match the names Java expects (unlikely), you will usually have glue code, which sits between Java and your real functions. Java will call the glue functions, which in turn call in your functions. Alternatively, you can modify your functions to use the names and parameters Java expects, thus eliminating this extra call; however, in practice this is not always feasible, especially when calling existing libraries. Figure 30.2 shows the most likely scenarios of how your code will be segmented.

Figure 30.2 : Java's use of Dynamic Link Libraries.

The Sun JDK provides a tool, named javah, to help you create your native method implementation functions. The developer of native methods runs javah, passing it the name of a class. javah emits both a header file (.h) and a code file (.c) containing information about each native method and relevant type declarations. The .h file will contain the prototypes of the functions Java will call, and thus expect to find in the DLL. The .c file will contain stubs for each function. Thus, the developer needs to fill in only the details of the functions in the c file and build the DLL appropriately.

How the Virtual Machine Makes It Work

When a class is first used by Java, its class descriptor is loaded into memory. The class descriptor can be thought of as a directory for all services provided by the class-there is only one class descriptor loaded, regardless of how many instances of that class exist. Among its entries is a list of method descriptors, which contain information specific to methods, including where the code is, what parameters they take, and method modifiers.

If a method descriptor has its native modifier set, the block will include a pointer to the function that implements that native method. This function resides in some DLL but will be loaded into the Java processes address space by the operating system. At the time the class descriptor with native methods is loaded, the associated DLL does not have to be loaded, and thus the function pointer will not be set. Sometime prior to a native method being called, the associated DLL should be loaded. This is done via a call to java.system.loadLibrary(). When this call is made, Java will find and load the DLL but will still not resolve symbols; the resolution phase is delayed until the point of use. At the time of a call to a native method, Java will first check to see whether the native method implementation function has already been resolved-that is, its pointer is not null. If it has been previously resolved, the call is performed; otherwise, the resolution of the symbols is attempted. The resolution is performed by making an operating system call to see whether the symbol exists in the caller's address space. This includes the Java process and any DLLs loaded on its behalf. On Win32, this is done via a GetProcAddress() and on Solaris via a dlsym() call.

If the symbols are correctly resolved, the call is performed as if the Java virtual machine was making a standard C call to its own internal functions. If the resolution fails, the exception java.lang.UnsatisfiedLinkError will be thrown at the point of the native method call.

Summary

You should now have a basic understanding of how native methods enable a Java program to access the outside environment. Whether that consists of an operating system, a browser, or your own existing libraries, your Java code can reach them. It should now be clear that native methods do not come without some cost. You lose a lot of the benefits of the Java language. When there is no choice, however, native methods are there to be used. With the basic understanding of how native methods work you should be ready to tackle the next chapters, which provide more in-depth examples of native methods in action, as well as more tips and tricks to help you.