In a language like C or C++, the compilation model is fairly simple.
You compile a source file to an object file, and then later invoke a
linker to combine the object files with libraries into an executable
program. More recently, shared libraries and so forth have been used
in the compilation model.
Java is a bit different in its approach. We will be describing how it
handles program packaging and compilation in this and subsequent
issues.
The first thing to mention is that the Java compiler (javac) will
compile more than just the one source program given to it on the
command line. It will go off and do other compilation as necessary.
The result of compilation is an xxx.class file, containing the
bytecodes and a description of the interface (set of methods and so
on) offered by the class contained within.
A public class should be defined within a file of the same name as the
class, with lower/upper case significant. A public class uses the
keyword "public". An example of a public class is given in the above
example on applets. "ga" is a public class, "AppFrame" is not. We
will mention more about public classes in a moment. A Java source
file may contain only one public class definition.
The next item of importance is the CLASSPATH environment variable.
This is a set of directories that is used by the compiler and
interpreter to find classes. For example, in a Windows NT
installation, the setting might be:
CLASSPATH=".;d:/java/lib" |
meaning that the current directory and then the directory d:/java/lib
are searched. In UNIX the separator is ":" instead of ";".
Searched for what? Source files, class files, and packages. What are
packages? Packages are groupings of related classes. If I say:
// file A.java package X; public class A {} // file B.java package X; public class B {} |
then A and B are grouped in the package X. A somewhat similar feature
in C++ is namespaces.
Packages are tied in with the CLASSPATH variable and the file system.
Declaring a package X as in the example means that somewhere along the
CLASSPATH directories there will be a directory X, with files A.java,
B.java, A.class, and B.class in it. If the current directory is first
in the CLASSPATH list, then this means creating subdirectories of the
current directory, each subdirectory as the name of a package.
This works for system directories as well. For example, looking under
d:/java/lib, there are directories java/applet, java/awt, java/io,
java/lang, and so on. These tie in directly with import directives of
the form:
import java.io.*; import java.awt.*; |
and so on. Note that:
import java.lang.*; |
is supplied implicitly and so does not have to be specified by the
programmer. "*" means to import all classes in the package.
Package names can be used to qualify program entities. For example, I
can say:
java.lang.System.out.println("Howdy!"); |
In fact, there is some discussion about making the highest levels of
the package hierarchy correspond to Internet domain names, as in:
COM.glenmccl.java.lang.System.out.println("Howdy!"); |
If this is done, then you won't have to worry about creating packages
and classes that interfere with those produced by others!
We mentioned above the distinction between public and non-public
classes. Non-public classes in packages cannot be imported into a
program. For example, this sequence is illegal:
// file y1.java in Z/y1.java relative to current directory package Z; /*public*/ class y1 {} // file y2.java in current directory import Z.*; public class y2 { public static void main(String args[]) { y1 y = new y1(); } } |
To wrap up this discussion for now, here is a longer example. There
are two classes that do statistical analysis, one for descriptive
statistics like mean and standard deviation and the other for doing
correlation. Don't worry too much if you don't know statistics; this
example is really about packaging.
We use a package called Stat for grouping these classes.
// file Stat/descr.java in subdirectory package Stat; public class descr { private long n; // count of numbers seen private double x; // sum of X private double x2; // sum of X^2 // constructor public descr() { n = 0; x = x2 = 0.0; } // add a number to the pool public void add(double d) { n++; x += d; x2 += d * d; } // retrieve the count of numbers seen public long cnt() { return n; } // return mean (average) public double mean() { if (n < 1) throw new ArithmeticException(); return x / (double)n; } // return standard deviation public double stdev() { if (n < 2) throw new ArithmeticException(); double d1 = (double)n * x2 - x * x; double d2 = (double)n * (double)(n - 1); return Math.sqrt(d1 / d2); } } // file Stat/corr.java in subdirectory package Stat; public class corr { private long n; // number of values seen private double x; // sum of X private double y; // sum of Y private double x2; // sum of X^2 private double y2; // sum of Y^2 private double xy; // sum of X*Y // constructor public corr() { n = 0; x = y = x2 = y2 = xy = 0.0; } // add in a pair of numbers public void add(double a, double b) { n++; x += a; y += b; x2 += a * a; y2 += b * b; xy += a * b; } // return count public long cnt() { return n; } // get correlation public double getcorr() { if (n < 2) throw new ArithmeticException(); double d0 = (double)n * xy - x * y; double d1 = Math.sqrt((double)n * x2 - x * x); double d2 = Math.sqrt((double)n * y2 - y * y); return d0 / (d1 * d2); } // get a for y = ax + b public double geta() { if (n < 2) throw new ArithmeticException(); return ((double)n * xy - x * y) / ((double)n * x2 - x * x); } // get b for y = ax + b public double getb() { if (n < 2) throw new ArithmeticException(); return y / (double)n - x / (double)n * geta(); } } // file stest.java in current directory import Stat.*; public class stest { public static void main(String args[]) { // test descriptive statistics descr sd = new descr(); for (int i = 1; i <= 10; i++) sd.add((double)i); System.out.println("count = " + sd.cnt()); System.out.println("mean = " + sd.mean()); System.out.println("std dev = " + sd.stdev()); // test correlation corr co = new corr(); for (int j = 1; j <= 10; j++) co.add((double)j, (double)j * 2.5 + 9.0); System.out.println("count = " + co.cnt()); System.out.println("correlation = " + co.getcorr()); System.out.print("y = " + co.geta()); System.out.println("x + " + co.getb()); } } |
Java 1.1 also introduces inner classes, which are classes defined within
another class. This is a complicated topic that we will spend several
issues covering in full. One type of inner class is an anonymous class,
which we can illustrate with an example:
import java.awt.*; import java.awt.event.*; public class button { public static void main(String args[]) { Frame f = new Frame("testing"); Panel p = new Panel(); Button b = new Button("OK"); b.addActionListener( new ActionListener() { public void actionPerformed(ActionEvent e) { System.err.println("got here"); System.exit(0); } } ); p.add(b); f.add(p); f.pack(); f.setVisible(true); } } |
This example sets up an AWT frame with a panel, and adds a button
containing "OK" to the panel. When the button is selected, the program
terminates. With the 1.1 event delegation model, we want to bind an
action listener to the button, such that a method actionPerformed() will
be called when the button is selected. That is, we want to implement
the interface ActionListener.
This could be done via a separate class, or by having button implement
ActionListener. We've seen examples of such usage before. But a more
contained approach is to use an anonymous class. When I said:
new ActionListener() { ... } |
I in fact declared a new class, with no name, that implements the
interface ActionListener. To actually implement this interface,
actionPerformed() must of course be defined.
If ActionListener was a class, rather than an interface, saying something like:
new ActionListener(arg1, arg2, ...) { ... } |
would declare a subclass of ActionListener, and the arguments would be
passed to the superclass constructor.
An anonymous class has no name, and therefore has no constructor. An
anonymous class may refer to fields and methods in its containing class.
Whether anonymous classes are a "good" feature is a hard one to call.
They are very convenient, as this example illustrates. But they make
code harder to read, cause problems for debuggers and documentation
tools, and so on.
Another type of class is what is called a nested
class, where a class is declared within another class as a sort of
helper. Nested classes are declared using the "static" modifier. To
see what this looks like, consider an example of managing school records
consisting of name/grade pairs:
import java.util.Vector; public class RecordList { private Vector recs = null; public RecordList() { recs = new Vector(); } public void addRecord(String name, int grade) { recs.addElement(new rec(name, grade)); } public void dumpAll() { int sz = recs.size(); for (int i = 0; i < sz; i++) { rec r = (rec)recs.elementAt(i); System.out.println(r.name + " " + r.grade); } } public static void main(String args[]) { RecordList rl = new RecordList(); rl.addRecord("Jane Smith", 57); rl.addRecord("John Jones", 43); rl.addRecord("Nancy Williams", 51); rl.dumpAll(); } private static class rec { private String name = null; private int grade = 0; private rec(String n, int g) { name = n; grade = g; } } } |
There is a public class RecordList that we create an instance of and
then feed name/grade pairs to. It in turn uses a helper class "rec" to
actually record the pairs. The individual records are stored in a
Vector object for later retrieval.
rec is a separate class, nested within RecordList. It has access to the
static members, if any, of its containing class. But it does not have
access to the non-static members in RecordList, because this would have
no meaning -- object instances of rec are created independently of those
of RecordList (another variation, the member class, does allow such
access).
There are various other ways in which this application could be
implemented. For example, rec could be broken out as a separate
top-level public class. But sometimes it's better to hide a helper
class, if its use is limited to a narrow application.
The 1.1 edition of David Flanagan's "Java in a Nutshell" has a good
discussion of nested/member/local/anonymous classes.
In the previous two issues we discussed the use of anonymous and nested
classes. These provide for the nesting of one class inside of another.
Another variation on this feature is local classes, where a class is
defined within a method.
As a simple example of how local classes work, consider an application
that needs to create a listener for an AWT object:
import java.awt.*; import java.awt.event.*; public class local1 { public static void main(String args[]) { Frame f = new Frame("testing"); Panel p = new Panel(); class listen implements ActionListener { public void actionPerformed(ActionEvent e) { System.exit(0); } } Button b = new Button("Exit"); b.addActionListener( new listen() ); p.add(b); f.add(p); f.pack(); f.setSize(100, 100); f.setVisible(true); } } |
For our example, the listener must implement the ActionListener
interface, and define actionPerformed(). But realize that the listener
class could be local1 itself, a top-level non-public class in the same
compilation unit, an anonymous class, or a local class. We have chosen
the last of these.
A local class is visible only within the method where it's defined, and
cannot have public/protected/private modifiers on the class definition
(they wouldn't mean anything). Such a class can use fields from its
enclosing class, and final (unchangeable) local variables from its
enclosing method. For example:
public class local2 { private static int a = 37; private int b = 47; public void f() { int c = 57; final int d = 67; class Z { Z() { System.out.println(a); System.out.println(b); //System.out.println(c); // error System.out.println(d); } } Z z = new Z(); } public static void main(String args[]) { local2 x = new local2(); x.f(); } } |
The reason that non-final local variables are not allowed is because the
implementation of local classes makes a hidden copy of local variables
for use by the class. So the local class is accessing a copy of the
local variables rather than the variables themselves, and thus only
access to unchanging variables makes sense.
Anonymous and local classes have some overlap, and both can be used to
solve similar problems. An anonymous class is somewhat cryptic in
nature, while a local class is more verbose and easier to understand.
In Java, a class organization such as:
class A {} class B extends A {} |
results in a superclass (A) and a subclass (B). References to B objects
may be assigned to A references, and if an A reference "really" refers
to a B, then B's methods will be called in preference to A's. All of
this is a standard part of the object-oriented programming paradigm
offered by Java.
But there is a way to modify this type of organization, by declaring a
class to be final. If I say:
final class A {} |
then that means that A cannot be further extended or subclassed.
This feature has a couple of big implications. One is that it allows
control over a class, so that no one can subclass the class and possibly
introduce anomalous behavior. For example, java.lang.String is a final
class. This means, for example, that I can't subclass String and
provide my own length() method that does something very different from
returning the string length.
There is also a big performance issue with final classes. If a class is
final, then all of its methods are implicitly final as well, that is,
the method is guaranteed not be overridden in any subclass. A Java
compiler may be able to inline a final method. For example, this
program:
final class A { private int type; public int getType() {return type;} } public class test { public static void main(String args[]) { int N = 5000000; int i = N; int t = 0; A aref = new A(); while (i-- > 0) t = aref.getType(); } } |
runs about twice as fast when the class is declared final.
Of course, much of the time it's desirable to use the superclass /
subclass paradigm to the full, and not worry about wringing out the last
bit of speed. But sometimes you have heavily used methods that you'd
like to have expanded inline, and a final class is one way of achieving
that.
In previous issues we've discussed nested, local, and anonymous classes,
all part of the new inner class feature of Java 1.1. There is one more
type of inner class to consider, the member class.
A member class is a class defined within another class. Each instance
of the member class is associated with a corresponding instance of the
defining class. Unlike a nested class, methods of a member class can
access instance fields in a containing class.
A nested class must be declared static, and is used to group related
classes together. A member class, on the other hand, implies a more
intimate relationship, with each instance of the class corresponding to
an instance of the enclosing class.
To see how this works, consider an example like:
public class mem1 { private int i = 37; class mem11 {} public static void main(String args[]) { mem1 ref1 = new mem1(); mem11 ref11 = ref1.new mem11(); } } |
mem1 is an ordinary top-level class, and mem11 a member class within it.
New instances of mem1 are created in the usual way, but new instances of
mem11 require new syntax:
mem11 ref11 = ref1.new mem11(); |
That is, an instance of mem11 is being created in the context of a
specific enclosing instance of mem1, which is referred to by ref1.
A further example of how member classes are used is illustrated by this
code:
public class mem2 { int i = 0; class mem22 { int i = 0; void f() { this.i = 37; mem2.this.i = 47; System.out.println(this.i); System.out.println(mem2.this.i); } } public static void main(String args[]) { mem2 m2 = new mem2(); mem22 m22 = m2.new mem22(); m22.f(); } } |
We create the new top-level and member class instances, and then call
the f() method in the member class. Both the enclosing and member
classes have an "i" field, and to access each of these, we say:
this.i = 37; mem2.this.i = 47; |
"mem2.this" is the reference to the enclosing instance of mem2.
Member classes are most useful as helper classes to other classes, in
situations where the helper class needs to get at the instance variables
of the containing class. An example might be some type of a data
structure class like a tree, that defines a member class implementing
java.util.Enumeration to traverse the structure.
Java 1.2 adds a new feature for manipulating object references, the
Reference class and related support classes. This feature is a little
hard to describe, but we could say that "Reference is to object
references as Class is to Java classes". That is, a Reference
represents an object reference.
To see how this feature works, let's look at an example:
import java.lang.ref.*; public class ref1 { public static Reference ref = null; public static ReferenceQueue rq = new ReferenceQueue(); public static void f() { Object obj = new Object(); System.out.println(obj); ref = new GuardedReference(obj, rq); } public static void main(String args[]) { f(); try { Reference r = rq.remove(500); // 500ms timeout Object obj = (r == null ? null : r.get()); System.out.println(obj); } catch (InterruptedException e) { System.err.println(e); } System.gc(); try { Reference r = rq.remove(500); Object obj = (r == null ? null : r.get()); System.out.println(obj); } catch (InterruptedException e) { System.err.println(e); } } } |
In this example, main() calls a method f(), and f() creates a local
Object instance. We then create a GuardedReference wrapper for this
local object, and specify a ReferenceQueue as well. The wrapper has the
usual get() and set() methods for obtaining the object (the "referent")
that has been set, and for setting a new one.
But there's another aspect of Reference that goes beyond simple support
for wrapping a reference in a wrapper class. When we created the
GuardedReference object, we also specified a queue. This process is
known as "registering" the Reference object.
In this example, when f() returns, the object created locally is
garbage, that is, has no valid references to it. At some point the
garbage collector will realize this, and will add the Reference object
to the specified queue ("enqueue" it).
Why is this feature useful? One example would be a caching mechanism,
where objects represent in-memory disk files, and it's important to know
when a cached object is no longer in use (so that it can be replaced in
the cache by a higher-priority object).
Queues are used to represent Reference objects, so that for example a
separate thread can continually check the queue. Queues can also be
polled using the mechanism illustrated above (500 millisecond timeout in
this example).
There are several types of Reference objects. GuardedReference is one,
and WeakReference and PhantomReference two others. These differ in
their various properties, for example in whether the no-longer-reachable
referent can actually be reclaimed by the garbage collector.
This mechanism can be used for various types of caching and for
implementing object cleanup schemes.
In a previous issue we saw an example of static initializers for
classes, where one-time initialization of a class can be performed.
Java 1.1 extends this idea to instance initialization, that is, blocks
of code in a class that are executed every time a new class instance is
created. To get an idea of how this works, consider an example such as:
class A { public A() { System.out.println("A ctor"); } } class B extends A { static { System.out.println("B static init"); } public B() { super(); System.out.println("B ctor"); } { System.out.println("B instance init"); } } public class init { public static void main(String args[]) { B b1 = new B(); B b2 = new B(); } } |
Here we have a superclass A, and a subclass B. We create two instances
of B within main(). The output from running this program is:
B static init A ctor B instance init B ctor A ctor B instance init B ctor |
The static code block in B is executed first, and only one time. Then,
for each instance of B that is created, the superclass constructor is
run, then the block of code representing an instance initializer in B,
and then B's constructor.
Such an instance initializer is similar to a no-argument constructor.
So why would you use an initializer like this instead of a constructor?
One reason is for initializing instances of anonymous classes (see next
section), which do not have names and cannot define their own
constructors.
Another use of instance initializers is to support initialization of
object fields near the definition of those fields, rather than
performing initialization in a constructor which may be some distance
away.
Speaking of the use of final variables, another new feature in 1.1 is
the ability to initialize a final class field after its declaration. It
used to be that you'd have to say:
final int BUFSIZ = 1024; |
with the initializer in the declaration itself. That restriction has
been relaxed, so that a final variable can be initialized via an
instance initializer or a constructor. As an example, consider this:
public class blank { final int x; public blank(int i) { x = i; // OK x = 47; // error here } public blank() // error here { } public void f() { x = 57; // error here } } |
The final variable in this example needs to be initialized exactly once
in each of the constructors, and cannot be initialized in a method such
as f().
This feature offers some flexibility when initializing member fields,
for example via constructor arguments.
Java uses exceptions to signal error conditions. Some of these errors
are at user level, for example when a data file cannot be accessed.
Others originate in the Java runtime system, for example when memory is
exhausted or an invalid subscript is applied to an array. There are
four different exception classes defined in java.lang that are important
to understand when using exceptions.
Throwable is the superclass for all exceptions. If a program catches an
exception of type Throwable:
try { ... } catch (Throwable e) { ... } |
it will catch all exceptions.
Error is a subclass of Throwable, used to group together exceptions that
indicate serious problems which a normal application should not try to
catch. An example of an exception subclass of Error is
VirtualMachineError.
Exception is a subclass of Throwable used to group "normal" kinds of
exceptions, such as IOException thrown when an I/O problem occurs.
RuntimeException is a subclass of Exception, and indicates an exception
that is not required to be mentioned in a "throws" clause of a method.
Exceptions in this category are known as "unchecked" exceptions. For
example, if I have a method:
void f(String fn) throws IOException { FileOutputStream fos = new FileOutputStream(fn); ... } |
I must either catch any IOException myself, or else declare that f()
propagates this exception to its caller. On the other hand, there is no
requirement that f() declare that it might throw NullPointerException
(which could happen if fn is null). NullPointerException is a subclass
of RuntimeException and thus is unchecked.
These classes have constructors that allow the specification of an error
message, as in:
throw new Error("out of memory"); |
There is also a provision for dumping out a stack traceback:
try { ... } catch (Error e) { e.printStackTrace(); } |
One of the interesting issues that comes up with Java programming is how
to combine Java classes with code written in other languages such as C
or C++. It's common to have a way to do mixed-language programming on a
given platform, and so it's worth asking how this can be done in Java.
This is a complex topic, with some system-specific aspects to it. But
we will attempt to illustrate some of the basics with a simple example.
Suppose that you have a Java class:
public class test { public static native short f(short a, short b); public static void main(String args[]) { short num1 = 37; short num2 = 47; System.load("test"); // load the DLL short s = f(num1, num2); System.out.println(s); } } |
and you'd like to call a method defined in another language ("native"
method). In the above example f() represents such a method.
The first thing to do is declare the method, using the native modifier.
A native method has no body, because the body will be provided by some
other module written in some other language. Declaring the method in
Java allows for type checking and so on to be performed.
We then compile this class:
javac test.java |
The JDK tool "javah" is next run over the class file:
javah -jni test |
The output of this is a file test.h:
/* DO NOT EDIT THIS FILE - it is machine generated */ #include <jni.h> /* Header for class test */ #ifndef _Included_test #define _Included_test #ifdef __cplusplus extern "C" { #endif /* * Class: test * Method: f * Signature: (SS)S */ JNIEXPORT jshort JNICALL Java_test_f (JNIEnv *, jclass, jshort, jshort); #ifdef __cplusplus } #endif #endif |
This header describes the prototype for a C++ function to be
implemented. We then create test.c to implement the function:
#include <jni.h> #include "test.h" JNIEXPORT jshort JNICALL Java_test_f(JNIEnv*, jclass, jshort a, jshort b) { return a * b; } |
and make a shared library (DLL) out of it by saying (Borland C++ 5.2):
bcc32 -tWD -DWIN32 -Ij:/java/include -Ij:/java/include/win32 test.c |
picking up JNI headers found in j:/java/include and j:/java/include/win32.
Finally, we execute the Java program:
java test |
The program loads the DLL and then calls the f() method within it.
This approach involves a certain amount of magic. The details of
creating DLLs, picking up JNI header files, and actually loading shared
libraries into a running Java program will vary from system to system.
There are also big issues with parameter passing, return values,
accessing object instances, and so on. The above example gives the
flavor of how JNI works. The basic idea is to declare native methods,
use javah to create a header that declares function prototypes for them,
implement the prototypes, create a shared library, and then load the
library into a Java program.
public class test1 { public static void main(String args[]) { long a = 123456; short b = (short)a; } } |
and Java allows the long to be converted to a short only via an explicit
cast.
The cast rule is suspended in some cases where the value to be assigned
is known to the compiler:
public class test2 { public static void main(String args[]) { short a = 12345; } } |
In this example, "12345" is a constant expression (see 15.27 in the Java
Language Specification), and the compiler knows that this value is
representable in a short (which supports values -32768 - 32767).
But implicit narrowing is not done on method invocation, so that:
public class test3 { public static void f(short s) {} public static void main(String args[]) { f(0); // error because "0" is an int, not short } } |
is invalid without a cast. This restriction is intended to simplify the
overloaded method matching process. In C++, with many more conversion
possibilities, argument matching of overloaded functions is very
complex.
Narrowing also applies to reference types. For example, if A is a
superclass of B, then converting an A reference to a B reference is a
narrowing conversion. In this case, the converted reference value is
checked as to whether it is actually a legitimate value of type B.
Narrowing is often quite useful. Java provides mechanisms for narrowing
values, with the desirable requirement that such narrowing be explicitly
identified via casts.
Float and Double are wrapper classes defined in java.lang. They can be
used to wrap individual values of float and double types, and insert
such values into object collections such as those represented by Vector.
These classes are also used to convert floating-point values to and from
strings.
But these classes have another purpose, which is to represent properties
of floating-point types. For example, they define the constant
MAX_VALUE that specifies the maximum float or double value, and
POSITIVE_INFINITY to represent an infinite value.
One of the most interesting features of these classes are methods that
convert to and from bit representations of floating-point values. For
example, this code:
public class floatbits { public static void main(String args[]) { int bits = Float.floatToIntBits(12.34f); int sign = bits >>> 31; int exp = (bits >>> 23) & 0xff; int mant = bits & 0x7fffff; System.out.println("sign = " + sign); System.out.println("exponent = " + exp); System.out.println("mantissa = " + mant); } } |
picks apart a 32-bit floating-point value, and displays its sign,
exponent, and mantissa. Representation of floating-point values uses
the IEEE 754 format (see 4.2.3 in the Java Language Specification). In
this particular case, the sign is in bit 31, the exponent in bits 30-23,
and the mantissa in bits 22-0.
For a 32-bit floating-point value, the bit representation of the value
also doubles as the hash code, used for example when inserting a Float
object into a Hashtable.
Here is a very simple one. The core Java libraries contain several
interfaces of this type:
public interface Cloneable {} |
alone in a source file. Why would you want to do this, given that the
interface doesn't define anything?
This technique can be used to "mark" a class, to specify that it has a
given property, and sometimes goes by the name of "marker interface".
In the case of Cloneable, Object.clone(), a native method for making a
copy of an object, requires that the object implement the Cloneable
interface, with an CloneNotSupportedException thrown if not.
So if you'd like a class that you develop to be cloneable, you need to
say:
public class MyClass implements Cloneable { ... } |
and this property can then be tested using the "instanceof" operator.
Similar considerations apply for Serializable, that is, a class that
supports conversion to/from a byte stream needs to implement
Serializable.
We are going to spend a few columns looking at some of the new 1.1
language features. The biggest of these is inner classes, which
warrants its own set of columns.
One new feature is anonymous arrays. In Java 1.0, you could say
things like:
int x[] = {1, 2, 3}; |
while saying:
int x[]; x = {1, 2, 3}; |
was illegal, because the {} was supported only in initializer
expressions following a declaration.
This usage is not legal in 1.1 either, but instead you can say:
int x[]; x = new int[] {1, 2, 3}; |
That is, instead of using "new int[3]", leave out the size and follow
the [] with a list of the actual initializer values.
Imagine that you have a complex data structure in memory, one with
many internal links and fields in it. At the end of the execution of
an application, you'd like to somehow save this data structure
permanently and then restore it for the next execution of the
application.
Java serialization, new in version 1.1, is a way of doing this. It's a
mechanism for turning a data structure into a stream of bytes, which
can be written to a file, and then read back in to another data
structure.
Let's look at a simple example of this:
// file write.java import java.io.*; public class write { private static final int N = 25; public static void main(String args[]) { int x[][] = new int[N][2]; for (int i = 0; i < N; i++) { x[i][0] = i; x[i][1] = i * i; } try { FileOutputStream fos = new FileOutputStream("xxx"); ObjectOutputStream oos = new ObjectOutputStream(fos); oos.writeObject(x); oos.flush(); fos.close(); } catch (Throwable e) { System.err.println("exception thrown"); } } } // file read.java import java.io.*; public class read { public static void main(String args[]) { int x[][] = null; try { FileInputStream fis = new FileInputStream("xxx"); ObjectInputStream ois = new ObjectInputStream(fis); x = (int[][])ois.readObject(); fis.close(); } catch (Throwable e) { System.err.println("exception thrown"); } for (int i = 0; i < x.length; i++) System.out.println(x[i][0] + " " + x[i][1]); } } |
In this example, we build a table of squares in a two-dimensional
array, then write the array to a file using writeObject(). We then
read the array back into a separate program using readObject(), casting
the Object reference to the appropriate type.
This operation may not seem like much, but it's hard to do in other
languages, and it's necessary to devise various ad hoc methods for
doing so.
For a class to be serializable, it must implement the Serializable
interface:
public class xxx implements java.io.Serializable { // stuff } |
This interface is empty, and simply serves as a flag to allow you to
specify which classes are serializable. In the example above, the
object we serialized was an array, treated as a class type by Java.
Classes which require special handling during serialization can
implement their own writeObject() and readObject() methods.
There are several interesting quirks with serialization which we may
discuss at some point.
One of the interesting 1.1 features is something known as reflection,
where it is possible to query a class at run time to determine its
properties. For example, with this code:
import java.lang.reflect.*; public class Dump { public static void main(String args[]) { try { String s = "java.lang." + args[0]; Class c = Class.forName(s); Method m[] = c.getMethods(); for (int i = 0; i < m.length; i++) System.out.println(m[i].toString()); } catch (Throwable e) { } } } |
one can query a class for the names and properties of its public
methods, using the new package "java.lang.reflect". There is also
support for accessing private and package level methods.
Additionally, you can dynamically invoke methods on a given object.
Running this program by saying:
$ java Dump Object |
results in:
public final native java.lang.Class java.lang.Object.getClass() public native int java.lang.Object.hashCode() public boolean java.lang.Object.equals(java.lang.Object) public java.lang.String java.lang.Object.toString() public final native void java.lang.Object.notify() public final native void java.lang.Object.notifyAll() public final native void java.lang.Object.wait(long) public final void java.lang.Object.wait(long,int) public final void java.lang.Object.wait() |
This type of feature simply doesn't exist in a language like C or
C++. Certain programming environments or debuggers may offer an
equivalent, but not as part of the language or its core libraries.
Java has no "goto" statement, though this identifier is reserved in
the language. There are several ways in which goto is used in C and
C++, and it's interesting to consider the Java alternatives to such
usage. In this section we will discuss one alternative, and in the
next section another.
One way that goto is used is to jump to the end of a function body,
where cleanup can be done. For example, suppose that we are
manipulating calendar dates, and have a function where we want a year
in the range 1900-99 and evenly divisible by 4. In C, we might have:
void f(int d) { if (d < 1900) goto err; if (d > 1999) goto err; if (d % 4) goto err; /* do stuff with date ... */ return; err: fprintf(stderr, "invalid date %d\n", d); } |
In Java, we can achieve a similar end without goto, by using a form of
the try-catch-finally statement used in exception handling:
public class test { public static void f(int d) { boolean err = true; try { if (d < 1900) return; if (d > 1999) return; if (d % 4 != 0) return; err = false; // do stuff with date ... } finally { if (err) System.err.println("invalid date " + d); } } public static void main(String args[]) { f(1852); f(1976); f(1989); } } |
The code within the try block is "tried", that is, executed. After
the execution of this code, the finally block is executed -- no matter
what happens in the try block. In the example above, we exit the
method (return) for various error conditions, but when the method is
exited, the finally block is executed. In this way, we can execute a
series of Java statements, and guarantee that no matter what happens
in those statements, some other processing will follow.
Whether this programming style is "good" is a matter of opinion.
Saying "return" with the idea that some cleanup will be done by a
finally block could be viewed as a little sneaky or confusing, or
alternatively this might turn out to be a common Java idiom in a few
months or years. At the least, you might put in a comment like:
if (condition) return; // proceed to cleanup phase of method |
If an exception is thrown in a try block, and there is a local catch
block to handle it, the catch block is executed, and then the finally
block. If there is not a local catch block, the finally block is
executed, and then the exception is propagated to the nearest catch
block that can handle the exception (that is, the stack is unwound, as
in other exception processing).
We will say more about Java exception handling at some future point.
In the last issue we talked about CLASSPATH and Java packages. In
this and subsequent issues, we'll be discussing visibility specifiers
for instance (per object) and class (shared across all object
instances) variables and methods.
The first two of these specifiers are "public" and "private".
Specifying public means that the variable or method is accessible
everywhere, and is inherited by any subclasses that extend from the
class. "Everywhere" means subclasses of the class in question and
other classes in the same or other packages. For example:
// file A.java public class A { public void f() {} public static int x = 37; } // file B.java public class B { public static void main(String args[]) { A a = new A(); a.f(); // calls public method A.x = -19; // sets static variable in A } } |
By contrast, "private" means that no other class anywhere can access a
method or variable. For example:
// file A.java public class A { private void f() {} private static int x = 37; } // file B.java public class B { public static void main(String args[]) { A a = new A(); a.f(); // illegal A.x = -19; // illegal } } |
Private instance variables are not inherited. This means something
slightly different in Java than in C++. In both languages private
data members are in fact part of any derived class, but in Java the
term "not inherited" in reference to private variables does double
duty to mean "not accessible".
One crude way of figuring out just how much space an object instance
requires is to use a technique like this one, where the amount of free
memory is saved, many object instances are allocated, and then a
calculation is done to determine the number of bytes per object
instance:
class x1 { private double d1; private double d2; private double d3; private double d4; private double d5; } public class x2 extends x1 { public static void main(String args[]) { int N = 10000; x2 vec[] = new x2[N]; long start_mem = Runtime.getRuntime().freeMemory(); for (int i = 0; i < N; i++) vec[i] = new x2(); long curr_mem = Runtime.getRuntime().freeMemory(); long m = (start_mem - curr_mem) / N; System.out.println("memory used per object = " + m); } } |
This technique is not without its pitfalls (notably issues related to
garbage collection), but sometimes can provide useful information
about object sizes.
In future issues we will be talking about other kinds of visibility,
such as the default visibility level and "protected".
Java has no global variables or functions, unlike some other common
languages. Every variable and function must be part of some class.
Within a class, a variable or function ("method") may be a regular
member of that class, or may be a class variable or class method.
Class methods and variables do not operate on particular object
instances of a class. A class method is typically used as a utility
function within the class, while a class variable is shared by all the
instances of the class. For example, if you're writing a Date class
to represent calendar dates, a class method might be used for the
method that determines whether a given year is a leap year.
Using class methods and variables, it is possible to synthesize
variables and methods somewhat similar to globals. For example, you
can say:
// file "Global.java" public final class Global { public static int x = 37; public static int f() { return 47; } } // file "globtest.java" public class globtest { public static void main(String args[]) { int i = Global.x + Global.f(); System.out.println("i = " + i); Global.x = 0; System.out.println("x = " + Global.x); } } |
"static" is the keyword used to denote that methods or variables are
class ones. The Global class is declared as final, meaning that it
cannot be extended (derived from). The class variable and method
names in Global are denoted by prepending "Global." to them.
If we wanted to get a bit fancier, we could add a line to Global:
private Global() {} |
This declares a private constructor for Global, meaning that anyone
who tries to create an object instance of Global will get an error
message at compile time. The error will occur because they are trying
to create an object whose constructor is inaccessible. This is
reasonable behavior since we don't care about creating object
instances of Global; it's just a wrapper for some variables and
methods.
Another change that could be made would be to have all access to class
variables in Global done through methods:
private static int x = 47; public static void setx(int i) {x = i;} public static int getx() {return x;} |
This makes it easy to trap all changes to the global variable.
A technique similar to that shown in this section is possible in C++,
but is not required. C has no similar mechanism at all, though you
can enforce your own rules via naming conventions and access functions.
It is possible to get into a long argument about the desirability of
global variables. They are best avoided, except to track
application-wide information, such as the current program state or
resources that are to be shared between threads. The Java I/O system
uses the technique illustrated here to set up System.out for use in
code like:
System.out.println("xxx"); |
System is a final class with a private constructor, and out is a class
variable defined within that class.
In issue #004 we talked about public and private fields. When public
is applied to a method or variable:
public int x = 37; public void f() {} |
it means that the method or variable is visible everywhere, while a
private method or variable:
private int x = 37; private void f() {} |
is visible only within the class where it is defined.
Two other levels of visibility are protected:
protected int x = 37; |
and the default when no keyword is specified:
void f() {} |
These are identical except in one case. For both of these levels, the
method or variable is visible in the class where it's defined and to
subclasses and non-subclasses from the same package. For example:
// file pack1/A.java package pack1; public class A { protected int x = 0; int f() {return x;} public static void main(String args[]) {} } class B extends A { void g() { A p = new A(); int i = p.x + p.f(); } } class C { void g() { A p = new A(); int i = p.x + p.f(); } } |
while not being accessible from other packages:
// file pack1/AA.java package pack1; public class AA { protected int x = 0; int f() {return x;} public static void main(String args[]) {} } // file pack2/BB.java package pack2; import pack1.AA; class BB extends AA { void g() { AA p = new AA(); int i = p.x + p.f(); // error here } } class CC { void g() { AA p = new AA(); int i = p.x + p.f(); // error here } } |
Where protected and the default differ is in whether they are
inherited by a subclass in a different package. For example:
// file pack1/D.java package pack1; public class D { int x = 37; protected int y = 47; public static void main(String args[]) {} } // file pack2/E.java package pack2; import pack1.D; class E extends D { void f() { int i = x; // error here int j = y; // OK here } } |
There are a couple more issues with packaging that we will explore in
future issues.