Team LiB
Previous Section Next Section

2.9. Reference Types

Now that we've covered arrays and introduced classes and objects, we can turn to a more general description of reference types. Classes and arrays are two of Java's five kinds of reference types. Classes were introduced earlier and are covered in complete detail, along with interfaces, in Chapter 3. Enumerated types and annotation types are reference types introduced in Java 5.0 (see Chapter 4).

This section does not cover specific syntax for any particular reference type, but instead explains the general behavior of reference types and illustrates how they differ from Java's primitive types. In this section, the term object refers to a value or instance of any reference type, including arrays.

2.9.1. Reference vs. Primitive Types

Reference types and objects differ substantially from primitive types and their primitive values:

  • Eight primitive types are defined by the Java language. Reference types are user-defined, so there is an unlimited number of them. For example, a program might define a class named Point and use objects of this newly defined type to store and manipulate X,Y points in a Cartesian coordinate system. The same program might use an array of charactersof type char[ ]to store text and might use an array of Point objectsof type Point[ ]to store a sequence of points.

  • Primitive types represent single values. Reference types are aggregate types that hold zero or more primitive values or objects. Our hypothetical Point class, for example, might hold two double values to represent the X and Y coordinates of the points. The char[ ] and Point[ ] array types are obviously aggregate types because they hold a sequence of primitive char values or Point objects.

  • Primitive types require between one and eight bytes of memory. When a primitive value is stored in a variable or passed to a method, the computer makes a copy of the bytes that hold the value. Objects, on the other hand, may require substantially more memory. Memory to store an object is dynamically allocated on the heap when the object is created and this memory is automatically "garbage-collected" when the object is no longer needed. When an object is assigned to a variable or passed to a method, the memory that represents the object is not copied. Instead, only a reference to that memory is stored in the variable or passed to the method.

This last difference between primitive and reference types explains why reference types are so named. The sections that follow are devoted to exploring the substantial differences between types that are manipulated by value and types that are manipulated by reference.

Before moving on, however, it is worth briefly considering the nature of references. A reference is simply some kind of reference to an object. References are completely opaque in Java and the representation of a reference is an implementation detail of the Java interpreter. If you are a C programmer, however, you can safely imagine a reference as a pointer or a memory address. Remember, though, that Java programs cannot manipulate references in any way. Unlike pointers in C and C++, references cannot be converted to or from integers, and they cannot be incremented or decremented. C and C++ programmers should also note that Java does not support the & address-of operator or the * and -> dereference operators. In Java, primitive types are always handled exclusively by value, and objects are always handled exclusively by reference: the . operator in Java is more like the -> operator in C and C++ than it is like the . operator of those languages.

2.9.2. Copying Objects

The following code manipulates a primitive int value:

int x = 42;
int y = x;

After these lines execute, the variable y contains a copy of the value held in the variable x. Inside the Java VM, there are two independent copies of the 32-bit integer 42.

Now think about what happens if we run the same basic code but use a reference type instead of a primitive type:

Point p = new Point(1.0, 2.0);
Point q = p;

After this code runs, the variable q holds a copy of the reference held in the variable p. There is still only one copy of the Point object in the VM, but there are now two copies of the reference to that object. This has some important implications. Suppose the two previous lines of code are followed by this code:

System.out.println(p.x);  // Print out the X coordinate of p: 1.0
q.x = 13.0;               // Now change the X coordinate of q
System.out.println(p.x);  // Print out p.x again; this time it is 13.0

Since the variables p and q hold references to the same object, either variable can be used to make changes to the object, and those changes are visible through the other variable as well.

This behavior is not specific to objects; the same thing happens with arrays, as illustrated by the following code:

char[] greet = { 'h','e','l','l','o' };  // greet holds an array reference
char[] cuss = greet;                     // cuss holds the same reference
cuss[4] = '!';                           // Use reference to change an element
System.out.println(greet);               // Prints "hell!"

A similar difference in behavior between primitive types and reference types occurs when arguments are passed to methods. Consider the following method:

void changePrimitive(int x) {
    while(x > 0)
        System.out.println(x--);
}

When this method is invoked, the method is given a copy of the argument used to invoke the method in the parameter x. The code in the method uses x as a loop counter and decrements it to zero. Since x is a primitive type, the method has its own private copy of this value, so this is a perfectly reasonable thing to do.

On the other hand, consider what happens if we modify the method so that the parameter is a reference type:

void changeReference(Point p) {
    while(p.x > 0)
        System.out.println(p.x--);
}

When this method is invoked, it is passed a private copy of a reference to a Point object and can use this reference to change the Point object. Consider the following:

Point q = new Point(3.0, 4.5);  // A point with an X coordinate of 3
changeReference(q);             // Prints 3,2,1 and modifies the Point
System.out.println(q.x);        // The X coordinate of q is now 0!

When the changeReference( ) method is invoked, it is passed a copy of the reference held in variable q. Now both the variable q and the method parameter p hold references to the same object. The method can use its reference to change the contents of the object. Note, however, that it cannot change the contents of the variable q. In other words, the method can change the Point object beyond recognition, but it cannot change the fact that the variable q refers to that object.

The title of this section is "Copying Objects," but, so far, we've only seen copies of references to objects, not copies of the objects and arrays themselves. To make an actual copy of an object, you must use the special clone( ) method (inherited by all objects from java.lang.Object):

Point p = new Point(1,2);    // p refers to one object
Point q = (Point) p.clone(  ); // q refers to a copy of that object
q.y = 42;                    // Modify the copied object, but not the original

int[] data = {1,2,3,4,5};           // An array
int[] copy = (int[]) data.clone(  );  // A copy of the array

Note that a cast is necessary to coerce the return value of the clone( ) method to the correct type. There are a couple of points you should be aware of when using clone( ). First, not all objects can be cloned. Java only allows an object to be cloned if the object's class has explicitly declared itself to be cloneable by implementing the Cloneable interface. (We haven't discussed interfaces or how they are implemented yet; that is covered in Chapter 3.) The definition of Point that we showed earlier does not actually implement this interface, so our Point type, as implemented, is not cloneable. Note, however, that arrays are always cloneable. If you call the clone( ) method for a noncloneable object, it throws a CloneNotSupportedException. When you use the clone( ) method, you may want to use it within a TRy block to catch this exception.

The second thing you need to understand about clone( ) is that, by default, it creates a shallow copy of an object. The copied object contains copies of all the primitive values and references in the original object. In other words, any references in the object are copied, not cloned; clone( ) does not recursively make copies of the objects referred to by those references. A class may need to override this shallow copy behavior by defining its own version of the clone( ) method that explicitly performs a deeper copy where needed. To understand the shallow copy behavior of clone( ), consider cloning a two-dimensional array of arrays:

int[][] data = {{1,2,3}, {4,5}};        // An array of 2 references
int[][] copy = (int[][]) data.clone(  ); // Copy the 2 refs to a new array
copy[0][0] = 99;                         // This changes data[0][0] too!
copy[1] = new int[] {7,8,9};            // This does not change data[1]

If you want to make a deep copy of this multidimensional array, you have to copy each dimension explicitly:

int[][] data = {{1,2,3}, {4,5}};        // An array of 2 references
int[][] copy = new int[data.length][];  // A new array to hold copied arrays
for(int i = 0; i < data.length; i++)
   copy[i] = (int[]) data[i].clone(  );

2.9.3. Comparing Objects

We've seen that primitive types and reference types differ significantly in the way they are assigned to variables, passed to methods, and copied. The types also differ in the way they are compared for equality. When used with primitive values, the equality operator (= =) simply tests whether two values are identical (i.e., whether they have exactly the same bits). With reference types, however, = = compares references, not actual objects. In other words, = = tests whether two references refer to the same object; it does not test whether two objects have the same content. For example:

String letter = "o";
String s = "hello";                      // These two String objects
String t = "hell" + letter;              // contain exactly the same text.
if (s =  = t) System.out.println("equal"); // But they are not equal!

byte[] a = { 1, 2, 3 };                  // An array.
byte[] b = (byte[]) a.clone(  );           // A copy with identical content.
if (a =  = b) System.out.println("equal"); // But they are not equal!

When working with reference types, there are two kinds of equality: equality of reference and equality of object. It is important to distinguish between these two kinds of equality. One way to do this is to use the word "identical" when talking about equality of references and the word "equal" when talking about two distinct objects that have the same content. To test two nonidentical objects for equality, pass one of them to the equals( ) method of the other:

String letter = "o";
String s = "hello";                      // These two String objects
String t = "hell" + letter;              // contain exactly the same text.
if (s.equals(t))                         // And the equals(  ) method
    System.out.println("equal");         // tells us so.

All objects inherit an equals( ) method (from Object), but the default implementation simply uses = = to test for identity of references, not equality of content. A class that wants to allow objects to be compared for equality can define its own version of the equals( ) method. Our Point class does not do this, but the String class does, as indicated in the code example. You can call the equals( ) method on an array, but it is the same as using the = = operator, because arrays always inherit the default equals( ) method that compares references rather than array content. You can compare arrays for equality with the convenience method java.util.Arrays.equals( ).

2.9.4. Terminology: Pass by Value

I've said that Java handles objects "by reference." Don't confuse this with the phrase "pass by reference." "Pass by reference" is a term used to describe the method-calling conventions of some programming languages. In a pass-by-reference language, valueseven primitive valuesare not passed directly to methods. Instead, methods are always passed references to values. Thus, if the method modifies its parameters, those modifications are visible when the method returns, even for primitive types.

Java does not do this; it is a "pass by value" language. However, when a reference type is involved, the value that is passed is a reference. But this is still not the same as pass-by-reference. If Java were a pass-by-reference language, when a reference type is passed to a method, it would be passed as a reference to the reference.

2.9.5. Memory Allocation and Garbage Collection

As we've already noted, objects are composite values that can contain a number of other values and may require a substantial amount of memory. When you use the new keyword to create a new object or use an object literal in your program, Java automatically creates the object for you, allocating whatever amount of memory is necessary. You don't need to do anything to make this happen.

In addition, Java also automatically reclaims that memory for reuse when it is no longer needed. It does this through a process called garbage collection. An object is considered garbage when no references to it are stored in any variables, the fields of any objects, or the elements of any arrays. For example:

Point p = new Point(1,2);           // Create an object
double d = p.distanceFromOrigin(  );  // Use it for something
p = new Point(2,3);                 // Create a new object

After the Java interpreter executes the third line, a reference to the new Point object has replaced the reference to the first one. No references to the first object remain, so it is garbage. At some point, the garbage collector discovers this and reclaims the memory used by the object.

C programmers, who are used to using malloc( ) and free( ) to manage memory, and C++ programmers, who are used to explicitly deleting their objects with delete, may find it a little hard to relinquish control and trust the garbage collector. Even though it seems like magic, it really works! There is a slight, but usually negligible, performance penalty due to the use of garbage collection. However, having garbage collection built into the language dramatically reduces the occurrence of memory leaks and related bugs and almost always improves programmer productivity.

2.9.6. Reference Type Conversions

Objects can be converted between different reference types. As with primitive types, reference type conversions can be widening conversions (allowed automatically by the compiler) or narrowing conversions that require a cast (and possibly a runtime check). In order to understand reference type conversions, you need to understand that reference types form a hierarchy, usually called the class hierarchy .

Every Java reference type extends some other type, known as its superclass. A type inherits the fields and methods of its superclass and then defines its own additional fields and methods. A special class named Object serves as the root of the class hierarchy in Java. All Java classes extend Object directly or indirectly. The Object class defines a number of special methods that are inherited (or overridden) by all objects.

The predefined String class and the Point class we discussed earlier in this chapter both extend Object. Thus, we can say that all String objects are also Object objects. We can also say that all Point objects are Object objects. The opposite is not true, however. We cannot say that every Object is a String because, as we've just seen, some Object objects are Point objects.

With this simple understanding of the class hierarchy, we can return to the rules of reference type conversion:

  • An object cannot be converted to an unrelated type. The Java compiler does not allow you to convert a String to a Point, for example, even if you use a cast operator.

  • An object can be converted to the type of its superclass or of any ancestor class. This is a widening conversion, so no cast is required. For example, a String value can be assigned to a variable of type Object or passed to a method where an Object parameter is expected. Note that no conversion is actually performed; the object is simply treated as if it were an instance of the superclass.

  • An object can be converted to the type of a subclass, but this is a narrowing conversion and requires a cast. The Java compiler provisionally allows this kind of conversion, but the Java interpreter checks at runtime to make sure it is valid. Only cast an object to the type of a subclass if you are sure, based on the logic of your program, that the object is actually an instance of the subclass. If it is not, the interpreter throws a ClassCastException. For example, if we assign a String object to a variable of type Object, we can later cast the value of that variable back to type String:

Object o = "string";    // Widening conversion from String to Object
// Later in the program...
String s = (String) o;  // Narrowing conversion from Object to String

Arrays are objects and follow some conversion rules of their own. First, any array can be converted to an Object value through a widening conversion. A narrowing conversion with a cast can convert such an object value back to an array. For example:

Object o = new int[] {1,2,3};  // Widening conversion from array to Object
// Later in the program...
int[] a = (int[]) o;           // Narrowing conversion back to array type

In addition to converting an array to an object, an array can be converted to another type of array if the "base types" of the two arrays are reference types that can themselves be converted. For example:

// Here is an array of strings.
String[] strings = new String[] { "hi", "there" };
// A widening conversion to CharSequence[] is allowed because String
// can be widened to CharSequence
CharSequence[] sequences = strings;
// The narrowing conversion back to String[] requires a cast.
strings = (String[]) sequences;
// This is an array of arrays of strings
String[][] s = new String[][] { strings };
// It cannot be converted to CharSequence[] because String[] cannot be
// converted to CharSequence: the number of dimensions don't match
sequences = s;  // This line will not compile
// s can be converted to Object or Object[], however because all array types
// (including String[] and String[][]) can be converted to Object.
Object[] objects = s;

Note that these array conversion rules apply only to arrays of objects and arrays of arrays. An array of primitive type cannot be converted to any other array type, even if the primitive base types can be converted:

// Can't convert int[] to double[] even though int can be widened to double
double[] data = new int[] {1,2,3};  // This line causes a compilation error
// This line is legal, however, since int[] can be converted to Object
Object[] objects = new int[][] {{1,2},{3,4}};

2.9.7. Boxing and Unboxing Conversions

Primitive types and reference types behave quite differently. It is sometimes useful to treat primitive values as objects, and for this reason, the Java platform includes wrapper classes for each of the primitive types. Boolean, Byte, Short, Character, Integer, Long, Float, and Double are immutable classes whose instances each hold a single primitive value. These wrapper classes are usually used when you want to store primitive values in collections such as java.util.List:

List numbers = new ArrayList(  );               // Create a List collection
numbers.add(new Integer(-1));                 // Store a wrapped primitive
int i = ((Integer)numbers.get(0)).intValue(  ); // Extract the primitive value

Prior to Java 5.0, no conversions between primitive types and reference types were allowed. This code explicitly calls the Integer( ) constructor to wrap a primitive int in an object and explicitly calls the intValue( ) method to extract a primitive value from the wrapper object.

Java 5.0 introduces two new types of conversions known as boxing and unboxing conversions. Boxing conversions convert a primitive value to its corresponding wrapper object and unboxing conversions do the opposite. You may explicitly specify a boxing or unboxing conversion with a cast, but this is unnecessary since these conversions are automatically performed when you assign a value to a variable or pass a value to a method. Furthermore, unboxing conversions are also automatic if you use a wrapper object when a Java operator or statement expects a primitive value. Because Java 5.0 performs boxing and unboxing automatically, this new language feature is often known as autoboxing.

Here are some examples of automatic boxing and unboxing conversions:

Integer i = 0;    // int literal 0 is boxed into an Integer object
Number n = 0.0f;  // float literal is boxed into Float and widened to Number
Integer i = 1;    // this is a boxing conversion
int j = i;        // i is unboxed here
i++;              // i is unboxed, incremented, and then boxed up again
Integer k = i+2;  // i is unboxed and the sum is boxed up again
i = null;
j = i;            // unboxing here throws a NullPointerException

Automatic boxing and unboxing conversions make it much simple to use primitive values with collection classes. The list-of-numbers code earlier in this section can be translated as follows in Java 5.0. Note that the translation also uses generics, another new feature of Java 5.0 that is covered in Chapter 4.

List<Integer> numbers = new ArrayList<Integer>(  ); // Create a List of Integer
numbers.add(-1);                                  // Box int to Integer
int i = numbers.get(0);                           // Unbox Integer to int

    Team LiB
    Previous Section Next Section