what's in a name?


`The name of the song is called ``Haddocks' Eyes''.'
`Oh, that's the name of the song, is it?' Alice said, trying to feel interested.
`No, you don't understand,' the Knight said, looking a little vexed. `That's what the name is called. The name really is ``The Aged Aged Man''.'
`Then I ought to have said ``That's what the song is called''?' Alice corrected herself.
`No, you oughtn't: that's quite another thing! The song is called ``Ways and Means'': but that's only what it's called, you know!'
`Well, what is the song, then?' said Alice, who was by this time completely bewildered.
Lewis Carroll, Through the Looking-Glass

We now have quite a sophisticated little program, and we also now have the tools to make it ever more sophisticated. We could keep adding new types by extending downward from class Lamp, or class DimmerControlledLamp, or both, or we could extend upward by implementing new interfaces besides Luminous, then extend sideways by creating new abstract or concrete classes to implement those interfaces, then extend downward again by creating new classes to extend those classes, and so on.

All of this would be amusing enough and would certainly take us very far, but we would eventually run up against fundamental problems. There is a deep part of Java that we have touched on several times but that we have yet to fully explore. It is the seemingly simple idea of object names. There are many different ways to refer to objects, and we must master them all to clamber up to the next stage in complexity.
 
Actors and Names

In the movies we frequently mix the name of the character an actor is playing with the name of the actor who's playing that character. We easily refer to both Rambo and Rocky as if they were real people, and not both characters played by Sylvester Stallone. Movies like Last Action Hero, in which Arnold Schwarzenegger both plays himself and also plays a character starring in a movie within the movie in which he himself stars, are rare. Usually we needn't distinguish between an actor and the role the actor plays.

As Java programmers, however, we need to be more precise. When talking to the stage manager we sometimes have to distinguish between the following eight things:

* an object (an actor),
 
* a variable referencing the object (a small box),
 
* the name of the variable referencing the object (the label on the outside of the box),
 
* the name of the object (something written on a piece of paper inside the box),
 
* the type of the object (the script the actor follows),
 
* the type of the variable referencing the object (it's a reference variable, as opposed to a boolean, int, or double variable),
 
* the type of the name of the variable referencing the object (a property of the label on the box), and
 
* the type of the name of the object (a property of the name on the piece of paper inside the box, which is inaccessible to us, so its type, if any, doesn't matter).

There's four different things: an object, a box, a piece of paper inside the box, and a label on the outside of the box. Each of those four things has a type, which adds four more things. All of those eight things are different. Further, any object may have many such boxes. Each such box can have a different label, and the type of those labels can sometimes vary as well (we'll see how soon), but all such boxes must have the same name written on the piece of paper inside them---they all reference the same object.

For most purposes, we can mush together the box, the label on the box, the name inside the box, and the actor who answers to the name inside the box into one thing, which we then call "the actor"; but they are all quite separate in reality.

This level of precision usually isn't important, but it's critical when programming---if nothing else, it's what prevents us from getting so frustrated that we throw our computers out the window. Precision helps us see that what we told the Java interpreter to do isn't what we meant for the Java interpreter to do.
 
Actors and Characters

Here is part of the main() method of our very first Java program, FredsScript:


   FredsScript fred;
   fred = new FredsScript();
   double portionsPerPerson;
   portionsPerPerson = fred.divideApplesAmongPeople(6, 2);

The first two statements look like we're creating an actor and naming it fred, but that's not what we're doing. The actor we create has a name, yes, but it's hidden---only the stage manager knows it. What we're instead doing is asking the stage manager to do the following three things:

* create a box that can hold an actor's name, and label the box fred,
* create an actor,
* put that actor's true name into the box for us.
All we will ever know is the label on the box (the name of the reference variable); we will never know the actor's true name.

In theater terms, the label on the box is the name of the character the actor is playing in our script. This indirection lets us say "Romeo kisses Juliet" in our script without knowing, or caring, whether the actors' real names are indeed Romeo and Juliet.

So when we say "fred please execute your divideApplesAmongPeople() method" we're really saying: "the actor who's true name is in the reference variable labelled fred (that is, the actor who's playing the fred character), please execute your divideApplesAmongPeople() method".
 
Actors and Aliases

Suppose we change the FredsScript() code as follows:


   FredsScript fred;
   fred = new FredsScript();
   FredsScript jim;
   jim = fred;
   double portionsPerPerson;
   portionsPerPerson = jim.divideApplesAmongPeople(6, 2);

This looks like we've created two actors, named fred and jim. What we've instead done, though, is created two boxes (that is, reference variables) that can point to actors following this particular script.

We now have a new box, labelled jim, that can contain the name of any FredsScript-following actor. Into that box we put a copy of the true name that's inside the box labelled fred using the assignment statement:


   jim = fred;

That doesn't damage the contents of the box labelled fred we started with. The new box is labelled jim, but both boxes contain the same thing---a reference to the sole actor we've created so far. Consequently, we can refer to the actor using the box labelled fred or the box labelled jim. So the two method execution requests:

   jim.divideApplesAmongPeople();

and

   fred.divideApplesAmongPeople();

have the same effect. All we've done is given the (sole) actor an alias.

In theater terms, we now have one actor playing two different characters.
 
Actors and Roles

In the last example, although we had two characters, each character was playing the same role. Both will follow the same set of instructions in their (single) class. Each class is the same as a theatrical role; it specifies the state and behavior of one particular type of actor.

If now we were to change the code in the following way:


   FredsScript fred;
   fred = new FredsScript();
   FredsScript jim;
   jim = new FredsScript();
   double portionsPerPerson;
   portionsPerPerson = jim.divideApplesAmongPeople(6, 2);

we would end up with two actors, whose respective names are in the boxes labelled fred and jim. In this version, although there are now two separate actors, only one, jim, is being asked to do anything.

In theater terms, we now have two actors playing two different characters, but each character has the same role, just as before. Both are following the same script (or class).
 
The Theater Metaphor

All the pieces of the our theater metaphor should now be clear.
 
Theater Computer
   
actors objects
roles classes and interfaces
props variables
actor names reference values
character names reference variables
lines messages
stage directions statements
scenes methods
scripts programs
audience program users
playwright and director programmer
stage manager and producer java interpreter

Classes define roles for actors to play. Each actor plays one or more characters (each named with a different reference variable). Each character that an actor plays has a fixed role (whose actions are specified by its class). Actors may have several aliases (that is, reference variables), but they always each have one true name (the single reference value stored in each of the reference variables that reference them).

We, the playwright/directors, will never know an actor's true name, only the stage manager/producer will know that. We will, however, be the ones to decide how many roles there will be, how many actors will play characters playing those roles, and how many character names, if any, each of those actors have.

The actors play out their roles by following the role's stage directions (statements) and speaking their lines (sending messages). Those messages ask other actors to do things. An actor may ask another actor to access its props (variables) or to execute its scenes (methods).

An actor may have special (local) props in each scene its role defines, but will always have the (global) props its role calls for all actors of that type to have. We can make those global props, as well as the scenes themselves, visible or invisible to other actors, depending on the access modifiers of the props or scenes.

We can put the roles (classes and interfaces) themselves into packages of roles, and we can make roles within a package visible or invisible to actors playing other roles in other packages. Some of those roles may extend other roles, and some of the roles may be little more than drafts (abstract classes), or mere declarations about expectations (interfaces) for any actors implementing those roles.

The entire complex of roles we define make up the program we're writing. The stage manager/producer will then perform the play by creating the actors we require to follow the roles we specify. If we do our jobs properly, the audience then breaks into wild applause.
 
Using Shorthand for Naming

So far we've been writing the following two statements to create a reference variable, then to create an object for that reference variable to point to:


   Lamp bauhaus;
   bauhaus = new Lamp();

Instead, we can write the following:

   Lamp bauhaus = new Lamp();

This statement has the same effect as the previous two. First, the stage manager creates a reference variable named bauhaus, then it creates a Lamp object (whose true name we won't ever know), then it puts the object's true name into the reference variable bauhaus.

By itself, the statement:


   Lamp bauhaus;

asks the Java interpreter to create a reference variable whose type is Lamp (that is, it can store reference values that can only point to Lamp objects). Initially, the Java interpreter gives this reference variable the value null, which means that the variable points to no object.

By itself, the statement:


   new Lamp();

asks the Java interpreter to create an object whose type is Lamp (that is, an object that knows how to do all the things defined in class Lamp). Initially, no reference variable points to it.

By connecting the two creation requests with an assignment, as in:


   bauhaus = new Lamp();

we ask the Java interpreter to create an object whose type is Lamp and to put its true name in a reference variable whose type is Lamp. If we don't do so we will never have any way to refer to the new object. Having no name we can refer to, it might as well not exist at all.
 
Arrays of Names

Understanding naming becomes even more important when there are too many variables to name individually, as in arrays. An array is an object that can hold any particular number of variables, all of the same type. Those variables may hold object names (that is, they may be reference variables holding reference values) or they may be int, double, or boolean variables holding int, double, or boolean values, respectively.

The code


   Object[] array = new Object[3];
   array[0] = new Object[2];
   array[1] = new Object[4];
   array[2] = new Object[1];

declares an array reference variable array then creates an object for it to point to. This object has three slots in it to hold Object reference variables (that is, reference variables that can point to any Object object). Since every object is of type Object, this means that the newly created object can point to any three objects at all.

The code then creates three objects whose names are stored in the first object. The first of the three is another reference variable that points to another object that can hold Object reference variables, but this time only two of them. The second can hold four, and the third can hold only one.

The code creates exactly the same set of objects as does the following code:


   Object[][] array = new Object[3][];
   array[0] = new Object[2];
   array[1] = new Object[4];
   array[2] = new Object[1];

Reference variable array now points to a two-dimensional array of reference variables. Similarly, we can declare higher-dimensioned arrays. If we use this style, however, we must fill-in at least the first (leftmost) dimension of the array since the Java interpreter has to know how many slots the very first object will contain before it can create that object.
 
Understanding Arrays of Names

We can refer to the third element of the second element of the above array array with array[1][2]. The reference variable array points to an object. That object has three slots. The second such slot is the reference variable array[1]. It points to yet another object. That object has four slots. The third such slot is the reference variable array[1][2]. It points to nothing at present (its default value is null), but we could make it point to any object at all, since its type is Object.

The following code fills a 4x4 two-dimensional array of Strings with the string "Hi":


String[][] arrayOfStringArrays = new String[4][4];
int index1, index;
for (index1 = 0; index1 < arrayOfStringArrays.length; index1++)
   for (index2 = 0; index2 < arrayOfStringArrays.length; index2++)
      arrayOfStringArrays[index1][index2] = "Hi";

Or we could do it like this:

String[][] arrayOfStringArrays = new String[4][4];
int index1, index;
for (index1 = 0; index1 < arrayOfStringArrays.length; index1++)
   {
   String[] temporaryArray[index2] =  new String[4];
   for (index2 = 0; index2 < temporaryArray.length; index2++)
      temporaryArray[index2] = "Hi";
   arrayOfStringArrays[index1] = temporaryArray;
   }

The code:


   Object[] array = new Object[1];
   array[0] = new Object[1];

does not create the same set of objects as the code:

   Object[] array = new Object[1];
   array[0] = new Object();

The first piece of code creates an object that has a slot for a reference variable. The reference variable in that slot can point to another object that has a slot. The reference variable in that last slot can hold the name of any object.

The second piece of code also creates an object that has a slot for a reference variable. The reference variable in that slot can point to another object, but that second object does not have a reference variable slot. It's an Object object, not an array of Object objects of length one.
 
Initializing Arrays of Names

When we create an array object the Java interpreter secretly initializes the variables it holds. If it holds boolean variables, they are each initialized to hold false. If it holds int or double variables, they are each initialized to hold 0. If it holds reference variables, they are each initialized to hold null.

We can also state the elements of an array directly using braces ({}). For example,


   String[] names = { "Shannon", "Jonathan", "Erin", "Moreena" };

is short for

   String[] names = new String[4];
   names[0] = "Shannon";
   names[1] = "Jonathan";
   names[2] = "Erin";
   names[3] = "Moreena";

Before we create the four String objects, each of the elements of the names array (that is, every String reference variable slot in the object) contains the non-typed value null; they point to no object at all.

Similarly, we can initialize, say, int arrays the same way. For example,


   int[] numbers = { 1, 2, 4, 8 };

is short for

   int[] numbers = new int[4];
   numbers[0] = 1;
   numbers[1] = 2;
   numbers[2] = 4;
   numbers[3] = 8;

Before we insert the four int values, each of the elements of the numbers array (that is, every int variable slot in the object) contains the int value 0.
 
Naming Variables by Passing Parameters

When we ask an object to execute one of its methods that takes a parameter we are asking the stage manager to first create a copy of the parameter then pass that copy to the object that's about to execute the method. In other words, we're creating yet another name for whatever we're passing.

If that variable is an int, double, or boolean variable, then we're asking the stage manager to make a copy of the int, double, or boolean value inside the variable and pass that copy to the object executing the method. If it's a reference variable, we're asking the stage manager to do exactly the same thing. It's going to make a copy of the reference variable and pass that copy to the object executing the method.

The effects however, can be different depending on what the object executing the method does with that copied variable.

Suppose, for instance, we had a method like this:


   public int twiddle(int fiddle)
      {
      fiddle = 5;

      return fiddle;
      }

Wherever we use the twiddle() method, assigning 5 to fiddle inside the method can't affect on the value of fiddle outside of the method. We can twiddle() the fiddle variable as much as we want, it won't affect the original copy outside the method. Remember that we're first making a copy of fiddle before sending it to be twiddle()ed. Any changes to that copy inside the method don't matter outside the method.

Suppose, however, the method looked like this:


   public int[] twiddle(int[] fiddle)
      {
      fiddle[0] = 5;

      return fiddle;
      }

This method looks much the same as the previous one, but there is a important difference. We're now passing in an array of int variables, not an int, and all arrays are objects. So fiddle is now a reference variable, not an int variable.

The Java interpreter will copy it and pass it to the object about to execute the method, just as before. Inside twiddle(), however, when we alter fiddle[0] inside the method we will also be altering fiddle[0] outside the method.

Although we're still making a copy and passing it in, the thing that's stored in reference variables are object names, that is, reference values, not int, double, or boolean values. So we're making a copy of the name of the object that reference variable fiddle refers to.

So when we alter fiddle[0] inside the method we're asking the object executing the method to alter the object that fiddle points to. It makes no difference whether the name of the variable is fiddle or something else; once it refers to the same object it will access that object. So once we return from the method, fiddle[0] will have the new value assigned to it inside the method.

Finally, suppose the method looked like this:


   public int[] twiddle(int[] fiddle)
      {
      fiddle = new int[5];

      return fiddle;
      }

This time, the method would return a completely new array (all filled with 0s), not the old fiddle array that we passed in to the method. The fiddle variable outside the method, however, would still be unaffected by the shenanigans inside twiddle(). It would still point to the same array as before.


int twiddle(int fiddle)
   {
   fiddle = 5;

   return fiddle;
   }


int[] twiddle(int[] fiddle)
   {
   fiddle[0] = 5;

   return fiddle;
   }


int[] twiddle(int[] fiddle)
   {
   fiddle = new int[5];

   return fiddle;
   }


 
The this Name

The this variable is a reference variable, since it contains the name of an object. But it's a very special kind of reference variable. To make sure that this always refers to the object executing the method that this is used in, the stage manager won't let us assign anything to this, even though it's a reference variable. So, the following, for instance, is illegal:


   this = kandinsky;

Although the following is perfectly legal:

   kandinsky = this;

As is this:

   if (kandinsky == this)

And this:

   return this;

Think of this as a final variable---that is, a constant. We can't assign anything to it, but we can assign its value to any other reference variable of the same type.

Although it's a constant when any particular object is executing the method it's used in, its value changes when another object executes the same method. Each time a different object executes the method it behaves like a constant, but its value varies from object to object.
 
The super Name

So far, we've only seen super used to refer to the superclass' constructor in the statement:


   super();

We can also use super to refer to non-private variables and methods in the superclass. For example, we could rewrite the turnOn() method in class DimmerControlledLamp like this (comments deleted):


   public void turnOn()
      {
      super.turnOn();
      this.setBrightness(1);
      }

This method first asks a DimmerControlledLamp object to execute the turnOn() method defined in class Lamp, then to execute the setBrightness() method in class DimmerControlledLamp.

We can refer to the current class with this and that class' superclass with super. There is no direct way to refer to the super-superclass (if there is one) or any higher ancestors.
 
Initializing Objects

Although we can use super anywhere, we can only use super() in a constructor. Further, if we use it, it must be the first non-comment statement in the constructor.

Similarly, if a class has more than one constructor we can refer to other constructors of the same class using this(). Again, if we do so, we can only use it in a constructor and it must be the first line of the constructor we use it in. Since there are multiple constructors, the Java interpreter uses the signature of each constructor to figure out which one we mean, just as it does for overloaded methods.

For example, here, minus comments, are the two constructors of class Lamp:


   public Lamp()
      {
      lampIsOn = false;
      wattage = Lamp.DEFAULT_WATTAGE;
      name = "Lamp";
      }

   public Lamp(int wattage)
      {
      lampIsOn = false;

      if (wattage > Lamp.MAXIMUM_WATTAGE)
         this.wattage = Lamp.MAXIMUM_WATTAGE;
      else
         this.wattage = wattage;

      name = "Lamp";
      }

We could rewrite them as follows:


   public Lamp()
      {
      this(Lamp.DEFAULT_WATTAGE);
      }

   public Lamp(int wattage)
      {
      lampIsOn = false;

      if (wattage > Lamp.MAXIMUM_WATTAGE)
         this.wattage = Lamp.MAXIMUM_WATTAGE;
      else
         this.wattage = wattage;

      name = "Lamp";
      }

Now all the variable settings happen in one place. If we later decide to add a new variable, or delete one of the current ones, or modify something else, we only need to do it in one place. That increases class Lamp's internal encapsulation, thereby making future changes easier.

If we don't add an explicit super() or this() as the first line of each constructor, the Java interpreter will secretly add super() as the first line of the constructor. Since it always does this in every class (which , of course, includes the superclass) the effect is for each constructor to first either request execution of another constructor in the same class, or to request execution of some one of its superclass' constructors. Each class' constructors work their way up until they're requesting execution of the topmost superclass' constructor---which is the constructor in class Object, since everything extends from Object either explicitly or implicitly.

Class Object then initializes itself, then the next superclass down initializes itself, and so forth, until we fall all the way back down to the lowest level subclass of the object we're trying to initialize, at which time it initializes itself. Consequently, all the constructors initialize themselves in the order: class Object, then the first superclass below that, then the next below that, and so on, until we get all the way down to the lowest subclass that we're presently creating an object for.
 
Initializing Objects: An Example

To understand what happens when objects are initialized, let's look at some classes with protected variables (even though using such access modification for variables is a style crime):


public class A
   {
   protected int x = 1;
   protected int y = 2;

   public A()
      {
      y = x;
      }
   }


public class B extends A
   {
   protected int z = 3;

   public B()
      {
      y = y + z;
      System.out.println("y = " + y);
      }

   public static void main(String[] parameters)
      {
      B someB = new B();
      }
   }

Superficially, this code looks like it will print 5, but it instead prints 4. Why?

First, recall that the Java interpreter secretly adds super() as the first line of both constructors. Now, here's what happens when we ask the interpreter to execute B's main() method.

The interpreter creates a secret object associated with class B. That secret object begins execution of main(). It asks the Java interpreter to create a new B object. Then it starts to initialize all the variables in that object.

To us, the object appears to have one variable (that is, z) but actually it has at least three: z in the B portion of the object, and x and y in the A portion of the object. It may also have inherited other variables all the way from class Object, but we won't consider them here.

After creating the new B object, the Java interpreter secretly sets all its variables to their default values (false for boolean variables, null for reference variables, and 0 for int and double variables). In this case, all variables are int so they are all set to 0.

Once it does that, it initializes all the variables we have explicitly given values to in their declaration. So x gets the value 1, y gets the value 2, and z gets the value 3. Then the Java interpreter hands the newly created but only partly initialized object to the secret object.

The secret object then begins to execute the object's B constructor, which, remember, the Java interpreter has already secretly altered to first execute the A constructor. So it begins to execute the A constructor, but that has also already been secretly altered to first execute the Object constructor. So that's where the secret object ends up.

Once it finishes initializing all the Object parts of the new B object, it runs out of things to do in the Object constructor and continues execution right after the super() statement in the A constructor. At this point, x holds 1, y holds 2, and z holds 3. The constructor assigns the value of x to y, so y now holds 1.

Then the superclass constructor ends and execution returns to the B constructor, right after the super() request. The constructor sums y and z, giving 4, and assigns that value to y. So y now holds 4.

In short, the initialization order is: default-initialized variables first, then variables initialized in their declarations, then, starting with class Object, descend from superclass to subclass executing each constructor until arriving at the lowest-level constructor. Each portion of the newly created object may depend on its superclass' values to be properly initialized. The Java interpreter secretly inserts all those super() requests to make sure that always happens for every object.
 
Egg Diagrams

Although they are very useful, the this and super references have kicked our tidy little world of objects into a cocked hat. Just what does it mean for an object to execute a constructor or method "in its superclass" or "in its class" or "in class Object"? As far as we know so far, only objects can do things. Even class methods, we found, are executed by a (admittedly secret) object. So what could it possibly mean for a class to execute a method or constructor?

The answer is that there is no such thing; it's simply a way of speaking. There still are only objects. So far we've been thinking about objects as formless bags that hold variables and execute methods, but there's more structure to them than that.

Each subclass we extend from yet another subclass makes the set of objects producible from that subclass more complex. Each subclass we extend adds one more layer of functionality to the objects producible from that class. We can address each of those parts with different names. We get those names through casting.
 
New Names Through Reference Casting

Given a reference variable and a type we can cast the reference variable to produce a new reference variable of the given type by enclosing the type name in brackets before the reference variable.

The following code creates a new object, and a reference variable to that object, then creates another reference variable pointing to the same object. The second reference variable, however, has a different type than the first one:


   DimmerControlledLamp kandinsky = new DimmerControlledLamp();
   bauhaus = (Lamp) kandinsky;

If type B extends or implements type A, then type B is a subtype of type A. Conversely, type A is a supertype of type B. If one type is a subtype of another we can always upcast from the subtype to the type. That is, given a reference variable pointing to an object of the subtype we can always create another reference variable whose type is the supertype by casting to the supertype.


   //A is a supertype of B

   //create a B object and refer to it with a B reference
   B b = new B();

   //since A is a supertype of B then
   //given a reference variable to the B object
   //we can create another reference to the same object
   //and the type of that reference variable will be A
   A a = (A) b;

A extends (or implements) B, and B extends (or implements) C, is just another way of saying that A is-a-special-type-of B and B is-a-special-type-of C. So if A extends B, and B extends C, for example, than we can happily upcast from A and B, and from B and C, and from A and C.

Since upcasting is always legal, the Java interpreter lets us simply do the assignment, and not bother with the cast at all, like so:


   //A is a supertype of B

   //create a B object and refer to it with a B reference
   B b = new B();

   //create another reference to the same object
   //and the type of that reference variable will be A
   A a = b;

This creates one object and two references to that one object. The first reference is of type B and the second is of type A. The one object is a B object, and B is a subtype of A, so it is not only a B it is also an A. Consequently, we can give it an B name or an A name, or both.

For example,


   //Actor is a supertype of Butler

   //create a Butler object
   //and refer to it with a Butler reference
   Butler jeeves = new Butler();

   //create another reference to the same object
   //and the type of that reference variable will be Actor
   Actor genericActor = jeeves;

Here we're created one object. It's a Butler. All Butlers are also Actors. We've created one name, jeeves, for the object and its type is Butler. Then we create another name for the same object, genericActor, but this time its type is Actor. We can refer to the one object using either of those two names. Each name is a different view on the same object.

The following, however, is illegal:


   //Actor is a supertype of Butler

   //create an Actor object
   //and refer to it with an Actor reference
   Actor genericActor = new Actor();

   //create another reference to the same object
   //and the type of that reference variable will be Butler
   Butler jeeves = genericActor; //illegal: it isn't a Butler

Just because Actor is a superclass of Butler doesn't mean that an actor must be a butler. Actors can have lots of roles, not just butlers. Certainly though, every butler is most definitely an actor.
 
Types and Subtypes

Reference casting is different from creating one object and two reference variables pointing to the same object (two aliases) as we've done before. We still have that but we also now have that the types of the reference variables can be different. They point to the same object, yes, but they can point to different parts of it.

Consequently, we can use one actor in several roles. Even though so far it's looked like each actor follows only one role at a time, we now know that each actor can follow several different roles simultaneously. Every ancestor class of the actor's class defines a different role that actor can follow, and the actor can follow all of them at any time.

We cannot, however, cast between two types neither of which is a subtype of the other. So if A extends B and C extends B, we can't cast back and forth between A and C. We can, however, cast either of them up to their ancestor type B.

Class A may define methods that no class C object would know how to execute. But since A and C are both B (that is, they both know how to execute every method that B defines) then we can always cast objects of either class to the type B. This would be true even if A and C were much more distant descendants, as long as there was a chain of types all the way back to B. Consequently, we can create an array of B and put B objects in it, or A objects, or C objects. All of them are B objects of one subtype or another.
 
Types of Types

Each way of defining types carries different meaning for the Java interpreter as opposed to other objects in the program. Since the Java interpreter runs the whole show, it must know everything that every type promises, including everything the type promises itself (its private variables and methods). Objects, however, only should know each type's external promises: their public, protected, or package promises.

Each class of objects may have an interest in a different subset of these promises. Objects with no relation to the class at all can only rely on its public promises. Objects of subclasses, whether in the same package or not, can rely on all of its protected and public promises. Objects of classes in the same package, whether subclasses or not, can rely on all of its protected, package, and public promises. Objects of the class itself can rely on all its promises.

Package access and protected access are similar. The difference is that subclasses can always access a protected variable or method in its superclass whether the superclass is in the same package or not. If the variable or method had package access (the default), subclasses of the class in different packages would not be able to access them. Classes in the same package that don't extend the class can access the package access variables or methods, but classes in different packages cannot, even if they are subclasses of the class.
 
The instanceof operator

We can check whether an object is of a particular type using the instanceof operator. The boolean expression:


   [reference variable] instanceof [class or interface]

is true if the object pointed to by the named reference variable is an object of the named type, or if it belongs to a subtype of the named type. This also works if the named type is an interface, in which case the instanceof expression is true if the object belongs to a class that implements the named interface, or if it belongs to a subclass of that class.

So the code:


   DimmerControlledLamp kandinsky = new DimmerControlledLamp();
   if (kandinsky instanceof Lamp)
      system.out.println("kandinsky is a Lamp");

prints that kandinsky is a Lamp object. Similarly, if either class Lamp, or DimmerControlledLamp, implemented the Luminous interface then the code:

   DimmerControlledLamp kandinsky = new DimmerControlledLamp();
   if (kandinsky instanceof Luminous)
      system.out.println("kandinsky is a Luminous");

prints that kandinsky is a Luminous object.
 
Type Diagrams

Take all the types (concrete classes, abstract classes, and interfaces) in any Java program and draw a set of rectangles each representing a different type. Connect those rectangles with arrowed lines whenever one type descends directly from another type (either by extending or implementing). Draw the descendant rectangle lower than the ancestor rectangle and let the arrow point from the ancestor to the descendant. This produces a diagram of rectangles and lines with class Object at the top and all non-extended classes and non-implemented interfaces at the bottom.

Some interfaces used in the program would also either be at the top along with class Object or they would be somewhere off to the side. Those interfaces off to the side would either extend from some other interface, or interfaces, or they would not have any ancestors at all.

Any rectangle could have many descendants and many ancestors but it isn't possible for any arrowed line to point up the page; all of them must point down the page. Subtyping means "special case of", so there's only one direction an arrow can go in. If type A is a subtype of type B then type B cannot also be a subtype of type A. One can be a special case of the other; they can't both be special cases of each other.

Every type in the diagram is either a descendant of some other type, or types, in the diagram, or it will have no ancestors at all. There's no way for a type to be one of its own ancestors. We can't have type A being a supertype of type B and type B being a supertype of type C, and so on, and end up with type Z being a supertype of type A. If that were possible then type A would be a subtype of itself.

By reference casting we move up or down this diagram. We can always move up (that is, go from a type to any of its supertypes) but we can't always move down (that is, go from a type to one of its subtypes). We can always do so, however, when the object the reference is pointing to is an object of the target subtype.

The instanceof operator tells us whether an object belongs to a particular type, no matter where that type is on the type diagram. So we can always use it to predict whether we can do any particular cast, whether up or down. Since we don't need it when we're upcasting, it's only useful when we're downcasting.
 
Reference Casting II: Revenge of the SuperHeroes

Suppose we have a class hierarchy descending from Object with first subclass Swashbuckler, which has a subclass RobinHood, which has a subclass ScarletPimpernel, which has a subclass Zorro, which has subclasses IndianaJones, Batman, Superman, and LoneRanger.

Class Zorro is a subtype of class ScarletPimpernel, which is a subtype of RobinHood, which is a subtype of Swashbuckler. So we can happily cast any references to an object of class Zorro up to either ScarletPimpernel, RobinHood, or Swashbuckler---or even Object, since Swashbuckler secretly is a subtype of Object just like every other Java class.

All Swashbuckler objects (that is, any object created from class Swashbuckler or any of its subclasses), know how to "swashbuckle" (that is, it understands how to rescueMaiden() and rightWrongs()). Similarly, any object created from subclass RobinHood, or any of its subclasses, knows how to romanceMaiden(), how to stealFromTheRich(), and so on.

Once all that code is in place, another programmer could use the type hierarchy. For example:


public void swash(Swashbuckler person)
   {
   //the java interpreter won't let us pass in an object
   //that wasn't at least of type Swashbuckler,
   //so inside this method we know that person references
   //an object of type at least Swashbuckler.

   //do Swashbuckler stuff with the object referenced
   //by the reference variable person

   person.rescueMaiden();
   person.rightWrongs();
   System.out.println(person.toString());

   //now find out whether person references,
   //not just a Swashbuckler object, but a
   //subtype---the ScarletPimpernel subtype

   if (person instanceof ScarletPimpernel)
      {
      //inside this block we know that person
      //points to an object at least of type ScarletPimpernel
      //(or any of its subtypes) so we can create a new
      //reference variable to that object and make the type
      //of the new reference be ScarletPimpernel

      ScarletPimpernel sirPercy = (ScarletPimpernel) person;

      //this cast does not alter the object,
      //all it changes is the name and type of the
      //reference variable we're using to point to that object

      //now do ScarletPimpernel stuff with the object
      //pointed to by person, which is now also pointed to
      //by the new reference variable sirPercy
      sirPercy.donDisguise();
      sirPercy.rescuePrisoners();
      System.out.println(sirPercy.toString());
      }

   //now see if person references an even more
   //specific subtype of objects---namely, objects of
   //subtype Zorro, or any of its subtypes

   if (person instanceof Zorro)
      {
      //inside this block we know that person
      //points to an object at least of type Zorro
      //(or any of its subtypes) so we can create a new
      //reference variable to that object and make the type of
      //the reference be Zorro:

      Zorro donDiego = (Zorro) person;

      //now we have one object pointed to by person, sirPercy,
      //and the new reference variable donDiego. however,
      //the reference variable sirPercy is now out of scope
      //so we can't use it here, but we could use person
      //since it's still in scope

      //do Zorro stuff with donDiego as the reference
      donDiego.rideToronado();
      donDiego.writeZs();
      System.out.println(donDiego.toString());
      }

   //now let's view the object that person references
   //as a plain old Object, rather than a SwashBuckler,
   //or any of its subtypes

   Object object = (Object) person;
   System.out.println(object.toString());

   //throughout the above shenanigans,
   //we have only had one object;
   //all we have done is create new references to it
   //so all the toString() requests will print the same thing,
   //since they're all requests to the same object.
   //the toString() method, unless overridden in one of the
   //subtypes, comes all the way down from class Object.

   //also, each of the above casts are all safe,
   //so we can remove the explicit casts themselves
   }

Upcasting always works. Downcasting works when the object we thought was just a plain old Swashbuckler is much more specialized. It's not just any old Swashbuckler, it's a Zorro (or a RobinHood, or a ScarletPimpernel, or whatever subtype of Swashbuckler we're trying to downcast to).

So the above method would work if we sent in a Swashbuckler, RobinHood, ScarletPimpernel, or Zorro object. It would also work if we sent in an IndianaJones, Batman, Superman, or LoneRanger object. All such objects have type Swashbuckler.
 
Accessing Variables versus Executing Methods

The Java interpreter treats variables and methods diferently. When we ask an object to access a variable, the variable we gain access to (assuming it is accessible from that point) depends on the type of the reference variable we use to name the object. When we ask an object to execute a method, however, the type of the reference variable we use to name the object almost never matters.

Only when we use super does the type of the reference variable affect the version of the method we execute. All other reference variables, no matter their type, start from the lowest level of the object and work their way up. When we use super it starts one level up from that then works its way up.
 
Climbing the Scope Ladder

Recall the statement from one of the constructors for class Lamp:


   this.wattage = wattage;

The left-hand side specifies the wattage the lamp keeps, and the right-hand side specifies the wattage that was passed into the constructor. The statement thus says to copy the value of the parameter passed into the constructor into the lamp's wattage variable.

Every local variable hides every global variable of the same name. This is true inside methods or constructors, but it's also true (in a broader sense) from one portion of an object to an ancestor portion. The ancestor portion has greater scope, and the more ancient the ancestor the broader the scope. We can access its variables from the "local" or lowest-level, most-specific portion. but if we declare a variable with the same name as an ancestor's variable in the most-specific portion, it hides the ancestor's version. We can, however, get at the ancestor's version of the variable with super or a cast.

The type of the reference makes no difference when it comes to methods, but it does for variables. The type of the reference determines the variable access, but every variable also has a scope. The scope of variables in ancestor classes contains all variables in any descendant classes. So if a reference variable points to a particular portion of the object then that's where the search for the variable starts but if it doesn't find the variable there it scans up the type hierarchy just as for methods.

When we declare a variable private in a superclass, we can't access it even by creating a subclass object and creating a reference to that portion of the object. Although normally it would be within scope, declaring it private strictly limits its scope precisely to methods in the class alone. There's no way for any descendants to get at it.
 
Nameless Actors

We can cast a reference to a subtype we know the object belongs to (perhaps because we're inside an instanceof test) and use that temporary reference directly. We don't have to create a temporary reference variable. For example, suppose the type of the reference variable value is Object but we know that it actually points to a String object. We can then do the following:


   ((String) value).startsWith("a")

This statement takes the reference variable value (which presently has type Object) and downcasts it to a nameless reference variable of type String. Then it asks the object that variable references to execute the startsWith() method (which all String objects understand).

We haven't altered the object that value points to. We've just accessed some more of its functionality because we happen to know (thanks to an instanceof test, say) that it's not just any old Object, it is a special subtype, it's a String.

The above statement is the same as the two statements:


   String temporaryStringReference = (String) value;
   temporaryStringReference.startsWith("a");

which is the same as the three statements:

   String temporaryStringReference;
   temporaryStringReference = (String) value;
   temporaryStringReference.startsWith("a");

We don't have to name and create a temporary reference variable that we will only need for one statement.

Similarly, the statement


   (new Lamp()).turnOn();

creates a lamp and ask it to turnOn(). In this case, we never name the lamp at all. In theater terms, some actors can be so incidental that they never have character names, even though they appear in the play. They're bit-players, or walk-ons, or extras.
 
Choosing Good Variable Names

Once upon a time, programming meant keypunching cards. In those ancient days it was an insane effort to change even a single line of code because it meant the programmer had to repunch the entire card (not to mention not having such a thing as "backspace").

Mistyping a variable wasn't the trivial matter it is today. Back then even the tiniest change meant having to resubmit the job and wait half an hour (or even days sometimes). So the earliest coding style used simple and short variable names like i and j and x and y.

No one thought much about it because in those antediluvian days pretty much all the books also used that style of naming variables. Nowadays, however, decades later, good stylists name their variables numberOfRows, partialTotal, transformedMatrix, and things like that. Why the difference?

Well, to bring meaning to largely meaningless variable names like i and x and so forth, conscientious programmers regularly found themselves writing the following kinds of comments:

   //multiply the number of rows by the default number of columns
   i = j * 15;

   //transform each pixel using the new colormap
   for (int k = 0; k < i; k++)

This is just plain stupid. It was mostly unavoidable, though, because most of the early languages placed strict limits on the lengths of names. The Java interpreter, however, places no such limits on us, so we should instead write our variables like this:

   colormapSize = numberOfRows * 15;

   for (int pixel = 0; pixel < colormapSize; pixel++)
To name the variables any less meaningfully would be a style crime.

The Java interpreter doesn't care what the names are, and while it may take more time to type more meaningful names, it takes less time overall because:

* We don't have to write those pointless explanatory comments anymore since the meaning of the code is clear from the code itself.
 
* We're much less likely to make an error when variable names are this explicit; and that has incalculable benefit when it comes to preventing bugs.
 
* Other programmers can follow our code much more easily, so the code is easier to share and to modify.

 
Naming All Constants

Look at that magic number 15 in the above piece of code. This is a style crime because it makes the program less flexible and because we can easily forget what the number means six months after we first wrote it. Is it the 15 that stands for the default number of columns, or is it the 15 that stands for the number of pictures to display? Oops.

Instead, we should make it into a named constant, as follows:

   public final int DEFAULT_NUMBER_OF_COLUMNS = 15;

   colormapSize = numberOfRows * DEFAULT_NUMBER_OF_COLUMNS;
   for (int pixel = 0; pixel < colormapSize; pixel++)

The line length has gone up, but the code is much cleaner, much easier to read, and much easier to modify than the original. Further, the line length really isn't that much more when we count the silly comments the first version needed simply to be intelligible.

This little lesson extends to methods, classes, abstract classes, interfaces, and packages. We shouldn't name classes MyClass or Display; we should name them descriptively: ParenthesizedExtractor, AnimationSequenceController, and WebPageParser.
 
Conventional Names

It's good style for all variable, method, class, interface, and package names to follow the following naming scheme:
aaaa.aaaaa... for packages,
AaaaAaaa... for classes and interfaces,
aaaaAaaa... for methods and variables,
AAAA_AAAA_... for constants.

We shouldn't invent another name for a method parameter simply because the obvious choice clashes with a global variable. We should use exactly the same name and disambiguate the two of them with this. We should choose each name with great care; it should capture the essence of the variable. We shouldn't undo all that work by making up fake variants of the name as well.

We should avoid using negative boolean variable names; we can use the positive version instead. For example, instead of notReady, notFull, and notBuffered we can use isReady, full, and buffered. People have a lot of trouble with negatives. (It's not hard not to see why not.) We shouldn't add unnecessary cognitive complexity to our code.

We should make boolean variable names predicative verbs, that is, they should make sense when prefixed with 'if' (done, isReady, isHighSpeed). Other variable names should be nouns (pixelSetting, characterBuffer, frame).

Variable accessor methods should be prefixed with 'get' and 'set', and variable query methods should be prefixed with 'is' (getInstance(), setBuffer(), isValid()). Other method names should be verbs or adverbial phrases (initializeScreen(), createBuffer(), run()).

Interface names should be adjectives (Runnable, Cloneable, Observable). Class names should be nouns or noun phrases (DoubleBufferedApplet, WidgetFactory, Component)
 
The Keypuncher's Assumption

All the constraints that avoiding style crime force on us might seem terribly onerous the first time we're exposed to them. It might seem much easier to program lackadaisically---choosing variable names with no real thought, making variables public so that we can get at them from any other class, having multiple access points to each variable, and so on. But we would pay for that short-term ease with long-term misery.

The stylistic constraints work together to make our programs easier to read, and so easier to understand, easier to debug, and easier to change. We must work hard so that our readers don't have to.

A lot of really bad style crime stems from the "keypuncher's assumption", the belief, going back decades, that we must minimize the time we take to create our programs by using short variable names, by not protecting our variables, and so on. Add to that the widespread belief that we must sacrifice everything else to get faster code, even if the program becomes completely incoherent. What these beliefs ignore is that the time to create any old program isn't important; what's important is the time it takes to develop a correct, readable, and long-lived program.

We all do things a certain way because we did them that way when we first learned how. Tying our shoes, for example; most of us still tie our shoes the way our mothers taught us. Those ways of doing things made sense to somebody at the time. Maybe long ago it was hard to do things some other way (as in the keypunching example), or maybe someone decided that it was more efficient, or maybe that was simply the way our teachers were themselves taught.

Unlike professionals in almost every other discipline, good programmers can't afford to be slack since the technology we use is changing constantly. Assumptions that were true five years ago are ancient history today. If we never reexamine why we're doing whatever it is that we're doing, we're going to be ancient history right along with it.


last | | contents | | next