B669: Personalized Data Mining and Mapping
Coding Rules



* All variables should be private, except those few that absolutely must be protected (for subclass access).
* The sole exception to the rule that no variable should be public is final static variables, since they are constants.
* All methods, except those in the public API of the class, should be private. Those to be used by subclasses must be protected (try to have as few of them as possible).
* Do not use package access (which, stupidly, is the default if you don't specify any access modifier) for variables or methods. Explicitly make everything private, public, or (if you must) protected.
* Minimize the use of static non-final variables. Class-globals are a Bad Thing.
* Minimize the use of instance-globals. They are a Bad Thing, Too. Instead, write your methods to explicitly import as many of them as possible as parameters, even though the methods are in the same class. This keeps your methods as encapsulated as possible.


* All identifiers should be meaningful names. If a variable has no special meaning then don't be afraid to use its type as its name. For example:
private static DataFormat dataFormat
* No variable, method, class, interface, or package name should use jargony or obscure abbreviations (for example, obj, cls, and, most especially, Impl).
* All variable, method, class, interface, and package names should follow the standard Java naming scheme (never mind that Java already messed this up with instanceof and the names of all Color constants):
aaaaa... for packages,
AaaaAaaa... for classes and interfaces,
aaaaAaaa... for methods and variables,
AAAA_AAAA_... for constants.

* No variable name should have an underscore (unless it is a constant, in which case it must be in all uppercase).
* Do not try to invent another name for a parameter simply because the obvious choice clashes with a class or instance variable. Use precisely the same name and disambiguate the two of them with "this". (A name should be chosen with great care; it should capture the essence of the variable. Don't undo all that work by making up fake variants of the name as well.)
* Do not give the same name to a method and to a variable.
* Don't name a thread reference runner or kicker. Call it what it is: thread.
* Avoid using negative boolean variable names; use the positive version instead. For example, instead of notReady, notFull and notBuffered use isReady, full, and buffered.
* Boolean variable names should be predicative verbs (that is, they should make sense when prefixed with 'if')
(done, isReady, isHighSpeed);
other variable names should be nouns
(pixelSetting, characterBuffer, frame);
accessor methods should be prefixed with 'get' and 'set', and query methods should be prefixed with 'is'
(getInstance(), setBuffer(), isValid());
other method names should be verbs or adverbial phrases
(initializeScreen(), createBuffer(), run());
interface names should be adjectives
(Runnable, Cloneable, Observable);
class names should be nouns or noun phrases
(DoubleBufferedApplet, WidgetFactory, Component)
exceptions should end with the word "Exception"
(BufferCorruptionException, TooSmallToSpawnException).


* Of the primitive types, you should only use boolean, int, long, and double. Do not use byte, char, short, and float. (Unfortunately we can't get rid of int in favor of long thanks to two bad Java design decisions---long operations aren't thread-safe and longs can't be used as array indexes or to control switch statements.)
* Do not use the ternary operator.
* Do not use the binary shift operators.
* Do not use the bitwise operators. (Unless you really must.)
* Do not compare Strings with ==.
* Always use braces to indicate scope in a do-while statement.
* Try to rewrite do-while statements into while statements if you can do so without thoroughly mangling your code. The fewer different types of statements you use, the lower the chance that you will confuse yourself, or your reader (and, no, I'm not talking about the compiler).
* Fully parenthesize every expression containing multiple operators. Don't rely on complex precedence rules to produce the correct parenthesization. Either you or your reader, or both, will get it wrong.
* Avoid using labels, and break, and continue.

Switch statements

* Do not use "fall-through" cases in switch statements. If the code being shared is significant then make it into a separate method----even if it's only a line or two.
* Always provide a default case for switch statements. If you don't expect it to ever be fired, write code to report that it did fire (and therefore that there is a bug).
* Always put the default case last in the switch statement.
* Always include a break statement for the default case, even if it is last in the list of cases. Future programmers may rearrange the cases and not notice the missing break.
* If you find yourself writing a switch statement to do what can be done with polymorphic classes or overloaded methods (especially if you find yourself writing a lot of instanceof tests to check for every subtype of a type), you are still asleep in C-land. Wake up and smell the Java. Use polymorphism instead.
* Avoid switch statements; they are a fertile home for bugs. All programs can be written using only if, for, and while for flow control.


* Do not try to make methods do more than one well-defined task, and its name should mirror that task. If you cannot name a method easily then either you don't know what you're doing and should stop for thought, or it is doing too much---break it up.
* If a method is longer than about half a page (including comments), it's probably too long and should be broken up into several methods.
* If a method requires more than about 3 or 4 unrelated parameters it is probably too big; break it up.
* Do not avoid one-line methods on the grounds of "efficiency." A one-line method is perfectly reasonable if it allows you to clean up some other methods that use it.
* Make as many method parameters as you can final.
* Make as many local variables as you can final.
* Each method should have at most one return statement (unless it would seriously harm the method's logic to do so).
* Methods should never contain 'magic numbers' aside from 0, 1, and possibly 2; all constants should be safely hidden in final static variables.
* Never use 0 and 1 when you really mean true and false.
* The last statement in a finalize method should always be: super.finalize().
* If you override toString() you should also override hashCode().
* If you override equals() give some thought to what you'll do about checking whether the superclass portion of the objects will test equal as well.

Methods and Variables

* Class variables and methods should always be invoked through their class, not through an instance. Note that using nothing at all is equivalent to access through an instance, since "this" is automatically added as a prefix.


* (Almost always) provide at least one constructor and make sure that it initializes instance variables to a consistent state.
* (Almost always) provide a zero-argument constructor if you also provide non-zero-argument constructors. Otherwise someone may subclass your class one day and end up with a very obscure compiler error. (Java adds an empty public zero-argument constructor if you don't have any constructors at all to partly avoid this problem; but it doesn't do a complete job.)
* If you have multiple constructors, avoid code duplication (and subsequent maintenance nightmares) by making one of them the setter of all variables and pass special cases to it from the others; or create a special private method to handle initializations and call it from each constructor.


* Do not try to make classes do more than one well-defined task, and its name should mirror that task. If you cannot name a class easily, or if you cannot describe its purpose in one short declarative sentence, it is doing too much; break it up.
* If a class has more than about a page's worth of method descriptions it's probably too big and should be broken into several classes.
* Do not subclass standard classes (Thread, String, Vector, Hashtable, etc); use composition rather than inheritance (that is, use a Vector, don't be a Vector; implement Runnable, don't be a Thread).
* Do not reinvent the wheel: use the standard collection classes already provided in Java rather than rolling your own (unless you have an extremely demanding application).
* Avoid inheritance as much as possible. Use interfaces and composition and delegation instead.
* If you must inherit, try to make your superclasses as abstract as possible. For example, common but widely variable methods in subclasses should be made abstract in the superclass rather than be made into dummy methods; this forces subclasses to explicitly provide an implementation.

Files and Packages

* Put every class in its own file. Use explicit import statements to name collaborator classes.
* No import statement should use the global class descriptor (the '*'); all imports should be explicit. Importing more than 10 or so classes is a warning sign that your class is trying to do too much.
* Make sure that every imported class is actually used.
* Use packages to group related classes together, not to create a single umbrella for a bunch of unrelated classes (sole exception: a toolbox package).
* Write a README file for every package of classes to give the reader an overview of the package. (Unfortunately, Java has no mechanism in place for this common problem. Java's poor support of modules larger than classes is its weakest aspect.)


* Every class should start with a CRC: the name of the class, the responsibilities of the class (what the class does), and the collaborators of the class (which other classes the class must work with to do its job). If a CRC lists more than 2 or 3 collaborators it is doing too much; redesign your architecture. If a CRC responsibility cannot be expressed in a short declarative sentence the class is doing too much; break it up.
* At the top of every class there should be a description of its overall function, a list of all its public methods (that is, its API) together with descriptions of their function, and a list of all classes it collaborates with to accomplish its function together with descriptions of those classes.
* At the top of every method there should be a description of its function, any class-global or instance-global variables it depends on, and any special actions it takes (for example, any exceptions it throws).
* At the top of every non-trivial chunk of code within a method there should be a description of its overall function.
* Write comments as you code; each level should proceed in a stepwise manner. From topmost level down to the lowest and simplest line of code the reader should be able to read your comments only and fully understand your program without also having to read the code itself. A well-commented program is like a well-lit room; bugs hate it.
* Don't ever let the code and the comments disagree.
* In comments and other documentation, always refer to methods by appending parentheses to their names to show that they are methods.
* Use /** comments for class-level comments, /* for method-level comments, and // for declaration-level and statement-level comments.
* Don't make spelling or grammatical errors.


* Indent all scopes consistently.
* Comments should also obey the local scope.
* Indent statement continuations.
* Do not mix spaces and tabs in one line. Different people use different tab settings and that completely messes up your layout.
* Do not use tabs inside a line (that is, between two pieces of text---for example a declaration and a comment); use spaces instead. (See above.)
* Put a space after each comma in any statement (declarations, for loops, and comments).
* Put a space on each side of every binary operator, including assignments.
* Collect declarations of related variables together and use a group comment to describe them and their relationships to each other. Separate that group from other groups with blank lines.
* Separate declaration-only statements from executable statements by at least one blank line.
* Separate each related chunk of code from other chunks with blank lines. If there are repeating chunks, consider putting it in its own method.
* Separate each method's return statement from the rest of the method with a blank line.
* If you find yourself writing code nested deeper than about 3 levels, you've gone too deep and are about to drown. Rewrite to use a separate method or methods.
* Separate methods from each other either with several blank lines or with a blank line and a solid line (for example: "////////...").
* Within classes, place declarations before constructors before methods. Within methods, place declarations before executables except for temporary variables that should be declared in or near the loops that use them. Within declarations, place public before protected before private variables, unless that would hamper the organization of variables by relation or function, in which case go with relation instead.


* Every non-abstract class should have a main() method that contains test code for the rest of the class. Such test-code main() methods are also very useful for showing prospective users of the class how to use it. (Note that a class can contain constructors (or be an Applet) and still contain a main() method.)
* main() methods and major inter-class interface methods should always validate their input---even when being called from classes you wrote. One day someone else will modify your code and break the invariants by mistake.
* Use assertions to continually validate your methods' inputs.
* Use repeatable pseudo-random number sequences when debugging probabilistic code by using the same seed over and over.
* Don't ever catch all exceptions [with "catch (Exception exception)"] and then not handle them. More than anything else you could do this shows that you just got off the C boat, and have no idea how to program in Java.
* Do not place try blocks around every statement that can throw an exception. Place the entire block of code in a try block and handle all exceptions at the end of the block.
* Catch as many exceptions as you can and handle them in one block. Try to avoid client classes from knowing of class-specific exceptions and therefore having to write exception-handling code unless the exceptions are something the client classes must know about to do their job adequately.
* Create your own meaningfully named exceptions rather than using the generic exception classes. This makes your programs more self-documenting.
* Exceptions should be rare---not commonplace. (Exception! I can't find the disk drive! Exception! The machine is on fire!) Don't use exceptions to handle otherwise normal flow of control that you can easily predict (for example, end-of-file).
* When dealing with files, always use a finally clause to close them.
* Do not place exception-throwing code in a finally clause.
* Do not use stop(), suspend(), destroy(), or resume() on threads. They are all deadlock-prone. Use flags instead.
* Do not use yield(); use a preemptive scheduler class to force preemption on all JVMs instead.
* Do not use notify(). Use notifyAll() instead and have each wait()ing thread check its condition to see if it is the one that should awaken.
* Embed all wait()s inside of whiles that test their wake up condition. Don't rely on an if test; the condition may change between the time the test was passed and the wait() was re-encountered.
* Synchronize all thread-unsafe methods even if you are presently using only one thread. One day you will add multithreading, forget all about the lack of safety, and blow your foot off.
* Do not fail to synchronize methods on the theory that data corruption "will never happen" and, besides, "synchonization makes things take longer." Subtle bugs make things take even longer.


* Do not use busy-waits. Use sleep() instead.
* Use multithreading to improve apparent responsiveness. If you have a long task, spin it off into a separate thread and return immediately for more processing.


* Optimization for amateurs: don't.
* Optimization for experts: not yet.
* Optimization for gurus: buy a faster machine.
* In these days of quite smart optimizing compilers, the only reasonable case where hand optimization might still be warranted is for some tricky cases of tail recursion. Do not use tail recursion instead of a simple while loop (unless the method becomes deeply inelegant without the recursion).
* Do not try to do Java's garbage collection for yourself, or to defeat it, or to reschedule it, or to alter the heap size, or monkey with any other system variable unless you really, really know what you're doing.
* Do not try to unravel a multiply-recursive method to an iterative one where you manage the stack yourself. That's the compiler's job, and if you try it you will either bungle it, get in the compiler's way, or horribly mangle your code---or all three.
* Do not do trivial "optimizations" like unrolling loops or creating temporary variables for constants inside loops. That's the compiler's job and it is quite good at it. (See above).
* If you ever feel the urge to optimize, lie down. It will pass. If it doesn't pass, then first make sure your program is correct in all respects. Then profile your code. Then figure out why the hottest spots are so hot. Before performing trivial optimizations (and mangling your code) look for a better algorithm. Only if all else fails (and you've had another good long lie-down) and you really, really know what you're doing, should you try any real "optimizations".


* Do not embed your own system's specific constants into your programs (for example, do not assume that files must be separated by "/" or by "\" or by ":"). Use Java's System properties to make your code portable. That's what it's for.
* Read a good Java book. (Unfortunately, there are only two, possibly three.)
* Study the source code provided with the JDK or with almost any Java book. Laugh your head off. Then resolve never to produce such terrible code yourself.