Dave Yost - Java switch statements should allow Strings, not just ints

Java `switch` statements should allow `String`s, not just `int`s

(see Update at end)

From Arnold/Gosling 3rd Edition:

The switch expression must be of type char, byte, short, or int. All case labels must be constant expressions—the expressions must contain only literals or named constants initialized with constant expressions—and must be assignable to the type of the switch expression.

Proposal: the switch statement should be enhanced thus:

The switch expression must be of type char, byte, short, int, or String. All case labels must be constant expressions—the expressions must contain only literals or named constants initialized with constant expressions—and must be assignable to the type of the switch expression. If a String switch expression cannot be proven by static analysis to be interned, then the compiler will generate a call to the String's intern() method.

Rationale

Lisp and scheme have a symbol type (equivalent to Java's interned Strings), which programmers are accustomed to using for selector constants because symbols speak for themselves when displayed in a debugger or a debug printout, whereas ints display as unexpressive numbers. The omission of String from the set of legal switch expressions in Java would seem to be an oversight born of the C / C++ tradition, forgetting the usefulness of the new-to-Java String interning mechanism.

Efficiency

On the matter of interning the switch expression value, the intern() method is extremely efficient for Strings that are already interned and not too costly for those that are not. I claim it is reasonable to expect a programmer sensitive to maximal efficiency to ensure that all String values passed into a switch expression are already interned, but even if they are not, the performance penalty for trivial-case interning or even for nontrivial-case interning is not grave enough to kill the proposed String switch feature.

In comparing execution efficiency between an int switch statement and a String switch statement, three cases are relevant:

Case 1: The table-lookup `switch` statement

If there are enough int case statements to warrant, and if their values are sufficiently packed, the compiler can generate a table lookup to control the flow of execution. This is not possible with Strings and thus is a drawback of the proposal; however, Case 2 can always be used in place of Case 1. This drawback is not important enough to make the String switch statement undesirable.

Case 2: The hash-lookup `switch` statement

If the cases of the switch statement do not qualify for a table lookup but are numerous enough to warrant, some compilers may choose to transform the switch into a hash table lookup. This technique would work as well with interned Strings as with ints.

Case 3: The if-else-if-chain `switch` statement

If the cases of a switch statement do not qualify for Case 1 or Case 2, then the switch statement degenerates into a simple if-else-if chain of equality tests, for which interned Strings are as speed-efficient as ints.

Examples

Example 1

String state variety = condition ? "fish" : "fowl";switch (variety) {case "fish": return 1;case "fowl": return 2;}return 3;

In Example 1, it's clear from path analysis that variety can evaluate only to one of two String constants, and since String constants are always interned, the compiler doesn't generate code to intern the switch expression's String value.

Example 2:

public int animalNumber(String variety) { switch (variety) { case "fish": return 1; case "fowl": return 2; } return 3; }

In Example 2, it could easily be beyond the compiler's ability to know if the variety variable will have been interned before being passed to animalNumber, so the compiler inserts a call to intern() in the compiled code, as if the source had been written this:

switch (variety.intern()) {

The alternative would be for the compiler to complain that intern() must be called on the switch expression value and leave it to the programmer to call it explicitly. I think that choice would be inelegant, but others may have a different opinion.

Update: Java 5 now has enums, which can be used in switch statements. This is great, but why not also allow String?

http://Yost.com/Computers/java/string-switch - this page
2000-11-26 Created
2001-03-18 Published here
2005-07-11 Modified

Java switch statements should allow Strings, not just ints