Immutability in Java

WRITTEN BY Dave Nicolette

This post isn’t a tutorial on how to design immutable objects in Java. It’s more of a lament, or perhaps an extended whine. Why is it that Java applications seem to suffer the effects of mutability more than those written in other languages, when it’s no more difficult to design immutable objects in Java than in any other language?

So what?

The first question, I guess, is “So what?” Who cares about immutability, anyway?

An immutable object can’t be modified after it has been created. When a new value is needed, the accepted practice is to make a copy of the object that has the new value.

Functional languages support immutability by design. When using an object-oriented or procedural language, objects and procedural code that accesses data structures (respectively) have to be designed explicity to provide protection against unwanted modification.

Immutable objects offer a number of advantages for building reliable applications. As we don’t need to write defensive or protective code to keep application state consistent, our code can be simpler, more concise, and less error-prone than when we define mutable objects.

Some of the key benefits of immutable objects are:

  • Thread safety
  • Atomicity of failure
  • Absence of hidden side-effects
  • Protection against null reference errors
  • Ease of caching
  • Prevention of identity mutation
  • Avoidance of temporal coupling between methods
  • Support for referential transparency
  • Protection from instantiating logically-invalid objects
  • Protection from inadvertent corruption of existing objects

Note that immutability as a design goal doesn’t make it literally impossible to mutate an object. It means that the normal mechanisms for accessing objects in a given programming language don’t allow modification. It’s always possible to go around normal mechanisms, but that is not recommended practice for business application code (although it is normal for certain other kinds of solutions).

Why is this an issue for Java?

Immutability is generally a good idea for most categories of software. It isn’t limited just to Java or just to object-oriented design. Like any other generally good idea, it isn’t a “rule” or “dogma.” Software needs to do whatever is necessary to solve the problem at hand. There are plenty of cases when it’s perfectly okay to have mutable entities. Still, in general, immutability helps us build habitable code. So, why pick on Java?

We spend considerably more time working with existing code bases than doing greenfield development. Java is probably the single most common language we find in production. Most of the Java code we find in the wild doesn’t adhere to the guideline to use immutable objects, and it isn’t because the authors have carefully considered the trade-offs for mutability. If anything, it seems to be because they have not done so. Runtime problems associated with mutability are common in production Java applications.

Why does Java seem to be more susceptible to this issue than other languages? After all, there’s plenty of information about the value of immutability and plenty of guidance on how to design immutable objects in Java. Oracle’s official Java tutorials include a lesson on immutability. How do developers learn Java without working those tutorials? They’re free and they’re straight from the horse’s mouth. An obvious starting point for learning the language.

A lot of developers have shared their knowledge about immutable objects in Java, too. For instance, David O’Meara has a good explanation on JavaRanch that’s based directly on the basic Java turotial (without attribution; tsk, tsk). His walkthrough and examples are clearer than those in the Java tutorial, in my opinion. That’s only one example; information abounds.

There’s no obvious reason why Java applications, in particular, should suffer from immutability problems to a greater extent than applications written in any other language…or is there?

I will suggest the following possible causes:

  • Most Java programmers believe it is “necessary” or “required” to write getter and setter methods for all fields in a class. They also seem to believe any method whose name begins with “get” or “set” is not allowed to contain any logic other than retrieving or modifying the value of a field. It’s the way they initially learn to write Java code, and they never question it. In effect, this is hardly better than declaring every field public. The practice may be a throwback to the Java Beans convention, applied inappropriately to classes that are not meant to be Java Beans.
  • Early implementations of dependency injection containers that were used with Java-based webapp frameworks required classes to be defined with no-arg constructors. Java programmers have gotten into the habit of writing no-arg constructors or allowing the default no-arg constructor to remain in place, even in cases when the solution design includes no such requirement.
  • Many programmers have an exaggerated view of the cost of object creation. They trade off readability, reliability, simplicity, and other desirable characteristics of habitable code in order to avoid instantiating new objects.

The Java Beans convention

According to the specification, “The goal of the JavaBeans APIs is to define a software component model for Java, so that third party ISVs can create and ship Java components that can be composed together into applications by end users.” That approach to building applications was formulated very early in the history of the Java language; the “current” version of the Java Beans specification document is dated August 8, 1997, and applies to Java 1.1.

The Java Beans convention was embraced by companies that produce visual GUI application builders. These tools allow developers to create a “screen” by dragging and dropping GUI widgets (AWT components) onto a visual workspace. The AWT components are implemented as Java Beans. This enables the tools to use the Java reflection API to discover any methods whose names begin with “set” so that developers can customize each widget by adding text, icons, actions, and so forth. For several years, this was a popular way to build Swing-based standalone applications and Java applets. The concept proved successful in the field.

But Sun Microsystems had a broader vision for Java Beans than just GUI widgets. From the specification: “Some JavaBean components will be more like regular applications, which may then be composed together into compound documents. So a spreadsheet Bean might be embedded inside a Web page.” In fact, I have seen (and written) solutions in which Swing components were composed into spreadsheet-like elements in a GUI app. On the whole, this approach never caught on for general application development.

The Java Bean phenomenon gained a lot of traction. Sun Microsystems extended the idea beyond the front end and applied it to mid-tier and back end applications in the form of Enterprise Java Beans (EJBs). Like many ideas in our field of work, EJBs must have seemed like a good idea at the time.

The use of Java Beans declined as simpler and/or more robust alternatives came along. The key design goals of the Java Beans convention were (according to the specification):

 

  • Component granularity
  • Portability
  • A uniform, high-quality API
  • Simplicity

 

All those goals are usually achieved by other means today. Component granularity and simplicity are supported by generally-accepted software design principles, such as SOLID and GRASP. Portability and uniform, high-quality API goals are supported by API design principles that have matured since the advent of the Java Beans convention, including Google’s API design guidelines, Swift.org’s API guidelines, the REST API guidelines, and Heroku’s Twelve-Factor application design guidelines.

Java Swing is an event-based framework. Java Beans are meant to work within that framework. A Bean has three aspects:

  • Properties
  • Methods
  • Events

A Bean’s properties are its fields. The methods are the operations that enable GUI builder tools to get and set the fields. The Bean fires and responds to events in response to user gestures on the UI.

If you are writing code that is different from that, then the Java Beans convention is probably irrelevant.

When a bean is not a bean

Most Java programmers working today routinely declare a class’ fields private and expose them through public getter and setter methods, exactly as one would do in a Java Bean. Most applications are not event-driven, and the classes in the majority of production applications don’t fire and respond to events.

Here’s an example of a Java class with all its fields declared public:

public class Shoe {
	public ShoeStyle style;
	public double size;
	public ShowWidth width;
	public int heel;
	public Color[] availableColors;
	public String photoFilename;
}

We’re assuming the existence of ShoeStyle and ShoeWidth types, which would most likely be enums if this were a real application.

You could use that class definition in an application. There’s nothing naughty or evil about it. The compiler doesn’t care if you declare all the fields public. Maybe you don’t care, either, if the class is just a bag of fields. If the class has any behavior, or if there are any negative implications to modifying the fields at runtime, then you might care a little.

Here’s how a typical Java programmer might write it:

public class Shoe {
	private ShoeStyle style;
	private double size;
	private ShoeWidth width;
	private int heel;
	private List availableColors;
	private String photoFilename;

	public double getSize() {
	    return size;
	}
	public void setSize(double size) {
	    this.size = size;
	}
	etc.
}

Now the fields are declared private, so we gain a sense of security about the robustness of objects created from this class. We also have public setters for all the fields, so we know our sense of security is false. The names of the getter methods reveal the internal names of the fields, so client code has to know more about the innards of the class than we might prefer. A change to any of the field names would also require a change to its getter and setter method names, and thus would require changes in all the client code that uses the class.

Sounds like a lot of busy-work just waiting to happen. Don’t know about you, but I’d rather not.

A slightly more object-oriented approach might be:

public class Shoe {
    private ShoeStyle style;
    private double size;
    private ShoeWidth width;
    private int heel;
    private List availableColors;
    private String photoFilename;

    public Shoe(ShoeStyle style, 
            double size,
            ShoeWidth width,
            int heel,
            List availableColors,
            photoFilename) {
        if (style == null ||
            width == null ||
            availableColors == null ||
            size < 1 || 
            size > 15 || 
            heel < 1 ||
            heel > 12) {
            throw new IllegalArgumentException();
        }
        this.style = style;
        this.size = double;
        this.heel = heel;
        this.width = width;
        this.availableColors = availableColors;
        this.photoFilename = photoFilename;
    }

    public double size() {
        return size;
    }

    public boolean comesIn(Color color) {
        return availableColors.contains(color);
    }

    public Shoe addColor(Color color) {
        List newAvailableColors = availableColors;
        newAvailableColors.add(color);
        return new Shoe(style, size, width, heel, 
                        newAvailableColors, photoFilename);
    }
    etc.
}

This version has a constructor that ensures a logically-invalid instance won’t be created. It prevents a Shoe object from being created unless all mandatory fields are populated (it allows a null reference for photoFilename, indicating that is an optional value).

It has accessors (the one shown is the size() method), but doesn’t name them “getThis()” and “getThat()”. There’s no purpose in doing that, as this isn’t a Java Bean. Other methods are named in a way that suggests their normal usage.

Could the code be prettier? Sure. Those magic numbers aren’t so great, for instance. They could be replaced by int constants, like MIN_SHOE_SIZE and so forth, or by enums or classes that know their own upper and lower bounds and ensure only valid objects can be instantiated. But this is just a quick and dirty example; we don’t want to get carried away here.

To find out if the shoe is available in red, you don’t call “shoe.getAvailableColors()” and dig through the List. You just write, “shoe.comesIn(Color.RED)”. You don’t have to know the class maintains the available colors as a List, and you shouldn’t have to know that. Client code becomes more loosely coupled and more expressive of intent:

    public void expressExtremeShoeColorBias(Shoe shoe) {
        if ( shoe.comesIn(Color.RED) ) {
            System.out.println("Let's go shopping!");
        } else {
            System.out.println("Let's go barefoot!");
        }
    }

There are no setters. To add an attribute to the shoe, we create a new instance of Shoe that has the new attribute (illustrated by the addColor() method).

There’s more you could do; this is meant to illustrate an object that isn’t a Java Bean need not look like a Java Bean and doesn’t benefit from trying to look like a Java Bean.

Conclusion: The overuse of getters and setters appears to be a hold-over from the era of Java Beans and EJBs.

Dependency injection

Early implementations of dependency injection containers were not able to chase down a chain of dependencies and resolve all the constructor arguments. They required developers to provide a no-arg constructor on classes that might be injected.

The downside was that logically-invalid objects could be instantiated; that is, objects that were missing values necessary for the object to be used. In the Java world, this led to frequent, unannounced visits by our dear friend, NullPointerException. Other friends dropped by from time to time, as well, such as Mysterious Intermittent Runtime Error That I Can’t Reproduce, and good old Works On My Machine But Not In The Test Environment, and his cousin But The Unit Tests Passed.

That early limitation did not last long. Containers today can instantiate any object. You can write well-structured, object-oriented code that doesn’t allow invalid objects to be created. The habit of providing a no-arg constructor and relying on setter injection can and should be broken.

Fear the Walking Object Instantiation Overhead

The Java runtime uses references to access objects in memory. What does that have to do with immutability?

Assume we want two bouquets of two flowers each. The first bouquet should have flowers with 6 petals and 9 petals. The second bouquet should have flowers with 4 petals and 8 petals. (Pretty romantic, eh?) After this code runs…

    public class Flower {
        private int petalCount;
        public setPetalCount(int petalCount) {
            this.petalCount = petalCount;
        }
    }

    public class Bouquet {
        private List flowers;
        public Bouquet() {
            flowers = new ArrayList();
        }
        public void setFlowers(List flowers) {
            this.flowers = flowers;
        }
    }

    public class SomeClientClass {
    . . .
        public List makeBouquets() {
            // build first bouquet
            Flower flower1 = new Flower();
            flower1.setPetalCount(6);
            Flower flower2 = new Flower();
            flower2.setPetalCount(9);
            List flowers = new ArrayList();
            flowers.add(flower1);
            flowers.add(flower2);
            Bouquet bouquet1 = new Bouquet();
            bouquet1.setFlowers(flowers);

            // build second bouquet
            flower1.setPetalCount(4);
            flower2.setPetalCount(8);  
            Bouquet bouquet2 = new Bouquet();
            bouquet2.setFlowers(flowers);

            List bouquets = new ArrayList();
            bouquets.add(bouquet1);
            bouquets.add(bouquet2);
        }
    }

…both bouquets have flowers with 4 and 8 petals. The references named flower1 and flower2 point to the same instances of Flower. Both bouquet1 and bouquet2 are looking at the same references. So, when we modify flower1 and flower2 to set the petal counts for bouquet2, those changes are visible in bouquet1 as well.

Making a shallow copy of the Bouquet object doesn’t result in different references to flower1 and flower2:

    public class Bouquet {
        . . .
        public Bouquet clone() {
            Bouquet newBouquet = new Bouquet();
            newBouquet.setFlowers(this.flowers);
            return newBouquet;
        }
    }

The flower1 and flower2 references still point to the same objects in memory.

Depending on the data types of the fields, making a safe copy of an object may involve making a deep copy of certain fields in the object. Depending on how the application is designed, several layers of deep copies may be necessary to ensure a data value somewhere down in the pile doesn’t get modified after the object has been instantiated. (Fancy people call that a hidden side-effect; a change in the application’s state that we don’t expect.)

    public class Bouquet {
        . . .
        public Bouquet clone() {
            Bouquet newBouquet = new Bouquet();
            List newFlowers = new ArrayList();
            for (Flower flower : this.flowers) {
                Flower newFlower = new Flower();
                newFlower.setPetalCount(flower.getPetalCount());
                newFlowers.add(newFlower);
            }
            newBouquet.setFlowers(newFlowers);
            return newBouquet;
        }
    }

Even this trivial example looks like it would incur significant compute overhead to instantiate the new Flower objects. Imagine a more realistic example that contains numerous non-trivial objects. Looks pretty scary, eh?

Looks scary, but isn’t.

Many Java developers worry excessively about the performance implications of object instantiation. The problem is so common that the Java tutorial on immutability calls it out explicitly: “Programmers are often reluctant to employ immutable objects, because they worry about the cost of creating a new object as opposed to updating an object in place. The impact of object creation is often overestimated, and can be offset by some of the efficiencies associated with immutable objects. These include decreased overhead due to garbage collection, and the elimination of code needed to protect mutable objects from corruption.”

Programmers really should worry more about I/O overhead. A database query coded inside a loop involves a lot more overhead than the instantiation of an object in memory. Network delay is likely to swallow up any compute overhead in the application, anyway. Instantiation overhead is not a valid reason to avoid creating immutable objects.

If your application has truly severe performance requirements, then my first question to you will be, “Why did you choose Java for this application?” Java is a great language for general business application programming, and its performance characteristics are suitable for such applications. Java is not designed for truly severe performance requirements.

Why do I keep writing the phrase, “truly severe?” Because most Java programmers seem to have an unrealistic concept of what “high performance” means. High performance software isn’t for pulling one record out of a back-end data store and displaying its fields on a screen. High performance software is for things like rapid searching of massive big data stores and realtime cruise missile guidance. If you’re developing a webapp or a business-oriented microservice, you don’t have truly severe performance requirements.

Get over it. Strings are interned. Pseudo-primitive types like int are managed similarly. Java compilers optimize code better than you can by hand. JVMs optimize memory usage and garbage collection better than you can by hand. Don’t sacrifice reliability and simple design to try and gain performance improvements that don’t matter.

Conclusion

The benefits of immutability are well-established and well-known. I think it’s fair to say that we need justification not to use immutable objects in our solutions, rather than an affirmative reason to use them.

leave a comment

Leave a comment

Your email address will not be published. Required fields are marked *