POJO, POCO, POPO - WHEN ANEMIC DOMAIN MODEL IS PREFERRABLE?

By Djambong T. Hank-debaim (khdzhambongtenke@edu.hse.ru)

Software Engineering is the development of software products using best practices, principles, and methods. The focus of this essay is about what is considered best and bad practice. In the next few pages, we will take a quick look into the notion of design patterns, then we will make a critical analysis of the so called Anemic Domain Model (ADM). At that stage, we would assess why it is being labelled as anti-pattern by some. After that, we will consider the good aspects of ADM and why software engineers still use it. As conclusion, I will present my stance on when ADM usage is preferable.

Design patterns are sets of solutions to common design problems that occur over and over in development. They work as a solution template in which an abstract solution for a common problem is described and the user then applies it, adapting it to their problem. An anti-pattern is just the possible of a pattern. Implementing an anti-pattern within a program is going to affect system performance, security, scalability, and sustainability.

During the years, multiple approaches have been created to help software engineers on developing complex systems. One of those approaches is Domain-Driven Development (DDD). DDD is an approach to software development that centers the development on programming a domain model that has a rich understanding of the processes and rules of a domain [1]. In DDD, the development process boils down to creating software abstractions called domain models, which are models of the domain that incorporates both behavior and data [3]. This clearly sets them apart from Entity Objects, which are object representations of only the data stored in a database, while the behavior is in separate classes instead.

In theory, DDD seems logical and shall suit all complex enterprise solutions. However, most developers still do not apply it by its full extent. Rather, they use a “semi-version” of DDD, which lead to the so-called Anemic Domain Model (ADM). An ADM is a design approach using pure-data Entity Classes, with the behavior extracted to another layer, usually called “service layer”. In various programming languages, such classes bear well-known denominations and follow the naming pattern POXO (Plain Old X Object), where X represent the first letter of a specific programming language. For example, in Java they are called POJO, in CLR – POCO, in PHP – POPO, etc.

Why despite its high propagation, some high figures in software engineering like Martin Fowler [2] still consider ADM as being anti-pattern? Well, there are multiple reasons:

  • ADM violates the encapsulation and abstraction principles of OOP, since the models contain only data and not logic associated with operating those data (except for some getters and setters).
  • In the context of model/database mapping, ADM is prone to difficulties, especially when the model will evolve and become more complex.
  • ADM transforms domain models into mere domain data carriers.
  • ADM is a failed attempt to use object-oriented programming, that ended on procedural programming.

All those reasons seem good enough for developers to avoid this anti-pattern, right? Well, not exactly. In the next few sections, we will consider the good aspects of ADM and why software engineers still use it.

In ADM, each model only bears the single responsibility of representing an object with its data. This is in accordance with the Single Responsibility principle in SOLID. In the opposite, a model containing both data and operations on those data (Rich Domain Model) has at least 2 responsibilities: hold the business data and hold the business logic on those data as a single abstraction.

In ADMs, the domain rules, and infrastructural concerns (such as persistence and object construction) are encapsulated in their own services (and presented via abstract interfaces). Consequently, coupling is reduced. This means in case we would like to modify the behaviors associated with that model we do not have to modify the model itself. All what is to be done is to modify the thick service layer, which is responsible for holding operations. This property of ADMs allows making models that are open for extension, but closed for modification, which is in accordance with the Open/Close principle in SOLID.

In the RDM approach, models usually construct and initialize other models’ objects in relation to their own business operations. In this approach, the model also contains the domain logic and check the state of other model’s objects. Finally, by providing persistence (CRUD) operations through a base class, the model entity is also bound to the persistence context. By enumerating these responsibilities, the Rich Domain Model exhibits a poor separation of concerns [4].

In computer engineering, Object-relational mapping (ORM) is a technique used to convert data between incompatible type systems that use object-oriented programming languages. There are 2 main ORM patterns: Active Record and Data Mapper [5]. A Data Mapper in its nature, is a Data Access Layer that performs bidirectional transfers of data between a persistent data store and an in-memory data representation. The desired goal of the pattern is to keep the in-memory representation and the persistent data store independent of each other and the data mapper itself. To achieve that, the layer is composed of some mappers, performing the data transfer. Not surprisingly, ADM works well with ORM, especially with Data Transfer Objects (DTOs).

In ADM, we observe highly cohesive, loosely coupled components which communicate via abstract interfaces. Those components are composed via dependency injection allowing for trivial mocking of dependencies. Therefore, scenario construction is made easier during automated test. In RDM, this process is more complicated to construct since tight coupling proliferates, making ADMs’ automated tests more maintainable. For illustration, let’s consider the following Reach Domain Model:

Suppose we ought to write unit tests on the IsItemPurchasable(Item item) method. The domain rules requires that the customer has enough funds and is in an allowed shipping region of that specific product for an item to be purchasable. Let’s consider a test checking the scenario when a customer has sufficient funds but is not in an allowed shipping region of that specific product (meaning the item is not purchasable). In accordance with the RDM this test shall be written as follow:

  • construct a Customer and an Item,
  • configure that customer to have enough funds,
  • configure the customer region to be outside the allowed shipping regions for that item,
  • at the end, we need to assert that customer.IsItemPurchasable(item) returned value is false.

However, the IsItemPurchasable method is directly dependent on the implementation of the ShipsToRegion method of the Item domain model. Any change in the domain logic in Item will change the result of the test. This is not the desired behavior, as the test should only be testing the customer’s IsItemPurchasable method logic. This bad behavior is the result of tight coupling and result on difficult code testability, which is familiar while using RDM. This simple example used few methods and already we see difficulties being drawn. Imagine this same RDM with dozens of tightly coupled domain models. Testing in this case becomes a nightmare.

On the other hand, ADM requires us to:

  • express the IsItemPurchasable logic within a separate service, which loosely depend on an abstract interface (the ShipsToRegion method of IItemShippingRegionService),
  • provide a stubbed, mock implementation of IItemShippingRegionService for this test, which will always return false in the ShipsToRegion method.

Within the software engineering community, there is a long discussion going on about what logic can be implemented within a domain model to interact with itself. The problem is that if implementing RDM, we quickly enter a situation where we must write a logic like this: Order.Save(order). However, if you think a bit about this command, it does not make sense. Should the Order model know how to save itself? Let's say we introduce a repository for that purpose. Can Order add order lines within the repository? it seems not to make sense, and it is because it does not. A User adds Product to Order, can we do User.AddOrderLineToOrder()? That starts to look strange! What if we wrote OrderService.AddOrderLine()? Now it makes sense! What we draw from this little experiment is that in OOP, encapsulation consists in putting business logic on models where the logic will need to access the model's internal state. If we need to access Order.OrderLines collection, we put Order.AddOrderLine() on Order. This way class's internal state doesn't get exposed.

This is exactly how most software engineers understand OOP. If it is needed to perform some operations on a set of objects or persist those objects, it seems more logical to have a separate service performing those operations, rather than having the objects performing those operations on themselves.

One of the arguments mentioned by the detractors of ADM is that “ADM is a failed attempt to use object-oriented programming, that ended on procedural programming”. Well, that is not false. However, what those detractors usually fail to grasp is the complexity of pure OOP, implemented in a RDM approach. New software engineers tend to learn coding by copying codes provided to them on internet resources. On those resources, nearly all examples are presented in a procedural manner. Models are constructed separately, and they store most of the time only data. The logic is presented in a separate service layer. If RDM is so good, why do most examples on the internet (and sometimes books) are written in a procedural manner? The answer is simple - for readability and understandability. RDM by itself is complex to grasp, especially if you are a new developer.

In this essay, we took a quick look into the notion of design patterns, then we made a critical analysis of the so called Anemic Domain Model (ADM). We assessed why it is being labelled as an “anti-pattern” by some. After that, we considered the good aspects of ADM and why people still use it. Despite its multiple cons, I personally consider ADM to be a great approach in software design. It helps build applications with domain models that follow the SOLID principles. Three scenarios when ADM is preferable:

  • When working in presence of a ORM library to operate data with the persistence layer.
  • When the system’s business logic implies strong coupling between models.
  • When testability is important, but we want to spend minimal time modifying / refactoring tests in each new model modification.