In a recent issue (“Am I a Mad Scientist?” CoDe Magazine, May/June 2005), I wrote about the obvious practical benefits of creating strongly-typed classes within a custom software application. Since then, I’ve been thinking more about the not-so-obvious benefits of object-oriented design (OOD), which seem to be worth discussing as well because they appear to be rooted in a better way to develop software. The first article centered on the benefits mainly of strong typing; this one focuses more on the benefits of, if you’ll allow, true OOD.
By “true object-oriented design” I mean object-driven design; that is, basing your software solution domain on the actual problem domain as opposed to, say, data-driven design, which is by far the more common design approach within Microsoft circles. I think this point doesn’t bear much proving in this article, as it is patently obvious by the artifacts existing today in the form of marketing, presentations, tutorials, the majority of Microsoft-related articles, developer tools, and indeed in the many millions of applications existing today that are clearly the result of data-driven (i.e., starting with the data and how it will be stored and reported upon) design.
But for those who want some evidence, consider two things, one of which I’ll discuss in more depth in this article. First, consider the popular code generation tool, CodeSmith, and other generation tools like it. These tools require you to have developed your database first and will happily create code for you to access and persist the data within it. Sure, you can generate custom classes using these tools, but you have to start with the database and design it in such a way as to be usable by the tools to generate their code.
The second evidence is found in the DataSet itself. One need only consider the vast numbers of articles, presentations, and tutorials written about it and the unequaled support that the tools such as Visual Studio and the Microsoft GUI frameworks (WinForms and ASP.NET) have for it. I’ll get more into this later.
Object-driven design, on the other hand, starts with the problems that you are trying to solve. Not only does it start with the real world, but it asks you to consider the objects in the real world, specifically the behaviors that define them and make them unique and pertinent to the problem at hand. It then takes that knowledge and applies it, creating a solution of objects based on these, a solution that better addresses the problems because it acts in ways that naturally make sense. There isn’t time to fully develop this concept here, so I direct you to Dr. David West’s recent book, Object Thinking (MS Press, 2004), to further explore object-driven design (object thinking).
Most objections to creating custom classes and using OOD revolve around ease of use or, rather, ease of design and implementation. There seems to be, strangely enough, almost unanimous agreement that custom classes ala OOD are better if you have time. The argument is that data-driven design ala the DataSet (and its predecessors like the RecordSet) is easier because it has better built-in tool support and is generic enough to represent most data. Even the argument from developer experience (i.e., that inexperienced developers can’t do object-driven custom class design) boils down to the fact that Microsoft has made it much easier and more common to use the DataSet than custom classes. But does that make it right?
Microsoft has gone to considerable lengths to create a fully-featured, strongly-typed, and object-based platform for developers (called the .NET Framework, in case that’s not clear). While it is still maturing, it is an impressive and fun platform to develop on (from a professional developer’s perspective) that challenges any of the alternatives as the best platform for general-purpose software development and is almost certainly the best platform for business-oriented solutions.
Microsoft also has a superlative RDBMS, SQL Server, and they, obviously, realize the importance of giving their application development platform access to their (and others’) data persistence platforms. They’ve got an entire System namespace for dealing with this aspect of development called System.Data, and that is precisely where the DataSet and its comrades live. (Henceforth, “DataSet” refers to DataSet and related System.Data classes.)
In other words, DataSet is a tool to deal specifically with data persistence, which includes persistent data access. The problem is that Microsoft and friends have drastically expanded the purpose of DataSet, evolving it from being part of the data persistence framework into a one-size fits all solution to application problem domains, or at least that is the impression one gets based on the aforementioned artifacts.
They do this by providing all these fancy GUI tools that make editing data (note, I don’t say “building an application”) easy. These tools work great for demonstrations to sell the platform; hence the marketing artifacts. These tools do not, however, build great applications-they simply build data editors with minimal constraint-like validation on the data.
I’m talking about Access++, InfoPath++, Excel++, or just plain Office++. This is not software development; this is not even application building; it is simply a souped up way to edit and store data. If that’s all you need, just go with one of the these commercial products and don’t waste a ton of money investing in SQL Server, application servers, developers, development, project management, quality assurance, etc., to build a custom application.
If, however, you need a piece of software that does something quasi-intelligent, such as automate business processes and solve (or help to solve) business problems, you’ll want to consider investing in .NET, SQL Server, Java, Oracle, app servers, developers, and so on. And if you are investing in that, you also want to make sure that your investment will have the highest return possible.
In order to do that, you need to think about long-term things like scalability, extensibility, and maintainability. These things cannot be offered by the drag-n-drop experience advocated in Microsoft demos and fly-by-night consultants. They cannot, indeed, be offered by abusing the DataSet as a solution domain. Not only will such abuse negate key long-term returns on investment, it will also unnecessarily complicate the solution domain because you are trying to fit a square peg into a round whole, using a hammer as your only tool, and other such nifty and overused aphorisms.
Yes, a DataSet can make persistence easier, but then you have to create other hoops and make it jump through them (e.g., “strongly-typed DataSets”) in order to create a workable solution to your problems. A DataSet forces you to think in terms of the persistence platform instead of in terms of the problem domain and doesn’t make it easy to address that problems that you are attempting to solve.
On the other hand, you could start with your problem domain-the real-world issues that you are attempting to solve-and model your solution using objects that correlate to that domain and behaviors that occur in that domain. Things will naturally fall into place then-you won’t be wrenching the problems into your (or Microsoft’s) predefined solution but rather your solutions will flow naturally and easily from your problems. Your solution objects will look like your problem objects, and they will behave alike.
Note this implies that your design is not data-centric but behavior-centric. When solving problems, making decisions, and taking action, we only need data inasmuch as it assists in executing the behaviors that need executing. This is so central to OOD, and I know that most developers don’t see it. (Trust me, I’ve argued with lots of them.)
We must stop looking at computers as just a handy way of storing data and start seeing them as facilities that make our lives easier, more efficient, and generally better. Instead of focusing on making data easier to store and read, developers should be focusing on making the business processes that the data supports more streamlined. Data only matters because it helps us make informed decisions, that is, to behave in a way that is most beneficial to us.
It is almost absurd to be having this discussion now given that object-oriented design made its debut so many years ago. Nonetheless, it seems that many people have missed the mark as they continue to think of OOD and OOP as just a way to group data intelligently. That’s just plain silly-that’s what relational databases do! Let the database worry about how best to store data. Let the new-fangled business intelligence architects worry about how best to mine that data to produce reports that help drive key business decisions. And let us, the application architects, worry about how best to automate business processes.
You may be thinking I’ve digressed, but I haven’t. This all speaks to the central topic here, which is, simply put, that you should not be using DataSets to solve your business problems except inasmuch as you need help persisting data. It can happily play a part somewhere in your solution, but it should not serve as the solution domain for custom software applications.
Your solution domain should be composed of classes of objects that address the problems you are trying to solve, classes that are defined by their behaviors and that use data (perhaps even from a DataSet in the background somewhere) to inform their behaviors.
You should be building and buying tools that make this kind of object-oriented design easier to implement. This argument for DataSets from ease of implementation is, at best, a compromise and, at worse, a plain and simple lie. And the argument from developer experience will fall away as we make OOD better and more widely understood and as we make tools that require less and less experience to do good OOD.
Don’t get me wrong, I’m not saying that the architects at Microsoft are ignorami who are unaware of the things I’m talking about, nor are, necessarily, the folks who advocate DataSets as a good approach to custom software development. I think anyone who has been working with Microsoft technologies prior to .NET has, necessarily, been coerced into using data-driven design by previous earlier development platforms that did not support OOD.
DataSet thinking is a natural outgrowth of computing history up to this point, particularly in the Microsoft world. We do need ways to simply access, modify, and store data, but the industry has evolved tools that are specific to address those kinds of needs and will continue to evolve such tools (like Office, InfoPath, OneNote, WinFX, SQL Server, etc.). These things have become so easy as to, most of the time, not require a developer’s time.
Surely, custom development should not be (and often is not) doing just those things any longer; if it is, you need to re-evaluate whether custom development is truly needed. We need to move on to the next great thing, building on the shoulders of giants; that is to say now that we can do more for ourselves and in better ways, let’s do it. Let’s focus on what custom software truly brings to the table, make it easier to do that, and just do it.
Objects have been (and, ironically, still are becoming) the next big thing in software development. I don’t see how they will ever be totally superseded precisely because they allow us to address real-world problems through software in natural ways. Surely we can continue to create tools that make OOD easier to design and implement, more efficient for computers to execute, and address limitations of technology. OOD is not at odds with or obsoleted by these goals or even by trends like SOA; rather these are (or should be) supplemental to OOD.
Thus far I have not spent much time speaking to the benefits of OOD but have rather focused my efforts on dispelling the notion that the DataSet is a viable solution domain. This is partly because of the already-mentioned acquiescence that OOD is best if you have time. And I wager the reason for that general agreement is that it doesn’t take a rocket scientist to know that a solution that is tailored to a problem will be of higher quality than one that is a general solution to be amended to fit.
This is true in nearly all human endeavors; just think of clothing, food, cars, houses, landscaping, etc. Custom is better. Having a solution that naturally solves a problem because it is tailored to fit is better by all accounts save economy. And it is no different with software. So, rather than attempt to make a one-size-fits-all solution, shouldn’t we rather focus our energies on making custom solutions easier to implement? We still need to solve the economy problem, but we needn’t sacrifice good design to do it.
So since we’ve agreed that custom solutions that naturally fit a problem domain are better, why continue to have discord about this subject? Let’s leave the object versus DataSet diatribe behind us and join our forces to make better and truly reusable architectures (reusable across similar problem domains) that are easier to understand and more accessible to all. But we have to commit to taking the first, painful steps of learning and, consequently, teaching others before we can achieve that goal. Who is with me?