All but the most trivial of programs accomplish tasks that require the coordination and combination of multiple concepts. How those disparate concepts are arranged can have a huge impact on the comprehensibility of the program. Complexity in the software not only impedes others' understanding of that software, but also negatively impacts the modifiability of that code. On the flip side, time taken to simplify a body of code leads not only to a more clear understanding of the problem and the solution being solved, but often presents opportunities for optimization in a number of different ways.
The complexity of a given body of code can often be determined by casual observation. In the case of a function, how long is it? How many branches does it have? How much state does it track? Of that state, how much is external to the function; how broad is the scope of that state (is it truly global data, or is it more restricted as members of a class instance)? The more that these attributes are represented in any item of code, be it a function, a file, a class, a library, the more complex that item is. The more complex a body of code is, the more resistant to change it is. The more resistant to change a thing is, the harder it is to add or change functionality and fix bugs.
I don't think it's hard to argue that complex code is difficult to work with. Software developers have come up with a number of catchy phrases to help guide a reduction in these sorts of complexity. "Do Simple Things" encourages us to, well, make the code simple to start with; while it doesn't directly indicate a method to reducing complexity the "mandate," if you will, can expand to the task of reducing complexity by taking small, aka "simple," steps. The "One Responsibility Rule" tells us that any unit of code (a function, a file, a class, a library, and possibly even an entire executable program) should focus on doing one thing and doing it well. The Law of Demeter encourages us in a similar way; it tells us very specifically about what sorts of data interactions any unit of code is allowed to have. "Tell Don't Ask" is a corrolary to the Law of Demeter that suggests we author our code to give the necessary state to the things it calls, rather than requiring a given piece of code to query near and far for that state. Finally, a personal favorite of mine, "Feature Envy" is a body of code that seems more interested in the state of things outside of itself than it does on its local state. Perhaps I may also take the opportunity to add a humorous statement that I read somewhere once upon a time that suggested that every branch in your program is a bug waiting to manifest.
The challenge then becomes simplifying the complex thing. Something can only practically be complex if it is made up of a number of less complex things. So the first step is to separate those less complex things so that they become individual items each of lower complexity. As a trivial example, consider an application like 'sort' which reads a file into memory, sorts the individual lines, then writes those lines back out. A straightforward implementation of this could readily be done in one function, open the file, read the file, close the file, split the file into individual lines, sort those lines, and then open another file, write the lines, close the file, done. At this point it's not terribly complex, but it can be made simpler if the mainline consists of three calls, one that reads the file into the internal data structure, one that sorts the data structure, and a third that writes the data structure. At a high level, this program has become less complex to the observer exactly because there's less to comprehend, there's less state to manage. An additional benefit that comes to light when the sort routine itself needs to start being more complex, if you want to reverse the sort or sort the lines based on something other than the string of characters starting in the first column. When we start talking about this, we realize that we need to pass switches on the command line, so we add a fourth function to our "main", one to parse the arguments and set up some state based on what was passed. The actual sorting function then examines the state that represents those parameters, and anyone looking at the program code doesn't need to deal with the mechanism of parsing the command line arguments unless specifically looking at the function that does that handling.
I made a big deal about state above, so I'll come back to it. The broader the scope of any element of non-constant data, the more likely it is that it will not be configured in the way you want it to be when it comes time to use it. This makes it more difficult to depend on, and worse, even though everything may work fine at the moment you make a dependency on state with a broad scope, if someone else comes in and changes that state without completely knowing the effects of that change then the previously-working code will not be working anymore. Many would chalk such a change up to a bad or otherwise lazy programmer on the team, but if we're honest with ourselves we realize that we're under all sorts of pressures to ship our software, and our software is so big that any one person can't truly grok the entire body of code. In this case, the onus is on each of us to make our software as robust as possible in the face of external changes. Finally, if you have a piece of code that only operates on state local to that code, that state is pretty much immune to changes to the outside world, assuming that the parameters it is handed remain valid. This is a thing of beauty and we should endeavor to keep it that way.
As developers on a software team, we should all keep in mind that other people will be reading our code, and we should write our code accordingly. If we author code to be the simplest code that it can possibly be, then anyone, even the most junior team member, should feel comfortable reading, understanding, and editing that code. The easier it is to edit a bit of code, the more readily it will grow to meet the user's needs. Finally, who knows, with simpler code, it may just be that new insights into the problem being solved will be gained!
0 comments:
Post a Comment