Bitwise Operator

[ Home | Library | Contents ]

[ Prev | Next ]

by Matt Slot

Collaborative Projects

When writing a small program by yourself, it's easy enough to keep track of the project and all the source code. For some things, it's just a single source file and perhaps a header; for others, you carefully divide the functionality across several files to make things more managable.

Soon enough, you learn that it's important to keep a backup of the work -- data loss happens, or you rip apart some code and just can't seem to put it back together. The easiest way to implement this safety net is just to copy the entire directory. Eventually the harddisk (or hopefully a remote server volume) becomes a haven for various snapshots.

Okay, that's fine for small, single-person projects but it's not really a practical solution for real world software development. Even adding just one developer to the mix means that every assumption has to be reexamined, every design decision debated, and even the formatting of the source code clearly spelled out.

What a pain!

Well, it's not really that bad. Enough people have already charted this territory so that there are many solutions and techniques to reduce conflicts. This month we'll discuss ways to improve collaborative work of closely knit and well organized programmers (for a typical commercial software product). Internet and "open software" development strategies will be deferred to an upcoming article.

Software Design

Even for single-person projects, harvesting the input of other developers, and even *gasp* users, is the easiest way to massage the target software into a usable application and a feasible implementation. Identify as many goals as possible, even if some seem impractical at the start, so that the decision to pursue the goal can be weighed against the cost in time and effort when the proper time comes.

Once the interface and feature set has been sketched out, it's time to plot out how large portions of the application will actually be written. For some tasks, the specifics are obvious and need little discussion; for others, it's prudent to seek the input of others before committing to a plan.

There are several key ingredients to this brainstorming process: several of your peers, a whiteboard, and a couple hours. Most of all, don't be afraid go back to the so called drawing board when something sticky interferes with your progress. Sometimes the best path is to throw away a portion of already working code to improve the code by an order of magnitude with a fresh start.

Writing Code

One of the most common conflicts between programmers on the same project is just what constitutes "readable" source code. Whitespace, type and variable names, commenting styles, and even file name conventions lead to circular debates. Here's my advice: Get over it! Apart from the need for plenty of comments, formatting of source code is an arbitrary and personal preference.

There are many other substantive issues that really need to be addressed early in development so that each contributor works from the same set of assumptions. (Note that I simply ignored the language issue, as in C vs C++. If you can't agree on that, you probably shouldn't be working together.)

Division of Labor

A single developer should lay the software framework: implementing a core engine, organizing the basic user interface, and setting up the internal structure of the code. Once the software becomes "functional" at some simple level, it's divided into modules which are then assigned to additional team members (who then focus on features and optimization in a well- defined subset of code).

As each piece is improved, the "lead programmer" oversees integration of changes. He adds and optimizes existing code at the architectural level, and may even reassign the other programmers as necessary to keep the whole project on track.

Memory Management

Any sufficiently advanced software project needs a well-defined strategy for managing memory allocations and deallocations. Within a given module, it's easy to track and release memory; however when exchanging internal data structures between modules, which one is responsible for releasing the memory? Now add in a few libraries, mixing C and C++ allocations, and copy constructors -- things start getting messy.

The best approach is use specific allocator and deallocator functions for any data (especially those with internal structure). The module which requests a block is responsible for passing it to the appropriate cleanup routine. And always, ALWAYS check for memory leaks.

Utility Libraries

It's impossible to write software without relying on some system-level or third party services -- they simplify some tasks, provide a common look and feel, and just plain add functionality. Unfortunately, there are a number of foundation libraries (memory, threads, data storage) and standardizing on one is often difficult or impossible -- especially for cross-platform software.

First decide exactly which services the application will need, then try to find libraries which provide those services conveniently for as many target platforms as possible. If a library does not work on a certain system, then you will probably need to find something similar or write your own. In fact, it's often best to "abstract" how the service is provided by writing wrapper routines, presenting a consistent interface. Meanwhile the module consists of glue code or a full implementation.

Error Handling

As mentioned in previous columns, error handling is often neglected until late in the design process. Just like memory management, each internal error must be tracked and passed up so that it can be resolved or the user can be notified. A consistent system of error propogation should be established early.

Decide which modules can fail, which are "fault tolerant", and when a propogated error should be displayed to the user. I prefer to pass errors up from implementation functions to those of the user interface, so that an error while saving a document is intercepted and displayed at the document level (rather than at the file system). This makes it easier to explain which action failed and why ("Unable to save "Term Paper" because disk is full" instead of just "disk is full").

Revision Control

While formalizing the development process is important to generating good code, developers know that software projects require more than one pass. As the only programmer on a project, it's quite simple to make specific changes to the code in carefully organized steps. With several people touching the same code base, however, things are just not that simple.

The biggest problems are tracking down bugs across revisions, keeping pace with changing interfaces, and simply keeping various modules synchronized and up to date. In fact, performing these tasks manually requires signficant programmer time that can be used elsewhere -- and the overhead grows exponentially with more project members.

The general solution to these problems is called "source code control" or "revision control." Basically, it's a formal and automated process for storing source code edits in a common repository and accessing them again. Everybody's work is stored in a single location, making it easy to search and synchronize "local" working copies with those on the "server."

By formalizing the process, incompatibilities tend to arise only at specific times instead of behind your back. One programmer can spend a month fleshing out a library, massaging the function declarations as necessary; another coder who depends on the library isn't plagued by problems with interim versions. Once the edits to the module are complete, the source can be "checked in" to the source database. Only when the second programmer is ready to "merge" his edits with the new version does he actually "check out" the new library.

Each developer is responsible for making sure his edits work with the rest of the source on the server before checking them in. Other developers can proceed with older, "working" code in a controlled manner until it's convenient to update. (Without such controls, debugging would suck because different modules might break while people edit them concurrently.)

There are other benefits to source code control as well: provides a revision by revision history of each source file (tremendously useful for finding new bugs), it lets you track several versions of the same code concurrently (new and maintenance versions), and it encourages developers to make changes in smaller and more managable chunks.

Careful use of revision control and file compare tools let each developer control not only what changes between edits, but also when these changes are made. There is still an element of cooperation required to work on a project with others, but enforcing reasonable guidelines helps smooth the overall process. In fact, developers who've used revision control on large projects often start using it on small or solo projects as well.

Overall, these techniques reduce the complexity of large scale projects to a manageable chaos. Anything which streamlines the collaboration process at the cost of a little overhead will greatly outweigh the costs of poorly integrated modules and disagreements between project members. Take the proactive approach and establish guidelines for each project, or even company-wide.

Matt Slot, Bitwise Operator

[ Prev | Home | Library | Contents | Next ]