Parnas (1972): On the Criteria for Decomposing Systems into Modules

This six-page paper changed how software engineers think about modules. Before Parnas, the accepted practice was to decompose a system by drawing a flowchart and making each major processing step a module. Parnas demonstrated, with a single concrete example, that this produces systems where every module depends on every other module’s internal data representation — making the system difficult to change, difficult to develop in parallel, and difficult to understand.

The alternative he proposed was deceptively simple: each module should hide a design decision. Not a processing step, not a subroutine — a decision about data representation or algorithm that is likely to change. The module’s interface exposes operations, not structure. If the hidden decision changes, only that module changes.

Four years later, Margaret Hamilton would formalize this insight into mathematics. Her Access axiom — a module can only read or write data that its position in the hierarchy grants it access to — is Parnas’s “secret” restated as a provable structural constraint. Parnas argued by example that information hiding works. Hamilton proved by axiom that it must.

The KWIC Index Example

Parnas chose a problem small enough to fit in a few paragraphs yet large enough to admit genuinely different decompositions: a KWIC (Key Word In Context) index production system.

Specification:

Input is a set of lines; each line is an ordered set of words; each word is an ordered set of characters
Each line can be “circularly shifted” — remove the first word and append it at the end
Output is all circular shifts of all lines, sorted alphabetically

For the input line “abcd efgh ijkl mnop”, the circular shifts are:

Shift
abcd efgh ijkl mnop
efgh ijkl mnop abcd
ijkl mnop abcd efgh
mnop abcd efgh ijkl

The problem is trivial to implement. The question is how to divide the implementation into modules.

Decomposition 1: Processing Steps

The conventional approach decomposes by flowchart — each module corresponds to a step in the processing pipeline:

Module	Responsibility
Input	Read data lines, store them in memory
Circular Shift	Produce an index array listing the first character position and original line for each circular shift
Alphabetizing	Sort the index array alphabetically
Output	Print the sorted circular shifts using the index array and stored lines
Master Control	Sequence the above

Every module knows how lines are stored in memory. Every module knows the format of the index array. The Input module decides the storage representation, and every subsequent module depends on that decision.

Decomposition 2: Design Decisions

Parnas’s alternative decomposes by information hiding — each module hides a design decision:

Module	Interface	Hidden Decision
Line Storage	`CHAR(l,w,c)` — get the $c$ th character of the $w$ th word of line $l$ ; `SETCHAR` — store a character; `WORDS(l)` — count words in line $l$	How lines are stored (packed array, linked list, disk-backed, etc.)
Input	Reads input, calls Line Storage	Input format
Circular Shifter	`CSCHAR(l,w,c)` — get the $c$ th character of the $w$ th word of the $l$ th circular shift; `CSSetup` — initialize	How circular shifts are represented (virtual index computation vs. physical copies)
Alphabetizer	`ITH(i)` — return the index of the $i$ th shift in alphabetical order; `ALSetup` — sort	Sorting algorithm
Output	Print the sorted shifts	Output format
Master Control	Sequence modules	Control flow

The critical difference: no module other than Line Storage knows how lines are represented in memory. Circular Shifter does not copy shifted lines — it computes shift indices on the fly through its CSCHAR function. Alphabetizer does not rearrange data — it produces an ordering. Each module’s interface is a set of operations, not a data structure.

Why Decomposition 2 Wins

Parnas evaluates on three dimensions:

Changeability

Consider changing how lines are stored in memory — say, from storing all lines in core to packing four characters per word, or reading from disk on demand.

Decomposition 1: Every module must change. Input stores differently, Circular Shift reads differently, Alphabetizing compares differently, Output formats differently.
Decomposition 2: Only Line Storage changes. Every other module calls CHAR(l,w,c) and gets the same character regardless of the underlying representation.

Consider changing the input format:

Both decompositions: only the Input module changes. This is a wash — both decompositions hide this decision in the same place.

The asymmetry is the point. Decomposition 1 localizes some changes but not others. Decomposition 2 localizes every change to the module that owns the corresponding decision.

Independent Development

In Decomposition 1, the interface between modules is the shared data representation itself — array layouts, index formats, character packing conventions. Every team must agree on these representations before any module can be implemented.

In Decomposition 2, the interface is a set of function signatures. Teams can implement modules in parallel, substituting stubs for modules that are not yet complete. The Line Storage module could be rewritten from a linked list to a B-tree without any other module knowing.

Comprehensibility

In Decomposition 1, understanding any module requires understanding the data representation shared across all modules. In Decomposition 2, understanding a module requires understanding only its interface and the interfaces of the modules it calls. The internal representation is invisible.

The Criterion

Parnas’s conclusion is a single sentence that became one of the most quoted in software engineering:

“We propose instead that one begins with a list of difficult design decisions or design decisions which are likely to change. Each module is then designed to hide such a decision from the others.”

This is the secret principle: every module has a secret — a design decision that only that module knows. The module’s interface is chosen so that the secret can change without affecting any other module. The designer’s job is to identify which decisions are likely to change and to draw module boundaries around them.

Hierarchical Structure

Parnas observes that Decomposition 2 naturally forms a hierarchy:

Level 2: Line Storage (lowest, used by everyone)
Level 3: Input and Circular Shifter (use Line Storage)
Level 4: Output and Alphabetizer (use Circular Shifter)
Master Control at the top

Higher levels use services of lower levels. Lower levels can be extracted and reused independently — levels 1 and 2 alone could support a different application entirely.

But Parnas is careful to separate two ideas. Hierarchical structure and information-hiding decomposition are independent properties. A conventional flowchart decomposition could also produce a hierarchy. The information-hiding criterion says what each module should contain (a design decision); the hierarchical structure says how modules relate (via levels of abstraction). The two often coincide, but they are not the same thing.

Hamilton’s Higher Order Software axioms (1976) formalize the insight that Parnas demonstrates by example. The connection is most direct through three axioms:

Axiom 2 (Input/Output): Every module’s interface is completely specified — no hidden data channels. This is Parnas’s functional interface requirement, stated as a structural rule.
Axiom 3 (Access): A module can only read or write data that its position in the hierarchy grants it access to. This is Parnas’s “secret” — no module reaches into another module’s private state.
Axiom 6 (State Change): If a module modifies data, the modification is visible only through its declared output interface. No side effects escape the module boundary.

Parnas argues empirically: here are two decompositions, one is better for these reasons. Hamilton proves axiomatically: if your system satisfies these six rules, interface errors are structurally impossible. Same insight, different rigor. The 75% interface error reduction that Hamilton and Hackler report in the 2008 USL paper is the quantitative validation of what Parnas proposed qualitatively in 1972.

The Five General Criteria

The paper closes with five specific rules for decomposition, each derived from the KWIC analysis:

Data structures with their access and modification procedures belong in a single module. Do not let multiple modules manipulate the same data structure directly.
The calling sequence for a routine and the routine itself belong in the same module. If calling a function requires format conversion (e.g., integer to string), that conversion code belongs with the function, not with the caller.
Control block formats should be hidden within a control block module. Queue headers, buffer descriptors, and similar bookkeeping structures change frequently and should not be part of inter-module interfaces.
Character codes, alphabetic orderings, and similar encoding decisions should be isolated in their own module. These are among the most variable design decisions in a system.
Processing sequence should be isolated in a single module. The order in which items are processed is highly variable and should be changeable without affecting the modules that process individual items.

These five criteria are specific instances of the general principle: hide what changes behind an interface that does not.

What This Paper Teaches

The paper’s argument is carried entirely by one example and three evaluation dimensions. There are no theorems, no proofs, no formal notation. The KWIC problem is small enough that a single programmer could implement either decomposition in an afternoon. And yet the paper fundamentally altered how software systems are designed, because the argument generalizes: every system of any size faces the same choice between decomposing by processing steps and decomposing by design decisions, and the second choice produces better systems on exactly the dimensions that matter for long-lived, team-developed software.

The paper also demonstrates something about the relationship between empirical argument and formal proof. Parnas convinced the field by showing two implementations of the same problem and comparing them concretely. Hamilton convinced the formal methods community by proving the same insight as theorems. Neither approach supersedes the other. Parnas’s example is accessible to any programmer. Hamilton’s axioms are necessary for automated verification. The insight — that module boundaries should follow information boundaries, not processing boundaries — belongs to both.

Hamilton & Zeldin (1976): Higher Order Software The axioms that formalize information hiding into provable structural properties

Hamilton & Hackler (2008): Universal Systems Language The capstone -- tracing the 75% interface error finding from Apollo through HOS to USL

Sha, Rajkumar & Lehoczky (1990): Priority Inheritance Protocols Companion theory paper -- formal analysis of real-time scheduling with shared resources

Space Software Engineering Collection overview and document inventory