A Formal Approach to Development of Network Protocols: Theory and Application to a Wireless Standard *

Mariusz A. Fecko  
Morristown, New Jersey, USA

M. Ümit Uyar  
Electrical Engineering Department  
The City College of the City University of New York, NY

Ali Y. Duale  
Engineering Systems Test, IBM  
Poughkeepsie, New York, USA

Paul D. Amer and Adarshpal S. Sethi  
Computer and Information Sciences Department  
University of Delaware, Newark, DE

Abstract

This paper presents the research effort to formally specify, develop and test a complex real-life protocol for mobile network radios (MIL-STD 188-220). As a result, the team of researchers from the University of Delaware and the City University of the City College of New York, collaborating with scientists from CECOM (an R&D facility of the U.S. Army) and the U.S. Army Research Labs, have helped advance the state-of-the-art in the design, development, and testing of wireless communications protocols. Estelle is used both as the formal specification language for MIL-STD 188-220 and the source to automatically generate conformance test sequences.

The formal test generation effort identified several theoretical problems for wireless communication protocols (possibly applicable to network protocols in general): (1) the timing constraint problem, (2) the controllability problem, (3) inconsistency detection and elimination problem and (4) the conflicting timers problem. Based on the collaborative research results, two software packages were written to generate conformance test sequences for MIL-STD 188-220. These packages helped generate tests for MIL-STD 188-220’s Data Link Types 1 and 4 services that were realizable
without timer interruptions while providing a 200% increase in test coverage. The test cases have been delivered and are being used by a CECOM conformance testing facility.

Key words: conformance testing, Estelle, formal description technique, formal specification, MIL-STD 188-220, protocol specification, test case generation

1 Introduction

Complexity of the wireless protocols used in MIL-STD 188-220, being developed for mobile combat network radios [20], necessitated that a formal approach be taken in protocol specification, development and testing. Estelle was chosen as the formal specification language to define the protocols in MIL-STD 188-220, from which the conformance tests were automatically generated.

Let us first provide the following data to help the reader realize the magnitude of size and complexity of the wireless protocols used in 188-220. The Datalink and Network layer specifications consist of 69 and 19 documents, respectively, describing the architecture, interfaces, EFSM, and state table of each module. The Datalink layer specification is accompanied by three Estelle source code files (for Datalink classes A, B, and C) with approximately 1,600, 8,700, and 2,400 lines of code, respectively. The Estelle source code for the Network layer has 7,150 lines of code, defining 34 states and 370 transitions in 7 EFSMs (for details, consult www.cis.udel.edu/~amer/CECOM/).

Automatic test generation from Estelle specifications presented various theoretical problems defined as follows:

- **Timing constraint problem:** During testing, if active timers were not taken into account when the tests were generated, these timers can disrupt the test sequences, thereby failing correct implementations or worse, passing incorrect ones. For accurate testing, timers must be incorporated as constraints into the extended FSM (EFSM) model of an Estelle specification.
- **Controllability problem:** Test sequence generation is limited by the controllability of an Implementation Under Test (IUT) [8]. Testers may not have direct access to all interface(s) in which the IUT accepts inputs. Typically, the interfaces with upper layers, or with timers are difficult or impossible

* This work supported by the US ARO (DAAH04-94-G-0093), and prepared through collaborative participation in the Advanced Telecommunications/Info Dist’n Research Program (ATIRP) Consortium sponsored by the US Army Research Lab under Fed Lab Program, Cooperative Agreement DAAL01-96-2-0002.
to access during real testing conditions. In this case, some inputs cannot be directly applied; the interactions involving such interfaces may render some portions of the protocol untestable, and may introduce non-determinism and/or race conditions during testing.

- **Inconsistency detection and elimination problem:** Infeasible test sequences may be generated unless possible conflicts among the protocol’s variables used in the actions and the conditions are avoided.
- **Conflicting timers problem:** Infeasible test sequences may result from a protocol’s variables modeling multiple timers that may be running simultaneously.

The team of researchers and scientists participated in this research and development effort are from the University of Delaware (UD), the City College of the City University of New York (CCNY), the Army Research Laboratory (ARL), US Army Communications-Electronics Command (CECOM), and the Joint Combat Net Radio Working Group (CNR-WG). As a result of this collaboration, the synergistic framework to develop C^4I (Command, Control, Communications, Computers, and Intelligence) systems with the help of formal methods serves as a model for future U.S. Department of Defense networking standards development [22].

Based on the solutions to these theoretical problems, two software packages, called efsm2fsm-rcpt, and (2) INDEEL, have been developed to automatically generate test cases from the EFSM models of Estelle specifications. The sizes of the resulting FSMs derived from the Estelle specifications range from 48 to 303 states, and from 119 to 925 transitions. The corresponding test sequences range from 145 to 2,803 test steps. These tests are free of interruptions due to unexpected timeouts while their coverage of the number of testable transitions increased from approximately 200 to over 700 by utilizing multiple interfaces without controllability conflicts.

Section 2 of this paper presents a part of the Estelle specification of 188-220. A general approach adopted at UD and CCNY to test generation from an Estelle formal specification is described in Section 3. This section also summarizes research results in test generation based on formal specifications. Section 4 presents efsm2fsm-rcpt, and INDEEL software systems. Section 5 summarizes our practical test generation results. Finally, Section 6 presents the authors’ personal perspective on how the protocol development process is in general improved thanks to using formal methods.
2 Estelle Specification of MIL-STD 188-220

The Protocol Engineering Lab researchers at UD used Estelle to specify parts of the 188-220 protocol suite [3, 20, 14, 51]. 188-220, originally developed in 1993, evolved to 188-220A with substantial new functionality, including support for new radio technology and integration with Internet protocols (commercial IP, TCP, and UDP at the network and transport layers). Version 188-220B, whose architecture is depicted in Figure 1, describes the protocols needed to exchange messages using Combat Network Radio (CNR) as the transmission media. These protocols include the physical, data link and part of the network layer of the OSI model. The protocols apply to the interface between host systems and radio systems. Hosts usually include communications processors or modems that implement these lower layer protocols. The unshaded portions of Figure 1 indicate those protocols and extensions that were developed specifically for use with CNR.

MIL-STD-188-220 Datalink layer specifies several service types, each intended to handle different types of traffic with different quality of service (QoS) demands. A 188-220 station can actually process several different types of traffic simultaneously (and almost orthogonally). MIL-STD-188-220 Network Layer consists of Internet (IP) Layer, Subnetwork Dependent Convergence Function (SNDCF), and Intranet Layer. The Intranet Layer has been dedicated to routing intranet packets between a source and possibly multiple destinations within the same radio network. The Intranet Layer also accommodates the rapid exchange of topology and connectivity information—each node on the radio network needs to determine which nodes are on the network and how many hops away they are currently located.
2.1 Intranet Layer Architecture

Figure 2 shows the interface and general architecture of the Network layer. The architecture represents the protocol stack at a single station, as well as an interface with “operator module” which can interact with several different layers in the stack. The operator module abstracts the link layer’s interactions with both a human operator and a system management process.¹

¹ Note that the numbers in Figures 2 through 3 refer to interactions, and are consistent throughout the figures (e.g., number 12 refers to OP-min-update-per in...
Figure 3 shows the internal structure of the Intranet Layer. The two main Intranet Layer functionalities, Source Directed Relay (SDR) and Topology Update exchange (TU), were encapsulated in separate component modules of the Intranet Layer module. This simplifies the design of the FSMs that model the entire layer, and also allows for generating test cases for each functionality separately.

The SDR module receives IL_Unitdata_Req messages through SNDCF SAP interaction point. It starts/stops a varying number of END_END_ACK timers, one for each IP packet that has been sent but not yet acknowledged. The TU module interacts with the SDR module by notifying it of any topology changes that take place dynamically. The TU module communicates with two timers: Topology_Update_Timer and Topology_Update_Request_Timer. The former is started after a topology update message is sent by the station. According to 188-220A, a station is not allowed to send another topology update message until the timer expires. The latter performs the same role for topology update request messages.

Both SDR and TU modules can send and receive messages from thedatalink layer through their lower_mux interaction points—the messages from the two modules are multiplexed by the parent Intranet Layer module. A peer operator or management component is connected directly to the Topology Update module and can set parameters that are relevant in topology update mechanism. Part of the diagram inside the dash-lined rectangular contains modules that handle XNP procedures: joining and leaving the net with either centralized or distributed control, and parameter update requests.

3 Test Case Generation

Formal methods in communications protocol specification and conformance testing have been widely used in the design and testing of real-life protocols [7,17,18,30,41,42,86]. In particular, the Estelle formal description technique (FDT) [12,34,63,66] has been used on several occasions to resolve ambiguities within international protocols [9,15,40,55,64,79].

A number of techniques have been proposed to generate test sequences from Estelle specifications [48,49,67,68,84]. However, full Estelle specifications of large systems may prove to be too complex for direct test case generation. As shown in Figure 4, there are several ways of generating test sequences from Estelle specifications. One approach would be to expand Estelle’s EFSMs thereby converting them to pure FSMs. This expansion would be useful since methods
exist for generating tests directly from pure FSMs (e.g., [2]). Unfortunately, completely converting even a simple EFSM can result in the state explosion problem, that is, the converted FSM may have so many states and/or transitions that either it takes too long to generate tests, or the number of tests generated is too large for practical use.

As an alternative, the UD and CCNY ATIRP research group used an intermediate approach, where an Estelle EFSM is partially expanded (hence resulting in some more states and transitions), but not expanded completely to a pure FSM. The EFSM is expanded partially just enough to generate a set of tests that is feasible and practical in size. Determining which features to expand in the general case is the difficult aspect of this research.

**Test Case Generation Research:**

Conformance test generation techniques reported in literature [2, 8, 45, 52, 61, 68], using a deterministic finite-state machine (FSM) model of a protocol specification, focus on the optimization of the test sequence length. However, an IUT may have timing constraints imposed by active timers. If these constraints are not considered during test sequence generation, the sequence may not be realizable in a test laboratory. As a result, valid implementations may incorrectly fail the conformance tests, or nonconformant IUTs may incorrectly pass the tests.

Another problem in test sequence generation is due to the limited controllability of an IUT. Typically, the inputs defined for the interfaces with upper layers or with timers cannot be directly applied by the tester. In this case, the
testability of an IUT may severely be reduced; in addition, non-determinism and/or race conditions may occur during testing.

When a test sequence is to be generated from an EFSM model, one must take into account that the variables used on the actions and conditions may require conflicting values for a given sequence. A test sequence becomes infeasible if there are one or more variables with conflicting values in it. Therefore, possible conflicts among the protocol’s variables used in the actions and the conditions must be avoided during test sequence generation.

Another focus point on test sequence generation is the status of different protocol timers at each state (e.g., running, stopped, started, etc.) and the relationship between timers and the actions to trigger them (e.g., start, stop, re-start, or expiry of a timer, etc.). The so-called conflicting timers problem addresses that infeasible test sequences may be generated unless conflicting conditions based on timers are resolved.

The remainder of this section presents detailed definitions of these problems and outlines the research progress and the current results.

3.1 The Timing Constraint Problem

During testing, traversing each state transition of an IUT requires a certain amount of time. A test sequence that traverses too many self-loops (a self-loop is a state transition that starts and ends at the same state) in a given state will not be realizable in a test laboratory if the time to traverse the self-loops exceeds a timer limit as defined by another transition originating in this state. In this case, a timeout will inadvertently trigger forcing the IUT into a different state, and thereby disrupting the test sequence before all of the self-loops are traversed. If this unrealizable test sequence is not avoided during test generation, most IUTs will fail the test even when they meet the specification. Clearly, this is not the goal of testing. Therefore, a properly generated test sequence must take timer constraints into account.

Our research results optimize the test sequence length and cost, under the constraint that an IUT can remain only a limited amount of time in some states during testing, before a timer’s expiration forces a state change [74,75]. The solution first augments an original graph representation of the protocol FSM model. Then it formulates a Rural Chinese Postman Problem solution [50] to generate a minimum-length tour. In the final test sequence generated, the number of consecutive self-loops never exceeds any state’s specified limit. In most cases, this test sequence will be longer than one without the constraint since limiting the number of self-loop traversals likely requires additional visits to a state which otherwise would have been unnecessary.
The methodology uses UIO sequences for state verification. However, the results presented also are applicable to test generation that uses distinguishing or characterizing sequences. Earlier results of this study, limited to verification sequences that are self-loops, are presented in [74]. The later paper [75] generalizes these earlier results to both self-loop and non-self-loop verification sequences.

3.1.1 Practical Motivation

Examples of protocols that contain many self-loop transitions in their FSM models include ISDN Q.931 for supplementary voice services, MIL-STD 188-220 [20] for Combat Net Radio communication, and LAPD [76], the data link protocol for the ISDN's D channel. For example, in ISDN Q.931 protocol (Basic voice services, for the user side), each state has an average of 9 inopportune transitions, which requires the traversal of 18 self-loop transitions during testing. A Q.931 implementation has several active timers that are running in certain states, e.g., timer T304 running in state Overlap sending, and timer T310 in state Outgoing call proceeding. An EFSM modeling the Topology Update (TU) functionality of 188-220's Intranet Layer has three active states in which one or two timers are running [74].

It is not always possible to delay the timeout at a tester's convenience. In real protocols, there may be timers whose timeouts are difficult to set by the tester, e.g., acknowledgment timers' timeout values often are computed by the implementation. Moreover, a tester may want to test an IUT’s behavior for different settings of the IUT’s internal timers, to be able to test the IUT’s correctness for various configurations of the timers.

In addition to the original self-loops of a specification model, additional self-loops are typically created when generated test sequences use state verification techniques such as unique input/output (UIO) sequences [60], distinguishing sequences [6,44], or characterizing sequences [6,44].

3.1.2 Optimizing Tests under Timing Constraints

Let $E_{self}$ and $E_{nself}$ be the sets of self-loop and non-self-loop edges to be tested, respectively. Let $d_{self}(v_i)$, the number of self-loops of vertex $v_i$, be defined as the number of edges in $E_{self}$ incident on $v_i$. Let $d_{\text{min}_\text{self}}(v_i)$ be the minimum number of times any tour covering all edges of $E_{nself} \cup E_{self}$ must include vertex $v_i \in V$.

Let $d_{\text{state}_\text{per}}(v_i)$ be the number of self-loop transitions used to verify whether an IUT is in state $v_i$. Suppose that during testing, a given vertex $v_i \in V$ can tolerate at most $max_{self}(v_i)$ self-loops executed at one visit to vertex $v_i$. 
Fig. 5. Conversion of $v_i$ in $G$ (part (a)), to $v'_i$ in $G'$ (part (b)) and to $v_i^{*(1)}, v_i^{*(2)}$ in $G^*$ (part (c)).

Attempting to remain in state $v_i$ to execute $1 + \max_{self}(v_i)$ self-loops would result in disruption of a test sequence. Testing a self-loop transition involves traversing the self-loop transition followed by applying the state verification self-loop sequence, which contains $d_{state,ver}(v_i)$ transitions.

Due to space limitations, we are unable to include the detailed derivation of $d_{min, self}(v_i)$. In [74], we prove that the minimum number of times vertex $v_i$ must be visited in a test sequence is as follows:

$$d_{min, self}(v_i) = \begin{cases} d_{in}(v_i) & \text{if } d_{self}(v_i) \leq (d_{in}(v_i) \times \Delta_1(v_i)) \\ \Gamma(v_i) & \text{if } d_{self}(v_i) > (d_{in}(v_i) \times \Delta_1(v_i)) \end{cases}$$

(1)

where $d_{out}(v_i)$ and $d_{in}(v_i)$ are respectively the out-degree and the in-degree of vertex $v_i$ in $E_{cond}$, and where

$$\Gamma(v_i) = \frac{d_{in}(v_i) - (d_{in}(v_i) \times \Delta_1(v_i))}{\Delta_2(v_i)}$$

(2)

$$\Delta_1(v_i) = \left[ \frac{\max_{self}(v_i) - d_{state,ver}(v_i)}{1 + d_{state,ver}(v_i)} \right]$$

(3)

$$\Delta_2(v_i) = \left[ \frac{\max_{self}(v_i)}{1 + d_{state,ver}(v_i)} \right]$$

(4)

$G'(V', E')$ ($G'$ is obtained from $G$ by removing self-loop edges) is converted to $G^*(V^*, E^*)$ by splitting each vertex $v'_i \in V'$ satisfying

$$d_{min, self}(v_i) > \max(d_{in}(v_i), d_{out}(v_i))$$

(5)

into the two vertices $v_i^{*(1)}, v_i^{*(2)} \in V^*$ (Figure 5).
Fig. 6. Minimum-cost test sequence without self-loop repetition constraint.

Then, \( v_i^{(1)} \) is connected to \( v_i^{(2)} \) with a set of edges with cardinality of \( d_{\min\_self}(v_i) \): 
\[
E_i = \bigcup_{v_j \in V'} g((v_i^{(1)}, v_i^{(2)}), d_{\min\_self}(v_i)).
\]
Each edge in \( E_i \) is assigned infinite capacity \( \beta \) and a zero cost \( \psi \). These fake edges will force additional visits to \( v_i \) in a minimum-cost tour of \( G \).

We then use network flow techniques (similar to Aho et al. [2]) to maximize the flow on graph \( G' \) with minimum cost. This flow defines a minimum-cost tour of \( G \) under timing constraints.

**Example**: Consider the FSM (represented by the graph \( G(V, E) \)) with self-loop transitions shown in Figure 6. Suppose that vertices \( v_0, v_2, \) and \( v_3 \) of the FSM can tolerate at most three, and \( v_1 \) at most two self-loop transitions during each visit. Let transitions \( e_{10} \) and \( e_{11} \) correspond to timeouts. After either \( e_{10} \) or \( e_{11} \) is triggered, the FSM is brought into state \( v_3 \).

UIO sequences and the values of \( max\_self, d_{state\_ver} \) and \( d_{min\_self} \) for vertices \( v_0, v_1, v_2, \) and \( v_3 \) are as follows:

<table>
<thead>
<tr>
<th>Vertex</th>
<th>UIO</th>
<th>( max_self )</th>
<th>( d_{state_ver} )</th>
<th>( d_{min_self} )</th>
</tr>
</thead>
<tbody>
<tr>
<td>( v_0 )</td>
<td>( e_0 )</td>
<td>3</td>
<td>1</td>
<td>2</td>
</tr>
<tr>
<td>( v_1 )</td>
<td>( e_2 )</td>
<td>2</td>
<td>1</td>
<td>3</td>
</tr>
<tr>
<td>( v_2 )</td>
<td>( e_6, e_7 )</td>
<td>3</td>
<td>2</td>
<td>4</td>
</tr>
<tr>
<td>( v_3 )</td>
<td>( e_9 )</td>
<td>3</td>
<td>1</td>
<td>2</td>
</tr>
</tbody>
</table>

The Chinese postman method [70] when applied to the graph without any self-loop repetition constraint results in the test sequence

\[
\begin{align*}
e_0, e_0, e_1, e_2, e_2, e_2, e_{10}, e_9, e_9, e_9, e_{12}, e_0, e_1, e_3, e_2, e_4, e_6, e_7, \\
e_6, e_6, e_7, e_{11}, e_9, e_{12}, e_1, e_4, e_7, e_6, e_7, e_6, e_7, e_5, e_0
\end{align*}
\]
Fig. 7. Minimum-cost test sequence with self-loop repetition constraint.

containing 34 edges. Edges used for the purpose of state verification appear in bold.

As can be seen from the underlined part of the above test sequence, after e1 is traversed, the IUT should stay in state v1 for a time that allows at least three self-loop traversals. However, this part of the test sequence is not realizable in a test laboratory because the timeout edge e10 will be triggered after the second consecutive self-loop traversal (i.e., $max_{x \in T} f(v_1) = 2$). The IUT will prematurely move into v3 and the test sequence will be disrupted.

To address the problem of test sequence disruption due to timeouts, the graph of Figure 6 is converted to the graph shown in Figure 7. Since in this example all UIO sequences are self-loops, the simplified conversion presented in [74] is sufficient. The vertices for which a premature timeout may disrupt a test sequence, which are v1 and v2, are split and then connected by $d_{min_{x \in T}} f(v_1) = 3$ and $d_{min_{x \in T}} f(v_2) = 4$ edges, respectively.

Considering the constrained self-loop problem, the test sequence for the graph of Figure 7 is obtained as

$$e_0, e_0, e_1, e_2, e_10, e_9, e_9, e_9, e_12, e_0, e_1, e_2, e_4, e_6, e_7, e_11, e_9, e_12, e_1, e_3,$$

(7)

containing 40 edges.

Although longer than that of Figure 6, the test sequence in Figure 7 is minimum-length with the introduced self-loop constraint. During each visit to vertices $v_0, v_1, v_2$ and $v_3$, the number of consecutive self-loop edges traversed is less than or equal to the maximum allowed number of self-loop traversals. Therefore, this test sequence is realizable in the test laboratory.
3.2 The Controllability Problem

Consider a testing framework where the interface \( I_1 \) between the IUT and the (N)-layer in the System Under Test (SUT) [8] is not externally accessible (Figure 8). In other words, the inputs from (N+1)-layer cannot be directly applied to the IUT, nor can the outputs generated by the IUT be observed at (N+1)-layer. Such an interface \( I_1 \) is called *semicontrollable* if \( FSM_1 \) can be utilized to supply inputs to the IUT. On the other hand, the tester can apply inputs to the IUT directly by using a lower tester, which exchanges N-PDUs with the IUT by using the (N-1)-Service Provider. The interface \( I_0 \) between the lower tester and the IUT is therefore *directly controllable*.

Our approach addresses the problem of generating optimal realizable test sequences in an environment with multiple semicontrollable interfaces [25]. The methodology fully utilizes semicontrollable interfaces in an IUT while avoiding the race conditions. An algorithm is introduced in [25] to modify the directed graph representation of the IUT such that its semicontrollable portions become directly controllable, where possible. In the most general case, obtaining such a graph conversion may end up with exponentially large number of nodes. However, it is shown [25] that special considerations such as the small number of interfaces interacting with an IUT and diagnostics considerations make the problem size feasible for most practical cases.

3.2.1 Practical Motivation

As motivation for solving the controllability problem, a real protocol is considered where an SUT’s (N+1)-layer must be utilized indirectly to test certain transitions within the (N)-layer IUT.

188-220 focuses on 3 layers: Physical, Datalink, and Network. The Network
layer contains an Intranet sublayer. An SUT contains the (N)-layer IUT implemented in the Datalink layer, and the Intranet sublayer, which is part of the (N+1)-layer, as shown in Figure 9.

In the CECOM’s environment used for testing 188-220 implementations, the upper layers cannot be directly controlled. Therefore, the IUT’s transitions that are triggered by the inputs coming from the Network layer are not directly testable. An example SUT transition that causes a controllability problem is the transition \( t1 \) from the Class A-Type 1 Service Datalink module [20,23], shown in Figure 9. The \textit{input/event} field for this transition requires a \texttt{DL-Unitdata.Req} from the (N+1)-layer. Unfortunately, the interface between the IUT and the (N+1)-layer is not directly accessible for generating this input. Initially, it appears that transition \( t1 \) is untestable.

To trigger this transition, which requires the (N+1)-layer to pass a \texttt{DL-Unitdata.Req} down to the (N)-layer, feedback from the (N+1)-layer must be used. To force a \texttt{DL-Unitdata.Req} from the (N+1)-layer, the tester sends a \texttt{PL-Unitdata.Ind} to the IUT (similar to the message \( a \) in Figure 8) that contains an intranet layer message telling the (N+1)-layer to relay the frame to a different network node. The IUT outputs this message to the (N+1)-layer (see message \( b \) in Figure 8), and the (N+1)-layer FSM responds by outputting the desired \texttt{DL-Unitdata.Req} (message \( c \) in Figure 8). Finally, the datalink layer generates the desired output \texttt{PL-Unitdata.Req} (corresponding to message \( d \) in Figure 8), which can be observed by the lower tester.

In fact, 70% of the transitions the Class A-Type 1 Datalink Service module are based on not directly controllable inputs. Without indirect testing, test coverage would be seriously limited; only approximately 200 transitions out of 750 would be testable. However, by applying the technique outlined in this paper, over 700 of defined transitions (>95%) can be tested. The application of the presented technique to 188-220 is described in more detail in [24].

Similar controllability problems can also be pointed out in testing the IEEE
802.2 LLC Connection Component [25,38].

3.2.2 Optimizing Tests with Multiple Semicontrollable Interfaces

To optimize tests with multiple semicontrollable interfaces, modeling SUT as a single FSM was proposed [25,26]. A semicontrollable interface \( I_i \) is implemented as a separate FIFO buffer. During testing, a buffer may be empty or store an arbitrary sequence of inputs to the IUT generated indirectly through \( I_i \). For each \( I_i \), we define variable \( \omega_i \) that has a distinct value for each permutation of inputs that the \( i \)-th buffer can hold. The proposed model consists of graph \( G \) (which represents the IUT’s FSM) and the variables \( \omega_1, \omega_2, \ldots, \omega_F \).

An FSM modeling the SUT can be obtained by expanding \( G \) and \( \omega_1, \omega_2, \ldots, \omega_F \) into \( G'(V', E') \). An algorithm for converting \( G(V, E) \) to \( G'(V', E') \) proceeds as follows (a detailed description of the algorithm along with its pseudocode is available in [25,26]):

Step 0—Definitions:
Let \( B_i \) denote a sequence of inputs buffered at the \( i \)-th semicontrollable interface. Each state \( v' \in V' \) has two components: the original state \( v \in V \), and the current configuration of \( F \) buffers, i.e., \( v' = (v, B_1, \ldots, B_F) \). The algorithm constructs all possible buffer configurations with up to \( b_i \) inputs buffered at \( I_i \).

Step 1—Initialize:
\( r' \), root of \( G' \), as \( (r, \emptyset, \ldots, \emptyset) \) (root of \( G \) and configuration of empty buffers); \( E' \) as empty set; \( V' \) as \( \{r'\} \); \( Q \), queue of vertices, as \( V' \).

Step 2—Repeat until \( Q \) is empty:
(1) extract \( v' = (v_{start}, B_1, \ldots, B_F) \) as first element from \( Q \), where \((B_1, \ldots, B_F)\) is current configuration
(2) given the current vertex \( v' = (v_{start}, B_1, \ldots, B_F) \), perform the following steps for each original outgoing edge \( e = (v_{start}, v_{end}) \in E \):
- **Class 1:** \( e \) is triggered by an input from and generates output(s) to an LT;
- **Class 2:** \( e \) is triggered by an input from an LT and generates an output \( o_{q,l} \) (buffered in \( B_q \) to create a new configuration) at \( I_q \);
- **Class 3:** \( e \) is triggered by \( a_{p,k} \) (extracted from \( B_p \) to create a new configuration) from \( I_p \) and generates output(s) to an LT;
- **Class 4:** \( e \) is triggered by an input \( a_{p,k} \) from \( I_p \) and generates an output \( o_{q,l} \) at \( I_q \). Apply rules for Class 3 and Class 2 to create a new configuration.
- create new vertex \( v_{new} = (v_{end}, B_1, \ldots, B_F) \in V' \), and new edge \( e'_new = (v', v_{new}) \in E' \);
- include new edges in \( E' \) iff inputs in \((B_1, \ldots, B_F)\) cannot trigger other edges outgoing from \( v_{start} \);
- append to \( Q \) end vertices \( v'_{new} \) of new edges included in \( E' \).
Fig. 10. Classes of edges in $G'$ (dashed-lined outputs are optional).

Fig. 11. IUT interacting with two semicontrollable interfaces.

Step 3—Retain only strongly connected states:
remove from $V'$ all vertices from which $r'$ cannot be reached, and remove from $E'$ all edges incident to such vertices.

Based on the practical considerations discussed in [25], the algorithm can be refined to meet the following objective: "generate a test sequence that, at any point in time, avoids storing more than one input in only one of the buffers (where possible)." Satisfying this objective yields a linear running time in the number of semicontrollable interfaces and the number of edges in $G$. If this objective cannot be satisfied, the running time grows and nondeterminism may not be avoided during testing.

Example: Consider the IUT of Figure 11 which is interacting with semicontrollable $FSM_1$ and $FSM_2$ through the semicontrollable interfaces $I_1$ and $I_2$, respectively. The IUT’s FSM (represented by graph $G$) is described in Table 1. Transition $e_1$, triggered by input $x_1$ from the lower tester, generates output $o_{1,1}$ to $FSM_1$. In response, $FSM_1$ sends input $a_{1,1}$ which triggers transition $e_3$. (In general, $a_{i,j}$ is the expected response to $o_{i,j}$.) Transition $e_2$, which is triggered by a lower tester's input
Table 1
Inputs and outputs for the edges of Figure 11. $A?x$ denotes receiving input $x$ from $A$. $B!y$ denotes sending output $y$ to $B$.

<table>
<thead>
<tr>
<th>Edge</th>
<th>Input</th>
<th>Output</th>
<th>Edge</th>
<th>Input</th>
<th>Output</th>
</tr>
</thead>
<tbody>
<tr>
<td>e1</td>
<td>$LT?x_1$</td>
<td>$FSM_1 l_{01,1}$</td>
<td>e6</td>
<td>$LT?x_6$</td>
<td>$LT!y_6$</td>
</tr>
<tr>
<td>e2</td>
<td>$LT?x_2$</td>
<td>$FSM_2 l_{02,1}$</td>
<td>e7</td>
<td>$LT?x_7$</td>
<td>$LT!y_7$</td>
</tr>
<tr>
<td>e3</td>
<td>$FSM_1 ?a_{1,1}$</td>
<td>$LT!y_3$</td>
<td>e8</td>
<td>$FSM_1 ?a_{1,2}$</td>
<td>$LT!y_8$</td>
</tr>
<tr>
<td>e4</td>
<td>$FSM_2 ?a_{2,1}$</td>
<td>$FSM_1 l_{01,2}$</td>
<td>e9</td>
<td>$LT?x_9$</td>
<td>$LT!y_9$</td>
</tr>
<tr>
<td>e5</td>
<td>$LT?x_5$</td>
<td>$FSM_2 l_{02,2}$</td>
<td>e10</td>
<td>$LT?x_{10}$</td>
<td>$LT!y_{10}$</td>
</tr>
</tbody>
</table>

![Graph diagram]

Fig. 12. Graph transformation applied to the graph of Fig. 11. Mandatory and optional edges appear in solid and dashed lines, respectively.

Table 2
Minimum-length test sequence for the IUT of Figure 11.

<table>
<thead>
<tr>
<th>Step</th>
<th>Edge</th>
<th>Input</th>
<th>Output</th>
<th>Step</th>
<th>Edge</th>
<th>Input</th>
<th>Output</th>
</tr>
</thead>
<tbody>
<tr>
<td>→ 1</td>
<td>e1.0</td>
<td>$LT?x_1$</td>
<td>$FSM_1 l_{01,1}$</td>
<td>8</td>
<td>e7.2</td>
<td>$LT?x_7$</td>
<td>$LT!y_7$</td>
</tr>
<tr>
<td>2</td>
<td>e5.1</td>
<td>$LT?x_5$</td>
<td>$FSM_2 l_{02,2}$</td>
<td>→ 9</td>
<td>e8.2</td>
<td>$FSM_1 ?a_{1,2}$</td>
<td>$LT!y_8$</td>
</tr>
<tr>
<td>→ 3</td>
<td>e3.1</td>
<td>$FSM_1 ?a_{1,1}$</td>
<td>$LT!y_3$</td>
<td>10</td>
<td>e7.0</td>
<td>$LT?x_7$</td>
<td>$LT!y_7$</td>
</tr>
<tr>
<td>→ 4</td>
<td>e6.0</td>
<td>$LT?x_6$</td>
<td>$LT!y_6$</td>
<td>→ 11</td>
<td>e5.0</td>
<td>$LT?x_5$</td>
<td>$FSM_2 l_{02,2}$</td>
</tr>
<tr>
<td>→ 5</td>
<td>e7.0</td>
<td>$LT?x_7$</td>
<td>$LT!y_7$</td>
<td>→ 12</td>
<td>e9.0</td>
<td>$LT?x_9$</td>
<td>$LT!y_9$</td>
</tr>
<tr>
<td>→ 6</td>
<td>e2.0</td>
<td>$LT?x_2$</td>
<td>$FSM_2 l_{02,1}$</td>
<td>13</td>
<td>e10.0</td>
<td>$LT?x_{10}$</td>
<td>$LT!y_{10}$</td>
</tr>
<tr>
<td>→ 7</td>
<td>e4.3</td>
<td>$FSM_2 ?a_{2,1}$</td>
<td>$FSM_1 l_{01,2}$</td>
<td>14</td>
<td>e6.0</td>
<td>$LT?x_6$</td>
<td>$LT!y_6$</td>
</tr>
</tbody>
</table>
$x_2$, outputs $o_{2,1}$ to $FSM_2$, which responds with input $a_{2,1}$ triggering $e_4$. Then $e_4$ outputs $o_{1,2}$ to $FSM_1$, which responds with $a_{1,2}$ triggering $e_8$. On the other hand, transitions $e_5$, $e_6$, $e_7$, $e_9$, and $e_{10}$, can be triggered directly by the lower tester. $e_6$, $e_7$, $e_9$, and $e_{10}$, do not generate outputs to the semicontrollable interfaces. $e_5$ generates output $o_{2,2}$ to $FSM_2$, which does not send any input to the IUT.

After conversion (Figure 12), each state of $G$ is replaced with at most four related states in $G'$ corresponding to the buffer configurations at a semicontrollable interface. Each edge $e$ is annotated as $e.x$, where $x = 0, 1, 2, 3$, depending on the input buffered in the $e.x$'s start state, as shown in Figure 12. The solid edges in Figure 12 are the mandatory edges that are incident to nodes that correspond to the case where both buffers are empty; the dashed-line edges are the ones that can be traversed only when either buffer contains an input. Due to the practical diagnostic considerations [25], we prefer testing edges when no inputs are buffered in semicontrollable interfaces. The Aho et al. [2] optimization technique gives the minimum-length test sequence for $G'$ shown in Table 2. Steps with $(\rightarrow)$ indicate that an edge is tested in this step. Note that, for simplicity, the UIO sequences [60] are not included in this sequence.

### 3.3 Inconsistency Detection and Elimination Problem

Feasible test sequence generation is essential for assuring the proper operation and interoperability of different components in computer and communication systems. The use of formal description languages such as VHDL and Estelle enable the precise description of such systems and help minimize the implementation errors due to misinterpretations. However, the specifications written in VHDL and Estelle are often extended FSMs (EFSMs), making the automated test generation a more complex task due to possible inconsistencies among the action and condition variables [21].

We studied the problem of generating feasible test sequences for the EFSM by analyzing the interdependencies among the action and condition variables of the EFSM models. In the earlier phases of this research, action and condition inconsistencies in the EFSM models were defined [77,78]. It has been shown that once the inconsistencies are eliminated, the existing finite-state machine (FSM)-based test generation methods can be used to generate feasible test sequences from the resulting consistent EFSM graphs.

The algorithms for the detection and elimination of conflicts in EFSM models utilize symbolic execution, linear programming, and graph splitting methods. After all conflicts are eliminated, all paths of the final resulting EFSM graph are feasible and can be used as an input to the FSM-based test generation methods. The basic concepts for the inconsistency elimination algorithms were outlined in [78], which were later were generalized to include graphs
with loops [73]. The formal descriptions of the inconsistency detection and elimination algorithms have been given in [21].

### 3.3.1 Action Conflicts

If there is no solution for the set of equations formed by the actions of an edge $e_i$ and the condition of another edge $e_j$, where $\text{head}(e_j)$ can be reached from $\text{tail}(e_i)$ or $\text{head}(e_j) = \text{tail}(e_i)$, then the two edges of $e_i$ and $e_j$ are said to have an action conflict.  

For example, in Figure 13, there is an action conflict between the action of $e_1$ and the condition of $e_8$ due to variable $b$.

In general, the effects of the edge actions on variables (i.e., variable modifications) can be represented as matrices. For an EFSM graph with $m$ variables, $\text{var}_1, \text{var}_2, \ldots, \text{var}_m$, a pair of matrices $A(m \times m)$ and $\bar{B}(m \times 1)$ called the modification matrix and the modification vector, respectively, are defined.

The accumulated effects of the actions in the paths leading to a node $v_i$ can be represented by a set of Action Update Matrix pairs defined as:

$$
\text{AUM}(v_i, J) = \{A_{v_i,0}, \bar{B}_{v_i,0}, A_{v_i,1}, \bar{B}_{v_i,1}, \ldots, A_{v_i,J-1}, \bar{B}_{v_i,J-1}\} \quad (8)
$$

where $A_{v_i,k}$, $\bar{B}_{v_i,k}$, and $J$ are the $k^{th}$ modification matrix, $k^{th}$ modification vector ($0 \leq k < J$), and the number of AUM pairs associated with $v_i$, respectively. The symbolic values of a variable $\text{var}_r$ are represented in the $r^{th}$ rows in $\text{AUM}(v_i, J)$. Only one AUM pair, where $A$ and $\bar{B}$ are initialized to the identity matrix and to a zero vector, respectively, is created for the initial node.

The number of AUM pairs associated with $v_i$ solely depends on the number of different ways in which the actions of the edges leading to $v_i$ modify variables. If the overall variable modifications of the actions of any two paths leading to $v_i$ are the same, only one AUM pair is sufficient to account for the effects of the actions in the two paths. Therefore, only unique AUM pairs are associated with $v_i$. See Section 3.3.3

Symbolic execution is utilized in the construction of AUM pairs. When an action conflict is detected, the EFSM graph is split from the node where the conflict occurs. The analysis continues until all action conflicts are eliminated. By applying the algorithms for action conflict detection and elimination as

---

2 For an edge directed from node $v_a$ to node $v_b$, the head and tail nodes are defined as $v_a$ and $v_b$, respectively.
presented in [73,21], the resulting EFSM graph after the action conflict is eliminated is shown in Figure 14.

![EFSM Graph with Conflicts](image)

**Fig. 13.** An EFSM graph with conflicts

### 3.3.2 Condition Conflicts

After all action conflicts are eliminated from the EFSM graph, the next step involves the detection and elimination of condition conflicts. The edges \( e_i \) and \( e_j \) are said to have a condition conflict if there is no solution for the set of equations formed by the accumulated conditions of the edges of a sub-path \( e_1 \cdot e_2 \cdot \ldots \cdot e_i \) and an edge \( e_j \), where \( head(e_j) \) can be reached from \( tail(e_i) \) or \( head(e_j) = tail(e_i) \).

In Figure 14, for example, there is a condition conflict between the edges of \( e_{5(0)} \) and \( e_{4(1)} \) since each edge requires a conflicting value of variable \( c \). Figure 15 shows the final conflict-free EFSM graph.
Fig. 14. The resulting EFSM graph after splitting the graph of Figure 13 due to $e_1$ action and $e_8$ condition.

Since the conditions of the edges of a test sequence constitute a system of constraints, simplified version of linear programming algorithms can be used in deciding whether a certain path predicate is feasible [1]. The edge conditions in a path from the starting node $v_0$ to a node $v_i$ can be represented in matrices. A triplet of matrices are defined as $C$ $(m \times p)$, $OP$ $(p \times 1)$, and $D$ $(p \times 1)$, where $m$ is the number of variables, $p$ is the number of conditions in the path from $v_0$ to $v_i$, $C$ is the coefficient matrix, $OP$ is the operator vector containing the relations of $=, <, >, !=, \cdots$, etc., and $D$ is the scalar vector containing the scalar values of the conditions in the path.

The AUM pairs discussed in Section 3.3.1 are applied to the edge conditions of the EFSM graph as follows. A single condition of an edge $e_r = (v_i, v_j)$ is in the form of $\bar{C} \ast V(\bar{O}P)\bar{D}$. The condition of $e_r$ will be modified based on the symbolic values of the variables $var_0$ through $v_{m-1}$, which are represented by the AUM($v_i, J$). As described in Section ??, the current values of the variables
including all the modifications represented by an AUM pair of \( v_i \) are in the form of: \( \hat{V} = A_{v_i,k} \hat{V} + B_{v_i,k} \). Substituting \( \hat{V} \) values in an edge condition will result in \( \tilde{C}(A_{v_i,k} \hat{V} + B_{v_i,k})(OP)\tilde{D} \), which simplifies as \( \tilde{E} \hat{V}(OP)f \), where \( \tilde{E} = \tilde{C} A_{v_i,k} \) is an \( m \)-element vector and \( f \) is a scalar. An edge \( e_r = (v_i, v_j) \) whose condition is infeasible based on the AUM pairs of \( v_i \) is deleted from the graph. The values assumed by the variables used in the condition of \( e_r \) can be determined from:

\[
C \hat{V} = C (A_{v_i,k} \hat{V} + B_{v_i,k})
\]  

where \( C \) is the coefficient matrix for the condition of \( e_r \) and \( 0 \leq k < J \), where \( J \) is the number of AUM pairs associated with \( v_i \).

The accumulated different conditions of the paths leading to \( v_i \) can be represented by a set of Accumulated Condition Matrix (ACM) triplets: ACM\((v_i, J) = (C_{v_i,0}, OP_{v_i,0}, D_{v_i,0}, C_{v_i,1}, OP_{v_i,1}, D_{v_i,1}, \ldots, C_{v_i,J-1}, OP_{v_i,J-1}, D_{v_i,J-1}) \), where \( C_{v_i,k}, OP_{v_i,k}, D_{v_i,k} \), and \( J \) are the \( k \)-th coefficient matrix, \( k \)-th operator matrix, \( k \)-th scalar value matrix \((0 \leq k < J)\), and the number of the ACM triplets associated with \( v_i \), respectively.

### 3.3.3 Complexity of the Algorithms

The action inconsistency detection and elimination algorithms use a two-phase modified breadth-first graph traversal, called P1-MBF and P2-MBF. P1-MBF is the main graph traversal from which P2-MBF may be invoked multiple times. During the condition inconsistency detection phase, the graph is traversed in a regular depth-first manner.

The complexity of the action conflict detection and elimination is contributed by a two-phase MBF graph traversal and constructing the number of AUM pairs for each node, for each edge for each AUM pair.

The complexity for the two-phase MBF graph traversal is \( O(E^2) \) [19]. For each node \( v_i \), the number of AUM pairs is \( \sum_{j=1}^{V_i-1} |E_{ij}^{v_j \rightarrow v_i}| \times |\text{AUM}(v_j, J)| \) (where \( |E_{ij}^{v_j \rightarrow v_i}| \) is the number of edges from \( v_j \) to \( v_i \)).

The complexity for the condition conflict detection and elimination is bounded by the number of AUM pairs of each node and executing the linear programming for each edge. A simplified version of linear programming, which is used to eliminate infeasible conditions, takes \( \min(m^2, S^2) \) steps where \( m \) is the number of variables and \( S \) is the number of constraints [1].

Therefore, for the general case, the complexity of algorithms for handling the action conflicts is exponential with respect to the number of simple paths.
(i.e., the number AUM pairs). Similarly, the condition conflict elimination can be exponential with respect to the number simple paths. However, based on our experience with several protocols (including nested and/or concatenated loops), the complexity of both algorithms and, hence, the size of the final conflict-free graph are bounded by the number of different values each condition variable assumes as it is used in the conditions.

3.4 The Conflicting Timers Problem

To ensure feasibility of tests in a laboratory, automated test generation for network protocols with timer requirements must consider conflicting conditions based on a protocol’s timers. Our ATIRP research developed a new model for testing real-time protocols with multiple timers, which captures
complex timing dependencies by using simple linear expressions involving timer-related variables. Similar dependencies, but based on arbitrary linear variables, are present in EFSM models of VHDL specifications [71]. Uyar and Duale present algorithms for detecting [71] and removing [21,78] such inconsistencies in VHDL specifications. The new modeling technique combined with the inconsistency removal algorithms are expected to significantly shorten test sequences without compromising their fault coverage.

The model, specifically designed for testing purposes, avoids performing a full reachability analysis and significantly limits the explosive growth of the number of test scenarios. These goals are achieved by incorporating certain rules for the graph traversal without reducing the set of testable transitions. The technique also models a realistic testing framework in which each I/O exchange takes a certain time to realize, and a tester has an ability to turn timers on and off in arbitrary transitions and to algorithmically find proper timeout settings.

The methodology presented in this paper is expected to detect transfer and output faults [47], where an IUT moves into a wrong state (a state other than the one specified) or generates a wrong output (an output other than the one specified) to a given input, respectively. The detection of transfer faults can significantly be improved by using the well-known state verification methods such as UIO sequences, characterization sets, or distinguishing sequences. These techniques should be applied while generating a minimum-cost test sequence from the final conflict-free graph.

The proposed solution is likely to have a broader application due to a proliferation of protocols with real-time requirements. The functional errors in such protocols are usually caused by the unsatisfiability of time constraints and (possibly conflicting) conditions involving timers; therefore, significant research is required to develop efficient algorithms for test generation for such protocols. Our methodology is expected to contribute towards achieving this goal. The preliminary results are reported in [28].

In the test cases delivered to CECOM (see Section 5), conflicting conditions based on 188-220's timers are resolved by manually expanding EFSMs based on the set of conflicting timers. This procedure results in test sequences that are far from minimum-length. The technique presented here allows us to automatically generate conflict-free test sequences for 188-220.

Suppose that a protocol specification defines a set of timers $K = \{tm_1, \ldots, tm_{|K|}\}$, such that a timer $tm_j$ may be started and stopped by arbitrary transitions defined in the specification. Each timer $tm_j$ can be associated with a boolean variable $T_j$ whose value is true if $tm_j$ is running, and false if $tm_j$ is not running. Let $\phi$ be a time formula obtained from variables $T_1, \ldots, T_k$ by using logical
operands $\land$, $\lor$, and $\neg$. Suppose that a specification contains transitions with time conditions of a form “if $\varphi$” for some time formula $\varphi$. It is clear that there may exist infeasible paths in an FSM modeling a protocol, if two or more edges in a path have inconsistent conditions. For example, for transitions $e_1$: if $(T_j)$ then $\{\varphi_1\}$ and $e_2$: if $(-T_j)$ then $\{\varphi_2\}$, a path $(e_1, e_2)$ is inconsistent unless the action of $\varphi_1$ in $e_1$ sets $T_j$ to false (which happens when timer $tm_j$ expires in transition $e_1$). The solution to the above problem is expected to allow generating low-cost tests free of such conflicts.

188-220’s Datalink Layer Estelle specification defines several timers that can run concurrently and affect the protocol’s behavior. For example, $BUSY$ and $ACK$ timers may be running independently in $FRAME\_BUFFERED$ state. If either timer is running, a buffered frame cannot be transmitted. If $ACK$ timer expires while $BUSY$ timer is not running, a buffered frame is retransmitted. If, however, $ACK$ timer expires while $BUSY$ timer is running, no output is generated. Besides Estelle specifications, feasibility constraints related to multiple concurrent timers are also of special concern for specifications in SDL.

The conflicting timers problem is a special case of the feasibility problem of test sequences, which is an open research problem for the general case [29,69]. However, there are two simplifying features of the conflicting timers problem: (1) timer-related variables are linear, and (2) the values of time-keeping variables implicitly increase with time. Considering these features makes it possible to find an efficient solution to this special case.

### 3.4.1 General approach

The goal of the presented technique is to achieve the following fault coverage:

*Cover every feasible state transition defined in the specification at least once.*

During the testing of a system with multiple timers, when a node $v_p$ is visited, an efficient test sequence should either (1) traverse as many self-loops (i.e., transitions that start and end in the same state) as possible before a timeout or (2) leave $v_p$ immediately through a non-timeout transition. Once the maximum allowable number of self-loops are traversed, a test sequence may leave $v_p$ through any outgoing transition. Such an approach does not let perform full reachability analysis; however, it can be shown that considering only the above two cases is sufficient to include at least one feasible path for each transition provided such a feasible path is not prohibited by the original specification.

Suppose that there are 15 untested self-loops (each requiring 1 sec to test) in state $v_{57}$, and that, when the test sequence visits $v_{57}$, the earliest timer to expire is $tm_{4}$, with 10.5 sec remaining until its timeout. In this example, the test sequence will either leave $v_{57}$ immediately or traverse 10 of the untested
self-loops. Suppose that the latter option is chosen and, later during the test sequence traversal, $v_{57}$ is visited again with $tm_2$ leaving 3.1 sec until the earliest timeout. In this case, 3 more untested self-loops of $v_{57}$ can be covered by the test sequence. Traversal will continue until all of the $v_{57}$’s self-loops are tested.

In more complicated cases, in addition to the aforementioned timing constraints, traversal of a self-loop requires that its associated time condition be satisfied, i.e., certain timers be active (or, similarly, other timers be inactive). These time conditions will also be taken into account while selecting which self-loops to traverse. In the above example, if 6 or more self-loops of $v_{57}$ have ‘$tm_4$ not running’ as their time condition, the test sequence, which tries to execute 10 of the untested self-loops, will cause a timer conflict due to the unsatisfiability of the time condition.

In general, the goal of an optimization is to generate a low-cost test sequence that follows the above guidelines, satisfies time conditions of all composite edges and is not disrupted by timeout events during traversal (i.e., contains only feasible transitions).

Similar inconsistencies, but based on arbitrary linear variables, are present in EFSMs modeling VHDL specifications. ATIRP researchers Uyar and Duale presented algorithms for detecting [71] and removing [72] inconsistencies in VHDL specifications. Recent research in UD and CCNY focused on adapting these algorithms to detecting and removing inconsistencies caused by a protocol’s conflicting timers. The software implementation of these algorithms developed within ATIRP is described in the next section.

## 4 Software for Automated Test Generation

The process of generating tests involved the development of two systems of software: (1) efsm2fsm-rcpt, and (2) INDEEL. These two systems are now described in turn.

### 4.1 efsm2fsm-rcpt

Figure 16 depicts the major software components that were developed to generate test sequences from an EFSM [27]. The software contains two packages: (1) efsm2fsm, and (2) rcpt. The former was designed and implemented at UD. The latter was based on the software written at CCNY, which originally was able to handle graphs of at most 100 transitions in a plain input/output format, without any of the additional parameters specifically required for 188-
220B tests. This component was enhanced to generate tests for 188-220B for a proprietary CECOM’s format. Also, the software was significantly redesigned to process large graphs (1000s of transitions), which enabled its application to more complex real-life protocols.

![Diagram of software for automated test generation](image)

**Fig. 16.** Software for automated test generation.

### 4.1.1 *efsm2fsm*

*efsm2fsm* takes a protocol’s EFSM representation as input and performs its expansion to an FSM. Each EFSM’s transition is associated with the following parameters: transition name in the Estelle specification, transition description, start and end states, input and output names, numerical values specifying the corresponding fields in 188-220B’s PDUs, and changes in the variables’ values (i.e., start and end configurations). To express the start and end configurations, a simple notation was defined. In the potential future work on this package, it is essential that this notation be replaced with a different one, which should be more expressive and flexible.

To facilitate creating the input to *efsm2fsm*, spontaneous transitions are allowed to be specified in the input EFSM. These transitions are then concatenated with regular transitions (i.e., triggered by an external input) to eliminate spontaneous transitions from the resulting FSM. This procedure can be briefly described as follows. Suppose that in a path

\[
v_0 \xrightarrow{t_1} v_1 \xrightarrow{t_2} v_2 \ldots v_{i-1} \xrightarrow{t_i} v_i \ldots v_{n-1} \xrightarrow{t_n} v_n
\]  

(10)
where \( v_i \) and \( t_i \) denote a state and a transition, respectively, \( t_1 \) is regular and \( t_2, \ldots, t_n \) are spontaneous. Then transitions \( t_1, \ldots, t_n \) are concatenated into a single transition \( t_{1,n} \) from state \( v_0 \) to state \( v_n \). Their inputs, outputs, and other parameters are combined and associated with transition \( t_{1,n} \). States \( v_2, \ldots, v_{n-1} \) are marked as temporary, and subsequently removed from the FSM along with their outgoing transitions.

After the expansion to an FSM, transitions that are equivalent from a testing point of view could be identified, leading to a minimum-cost test sequence covering at least one transition from each equivalence class. However, building such a test sequence is NP-hard [27]. Therefore, simple heuristics bringing about 20%-30% reduction in the number of transitions were implemented.

It is possible to manually prepare the input file for the package such that an EFSM’s states are divided into two groups: (1) states with no inputs buffered, and (2) states with one input buffered at a semicontrollable interface. Then semicontrollable interfaces can be utilized for certain simplified cases such as using the 188-220B Intranet layer for indirect testing of 188-220B Datalink layer (in these tests, only one semicontrollable interface is used with a small number of semicontrollable inputs). A self-loop repetition constraint can be taken into account for the case of self-loop state verification sequences.

To run the package for a protocol’s EFSM specified in file protocol.efsm, the following command must be used:

```
  efsm2fsm protocol.efsm [-options]
```

producing two files protocol.fsm and protocol.stat. The former contains the output FSM. All information associated with transitions in the input EFSM is preserved. This enables the rcpt package to populate the fields defined in the CECOM’s proprietary format for test sequences. The latter file contains statistics such as the number of states and transitions in the EFSM/FSM, and the percentage effectiveness of the reduction heuristics.

Note that the original EFSM to FSM conversion technique implemented should be replaced by the application of the inconsistency elimination algorithms implemented in INDEEL (see Section 4.2). Using INDEEL to eliminate inconsistencies results in a conflict-free EFSM that is significantly smaller than the FSM.

4.1.2 rcpt

The FSM produced by efsm2fsm is then fed to rcpt, which builds a corresponding directed graph representation \( G \). Then, network flow techniques are applied to find a rural symmetric augmentation of \( G \) as \( G'' \). Finally, rcpt finds
an Euler tour of $G''$, and outputs to a file a resulting test sequence conforming to the CECOM’s proprietary format.

Suppose that protocol.fsm is an input file containing a protocol’s FSM. Then the following command runs the package:

\[ rcpt [-cecom/-plain ] protocol.fsm output_file \]

where plain option refers to a plain input/output file format. protocol.fsm file in plain format can be prepared manually. The cecom option selects test generation in the CECOM format. In this case, the input file protocol.fsm should be generated by the efsm2fsm package. The tests are stored in the number of files named protocol.i, where i is the index of a test group.

4.2 INDEEL: Software for Inconsistency Detection and Elimination

A software package, called INDEEL (INconsistencies DEtection and ELimination), has been implemented at CCNY based on the inconsistency elimination algorithms given in [73,21]. As part of the ongoing collaboration between the CCNY and the UD, the application of these algorithms has been extended to generate test sequences for the protocols with conflicting timers such as 188-220.

INDEEL contains 13,000+ lines of C code. As its input, the software reads a user specified file containing the description of an EFSM graph with the following properties:

- The specification consists of a single process and thus there are no communicating EFSMs.
- If the specification contains function calls, they can be described within the process with a simple transformation.
- Pointers, recursive functions, and syntactically endless loops are assumed not to be present in the specification.
- All conditions and actions are linear.

Overall complexity of the algorithms used in INDEEL were discussed in Section ref{complexity}. INDEEL uses an iterative approach: every time an action or condition inconsisteny is detected and eliminated, an intermediate output graph is generated in a file, using the same format as in the input file. This intermediate output file then becomes the new input file to INDEEL for continued analysis. This iterative procedure is repeated until the graph becomes free of inconsistencies. The intermediate and the final output graphs are provided as files.
INDEEL starts its analysis by considering the action inconsistencies; it then proceeds to the detection and elimination of the condition inconsistencies (if any). During the analysis of the action inconsistencies, INDEEL constructs a set of Action Update Matrix (AUM) pairs for each node. The AUM pairs represent the effects of the actions of the traversed edges leading to a given node \(v_i\). Similarly, the accumulated different conditions of the paths leading to \(v_i\) can be represented as a set of Accumulated Condition Matrix (ACM) triplets containing the coefficients, operators, and constants of the edge conditions.

To reduce the space complexity, during the AUM and ACM constructions, the software uses a single matrix called path matrices in which the numbers of the edges in the paths from the initial node to \(v_i\) are stored.

5 Technology Transfer Results

Using research results from Section 3, and software as described in Section 4.2, UD and CCNY collaborated with CECOM to generate tests for the SAP components of 188-220’s Data Link Layer Classes A and C. Table 3 shows the sizes of the expanded EFSMs and the tests that were generated from them. For example, the precedence tests set for Class A-Type 1 Service was based on an expanded EFSM of 303 states and 401 transitions. The minimum-length test sequence generated for this machine consists of 1,316 input/output pairs covering every transition in the expanded EFSM at least once.

Figure 17 shows a sample of the delivered test scripts. The figure depicts the test group #92 from Datalink Class A-Type 1 service tests. Each test group is a subsequence of a full test sequence that starts and ends in the initial state. In the first step, the technique of utilizing semicontrollable interfaces presented in Section 3.2 is used. The lower tester sends a packet with three destination addresses: \(IUT\_addr\), \(des\_addr\_1\), and \(des\_addr\_2\). The setting \(Relay=Yes\) in the \(INTRANET\) clause tells the first addressee, i.e., the IUT, to relay the packet to the two remaining addressees. As a result, the IUT sends a packet with its address as a source, and \(des\_addr\_1\) and \(des\_addr\_2\) as destinations, as if it were originated by the IUT’s Intranet Layer. In the second and third steps, the IUT’s packet sent in the first step is acknowledged by \(des\_addr\_2\) and \(des\_addr\_1\), respectively. Each test step is further annotated with the test description, the number of the corresponding Estelle transition(s), and the appropriate section(s) from the 188-220 official document.

The implementations of 188-220 from several manufacturers are being tested at CECOM. The tests generated by the UD and CCNY team have uncovered several implementation errors, including lack of mandatory capabilities in Datalink layer, and problems with multi-hop Intranet Relaying.
// Test Group #92
// ----------------------------------- - -------------

TESTGROUP=92;
LAYER=DataLink;

// Test 1
STIMULUS=send; // PL-Unitdata.ind
TIME=long;
// DL1
INTRANET={
    Type=IP;
    LowDelay=Yes;
    HighThroughput=No;
    HighReliability=No;
    Precedence=1; // PRIORITY
    OrgAddr=des_addr_17;
    DestRelay=
        Addr=IUAddr;
        Distance=1;
        Des=No;
        Relay=Yes;
        Ack=No;
    },
    DestRelay=
        Addr=des_addr_1;
        Distance=2;
        Des=Yes;
        Relay=No;
        Ack=No;
    },
    DestRelay=
        Addr=des_addr_2;
        Distance=2;
        Des=Yes;
        Relay=No;
        Ack=No;
    },
    DATALINK=
        CtrlField=
            SendSeq=1;
            RecSeq=1;
            ControlSpare=1;
            DLPrec=1; // PRIORITY
            IDNum=1;
            PDU=ui_0;
    },
    Command=Yes;
    SrcAddr=des_addr_17;
    DestAddr=IUAddr;
};

RESULTS=receive; // PL-Unitdata.Req
TIME=normal;
// DL1
DATALINK=
    CtrlField=
        SendSeq=1;
        RecSeq=1;
        ControlSpare=1;
        DLPrec=1; // PRIORITY
        IDNum=1;
        PDU=ui_1;
    },
    Command=Yes;
    SrcAddr=IUAddr;
    DestAddr=des_addr_1,des_addr_2;
};

TESTDESCRIPTION=
Intranet layer passes down a multidestination packet which is queued by datalink layer. Packet requires a coupled ack. There are no outstanding frames. No outstanding frame. Queued frame transmitted to multiple destinations. Frame requires a coupled ack. Ack timer started.
});
// ESTELLE TYPE1SAP_3,4,TYPE1SAP_18
// SECTION(S) 5.3.16,5.3.6.1.1,C4.3,5.3.4.2.2.2.1,5.3.6.1.1

// Test 2
STIMULUS=send; // PL-Unitdata.ind
TIME=normal;
// DL1
DATALINK=
    CtrlField=
        SendSeq=1;
        RecSeq=1;
        ControlSpare=1;
        DLPrec=2; // ROUTINE
        IDNum=1;
        PDU=ur_0;
    },
    Command=No;
    SrcAddr=des_addr_2;
    DestAddr=IUAddr;
};

RESULTS=noop; // none

TESTDESCRIPTION=
Second destination acks a multidestination packet. First has not acked yet.
});
// ESTELLE TYPE1SAP_12
// SECTION(S) 5.3.7.15.5,5.3.6.1.6 C4.3

// Test 3
STIMULUS=send; // PL-Unitdata.ind
TIME=normal;
// DL1
DATALINK=
    CtrlField=
        SendSeq=1;
        RecSeq=1;
        ControlSpare=1;
        DLPrec=2; // ROUTINE
        IDNum=1;
        PDU=ur_0;
    },
    Command=No;
    SrcAddr=des_addr_1;
    DestAddr=IUAddr;
};

RESULTS=noop; // none

TESTDESCRIPTION=
First destination acks a packet. Ack timer is stopped. No frame queued for transmission.
});
// ESTELLE TYPE1SAP_12
// SECTION(S) 5.3.7.15.5,5.3.6.1.6 C4.3

Fig. 17. A sample of test scripts delivered to CECOM.
Table 3
188-220 Datalink tests. A single step corresponds to one input/output exchange.

<table>
<thead>
<tr>
<th>Test set</th>
<th># of states</th>
<th># of transitions</th>
<th># of test steps</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>Class A Type 1 service</td>
<td></td>
<td></td>
</tr>
<tr>
<td>general behavior</td>
<td>298</td>
<td>799</td>
<td>1732</td>
</tr>
<tr>
<td>precedence</td>
<td>303</td>
<td>401</td>
<td>1316</td>
</tr>
<tr>
<td>multidestination</td>
<td>112</td>
<td>119</td>
<td>145</td>
</tr>
<tr>
<td></td>
<td>Class C Type 1 service</td>
<td></td>
<td></td>
</tr>
<tr>
<td>general behavior</td>
<td>298</td>
<td>799</td>
<td>1732</td>
</tr>
<tr>
<td>precedence</td>
<td>193</td>
<td>357</td>
<td>1314</td>
</tr>
<tr>
<td>multidestination</td>
<td>112</td>
<td>119</td>
<td>145</td>
</tr>
<tr>
<td></td>
<td>Class C Type 4 service</td>
<td></td>
<td></td>
</tr>
<tr>
<td>general behavior</td>
<td>235</td>
<td>925</td>
<td>2803</td>
</tr>
<tr>
<td>outstanding frames</td>
<td>48</td>
<td>172</td>
<td>264</td>
</tr>
<tr>
<td>multidestination</td>
<td>112</td>
<td>119</td>
<td>145</td>
</tr>
</tbody>
</table>

6 Conclusions: Improvements to Protocol Development Process

6.1 Integration of Estelle into System Development

Traditional sequential process of system development is known to be inefficient since it allows unnecessary duplication and does not facilitate tracking of rapidly changing technology. With 188-220 as a critical component, a synergistic framework for C^4 I (Command, Control, Communications, Computers, and Intelligence) systems development has been established [22] (Figure 18). It combines several parallel activities: developing protocol standards and specifications, formally specifying protocols in Estelle, building conformance tester hardware and software, “field testing”, modeling and simulation, as well as resolving and documenting the solutions to standards-related technical issues by the Joint CNR Working Group. (WG participants include representatives from DoD services/agencies, industry, and academia.)

Using formal methods as part of this process helped create a high quality protocol standard, which is robust and efficient. Due to the structured nature of Estelle, the specification process progressed at an accelerated pace compared to the other standards. 188-220 was completed on time, setting a rare example in the protocol standards arena.
Since it is relatively easier to extract modeling information from a formal specification, the researchers at UD and CCNY were able to solve a number of theoretical problems, which resulted in the development of new testing methodologies. By applying these new results, the conformance tests for 188-220 were generated while the protocol was still evolving. Performing initial conformance tests on prototypes uncovered several interoperability errors early in the development process. Following this success of the 188-220 development, the synergistic efforts to develop $C^4I$ systems with the help of formal methods serves as a model for DoD standards process and development for the future [22].

6.2 Advantages of Formal Methods in Eliminating Protocol Errors

The difficulties of describing protocol operations with clarity, precision, and consistency by using a natural language are illustrated by the examples in Section ???. In addition to the vagueness introduced by a natural language description, ambiguities and contradictions are difficult to detect when related protocol functionalities are defined in different document sections separated by several pages of unrelated text. Such problems are eliminated in a formal Estelle specification. All actions in a particular context are defined in one place within the Estelle specification. The specifications make the conditions for state transitions explicit through Estelle constructs. Indeed, the very process of creating these constructs enables formal specifiers to detect some of these types of ambiguities which are difficult to see in normal reading of a document written in English.
As concluding remarks for this paper, we report the following observations based on our experience during the formal specification and test generation for 188-220.

To develop an Estelle formal specification of a protocol, we must not only define its architecture and interface components (e.g., as in Figures 2 and 3 for 188-220), but we must also carefully specify the behavior of each module of these components. This definition, achieved through the creation of EFSMs, is the most difficult and time-consuming step of creating a formal specification. A syntax-directed editor improves the readability for testers who are not FDT-trained; it also is useful in writing non-trivial specifications. Moreover, the modeling and specification languages, such as SDL [31,32] and UML [56], enjoy widespread industrial popularity, partially due to their standard graphical representation. Therefore, it will be a natural extension for Estelle to include a graphical editor [62]. Once all states and transitions of a protocol (including inputs and outputs) are finalized, the writing of the Estelle code itself is fast and straightforward.

Since 188-220 is a multilayer, multifunction protocol of a considerable size and complexity, manual generation of conformance test sequences would be both inefficient and ineffective. As seen from Table 3, the tests already delivered to CECOM contain approximately 10,000 test steps. It is clear that manually generating test sets of this size from the protocol textual description is not a trivial task.

A number of conformance test generation techniques have been proposed [2,8,10,52,59,61,65,68], each of which is expected to give better results for a certain class of protocol specifications depending on the nature and size of the protocol. The experience obtained in generating tests for 188-220 suggests that to successfully test today’s complex protocols by using formal methods, an ideal test generation tool should support multiple test generation techniques [47]. They can range from Postman tours [2] or fault-oriented tests [80,82] for mid-size protocols when the number of states ranges on the order of thousands, to guided random walk approaches [45,83] for larger protocols when the number of states ranges in the tens of thousands.

The state explosion problem has been a major issue for generating FSM models out of EFSM representations of protocols [16,58,81,82]. One common procedure for converting EFSMs into FSMs simultaneously performs reachability analysis and online minimization [16,46]; this conversion is based on combining equivalent states [60] using bisimulation equivalence [53]. Another approach proposes the elimination of inconsistencies in EFSM models [71,72]. Efficient
algorithms such as these should be implemented in any test generation tool using FSM models. If the final FSM model is not confined to a manageable size, the test sequences generated from it will be infeasibly long regardless of the test generation method.

Finally, a test house may require its own proprietary format for the executable tests. Although TTCN is accepted as input by many test tools, a proprietary test format may be preferable for a given protocol if this format is more readable by testers, or is simpler to parse by software tools. The output of a test generation tool should be easily custom-tailored for a particular format, possibly by using simple application generators.

7 Acknowledgments

The authors thank Samuel Chamberlain of ARL; Ted Dzik and Ray Menell of CECOM; and Mike McMahon and Brian Kind of ARINC, Inc. for their collaboration in this research.

References


