Modeling TCP/IP Software Implementation Performance And Its Application For Software Routers

By Oscar Iván Lepe Aldama
<table>
<thead>
<tr>
<th>Dr.</th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>President</td>
<td></td>
</tr>
<tr>
<td>Dr.</td>
<td></td>
</tr>
<tr>
<td>Secretari</td>
<td></td>
</tr>
<tr>
<td>Dr.</td>
<td></td>
</tr>
<tr>
<td>Vocal</td>
<td></td>
</tr>
<tr>
<td>Dr.</td>
<td></td>
</tr>
<tr>
<td>Vocal</td>
<td></td>
</tr>
<tr>
<td>Dr.</td>
<td></td>
</tr>
<tr>
<td>Vocal</td>
<td></td>
</tr>
<tr>
<td>Data de la defensa pública</td>
<td></td>
</tr>
<tr>
<td>Qualificació</td>
<td></td>
</tr>
</tbody>
</table>
Contents

List of figures

List of tables

Preface

Chapter 1 Introduction ................................................................. 17
  1.1 Motivation .............................................................................. 17
  1.2 Scope ................................................................................... 18
  1.3 Dissertation’s thesis ............................................................. 19
  1.4 Synopsis ................................................................................. 19
  1.5 Outline .................................................................................. 20
  1.6 Related work .......................................................................... 20

Chapter 2 Internet protocols’ BSD software implementation .......... 23
  2.1 Introduction ........................................................................... 23
  2.2 Interprocess communication in the BSD operating system .......... 24
    2.2.1 BSD’s interprocess communication model ............................. 24
    2.2.2 Typical use of sockets ......................................................... 24
  2.3 BSD’s networking architecture .............................................. 26
    2.3.1 Memory management plane ............................................... 26
    2.3.2 User plane ....................................................................... 27
2.4 The software interrupt mechanism and networking processing ........................... 28
   2.4.1 Message reception ................................................................................. 28
   2.4.2 Message transmission ........................................................................... 29
   2.4.3 Interrupt priority levels ......................................................................... 30
2.5 BSD implementation of the Internet protocols suite .............................................. 31
2.6 Run-time environment: the host’s hardware ......................................................... 33
   2.6.1 The central processing unit and the memory hierarchy ......................... 34
   2.6.2 The busses organization ........................................................................ 34
   2.6.3 The input/output bus’ arbitration scheme .............................................. 35
   2.6.4 PCI hidden bus arbitration’s influence on latency .................................. 36
   2.6.5 Network interface card’s system interface ............................................. 37
   2.6.6 Main memory allocation for direct memory access network interface cards ........................................................................................................ 37
2.7 Other system’s networking architectures .............................................................. 40
   2.7.1 Active Messages [Eicken et al. 1992] ...................................................... 40
   2.7.2 Integrated Layer Processing [Abbot and Peterson 1993] ....................... 40
   2.7.3 Application Device Channels [Druschel 1996] ......................................... 42
   2.7.4 Microkernel operating systems’ extensions for improved networking [Coulson et al. 1994; Coulson and Blair 1995] ............................................. 43
   2.7.5 Communications oriented operating systems [Mosberger and Peterson 1996] ........................................... 44
   2.7.6 Network processors ............................................................................... 45
2.8 Summary ........................................................................................................... 46

Chapter 3 Characterizing and modeling a personal computer-based software router ........................................ 47
3.1 Introduction ..................................................................................................... 47
3.2 System modeling ............................................................................................. 48
3.3 Personal computer-based software routers ....................................................... 49
   3.3.1 Routers’ rudiments ............................................................................... 49
   3.3.2 The case for software router ................................................................. 49
   3.3.3 Personal computer-based software routers ......................................... 50
3.4 A queuing network model of a personal computer-based software router ...... 51
   3.4.1 The forwarding engine, the network interface cards and the packet flows ........................................................................................................ 53
   3.4.2 The service stations’ scheduling politics and the mapping between networking stages and model elements .................................................................. 53
3.5 System characterization ..................................................................................... 55
   3.5.1 Tools and techniques for profiling in-kernel software ............................ 55
   3.5.2 Software profiling .................................................................................. 55
   3.5.3 Probe implementation ......................................................................... 56
3.5.4 Extracting information from the kernel........................................59
3.5.5 Experimental setup......................................................................59
3.5.6 Traffic pattern .............................................................................60
3.5.7 Experimental design....................................................................62
3.5.8 Data presentation ........................................................................62
3.5.9 Data analysis ................................................................................62
3.5.10 “Noise” process characterization................................................64

3.6 Model validation.............................................................................65
3.6.1 Service time correlations...............................................................66
3.6.2 Qualitative validation ....................................................................67
3.6.3 Quantitative validation ..................................................................68

3.7 Model parameterization ................................................................71
3.7.1 Central processing unit speed.......................................................71
3.7.2 Memory technology ......................................................................73
3.7.3 Packet’s size................................................................................74
3.7.4 Routing table’s size ......................................................................75
3.7.5 Input/output bus’s speed.................................................................75

3.8 Model's applications .....................................................................76
3.8.1 Capacity planning .........................................................................76
3.8.2 Uniform experimental test-bed ......................................................78

3.9 Summary.........................................................................................88

Chapter 4 Input/output bus usage control in personal computer-based software routers ........................................89

4.1 Introduction ..................................................................................89
4.2 The problem ..................................................................................90

4.3 Our solution..................................................................................90
4.3.1 BUG’s specifications and network interface card’s requirements .........92
4.3.2 Low overhead and intrusion .........................................................93
4.3.3 Algorithm ....................................................................................94
4.3.4 Algorithm’s details.......................................................................95
4.3.5 Algorithm’s a priori estimated costs .............................................96
4.3.6 An example scenario ...................................................................98

4.4 BUG performance study ...............................................................100
4.4.1 Experimental setup ....................................................................100
4.4.2 Response to unbalanced constant packet rate traffic ..................102
4.4.3 Study on the influence of the activation period ..........................103
4.4.4 Response to on-off traffic ............................................................104
4.4.5 Response to self-similar traffic .....................................................105
<table>
<thead>
<tr>
<th>Section</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>4.5 A performance study of a software router incorporating the BUG</td>
<td>107</td>
</tr>
<tr>
<td>4.5.1 Results</td>
<td>107</td>
</tr>
<tr>
<td>4.6 An implementation</td>
<td>109</td>
</tr>
<tr>
<td>4.7 Summary</td>
<td>109</td>
</tr>
<tr>
<td><strong>Chapter 5 Conclusions and future work</strong></td>
<td>111</td>
</tr>
<tr>
<td>5.1 Conclusions</td>
<td>111</td>
</tr>
<tr>
<td>5.2 Future work</td>
<td>112</td>
</tr>
</tbody>
</table>

**Appendix A**

**Appendix B**

**Bibliography**
List of figures

Figure 2.1—OMT object model for BSD IPC................................................................. 25

Figure 2.2—BSD’s two-plane networking architecture. The *user plane* is depicted with its layered structure, which is described in following sections. Bold circles in the figure represent defined interfaces between planes and layers: A) Socket-to-Protocol, B) Protocol-to-Protocol, C) Protocol-to-Network Interface, and D) User Layer-to-Memory Management. Observe that this architecture implies that layers share the responsibility of taking care of the storage associated with transmitted data ........................................................................ 26

Figure 2.3—BSD networking user plane’s software organization ....................... 27

Figure 2.4—Example of priority levels and kernel processing ............................. 31

Figure 2.5—BSD implementation of the Internet protocol suite. Only chief tasks, message queues and associations are shown. Please note that some control flow arrows are sourced at the bounds of the squares delimiting the implementation layers. This is for denoting that a task is executed after an external event, such as an interrupt or a CPU scheduler event.......................... 32

Figure 2.6—Chief components in a general purpose computing hardware. ............ 33

Figure 2.7—Example PCI arbitration algorithm..................................................... 36

Figure 2.8—Main memory allocation for direct memory access network interface cards.................................................................................................................. 39

Figure 2.9—Integrated layering processing [Abbot and Peterson 1993]................. 41

Figure 2.10—Application Device Channels [Druschel 1996]................................. 42
Figure 2.11—SUMO extensions to a microkernel operating system [Coulson et al. 1994; Coulson and Blair 1995] .......................................................... 43

Figure 2.12—Making paths explicit in the Scout operating system [Mosberger and Peterson 1996] .............................................................................. 44

Figure 3.1—A spectrum of performance modeling techniques ........................................ 48

Figure 3.2—IP router architecture ................................................................................. 49

Figure 3.3—A queuing network model of a personal computer-based software router that has two network interface cards and that is traversed by a single packet flow. The number and meaning of the shown queues is a result of the characterization process presented in the next section ........................................ 51

Figure 3.4—A queuing network model of a personal computer-based software router that shows a one-to-one mapping between C language functions implementing the BSD networking code and the model’s queues. In order to simplify the figure, this model does not include the models for input/output bus and the noise. Moreover, it uses a simplified version of the network interface card model ........................................................................ 54

Figure 3.5—Probe implementation for FreeBSD .............................................................. 58

Figure 3.6—Experimental setup ..................................................................................... 59

Figure 3.7—Characterization charts for a security gateway’s protocols layer ............... 63

Figure 3.8—Characterization charts for a router’s network interfaces layer ............... 63

Figure 3.9—Comparison of the CCPFs computed after measured data from a software router and predicted data from a software router’s queuing network model, which used a one-to-one mapping between C language networking functions and model’s queue ........................................................................ 66

Figure 3.10—Example chart from the service time correlation analysis. It shows the plot of ip_input cycle counts versus ip_output cycle counts. A correlation is clearly shown ........................................................................ 66

Figure 3.11—Model’s validation charts. The two leftmost columns’ charts depict per-packet latency traces. The right column’s chart depicts latency traces’ CCPFs ........................................................................ 69

Figure 3.12—Relationship between measured execution times and central processing unit operation speed. Observe that some measures have proportional behavior while others have linear behavior. The main text explains the reasons to these behaviors and why the circled measures do not all agree with the regression lines ........................................................................ 71
Figure 3.13—Outliers related to the CPU's instruction cache. The left chart was
drawn after data taken from the ipintrq probe. The right chart
corresponds to the ESP (DES) probe at a security gateway. Referenced
outlier points are highlighted............................................. 73

Figure 3.14—Relationship between measured execution times and message size........... 74

Figure 3.15—Capacity planning charts........................................................................ 77

Figure 3.16—Queuing network model for a BSD based software router with two
network interface cards attached to it and three packet flows traversing it............... 80

Figure 3.17—Basic system's performance analysis: a) system's overall
throughput; b) per flow throughput share for system one; c) per flow
throughput share for system two........................................................................... 81

Figure 3.18—Queuing network model for a Mogul and Ramakrishnan [1997]
based software router with two network interface cards attached to it and
three packet flows traversing it............................................................................. 83

Figure 3.19—Mogul and Ramakrishnan [1997] based software router's
performance analysis: a) system's overall throughput; b) per flow
throughput share for system one; c) per flow throughput share for system
two....................................................................................................................... 84

Figure 3.20—Queuing network model for a software router including the receiver
live-lock avoidance mechanism and a QoS aware CPU scheduler, similar to
the one proposed by Qie et al. [2001]. The router has two network interface
cards and three packet flows traverse it................................................................. 86

Figure 3.21—Qie et al. [2001] based software router's performance analysis: a)
per flow throughput share for system one; b) per flow throughput share for
system two............................................................................................................. 87

Figure 4.1—The BUG’s system architecture. The BUG is a piece of software
embedded in the operating system’s kernel that shares information with the
network interface card’s device drivers and manipulates the vacancy of each
DMA receive channel.............................................................................................. 91

Figure 4.2—The BUG’s periodic and bistate operation for reducing overhead and
intrusion.................................................................................................................... 93

Figure 4.3—The BUG’s packet-by-packet GPS server emulation with batch
arrivals..................................................................................................................... 94

Figure 4.4—The BUG is work conservative................................................................. 97

Figure 4.5—The BUG’s unfairness counterbalancing mechanism............................. 97
Figure 4.6—The BUG’s bus utilization grant packetization policy. In the considered scenario, three packets flows with different packet sizes traverse the router and the BUG has granted each an equal number of bus utilization bytes. Packet sizes are small, medium and large respectively for the orange, green and blue packets flows. After packetization, some idle time gets induced................................................................. 97

Figure 4.7—We show an example of the behavior of the BUG mechanism. Vectors A, D, N, G and g are defined as: A = (A₁, A₂, A₃), etc. We assume that the system serves three packets flows with the same shares and with the same packet lengths. In a period T up to six packets can be transferred through the bus.............................................................................. 99

Figure 4.8—Queuing network models for: a) PCI bus, b) WFQ bus, and c) BUG protected PCI bus; all with three network interface cards attached to it and three packets flows traversing them................................................................. 101

Figure 4.9—BUG performance study: response comparison to unbalanced constant packet rate traffic between a WFQ bus, a PCI bus and a BUG protected PCI bus; first, middle and left columns respectively. At row (a) flow3 is the misbehaving flow while flow2 and flow1 are for (b) and (c), respectively............................................................... 102

Figure 4.10—BUG performance study: on the influence of the activation period...... 104

Figure 4.11—BUG performance study: response comparison to on-off traffic between an ideal WFQ bus, a PCI bus, and a BUG protected PCI bus ............... 105

Figure 4.12—QoS aware system’s performance analysis: a) system’s overall throughput; b) per packets flow throughput share for system one; c) per packets flow throughput share for system two ........................................... 108
List of tables

Table 3-I ......................................................................................................................... 75
Table 3-II ......................................................................................................................... 75
Table 3-III ......................................................................................................................... 78
Table 4-I ......................................................................................................................... 92
Table 4-II ......................................................................................................................... 98
Table 4-III ....................................................................................................................... 100
Table 4-IV ....................................................................................................................... 106
Preface

Dedico esta obra a mi esposa, Tania Patricia, y a nuestros hijos, Pedro Dario y Sebastián.

Al mismo tiempo, quiero agradecer a Tania y a Pedro su apoyo y paciencia durante el tiempo que estuvimos fuera de casa, viviendo no en las mejores condiciones. Sólo espero que esta precariedad haya sido mitigada por lo enriquecedor de las experiencias que hemos vivido, que siento nos harán mejores personas al habernos recalorado la importancia de la unidad familiar, y el valor de nuestras costumbres, nuestra gente y nuestra tierra.

Agradezco a mi director de tesis, Jorge García-Vidal, por su invaluables tutela y dedicación. Difícilmente olvidaré las incontables y largas reuniones de discusión que le dieron vida a esta obra, ni las numerosas noches en vela que pasamos ultimando los detalles de los artículos que enviamos a revisión. Pero también difícilmente olvidaré las tardes de café o las tardes en el parque con nuestros hijos, donde hablábamos de todo menos del trabajo.

Agradezco al pueblo de México, que con su esfuerzo permite la existencia de programas de becas para que mexicanos hagan estudios de postgrado en el extranjero, como el programa gestionado por el Consejo Nacional de Ciencia y Tecnología, o el gestionado por el Centro de Investigación Científica y de Educación Superior de Ensenada, que me dieron cobijo.

Y en general agradezco a todas las personas que de forma directa o indirecta me ayudaron a la consecución de esta obra, por ejemplo a los profesores del DAC/UPC y a los revisores anónimos en los congresos. En particular quiero mencionar, en orden alfabético, a Alejandro Limón Padilla, David Hilario Covarrubias Rosales, Delia White, Eva Angelina Robles Sánchez, Francisco Javier Mendieta Jiménez, Jorge Enrique Preciado Velasco, José María Barceló Ordinas, Llorenç Cerdà Alabern, Olga María Casals Torres, Oscar Alberto Lepe Peralta, Victoriano Pagoaga. A todos ellos muchas gracias.
Chapter 1

Introduction

1.1 Motivation

Evidently, computer networks and the computer applications than run over them are fundamental to today’s human society. Indeed, some kind of these telematic systems is fundamental for the proper operation of the New York Stock Exchange, for instance, but also some other kind is fundamental for providing telephony service to little villages in hard-to-reach places, like the numerous villages at the mountains of Chiapas, México. As it is well known, telematic technology’s application for better human society is only limited by humans’ imagination.

Today’s human’s necessities for information processing and transportation present complex problems. For solving these problems, scientists and engineers are required to produced ideas and products with an ever-increasing number of components and inter-components relationships. To cope with this complexity, developers invented the concept of layered architectures, which allow a structured “divide to conquer” approach for designing complex systems by providing a step-by-step enhancement of system services.

Unfortunately, layered structures can result in relatively low performance telematic systems—and often they are—if implemented carelessly [Clark 1982; Tennenhouse 1989; Abbott and Peterson 1993; Mogul and Ramakrishnan 1999]. To understand this, let us consider the following:
- Telematic systems are mostly implemented with software.
- Each software layer is designed as an independent entity that concurrently and asynchronously communicates with its neighbors through a message-passing interface. This allows for better interoperability, manageability and extensibility.
- In order to allow software layers to work concurrently and asynchronously, the host computer system has to provide a secure and reliable multiprogramming environment through an operating system.
- Ideally, the operating system should perform its role without consuming a significant share of processing resources. Unfortunately, as reported elsewhere [Druschel 1996], current operating systems are threatening to become bottlenecks when processing input/output data streams. Moreover, they are the source of statistical delays—incurred as each data unit is marshaled through the layered software—that hamper the correct deployment of important services.

Others have recognized this problem and have conducted studies analyzing some aspects of some operating systems’ computer networks software, networking software for short. For us it is striking that although these studies are numerous, (a search through the ACM’s Digital Library on the term “c.2 and d.4 and e.5)<IN>ccs” returns 54 references, and a search through IEEE's Xplore on the term “‘protocol processing’<IN>de” returns 81 references) only one of them pursued to build a general model of the networking software—see this chapter’s section on “related work”. Indeed, although different systems’ networking software has more similarities than differences, as we will later discuss, most of these studies have only focused in identifying and solving particular problems of particular systems. When saying this we do not denied the importance of this work, however, we believe that modeling is an important part of doing research.

The Internet protocol suite (TCP/IP) [Stevens 1994] is, nowadays, the preferred technology for networking. Of all possible implementations, the one done at the University of California at Berkeley for the Berkeley Software Distribution operating system, or BSD, has been used as the starting point for most available systems. The BSD operating system [McKusick et al. 1996], which is a flavor of the UNIX system [Ritchie and Thompson 1978], was selected as the foundation for implementing the first TCP/IP suite back in the 80's.

### 1.2 Scope

Naturally, for modeling a system high degrees of observability and controllability are required. For us this means free access to both networking software’s source code and host computer’s technical specifications. (When we speak of free, we are referring to freedom, not price.) Today’s personal computer (PC) technology provides this. Indeed, not only there is tons of freely available detailed technical documentation on Intel’s IA32 PC hardware but also there are several PC operating systems with open source policy. Of those we decide to work with FreeBSD, a 4.4BSD-Lite derived operating system optimized to be run on Intel’s IA32 PCs.

When searching for a networking application for which the appliance of a performance model could be of importance we found software routers. A software router
can be defined as a general-purpose computer that executes a computer program capable of forwarding IP datagrams among network interface cards attached to its input/output bus (I/O bus). Evidently, software routers have performance limitations because they use a single central processing unit (CPU) and a single shared bus to process all packets. However due to the ease with which they can be programmed for supporting new functionality—securing communications, shaping traffic, supporting mobility, translating network addresses, supporting applications’ proxies, and performing n-level routing—software routers are important at the edge of the Internet.

1.3 Dissertation’s thesis

From all the above we came up with the following dissertation thesis:

*Is it possible to build a queuing network model of the Internet protocols’ BSD implementation that can be used for predicting with reasonable accuracy not only the mean values of the operational parameters studied but also their cumulative probability function? And, can this model be applied for studying the performance of PC-based software routers supporting communication quality assurance mechanisms, or Quality-of-Service (QoS) mechanisms?*

1.4 Synopsis

Three are the main contributions of this work. In no particular order:

- A detailed performance study of the software implementation of the TCP/IP protocols suite, when executed as part of the kernel of a BSD operating system over generic PC hardware
- A validated queuing network model of the studied system, solved by computer simulation
- An I/O bus utilization guard mechanism for improving the performance of software routers supporting QoS mechanisms and built upon PC hardware and software

This document presents our experiences building a performance model of a PC-based software router. The resulting model is an open multiclass priority network of queues that we solved by simulation. While the model is not particularly novel from the system modeling point of view, in our opinion, it is an interesting result to show that such a model can estimate, with high accuracy, not just average performance-numbers but the complete probability distribution function of packet latency, allowing performance analysis at several levels of detail. The validity and accuracy of the multiclass model has been established by contrasting its packet latency predictions in both, time and probability spaces. Moreover, we introduced into the validation analysis the predictions of a router’s single queue model. We did this for quantitatively assessing the advantages of the more complex multiclass model with respect to the simpler and widely used but not so accurate, as here shown, single queue model, under the considered sce-
scenario that the router’s CPU is the system bottleneck and not the communications links. The single queue model was also solved by simulation.

Besides, this document addresses the problem of resource sharing in PC-based software routers supporting QoS mechanisms. Others have put forward solutions that are focused on suitably distributing the workload of the CPU—see this chapter’s section on “related work”. However, the increase in CPU speed in relation to that of the I/O bus—as here shown—means attention must be paid to the effect the limitations imposed by this bus on the system’s overall performance. We propose a mechanism that jointly controls both I/O bus and CPU operation. This mechanism involves changes to the operating system kernel code and assumes the existence of certain network interface card’s functions, although it does not require changes to the PC hardware. A performance study is shown that provides insight into the problem and helps to evaluate both the effectiveness of our approach, and several software router design trade-offs.

1.5 Outline

The rest of this chapter is devoted to discuss about related work. Chapter 2’s objective is to understand the influence that operating system design and implementation technique have over the performance of the Internet protocol’s BSD software implementation. Chapter 3 presents our experiences building, validating and conducting the parameterization of a performance model of a PC-based software router. Moreover, it presents some results from applying the model for capacity planning. Chapter 4 addresses the problem of resource sharing in PC-based software routers supporting communication quality assurance mechanisms. Furthermore, it presents our mechanism for jointly controlling the router’s CPU and I/O bus, indispensable for a software router to support communication quality assurance mechanisms.

1.6 Related work

Cabrera et al. [1988] (in reality, they presented their study’s results on July, 1985) is the earliest work we found across the publicly accessible literature on TCP/IP’s implementation experimental evaluation. Their study was an ambitious one, which objective was to assess the impact that different processors, network hardware interfaces, and Ethetnets have on the communication across machines, under various hosts’ and communication medias’ load conditions. Their measurements highlighted the ultimate bounds on communication performance perceived by application programs. Moreover, they presented a detailed timing analysis of the dynamic behavior of the networking software. They studied TCP/IP’s implementation within 4.2BSD when ran by then state of the art minicomputers, attached to legacy Ethetnets. Consequently, their study’s results are no longer valid. Worst yet, they used the gprof(1) and kgmon(8) tools for profiling. These tools are only supported by software and, consequently, produce results that have limited accuracy when compare to results produce by performance monitoring hardware counters, as we do. However complete their study was, they did not pursue to build a system model, like we do.
Sanghi et al. [1990] is the earliest work we found across the publicly accessible literature on TCP/IP’s implementation experimental evaluation that uses software profiling. Their study is very narrow when compare to ours in the sense that their study’s objective was only to determine the suitability of roundtrip time estimators for TCP implementations.

Papadopoulos and Gurudatta [1993] presented results of a more general study on TCP/IP’s implementation performance, obtained using software profiling. They studied TCP/IP’s implementation within a 4.3BSD-derived operating system—SUN OS. Like our work, their study’s objective was to characterize the performance of the networking software. Conversely, their study was not aimed to produce a system model.

Kay and Pasquale [1996] (in reality, they presented their study’s results on September, 1994) conducted another TCP/IP’s implementation performance study. Differently from previous work, their study was carried out at a different granularity level; that is, they went inside the networking functions and analyzed how these functions used the source code—touching data, protocol-specific processing, memory buffers manipulation, error checking, data structure manipulation, and operating system overhead. Moreover, their study’s objective was to guide a search for bottlenecks in archiving high throughput and low latency. Once again, they did not pursue building a system model.

Ramamurthy [1988] modeled the UNIX system using a queuing network model. However, his model characterizes the system’s multitasking properties and therefore cannot be applied to study the networking software, which is governed by the software interrupt mechanism. Moreover, Ramamurthy’s model was only solved for predicting mean values, something that is not enough when conducting performance analyses of today’s networking software. What is required is a model capable of producing the complete probability function of operational parameters so analysis at several levels of detail may be performed.

Björkman and Gunningberg [1998] modeled an Internet protocols’ implementation using a queuing network model, however, their model characterizes a user-space implementation (base on a parallelized version of University of Arizona’s x-kernel [Hutchinson and Peterson 1991]) and therefore disregards the software interrupt mechanism. Moreover, their model was aimed to predict only system throughput (measured in packets per second) when the protocols are hosted by shared-memory multiprocessor systems. Besides, their study was aimed only for high-performance distributed computing, where it is considered that connections are always open with a steady stream of messages—that is, no retransmissions or other unusual events occur. All this impedes the utilization of their model for our intended applications.

Packet service disciplines and their associated performance issues have been widely studied in the context of bandwidth scheduling in packet-switched networks [Zhang 1995]. Arguably, several such disciplines now exist that are both fair—in terms of assuring access to link bandwidth in the presence of competing packet flows—and easy to implement efficiently—with both hardware and software. Recently, an interest has arisen to map these packet service disciplines in the context of CPU scheduling for programmable and software routers. Qie et al. [2001], and Chen and Morris [2001], for a software router, and Pappu and Wolf [2001], for a programmable router, have put forward solutions that are focused on suitably distributing the workload of a router’s
processor. However, neither the formers nor the latters consider the problem of I/O bus bandwidth scheduling. And this problem is important in the context of software routers, as we demonstrate.

Chiuheh and Pradhan [2000] recognized the suitability and inherent limitations of using PC technology for implementing software routers. In order to overcome the performance limitations of PCs’ I/O buses, and to construct high-speed routers, they have proposed using clusters of PCs. Upon this architecture, several PCs having at most one network interface card each are tightly couple by means of a Myrinet system area network to conform a software router with as many subnetwork attachments as computing nodes. After solving all internodes communication, memory coherency and routing table distribution problems—arguably not an easy task—a “clustered router” may be able to overcome the limitations of current PCs’ I/O buses and not only provide high performance—in term of packet per second—but also support QoS mechanisms. Our work, however, is aimed at supporting QoS mechanisms in clearly simpler and less expensive PC based software routers.

Scottis, Krunz and Liu [1999] recognized the performance limitations of the omnipresent Peripheral Component Interconnect (PCI) PC I/O bus to support QoS mechanisms, and proposed an enhancement. Conversely to our software enhancement, theirs introduced a new bus arbitration protocol that has to be implemented with hardware. Moreover, their bus arbitration protocol is aimed to improve bus support for periodic and predictable real-time streams only, and clearly this is not suitable for Internet routers.

There is general agreement in the PC industry that the demands of current user applications are quickly overwhelming the shared parallel architecture and limited bandwidth of the various types of PCI buses. (Erlanger, L. Editor “High-performance busses and interconnects,” http://www.pcmag.com/article2/0,4149,293663,00.asp current as 8 July 2002.) With this increasing demand and its lack of QoS, PCI has been due for a replacement in different system and application scenarios for more than a few years now. 3GIO, InfiniBand, RapidIO and HyperTransport are new technologies designed to improve I/O and device-to-device performance in a variety of system and application categories, however, not all are direct replacements for PCI. In this sense, 3GIO and InfiniBand may be considered as direct candidates to succeed PCI. All these technologies have in common that they define packet-based, point-to-point serial connections with layered communications, which readily support QoS mechanisms. However, due to the large installation base of PCI equipment, it is expected that the PCI bus will be used for still some years. Consequently, we think our work is important.