A Practical Verification Framework
for Preemptive OS Kernels

Fengwei Xu\textsuperscript{1,2}, Ming Fu\textsuperscript{1,2(✉)}, Xinyu Feng\textsuperscript{1,2}, Xiaoran Zhang\textsuperscript{1,2}, Hui Zhang\textsuperscript{1,2},
and Zhaohui Li\textsuperscript{1,2}

\textsuperscript{1} School of Computer Science and Technology,
University of Science and Technology of China,
Hefei, China
fuming@ustc.edu.cn
\textsuperscript{2} Suzhou Institute for Advanced Study,
University of Science and Technology of China,
Suzhou, China

Abstract. We propose a practical verification framework for preemptive
OS kernels. The framework models the correctness of API implementa-
tions in OS kernels as contextual refinement of their abstract specifi-
cations. It provides a specification language for defining the high-level
abstract model of OS kernels, a program logic for refinement verifica-
tion of concurrent kernel code with multi-level hardware interrupts, and
automated tactics for developing mechanized proofs. The whole frame-
work is developed for a practical subset of the C language. We have
successfully applied it to verify key modules of a commercial preemptive
OS $\mu$C/OS-II [2], including the scheduler, interrupt handlers, message
queues, and mutexes etc. We also verify the priority-inversion-freedom
(PIF) in $\mu$C/OS-II. All the proofs are mechanized in Coq. To our knowl-
dge, our work is the first to verify the functional correctness of a prac-
tical preemptive OS kernel with machine-checkable proofs.

1 Introduction

Verifying OS kernels has long been recognized as an important but also extremely
challenging task. There have been exciting efforts for OS kernel verification
[4,13,16,27] in recent years, but most of them have no or limited support of
kernel-level preemption, which allows tasks to be preempted even in kernel mode.
This limitation restricts their applicability to real-time systems, where preempt-
tive multitasking is indispensable to achieve real-time guarantees.

Preemptive kernels require explicit invocation of schedulers inside interrupt
handlers and careful interrupt management in the kernel code, which make the
kernel highly concurrent and complex. In this paper we propose a verification
framework for preemptive OS kernels, and show its application in verifying key
modules of $\mu$C/OS-II [2], a commercial preemptive real-time multitasking kernel
for microprocessors and microcontrollers. The verification is fully mechanized

\textsuperscript{✉} This work is supported in part by grants from National Natural Science Foundation
of China (NSFC) under Grant Nos. 61103023, 61229201, 61379039 and 91318301.

© Springer International Publishing Switzerland 2016
DOI: 10.1007/978-3-319-41540-6_4
in Coq [1]. To our knowledge, it is the first verification of (key modules of) a preemptive OS kernel with machine-checkable proofs. The key contribution of the work is to adapt existing theories on interrupt verification [11] and contextual refinement of concurrent programs [17,19,24,25], and integrate them into a framework for real-world preemptive OS kernel verification. Specifically, our work makes the following new contributions:

**First**, we formulate and verify the correctness of the APIs of OS kernels as contextual refinement between their implementations and specifications. Although refinement approaches have been applied in earlier work on OS kernel verification [4,13,16], we believe our work is the first to explicitly specify and prove contextual refinement for APIs of a preemptive OS kernel, following recent progress on refinement verification of concurrent programs [17,19,24,25]. As we explain in Sect. 2.2, contextual refinement not only serves as a very strong notion of functional correctness of system APIs, but also allows us to prove properties based on the more abstract API specifications and then carry it down to the level of concrete implementations, which makes the verification much simpler than doing proofs directly at the concrete level.

**Second**, we provide a simple modeling language for specifying kernel primitives. The language strives for balance between abstraction and expressiveness for scheduling. On the one hand, we want the specification to abstract away implementation details. On the other hand, it should provide enough details so that many important properties can be specified at the abstract specification level. Our modeling language provides an abstract sched command, allowing us to specify explicitly when the scheduler is invoked in synchronization primitives or interrupt handlers. Semantics of sched is parameterized over abstract scheduling policies (e.g., priority-based or round-robin). Expressiveness about these details are necessary to specify system-wide scheduling properties.

**Third**, we propose a program logic for refinement verification of concurrent kernel programs. The logic supports multi-level nested hardware interrupts and configurable schedulers. It extends concurrent separation logic [21] (CSL) with relational assertions that relate program states at the implementation and the specification levels, as in Liang et al. [17,19]. It also assigns ownership-transfer semantics to interrupt management operations and verify multi-level hardware interrupts in a realistic setting. Different from traditional Hoare-style program logics, whose soundness ensures the semantic interpretation of Hoare-triples, our logic explicitly establishes contextual refinement, which is more useful for establishing abstractions for system APIs, as explained above.

**Fourth**, our framework is developed for a practical subset of C. It has been successfully applied to verify key APIs of μC/OS-II [2], including the timer interrupt handler (and a pseudo interrupt handler to demonstrate the support of multi-level interrupts), the scheduler, the time management, and four synchronization mechanisms: message queues, mail boxes, semaphores, and mutexes. It is worth noting that, unlike existing works [4,13,16,27] that are focused on kernels newly developed with verification in mind, we take a commercial system developed by an independent third-party and verify the code with minimum modification, which demonstrates the generality and applicability of our framework.
Fifth, we also specify and verify priority inversion freedom (PIF) of μC/OS-II. PIF is a crucial property for real-time systems and is worth verifying in its own right. Moreover, since the specification and verification are done at the level of the abstract model (i.e., specifications) of the kernel, they also help validate our model of system APIs. As we explain above, many important properties cannot be specified if the model is too weak or overly abstract.

Coq proofs and a companion technical report are available at http://staff.ustc.edu.cn/~fuming/research/certiucos.

2 Background and Overview of Our Work

2.1 Preemptive OS Kernels and Interrupts

In a preemptive OS kernel, execution of a task inside the kernel can be interrupted at any program point (unless interrupts are disabled). Then the control is switched to the interrupt handler. When the handler finishes, it may invoke the scheduler and switch the execution context to a different task, instead of returning to the original interrupted task. For instance, with priority-based scheduling, the interrupt handler always switches to the highest priority task at its end.

The x86 Interrupt Mechanism. Interrupt handling and management are indispensable in preemptive OS kernels. We give an overview of the interrupt mechanism in x86 systems (based on the Intel 8259A interrupt controller).

The CPU has a flag bit IF indicating whether interrupts are enabled or not. The cli/sti instruction clears/sets the bit to disable/enable interrupts. In 8259A there is a register isr, each bit of which corresponds to a hardware interrupt and records if the interrupt is being served or not. Different priority levels are assigned to different sources of interrupts, with level-0 being the highest. When an interrupt request comes, we check IF and isr. If the interrupts are enabled and there is currently no interrupt with higher or the same priority being served, the request will be served. The corresponding bit in isr is set to 1 and the control jumps to the corresponding interrupt handler.

On the invocations of an interrupt handler, the CPU flags (including IF) are saved on the stack, and interrupts are disabled automatically. If interrupts are enabled again inside the handler, the handler could be further interrupted by requests with higher priorities, causing nested interrupts.

The handler returns to the program being interrupted using the iret instruction, which also restores the flags (including IF). Before the handler returns, it needs to execute eoi to send an “end of interrupt” signal to the interrupt controller, which clears the corresponding bit in isr. Note that after eoi but before iret, if interrupts are enabled (IF = 1), the handler could be interrupted by interrupts at a lower or the same level.

Overview of μC/OS-II. μC/OS-II is a commercial preemptive real-time multitasking OS kernel developed by Micrium [2]. The kernel has 6000+ lines of C code and 300+ lines of assembly. It allows a fixed number of tasks, multi-level
interrupts, and preemptive priority-based scheduling. The system APIs include “semaphores; event flags; mutual-exclusion semaphores that eliminate unbounded priority inversions; mailboxes; message queues; task, time and timer management; and fixed sized memory block management” [2]. μC/OS-II is developed for microprocessors and microcontrollers, and it does not support virtual memory. It has been deployed in many real-world safety critical applications, including avionics (e.g., the Mars Curiosity Rover) and medical equipments.

2.2 Overview of the Verification Framework

An OS kernel hides details of the underlying hardware and provides an abstract programming model for application-level programmers. The implementation of the kernel must ensure that behaviors of user applications in the real machine are consistent with their behaviors under the abstract model [14]. Thus the OS verification can be reduced to verifying refinement between the concrete and abstract programming models.

Contextual Refinement as Correctness. We consider three entities, the application $A$, the abstract specifications of the system APIs and interrupt handlers $\mathcal{O}$, and their concrete implementations $O$. When system calls are made or interrupts are handled, routines in $O$ are invoked in the real execution, while in the programmers’ mind those in $\mathcal{O}$ are invoked instead at the abstract level. Then the correctness of OS kernels requires $O$ refines $\mathcal{O}$ under all contexts $A$:

$$\forall A. [A[O]] \subseteq [A[\mathcal{O}]]$$

where $[\cdot]$ maps a program $P$ to the set of its observable behaviors. It says that, for all applications, executing the concrete code $O$ does not have more observable behaviors than executing the abstract version $\mathcal{O}$. In this paper, observable behaviors are defined as finite prefixes of execution traces consisting of observable events, following Liang et al. [17].

Contextual refinement is a very strong notion of functional correctness of system APIs since it quantifies over all applications. Moreover, it makes verification of system-wide properties simpler. For instance, if we want to verify certain property $\Phi$ about a whole system $A[O]$, i.e., $\Phi$ holds over every trace in $[A[O]]$, we could prove that it holds over every trace in the superset $[A[\mathcal{O}]]$ instead. Proofs at the abstract level could be much simpler than the concrete level.

The Whole Verification Framework. Figure 1 shows the structure of our verification framework. To model OS kernels and applications, we introduce two languages (in block A), the low-level language for the concrete code implementation and the high-level language for the abstract specification. Above them we have a program logic (in block B) that allows us to prove the low-level kernel implementation contextually refines the high-level specifications. The framework also provides a set of Coq tactics (in block C) to automatically generate and prove verification conditions. The $\mu$C/OS-II modules certified in this framework are shown in block D. Below we give details of some of the building blocks.
3 Modeling of OS Kernels

As explained above, the correctness of OS kernels is formalized based on three entities — user applications $A$, the concrete implementation $O$, and the abstract specification $\mathcal{O}$. In this section we introduce the programming (or modeling) languages for the three entities (see block A in Fig. 1). Due to space limit, we only show the main language features with simplifications for clear presentation. The details are available at TR and the Coq code [26].

3.1 The Low-Level Language

The low-level language consists of two parts for implementations of user applications and OS kernels, respectively.

*Application Language.* The application language is shown at the top of Fig. 2. It is a subset of the C language consisting of function calls, pointer operations (except pointer arithmetics), arrays, structs, bit operations, *etc.* The application code $A$ maps function names to their function bodies. The command $f(\overline{e})$ calls the function $f$, which could be either an application function in $A$ or an OS API (in $O$ at the low-level or in $\mathcal{O}$ at the high-level, as we explain below).
(AExpr) \[ e ::= n | x | *e | &e | e.id | e[e] | \ldots \]
(AppStmts) \[ d ::= e=e | f(\tilde{e}) | d;d | \textbf{while} \ (e) \ d | \textbf{if} \ (e) \ d \ 	extbf{else} \ d | \textbf{return} \ e | \ldots \]
(AppCode) \[ A ::= \{ f_1 \leadsto d_1, \ldots, f_n \leadsto d_n \} \]

(LPrim) \[ \iota ::= \text{switch} \ x | \text{encrt} | \text{excrт} | \text{eoi} \ k | \text{iext} | \ldots \]
(LStmts) \[ s ::= d | \iota | s; s | \textbf{while} \ (e) \ s | \ldots \]
(ItripCode) \[ \theta ::= [s_0, \ldots, s_{N-1}] \]
(ProgUnit) \[ \eta ::= \{ f_1 \leadsto s_1, \ldots, f_n \leadsto s_n \} \]
(LProg) \[ P ::= (A, O) \]
(BitVal) \[ b, ie \in \{0, 1\} \]
(ISRRReg) \[ isr ::= [b_0, \ldots, b_{N-1}] \]
(CrtStk) \[ cs ::= \text{nil} | ie::cs \]
(ItripStk) \[ is ::= \text{nil} | k::is \]
(ItripTaskSt) \[ \delta ::= (ie, is, cs) \]
(ItripSt) \[ \pi ::= \{ t_1 \leadsto \delta_1, \ldots, t_n \leadsto \delta_n \} \]

Fig. 2. The language for applications and kernel implementation

Note that the correctness of OS kernels are independent of the implementation language of \( A \). Here we pick the C language for \( A \) to simplify the formalization because the applications and the kernel are now implemented in the same language and we do not have to consider the interaction between different languages when defining the whole system \((A[O])\) behaviors.

Low-Level Language for OS Kernels. The middle of Fig. 2 shows the low-level language for the concrete implementation of OS kernels. Usually the kernels are implemented in C with inline assembly. However, giving semantics directly to C with inline assembly requires us to expose stacks and registers, which make the semantics overly complex. To avoid this problem, we extend the C statements with assembly primitives \( \iota \) to encapsulate the assembly code. Semantics of these primitives will be given below.

\textbf{switch} \ x \ switches \ to \ the \ target \ task \ x. \ \textbf{encrt} \ enters \ a \ critical \ region \ by \ disabling \ interrupts. \ It \ also \ saves \ the \ old \ IF \ onto \ the \ stack \ to \ allow \ nested \ critical \ regions. \ Note \ we \ use \ \textit{ie} \ to \ model \ the \ IF \ flag \ and \ abstract \ away \ other \ bits \ in \ the \ hardware \ EFLAGS \ register. \ \textbf{excrт} \ exits \ the \ current \ critical \ region \ by \ popping \ the \ stack \ to \ recover \ \textit{ie}. \ Since \ we \ hide \ stacks \ in \ our \ state \ model, \ we \ use \ an \ abstract \ stack \ \textit{cs} \ to \ save \ the \ historical \ \textit{ie} \ bits \ (see \ Fig. 2, \ which \ is \ explained \ below). \ \textbf{eoi} \ k \ clears \ the \ \textit{k}-th \ bit \ in \ \textit{isr}, \ indicating \ that \ the \ \textit{k}-th \ interrupt \ is \ no \ longer \ in \ service. \ \textbf{iext} \ enables \ interrupts \ and \ returns \ to \ the \ interrupted \ program.

The kernel implementation \( O \) consists of the system API implementation \( \eta_a \), the internal functions \( \eta_i \) and the interrupt handlers \( \theta \). The internal functions are called only by code in \( \eta_a \) or \( \theta \). \( \theta \) is a sequence of \( N \) interrupt handlers, where \( N \) is the maximum number of interrupts we support. The handler with the lower identifier has the higher priority. Then a complete low-level program \( P \) is defined as a pair of the application code \( A \) and the kernel code \( O \).

Operational Semantics. The language is concurrent, with multiple continuations (\textit{i.e.}, control stacks) in the state, each corresponding to a task. All tasks share
memory, but each has its own local variables and local interrupt states (see \( \delta \) in Fig. 2, which is explained below). We also separate the program state (including memory and variables) into two disjoint parts, one for the application code \( A \) and the other for the kernel code \( O \). The only way for \( A \) to access kernel states is to call system APIs in \( O \), and \( O \) cannot access application states.

We give small-step operational semantics to the language. For each step, the processor picks the continuation of the current task and executes its current command or expression. To model concurrency and interrupts, both commands and expressions could be executed in multiple steps, where each step corresponds to the granularity of a single machine instruction (as in CompCertTSO [22], but we use the sequential consistent model instead of the x86-TSO memory model).

The assembly implementation of the context switch routine is abstracted into the primitive \texttt{switch} \( x \). It switches the execution from the current task to the target task \( x \), where \( x \) stores the task identifier.

The other assembly primitives \( \iota \) are all related to interrupts management and handling. To model their semantics, we introduce interrupt states in the state model, as shown at the bottom of Fig. 2. The global register \texttt{isr} is shared by all tasks. It models the \texttt{isr} register in the 8259A interrupt controller, as explained in Sect. 2.1. In addition, there are local interrupt states \( \delta \) for each task. It contains a local copy \( ie \) of the \texttt{IF} flag in the EFLAGS register (see Sect. 2.1) recording whether interrupts are enabled, a stack \texttt{cs} consisting of the historical values of \( ie \) to support nested critical regions, and another stack \texttt{is} recording the sequence of interrupts that interrupt the execution of the task. The stack \texttt{is} is auxiliary data introduced mainly for verification purposes. \( \pi \) records the \( \delta \) of each task.

\texttt{encrt} enters a critical region by disabling interrupts (i.e., clearing the \( ie \) bit using \texttt{cli}). It also saves the old \( ie \) onto the \texttt{cs} stack. \texttt{excrt} exits the critical region by popping off the top value on \texttt{cs} and using it to restore \texttt{ie} (executing \texttt{sti} if the value is 1).

\texttt{eoi} \( k \) clears the \( k \)-th bit in \texttt{isr}, indicating that the \( k \)-th interrupt is no longer in service. \texttt{iext} is an abstraction of the \texttt{iret} instruction. It resets the \( ie \) bit to 1 to enable interrupts, pops out the topmost interrupt number on the \texttt{is} stack, and returns to the interrupted program.

### 3.2 The High-Level Specification Language

Viewing from the aspect of application programmers, we model the OS kernel as an extended C language with multi-tasking and system calls. As explained above, the C language is used to implement user applications \( A \), and the system calls invoke an abstract version of system routines in \( O \), which are implemented using a simple specification language. Correspondingly, the low-level concrete
representation of kernel states is modeled as algebraic abstract states at the high level. This section presents the high-level language and its semantics.

As shown in Fig. 3, the whole high-level program \( P \) consists of the application code \( A \) and the abstract specification of the kernel \( \emptyset \). The application code \( A \) is the same as in the low-level language (see Fig. 2). \( \emptyset \) contains the specifications \( \varphi \) for kernel APIs, \( \varepsilon \) for interrupt handlers, and \( \chi \) for the scheduler.

Programmers at this level have no control over interrupts (e.g., enabling or disabling interrupts). Always enabled, interrupts are modeled implicitly as abstract external events that may occur non-deterministically at any program points. At the high level an incoming level-\( k \) event is always handled by executing \( \varepsilon(k) \), i.e. the \( k \)-th handler specified in \( \varepsilon \).

The system APIs and interrupt handlers are specified as an abstract statement \( s \), which forms a simple but expressive specification language. \( \text{sched} \) does scheduling. Its semantics is determined by the abstract scheduler specification \( \chi \). As defined in Fig. 3, \( \chi \) is a binary relation between abstract states and task identifiers. That is, given an abstract state \( \Sigma \) (defined at the bottom of Fig. 3), \( \chi \) finds a related task identifier as the next task to execute. Note that \( \chi \) is a relation instead of a function, therefore the abstract scheduler could be non-deterministic. Since \( \chi \) is provided as part of the kernel specification, the semantics of \( \text{sched} \) in our language is configurable. Specifying details of the scheduling policies (instead of using a more abstract non-deterministic scheduler that may pick any task) allows us to specify and verify scheduling properties such as PIF at the high level.

\( \gamma(\vec{v}) \) is a meta-level relation (defined in Coq) that takes \( \vec{v} \) as arguments and maps an abstract state to another. It can be instantiated to specify any atomic transitions over abstract states. \( \text{assert} \ \varphi \) asserts that the predicate \( \varphi \) holds over the current abstract state. \( \text{end} \) represents the end of abstract APIs or interrupt handlers. \( s_1; s_2 \) and \( s_1 + s_2 \) are statements for sequential composition and non-deterministic choices respectively.

**Fig. 3.** High-level spec. language and abstract states
(a natural number), the task status (ready, waiting, etc.) and so on, depending on the low-level implementations. \( \text{ctid} \) is the name for the current task identifier \( t \).

**Example of High-Level Specifications.** We use \( s_{\text{dly}} \overset{\text{def}}{=} (\gamma_{\text{err}}(\text{ticks}) + (\gamma_{\text{dly}}(\text{ticks}); \text{sched})) \) to specify the system API “void OSTimeDly(Int16u ticks)”, which delays the current task for the specified number of system ticks. The atomic operation \( \gamma_{\text{err}}(\text{ticks}) \) specifies the error case when \( \text{ticks} = 0 \). \( \gamma_{\text{dly}}(\text{ticks}) \) defines the atomic behavior of updating the status of the current task from “ready” to “waiting” with the duration set to \( \text{ticks} \) when \( \text{ticks} > 0 \), and the following \( \text{sched} \) switches to another ready task, following the scheduling policy specified by the abstract scheduler \( \chi \). Note that the exclusive conditions over \( \text{ticks} \) in \( \gamma_{\text{err}}(\text{ticks}) \) and \( \gamma_{\text{dly}}(\text{ticks}) \) make the non-deterministic choice statement deterministic. We omit the definitions of \( \gamma_{\text{err}}(\text{ticks}) \) and \( \gamma_{\text{dly}}(\text{ticks}) \) here.

As another example, below we show the abstract scheduler specification \( \chi_{\mu C/OS-II} \) for \( \mu C/OS-II \). It requires that the selected task be ready and have the highest priority among all the ready tasks.

\[
\chi_{\mu C/OS-II} \overset{\text{def}}{=} \lambda \Sigma, t. \exists \alpha, \text{pr}. \Sigma(\text{tcbls}) = \alpha \land \alpha(t) = (\text{pr}, \text{rdy}) \land \forall t', \text{pr}'. (t \neq t' \land \alpha(t') = (\text{pr}', \text{rdy})) \rightarrow \text{pr}' < \text{pr}
\]

### 3.3 OS Correctness

As we explain in Sect. 2.2, the correctness of OS kernels can be defined in terms of contextual refinement. Below we give its formal definition.

**Definition 3.1 (OS Correctness).** \( O \sqsubseteq_{\psi} O \) iff

\[
\forall A, W, \mathbb{W}. \text{Match}(\psi, W, \mathbb{W}) \implies ((A, O), W) \preceq ((A, O), \mathbb{W})
\]

where \( \psi \in \text{LOSFullSt} \rightarrow \text{HAbsSt} \rightarrow \text{Prop} \) and

\[
\text{Match}(\psi, (T, \Delta, \Lambda, t), (T, \Delta, \Sigma)) \overset{\text{def}}{=} (t \in \text{dom}(T)) \land (\psi \Lambda \Sigma) \land (t = \Sigma(\text{ctid})) \land (\text{dom}(T) = \text{dom}(\Sigma(\text{tcbls})))
\]

The low-level kernel code \( O \) refines its high-level abstract specifications \( O \) with constraints \( \psi \) over initial kernel states, denoted as \( O \sqsubseteq_{\psi} O \), if and only if for any client code \( A \), low-level state \( W \) and high-level state \( \mathbb{W} \), if \( W \) and \( \mathbb{W} \) satisfy certain consistency constraint (w.r.t. \( \psi \)), then the set of observable behaviors of the low-level configuration \( ((A, O), W) \) is a subset of \( ((A, O), \mathbb{W}) \) (i.e., \( (P, W) \preceq (P, \mathbb{W}) \)), following the event trace refinement in [17]).

Due to space limit, we elide the definitions of \( W \) and \( \mathbb{W} \) in Sects. 3.1 and 3.2. The low-level whole program state \( W \) is in the form of \( (T, \Delta, \Lambda, t) \), where the task pool \( T \) maps task identifiers to their continuations, \( \Delta \) is the client state, \( \Lambda \) is the low-level kernel state, and \( t \) is the identifier of the current task. The high-level program state \( \mathbb{W} \) is in the form of \( (T, \Delta, \Sigma) \), where \( \Sigma \) is an abstraction of the low-level kernel state \( \Lambda \) and the current task id \( t \).

The constraint \( \text{Match} \) requires that: (1) initially \( W \) and \( \mathbb{W} \) have the same task pool \( T \) and client state \( \Delta \); (2) the current task \( t \) is in \( T \); (3) the low-level
Fig. 4. Specification of concurrent programs

kernel state $\Lambda$ and the high-level abstract state satisfy $\psi$; (4) the current task at the low level and the high level are the same; and (5) the set of tasks in the abstract TCB list should be the same as those in the low-level task pool.

4 Relational Program Logic for Refinement Verification

In this section, we present a CSL-style relational program logic for refinement verification. The logic uses relational assertions to prove refinement between an implementation and its specification. It also follows the ownership-transfer semantics in CSL to reason about multi-level hardware interrupts.

Refinement of Concurrent Programs, and Relational Reasoning. For concurrent programs, refinement establishes stronger functional correctness than traditional Hoare triples. As an example, the function inc shown in Fig. 4(a) increments the counter cnt. It may be called simultaneously by concurrent tasks. Figure 4(b) gives pre-/post-conditions to specify inc, which would be valid in a sequential setting and is sufficient to describe the functionality. However, they cannot be used in a concurrent setting because they are not stable with respect to concurrent behaviors of other tasks. To make them stable, we may need the specifications in Fig. 4(c), which is too weak to capture the functionality.

Figure 4(d) gives a relational specifications to show that inc refines an abstract operation $\langle \text{CNT}++ \rangle$ [19], where $\langle C \rangle$ represents an atomic operation $C$. The relational assertions specify three important entities, the concrete state (cnt), the abstract state (CNT) and the abstract operation ($\langle \text{CNT}++ \rangle$) that the program refines (which could be non-atomic in general [19]). The precondition requires that initially cnt has the consistent value with its abstract counterpart CNT, and the abstract operation that inc needs to refine is $\langle \text{CNT}++ \rangle$. The post-condition ensures cnt and CNT remain consistent and the remaining abstract operation that needs to be refined is end (i.e., $\langle \text{CNT}++ \rangle$ has been accomplished).

Our refinement proofs for OS kernels follow the same kind of relational reasoning, where the assertions now relate the concrete kernel state, the abstract kernel state ($\Sigma$) and the abstract statement ($s$).

Assertions. Below is the assertion language, and its semantics is given in Fig. 5.

\[
(\text{Asrt}) \quad p, q, r ::= \mathsf{emp} | \mathsf{empE} | x \rightarrow v | \text{ISR}(isr) | \text{IE}(ie) | \text{IS}(is) | \text{CS}(cs) | \llbracket k \rrbracket \quad \chi \triangleright t \\
| a \rightarrow \Omega \llbracket \llbracket s \rrbracket \rrbracket | p \star p | p \wedge p | \ldots \\

(\text{Inv}) \quad I ::= [p_0, \ldots, p_N]
\]
A Practical Verification Framework for Preemptive OS Kernels 69

\( (\text{RelState}) \Theta ::= (\sigma, \Sigma, s) \quad (\text{LTaskCfg}) \sigma ::= (m, isr, \delta) \quad (\text{LTaskSt}) m ::= (G, E, M) \)

\[
\begin{align*}
(\sigma, \Sigma, s) & \models \text{emp} \quad \text{iff} \quad \sigma.m.M = \emptyset \land \Sigma = \emptyset \\
(\sigma, \Sigma, s) & \models \text{empE} \quad \text{iff} \quad \sigma.m.E = \emptyset \land (\sigma, \Sigma, s) \models \text{emp} \\
(\sigma, \Sigma, s) & \models x \rightarrow v \quad \text{iff} \quad \exists a. (\sigma.m.G)(x) = a \land \sigma.m.M = \{a \rightarrow v\} \land \Sigma = \emptyset \\
(\sigma, \Sigma, s) & \models \text{ISR(isr)} \quad \text{iff} \quad \sigma.isr = \text{isr'} \land (\sigma, \Sigma, s) \models \text{emp} \\
(\sigma, \Sigma, s) & \models \chi \triangleleft t \quad \text{iff} \quad \chi \Sigma t \\
(\sigma, \Sigma, s) & \models ||s|| \quad \text{iff} \quad s = s' \land (\sigma, \Sigma, s) \models \text{emp} \\
(\sigma, \Sigma, s) & \models \Omega \quad \text{iff} \quad \Sigma = \{a \rightarrow \Omega\} \land \sigma.m.M = \emptyset
\end{align*}
\]

\[
f \downarrow g \overset{\text{def}}{=} \text{dom}(f) \cap \text{dom}(g) = \emptyset \\
\sigma_1 \uplus \sigma_2 \overset{\text{def}}{=} \begin{cases} 
\{ (G, E, M_1 \cup M_2, \text{isr}, \delta) \quad \text{iff} \quad M_1 \downarrow M_2 \land \sigma_1 = ((G, E, M_1), \text{isr}, \delta) \\
\{ (G, E, M_2, \text{isr}, \delta) \quad \text{iff} \quad M_2 \downarrow M_1 \land \sigma_2 = ((G, E, M_2), \text{isr}, \delta) \\
\emptyset \quad \text{otherwise}
\end{cases}
\]

\[
\Theta \models p_1 \ast p_2 \quad \text{iff} \quad \exists \Theta_1, \Theta_2. \Theta = \Theta_1 \uplus \Theta_2 \land \Theta_1 \models p_1 \land \Theta_2 \models p_2
\]

**Fig. 5.** Semantics of relational assertions

As explained above, the assertions are interpreted over relational states \( \Theta \), which consist of the low-level task-local states \( \sigma \), the high-level abstract states \( \Sigma \), and the abstract statements \( s \) that the low-level code needs to refine. \( \Sigma \) and \( s \) are defined in Fig. 3. \( \sigma \), as shown in Fig. 5, consists of a task-local view \( m \) of program variables and memory, and also the global isr register and the task-local interrupt states \( \delta \) (see Fig. 2). Here \( m \) contains the global and local variables \( (G \land E \land M) \) respectively and the memory \( M \), whose definitions are omitted.

Assertion \( \text{emp} \) says the low-level memory and the high-level abstract state are both empty. \( \text{empE} \) further requires that the local variable environment be empty too. \( x \rightarrow v \) specifies a singleton memory cell with \( v \) stored in the global program variable \( x \). ISR(isr), IS(is), IE(ie) and CS(cs) specify the value of the corresponding interrupt status (see Fig. 2). \( \chi \triangleleft t \) means that the currently running interrupt handler is at level \( k \) (or \( k = N \), meaning no running handlers).

\( \chi \triangleleft t \) says that, based on the high-level abstract state, the abstract scheduler \( \chi \) picks \( t \) as the target task. \( a \rightarrow \Omega \) specifies a singleton high-level abstract state mapping the data name \( a \) to the abstract data \( \Omega \). \( ||s|| \) means the current abstract statement remaining to be refined is \( s \). The separating conjunction \( p_1 \ast p_2 \) means \( p_1 \) and \( p_2 \) hold over disjoint parts of a relational state.

**Ownership-Transfer Semantics for Multi-level Interrupts.** CSL [21] prevents data races by enforcing disjoint ownership of resources among tasks. Synchronization is modeled in terms of ownership

**Fig. 6.** Memory partition for handler and non-handler (Figure taken from [11])
transfer. Feng et al. [11] extend CSL and assign ownership-transfer semantics to interrupt operations. The idea is demonstrated in Fig. 6, which shows the logical memory model when there are only one task and single-level interrupt. Since the interrupt handler can preempt the task, we let the handler to reserve its required memory first (represented as block $B$). $B$ must remain publicly available if the interrupt is enabled. Then the task can only access the remaining part (block $T$). We use grey boxes to represent local resources of the task. Disabling interrupts (cli) by the task essentially transfers the ownership of $B$ from public to task-local. Correspondingly, sti converts the block from task-local to public, therefore the task cannot access it anymore. Similarly, invocation of the interrupt handler (not shown in the figure) automatically transfers $B$ from public to the local resource of the handler, while iret transfers it back to public.

Since block $B$ is shared between the interrupt handler and the task, it must be well-formed when it is public. We use the resource invariant $I_0$ to specify the well-formedness. Then the above ownership transfer semantics of cli and sti can be formalized in the following (simplified) program logic rules:

$$I_0 \vdash \{pt \} \text{cli} \{pt \ast I_0\} \quad \quad I_0 \vdash \{pt \ast I_0\} \text{sti} \{pt\}$$

Note that the partition between $B$ and $T$ is enforced logically using the separating conjunction in separation logic (see Fig. 5). It does not require physical separation in the program state model.

In this paper we extend this idea to support multi-level nested interrupts, where the ownership transfer of interrupt primitives is determined not only by the ie flag, but also by the isr register. Figure 7 shows the memory model (where the number $N$ of interrupts is set to 6). Interrupt handlers at levels 0 to $N-1$ are assigned with resource blocks $B_0, \ldots, B_{N-1}$ respectively. $B_N$ represents the resource shared only among tasks, i.e., the non-handler code. We omit task-local resources, therefore there are no counterparts to block $T$ in Fig. 6. Handlers’ priorities to reserve their required resources are consistent with their interrupt priority levels. That is, $B_0$ satisfies all the need of the level-0 (highest priority) handler, while the level-$k$ handler may need to access $B_0, \ldots, B_{k-1}$, in addition to $B_k$. The non-handler has the lowest priority. Each block $B_k$ is specified by the resource invariant $I(k)$, where $I$ is defined as a sequence of $N+1$ assertions (see the assertion syntax defined above).

![Fig. 7. Ownership-transfer for multi-level interrupts](image-url)
Figure 7 demonstrates the ownership transfer of resource caused by interrupt operations under different conditions. The grey or dotted blocks represent resources exclusively owned in interrupt handlers, different textures for different interrupts. The white ones represent resources available for share. Suppose initially we are at state (1), where the level-3 handler is being executed, as the value of \( \text{isr} \) indicates. Since interrupts are disabled, the handler owns \( B_0 - B_3 \), knowing no requests of levels 0 to 3 could be served. Enabling interrupts (\( \text{sti} \)) loses \( B_0 - B_2 \), as shown by state (2), but \( B_3 \) is remained because \( \text{isr}(3) = 1 \) and requests of the same (or lower) level are not handled. However, if \( \text{isr}(3) = 0 \) instead (as in state (5)), executing \( \text{sti} \) loses \( B_3 \) as well. Ownership transfer by \( \text{cli} \) is the dual of \( \text{sti} \).

Executing \( \text{eoi} \) at state (1) leads to state (5), but it causes no ownership transfer because interrupts are disabled anyway. If interrupts are enabled instead, as in state (2), \( \text{eoi} \) loses the ownership of \( B_3 \) because another level-3 request may be handled in state (4). \( \text{iret} \) can be executed only after \( \text{eoi} \). If interrupts are disabled (as in state (5)), it transfers \( B_0 - B_3 \) from local resources to shared resources. Otherwise (as in state (4)) there is no ownership transfer because the handler has lost the ownership of \( B_0 - B_3 \) already.

At state (2), interrupts with higher priority can be served. The "irq 1" step sets the bit \( \text{isr}(1) \), disables interrupts, and transfers \( B_0 \) and \( B_1 \) from shared resources to local resources of the level-1 handler, as in state (3).

**The Top Rule.** We show some selected program logic rules in Fig. 8. The \( \text{TopRule} \) establishes the judgment \( \vdash \psi \mid O \vdash : \) \( O \) w.r.t. \( \subseteq \) if the initial concrete and abstract kernel states satisfy \( \psi \) (explained in Sect. 3.3).

To verify the kernel, we need to come up with a specification \( \Gamma \) for the internal functions \( \eta_i \) in the low-level code, and a sequence of invariants \( I \) for kernel states. \( \Gamma \) assigns a pair of pre-/post-conditions to each internal function. We omit the formal definition here.

Then we prove that the internal functions, the API implementations and the interrupt handlers in the low-level kernel satisfy their specifications, respectively (the last three premises in the first line of the \( \text{TopRule} \) rule). The proof of each component carries the abstract scheduler specification \( \chi \) and the invariant \( I \).

The rule also requires that \( \psi \) ensures the initial states satisfy the invariant \( I[0, N] \), the interrupt-related states are properly initialized, and the initial local variable environment is empty. \( I[n, m] \) defined in Fig. 8 is the separating conjunction of invariants from level \( n \) to \( m \). \( \text{OS}[\text{isr}, \text{ie}, \text{is}, \text{cs}] \) specifies the status of interrupts, and requires that the currently executing handler (on top of \( \text{is} \)) have the highest priority among those in service (as recorded in \( \text{isr} \)). \( [\psi] \) lifts \( \psi \) to relational assertions (definition omitted). We also omit some more detailed side conditions about the initial states in the rule.

**Verifying Interrupt Handlers.** We omit the rules of proving \( \chi ; I \vdash \eta_i : \Gamma \) and \( \Gamma ; \chi ; I \vdash \eta_a : \varphi \) for internal functions and APIs respectively, which are similar to the rules for interrupt handlers. The \( \text{ITRP} \) rule proves the correctness of
interrupt handlers. It requires that each individual interrupt handler is correct with respect to its specification. The judgment for statements is in the form of

\[ \Gamma; \chi; I \vdash \eta_\theta : \varepsilon \]

(TopRule)

\[ p \vdash \operatorname{BldIttrpPre}(k, e, \text{isr}, ss, I) \quad p_i = \operatorname{BldIttrpRet}(k, \text{isr}, ss, I) \]

(ITRF)

\[ \operatorname{dom}(\theta) = \operatorname{dom}(\varepsilon) \quad \Gamma; \chi; I \vdash \text{false}; p_i \vdash \{ p \} \theta(k) \{ \text{false} \} \quad \text{for all } k \in \{0, \ldots, N-1\} \]

(ENCRT)

\[ \Gamma; \chi; I \vdash \theta : \varepsilon \]

\[ \Gamma; \chi; I; r; p_i \vdash \{ \text{OS}[[\text{isr}, 0, ss, cs], k_\mu] \star [[s]] \} \text{encrt} \{ \text{OS}[[\text{isr}, 0, ss, cs] + \text{INV}(I(k)) \star I(0, k-1) \star [[s]]] \}

\[ \text{ENCRT-0} \]

\[ \Gamma; \chi; I; r; p_i \vdash \{ \text{OS}[[\text{isr}, 0, ss, cs] + [[s]]] \} \text{encrt} \{ \text{OS}[[\text{isr}, 0, ss, cs] + [[s]]] \}

\[ \text{ENCRT} \]

\[ \Gamma; \chi; I; r; p_i \vdash \{ \text{OS}[[\text{isr}, 1, k, cs] \star I(k) \star [[s]]] \} \text{eoi} \{ \text{OS}[[\text{isr}(k \rightarrow 0), 1, k, cs] \star [[s]]] \}

\[ p \Rightarrow \text{SWINV}(I) \star \text{IS}(is) \star \text{CS}(cs) \]

\[ \text{SWINV} \]

\[ \Gamma; \chi; I; r; p_i \vdash \{ (p * \text{[sched, s]} \star \chi > x) \text{switch} x \{ p \star [[s]] \} \}

\[ \text{SWITCH} \]

\[ \Gamma; \chi; I; r; p_i \vdash \{ p \} s \{ q' \} \quad q' \Rightarrow q \]

(ABSCSQ)

\[ \Gamma; \chi; I; r; p_i \vdash \{ p \} \text{exit} \{ q \}

\[ \text{ABSCSQ} \]

\[ I[n, m] = \text{let} \{ I(n) \star I(n+1) \star \ldots \star I(m) \text{ if } 0 \leq n \leq m \leq N \}

\[ \text{otherwise} \]

\[ \text{OS}[[\text{isr, te, ss, cs}] = \text{def} \, \exists k. \text{ISR}(isr) \star \text{IE}(te) \star \text{IS}(ss) \star \text{CS}(cs) \star \text{l} \wedge l \star (\forall k'. 0 \leq k' < k \rightarrow \text{isr}(k') = 0) \]

\[ \text{INV}(I, k) = \text{def} \, \exists k. \text{ISR}(isr) \star \text{IS}(isr) \star \text{IS}(isr) \star (\text{ISR}(k) = 1 \wedge \text{emp}) \vee ((\text{ISR}(k) = 0 \wedge k = N \wedge I(k))) \]

\[ \text{SWINV}(I) = \text{def} \, \text{ISR}(0) \star \text{IE}(0) \star (\exists k. k, k, j, I[0, k]) \]

\[ \text{BldIttrpPre}(k, e, \text{isr}, ss, I) = \text{def} \, \text{OS}[[\text{isr}(k \rightarrow 1), 0, k, \ast, \text{ss}, nil] = I[0, k] \star [[e(k)]]) \star \text{empE} \]

\[ \text{BldIttrpRet}(k, e, \text{isr}, ss, I) = \text{def} \, \exists k. \text{OS}[[\text{isr}(k \rightarrow 0), e, k, \ast, \text{ss}, nil] \star (\exists \varepsilon = 1 \wedge \text{emp}) \vee (\exists \varepsilon = 0 \wedge I[0, k]) \star [[\text{end}]] \]

\[ \text{Fig. 8. Selected inference rules} \]
Rules for Commands. The \textit{next} rule simply requires that the post-condition $p_i$ holds when we reach the end of the interrupt handler. The \textit{encrt} rule shows the ownership transfer when interrupts are disabled. Suppose we are at the level-$k$ handler ($k = N$ means we are executing the non-handler code). Disabling interrupts prevents interrupt requests from level 0 to $k - 1$, therefore the current task gains the ownership of $I[0, k - 1]$. The transfer of the $k$-th block is specified by $\text{INV}(I, k)$ in Fig. 8. If the bit $\text{isr}(k)$ is 0 (or $k = N$), the task also gains the ownership of $I(k)$, otherwise it already owns the $k$-th block and there is no extra ownership transfer. The two scenarios are also demonstrated by the two $\text{cli}$ steps in Fig. 7. If interrupts are already disabled when $\text{encrt}$ is executed, there is no ownership transfer, as shown by the $\text{encrt}$-$0$ rule.

The $\text{excrt}$ rule is the dual of the $\text{encrt}$ rule (see the two $\text{sti}$ steps in Fig. 7). Correspondingly there is a $\text{excrt}$-$0$ rule, which is omitted here. The $\text{eoi}$ rule says, if interrupts are enabled, the task loses the ownership of $I(k)$ after $\text{eoi}$ $k$. Otherwise there is no ownership transfer and the corresponding rule is omitted (see the two $\text{eoi}$ steps in Fig. 7).

The $\text{switch}$ rule requires that the invariant $\text{SWINV}(I)$ holds before switching away and it is preserved after switching back. $\text{SWINV}(I)$, defined in Fig. 8, says that interrupts must be disabled, and all the bits of $\text{isr}$ are 0 (i.e., either we are running non-handler code or we are in the outmost layer of nested invocation of interrupt handlers and have already executed $\text{eoi}$). Also if we are running level-$k$ code (either handler or non-handler if $k = N$), the resource blocks 0 to $k$ acquired before should satisfy $I[0, k]$, so that the target task could access them. The rule also says that the task-local states $\text{id}$ and $\text{cs}$ are not changed by $\text{switch}$.

To establish refinement, the precondition also requires that the high-level abstract scheduler $\chi$ picks the same task with the one in $x$, and $\text{switch}$ $x$ at the low level correspond to the $\text{sched}$ step at the high level. Therefore in the post-condition $\text{sched}$ is no longer in the remaining abstract operations.

Following [19], the $\text{abscsq}$ rule looks like a regular consequence rule but allows us to execute the abstract code. The implication $p \Rightarrow p'$ is defined below.

$$\forall \sigma, \Sigma, s. (\sigma, \Sigma, s) \models p \implies \exists \sigma', \Sigma', (s, \Sigma) \xrightarrow{\text{abscsq}} (s', \Sigma') \wedge ((\sigma, \Sigma', s') \models p')$$

That is, given a related state $(\sigma, \Sigma, s)$ satisfying $p$, the abstract code $s$ could execute zero or multiple steps starting from $\Sigma$ and reach $(\Sigma', s')$, so that the resulting related state $(\sigma, \Sigma', s')$ satisfies $p'$. This rule allows us to establish simulation between the concrete and the abstract code, which then ensures refinement.

We can look at Fig. 4 to see the use of this rule. Suppose we want to verify $\text{inc()}$ using the specification in Fig. 4(d). When we reach the $\text{cas}$ command (see Fig. 4(a)), we have the precondition ($\text{tmp} = \text{cnt} \land \text{cnt} = \text{CNT} \land [\text{CNT++}] \lor \ldots$) (the case for $\text{tmp} \neq \text{cnt}$ omitted). Right after $\text{cas}$, we have ($\text{done} \land \text{cnt} = \text{CNT+1} \land [\text{CNT++}] \lor \neg\text{done} \land \ldots$). We have $\text{cnt} = \text{CNT+1}$ because $\text{cnt}$ increments if $\text{cas}$ succeeds. To establish the simulation, we apply the $\text{abscsq}$ rule to execute the abstract code, because ($\text{cnt} = \text{CNT+1} \land [\text{CNT++}]$) $\Rightarrow$ ($\text{cnt} = \text{CNT} \land [\text{end}]$), following the above definition of $p \Rightarrow p'$. 


Theorem 4.1 gives the soundness of the framework. The proofs are based on a compositional simulation following [18], and have been formalized in Coq. More details about the logic can be seen in TR [26].

**Theorem 4.1 (Soundness).** $\vdash \psi_0 \Rightarrow O \subseteq \psi$.

## 5 Proving Priority-Inversion-Freedom

**Formalization of PIF.** Earlier work [6] defines priority inversions in terms of whether there is a higher priority task waiting directly or indirectly for a lower priority task. Since the definition refers to the *current* priority of tasks, its meaning is affected by algorithms that dynamically change the priority of tasks, such as the classic priority ceiling and priority inheritance algorithms [23]. We give a new formalization of PIF, which is based on the *original* priorities assigned by the programmers, reflecting the actual degree of urgency.

**Definition 5.1 (Priority Inversion Freedom).** PIF($\Sigma$) holds, iff for any $t, t_c, pr$ and $pr_c$, if $t \neq t_c, t_c = \text{CurTask}(\Sigma)$, $pr = \text{OrgPr}(t, \Sigma)$, $pr_c = \text{OrgPr}(t_c, \Sigma)$, $\text{IsWait}(t, \Sigma)$ and $\neg \text{IsOwner}(t_c, \Sigma)$, then $pr \preceq pr_c$.

It says, if the current task $t_c$ does not own any shared resources, then its original priority should be higher than (or equal to) any other waiting tasks $t$. Here $\text{OrgPr}(t, \Sigma)$ represents $t$’s original priority assigned by programmers. $\text{IsWait}(t, \Sigma)$ means that $t$ is blocked, waiting for certain shared resource, and $\neg \text{IsOwner}(t_c, \Sigma)$ means that the task $t_c$ does not own any shared resource (*e.g.*, mutexes).

If each task eventually releases its shared resource (*i.e.*, there is no deadlock), the definition ensures that the waiting task with higher priority will be eventually released and executed. Therefore it prevents unbounded priority inversion [23].

**PIF of $\mu$C/OS-II.** The mutex of $\mu$C/OS-II is implemented with a simplified priority ceiling protocol [23]. When proving it satisfies PIF, we find a counterexample (given in TR [26]) showing that PIF cannot be guaranteed unless there is no nested use of mutexes. By adding the assumption of no nested mutexes, we prove that the mutex in $\mu$C/OS-II ensures our PIF definition.

**Theorem 5.2 (PIF without Nested Use of Mutexes).**

If $\text{Init}(\Sigma)$, $(A, \emptyset_{\mu\text{C/OS-II}}) \vdash (T, \Delta, \Sigma) \models (T, \Delta, \Sigma')$, $\text{NoNCR}(A, \Sigma, T, \Delta)$, and $\text{SchedProp}(\Sigma')$, then $\text{PIF}(\Sigma')$.

It says, for any application code $A$, task pool $T$, client state $\Delta$ and abstract kernel state $\Sigma$, if initially there are no tasks waiting for mutexes ($\text{Init}(\Sigma)$), and there is no nested use of mutexes ($\text{NoNCR}(A, \Sigma, T, \Delta)$), then for any $T', \Delta'$ and $\Sigma'$ generated during the execution, if $\Sigma'$ is consistent with the priority-based scheduling (*i.e.*, the currently running task always has the highest priority among all the ready tasks, represented as $\text{SchedProp}(\Sigma')$), then it must satisfy PIF. Here we use a simplified $\emptyset_{\mu\text{C/OS-II}}$ that contains the PIF mutex as the only APIs. The proof is formalized in Coq.
6 Verifying μC/OS-II

We have applied our framework to verify key modules (around 1300 lines of C code without counting comments and empty lines) of μC/OS-II V2.52, including the scheduler, the timer interrupt handler, mutexes, message queues, mail boxes, semaphores, and the time management. These 1300 lines of C code verified in our framework correspond to around 3250 lines of code in their original format (with comments and empty lines) in the source files of μC/OS-II, including “ucos_ii.h”, “os_q.c”, “os_sem.c”, “os_mbox.c”, “os_mutex.c”, “os_time.c”, “os_core.c” and “os_cpu_a.c”. The verified modules cover 63% of the frequently used APIs and internal functions [2]. We ignore some synchronization APIs which have similar functionality as the verified ones. Verification of task creation/deletion is still ongoing work based on the presented framework.

**Modifications to the Original Code.** Our verification is based on the original code with some minor modifications. For instance, the API `OSQPend(S)` is used to receive a message from a queue, and its original code does not check if the input pointer `S` points to a valid event control block, because it assumes that the client code always gets `S` by calling `OSQCreate()` (thus `S` should already be valid). We drop this assumption about the client code. Correspondingly we insert code that checks whether `S` is a valid pointer. If `S` is invalid a new error code is returned. Similar modifications are made to some other modules too. The reason for doing above modifications is that the contextual refinement proved in our verification framework assumes arbitrary client code, while kernels are usually implemented with assumptions over client code for efficiency.

<table>
<thead>
<tr>
<th>Framework</th>
<th>Coq lines</th>
<th>Verified Modules</th>
<th>lines of C</th>
<th>Coq lines</th>
</tr>
</thead>
<tbody>
<tr>
<td>Basic Libraries</td>
<td>32061</td>
<td>Global Declarations</td>
<td>187</td>
<td>-</td>
</tr>
<tr>
<td>Machine &amp; Logic</td>
<td>23095</td>
<td>Message Queue</td>
<td>240</td>
<td>4537</td>
</tr>
<tr>
<td>Automated Tactics</td>
<td>21050</td>
<td>Semaphore</td>
<td>166</td>
<td>2441</td>
</tr>
<tr>
<td>Total</td>
<td>76206</td>
<td>Mailbox</td>
<td>171</td>
<td>3326</td>
</tr>
<tr>
<td><strong>Certified μC/OS-II</strong></td>
<td><strong>104848</strong></td>
<td>Mutex</td>
<td>301</td>
<td>17331</td>
</tr>
<tr>
<td>C Code Definitions</td>
<td>1824</td>
<td>Time Management</td>
<td>39</td>
<td>861</td>
</tr>
<tr>
<td>Specifications</td>
<td>6012</td>
<td>Timer Interrupt</td>
<td>17</td>
<td>443</td>
</tr>
<tr>
<td>Priority Inversion Freedom</td>
<td>9570</td>
<td>Internal Functions</td>
<td>195</td>
<td>5447</td>
</tr>
<tr>
<td>Libraries for μC/OS-II</td>
<td>62085</td>
<td><strong>Final Theorems</strong></td>
<td>-</td>
<td>501</td>
</tr>
<tr>
<td>Auto. Generated Code</td>
<td>25357</td>
<td><strong>Total</strong></td>
<td>1316</td>
<td>34887</td>
</tr>
</tbody>
</table>

**Proof Efforts.** The Coq implementation consists of around 216,000 lines of code and proofs in Coq8.4pl6. Table 1 gives a break down of the number of lines for various components. Compiling the entire Coq package takes around 16 h on
a machine with 3.6 GHz cpu and 32G memory. The work takes us around 5.5 person years in total, including 4 person years for the framework and 1 person year for verifying the first μC/OS-II module (Message Queue). With the facilities (tactics, libraries and invariants etc.) being stabilized, verifying the remaining modules (around 900 lines of C code) only takes us around 6 person months.

The most challenging part is to verify the timer interrupt handler, which traverses the entire TCB list and updates task status in each TCB block. It needs to access all the shared data structures in μC/OS-II. Several different updates to shared data structures make the loop invariant quite complicated.

Also verifying an existing OS kernel is more difficult than verifying a new one written for verification purpose. When verifying μC/OS-II the major difficulty comes from the gap between the low-level concrete data structure and the high-level abstract representation. For instance, μC/OS-II uses a smart bitmap algorithm to record whether a task is in the waiting queue. The implementation requires us to establish a subtle consistency relation between the low-level bitmap and the high-level abstract waiting queue. The verification would have been much simpler if the waiting queue is simply implemented as a linked list.

**Coq Tactics.** Proof automation is essential to improve the productivity. We develop tactics for automatically proving relational separation logic assertions and generating verification conditions based on existing techniques [5,7,20]. They do forward reasoning for statements, including function calls and primitives entering and exiting critical regions, etc. Also some domain-specific tactics are implemented for individual data structures used in μC/OS-II, including ones for the arithmetic properties of Int32 and bitmaps. Thanks to these tactics, the ratio of Coq proof scripts to the verified C code is around 26:1. Another advantage of the tactics is that they can extract lemmas independent of program contexts for verifying functionality of code. Users can verify code using the tactics without knowing much about the underlying framework.

### 7 Related Work and Conclusion

There have been a number of OS verification projects, including seL4 [15,16], Verisoft [4], VCC/VeriSoftXT [3,9], Verve [27], and CertiKOS [8,13]. Most of them have no or limited support of preemption and multi-level interrupts.

seL4 [15,16] is one of the milestone OS kernel verification projects. The verification is fully mechanized in Isabelle/HOL. The kernel of seL4 does not support general preemption. Instead, tasks are preemptible only at specific points. Therefore the code verified is mostly sequential. On the other hand, the seL4 project has verified rich features and properties such as virtual memory, real-time properties and security properties, which are not done in our work.

The Verisoft project also verifies OS microkernels [4] in Isabelle/HOL, but the CVM model used there does not permit interrupts inside the kernel. Its successor project, Verisoft XT [3], uses VCC [9] to verify the commercial Hyper-V hypervisor. VCC supports verification of concurrent C code by inserting auxiliary
code and ghost states. The proofs have a refinement flavor, but VCC does not establish contextual refinement as what we do. Also it is unclear how VCC is applied to verify multi-level nested interrupts in hypervisors.

Verve [27] combines a type-safe kernel with a minimal hardware abstraction layer. The kernel is concurrent, but the properties verified are mostly about type safety, much weaker than our contextual refinement property. Also Verve simply squashes multiple interrupt levels into a single level and does not really handle multi-level interrupts. VCC/VerisoftXT and Verve use the Z3 SMT solver [10] for better automation, while we use Coq which generates machine-checkable proofs. Also the soundness of our program logic is proved in Coq. Therefore the trusted computing base (TCB) of our approach is smaller.

Gu et al. [13] verify the mCertiKOS hypervisor. Their kernel is sequential. Recently, Chen et al. [8] propose a framework for building certified interruptible OS kernels (based on mCertiKOS) with device drivers. Their framework does not support preemptive concurrency as ours, and it requires that interrupt handlers for device drivers and non-handler kernel code should not share any state.

Gotsman and Yang [12] developed a program logic based on CSL, which decomposes the verification of preemptive kernels into verifying the scheduler and the tasks. Their proofs are on-paper only and not mechanized. The machine model does not support multi-level interrupts, also their program logic is used to prove partial correctness, not contextual refinement as we do.

**Conclusion.** We have developed a practical verification framework for general verification purpose of preemptive OS kernels with multi-level interrupts. Correctness of the OS kernel is formalized as a contextual refinement between the low-level concrete implementations and the high-level specifications. As far as we know, our work is the first to establish contextual refinement for system APIs of a preemptive OS kernel. We have applied the framework to verify key modules and PIF of μC/OS-II, a commercial embedded real-time OS.

It is worth noting that although our verification framework is developed to verify μC/OS-II, it is a general verification framework and most of its building blocks can be reused to verify other OS kernels. As shown in Fig. 1, the small-step semantics for the C subset, the program logic and the tactics are all general and mostly independent of the μC/OS-II verification project. A potential limitation is that the interrupt mechanism in our operational semantics is modeled specifically based on the Intel 8259 A interrupt controller, and the program logic rules for interrupts are designed accordingly. However, the logic rules follow the general ownership transfer idea from CSL. With a different processor and interrupt mechanism, even though we may need to change the current inference rules for interrupt primitives, we can apply the same ownership transfer idea, and the required change should be superficial. Another limitation is that our C subset is chosen based on the μC/OS-II code. In particular, it does not allow function pointers, which requires the support of higher-order functions in the logic.
References

2. The real-time kernel: μC/OS-II. http://micrium.com/rtos/ucosii/overview


