关键词:
Operating Systems
State Management
State Spill
Programming Languages
Systems Software
Language Safety
摘要:
State management has become an intractable problem in modern operating systems due to their sheer size and complexity. Despite efforts to cleanly modularize OSes, the propagation and mismanagement of states remains a significant obstacle to many computing goals, e.g., system evolution and fault tolerance. We identify the root cause of such obstacles to be state spill, the phenomenon in which a software entity’s state undergoes a lasting change as a result of handling an interaction with another entity. We systematically study the existence and manifestation of state spill in existing OSes and find that it is deeply ingrained in both low-level OS kernels and framework-level components like Android system services. To this end, we introduce Theseus, an experimental OS written from scratch in Rust that rethinks overall OS structure and treats state management as a first-class design concern. Theseus makes two primary contributions. First, its OS structure consists of many tiny cell-like entities with clear, runtime-persistent bounds that are all loaded and linked dynamically, and interact without holding states for one another. Second, its intralingual design and implementation realizes OS functionality using existing language-level mechanisms, empowering the compiler to enforce invariants about OS semantics and enabling us to shift the responsibility of resource bookkeeping from the OS into the compiler, vastly reducing the set of states the OS must necessarily maintain. Together, Theseus’s structure, intralingual design, and state management principles facilitate desirable computing goals, allowing us to realize easy and arbitrary live evolution, system flexibility, and availability through fault recovery, even for core OS components.