Persistent Programming

The aim of persistent programming is to support the design, construction, maintenance and operation of long-lived, concurrently accessed and potentially large bodies of data and programs. When research into persistent programming began, persistent application systems were supported by disparate mechanisms, each based upon different philosophical assumptions and implementation technologies . The mix of technologies typically included naming, type and binding schemes combined with different database systems, storage architectures and query languages.

Atkinson postulated that, in many cases, the inconsistency was not fundamental but accidental. The various subsystems were built at different times when the engineering trade-offs were different. In consequence, they provided virtually the same services, but inconsistently since they were designed and developed independently. By contrast, Orthogonal Persistence provided the total composition of services within one coherent design, thereby eliminating these accidental disharmonies.

Orthogonally persistent object systems support a uniform treatment of objects irrespective of their types by allowing values of all types to have whatever longevity is required. The benefits of orthogonal persistence have been described extensively in the literature. They can be summarised as:

  • improving programming productivity from simpler semantics;
  • avoiding ad hoc arrangements for data translation and long-term data storage;
  • providing protection mechanisms over the whole environment;
  • supporting incremental evolution; and
  • automatically preserving referential integrity over the entire computational environment for the whole life-time of an application.

With orthogonal persistence there no distinction between data formats is visible to the programmer, irrespective of the data’s longevity. Atkinson and Morrison identified three Principles of Orthogonal Persistence:

  • The Principle of Persistence Independence
    The persistence of data is independent of how the program manipulates the data. That is, the programmer does not have to, indeed cannot, program to control the movement of data between long term and short term store. This is performed automatically by the system.

  • The Principle of Data Type Orthogonality
    All data objects should be allowed the full range of persistence irrespective of their type. That is, there are no special cases where objects of a specific type are not allowed to be persistent.

  • The Principle of Persistence Identification
    The choice of how to identify and provide persistent objects is orthogonal to the universe of discourse of the system

The application of the three principles yields orthogonal persistence. Violation of any of these principles increases the complexity that persistent systems seek to avoid.

The first language to provide orthogonal persistence was PS-algol, which provided persistence by reachability for all data types supported by the language. PS-algol adds a small number of functions to S-algol, from which it was derived. These are open_database, close_database, commit and abort. A number of functions are also provided to manage associative stores (hash maps), called tables in PS-algol. These functions are s_lookup, which retrieves a value associated with a key in a table, and s_enter, which creates an association between a key and a value in a table. By convention, a database always contains a pointer to a table at its root. Databases serve as roots of persistence and can be created dynamically.

A second version of PS-algol incorporated procedures as data objects thereby allowing code and data to be stored in the persistent store.

Napier88 attempted to explore the limits of orthogonal persistence by incorporating the entire language support environment within a strongly typed persistent store. The research produced the first integrated, self-contained, type-safe persistent environment.

The Napier88 system provides orthogonal persistence, a pre-populated strongly typed stable store, higher-order procedures, parametric polymorphism, abstract (existential) data types, collections of name-value bindings, graphical data types, concurrent execution, two infinite union types for partial specification, and support for reflective programming. Notable additions over PS-algol include the following:

  • the infinite union type any, which facilitates partial and incremental specification of the structure of the data
  • the infinite union type environment, which, in addition to the above, provides dynamically extensible collections of name/L-value bindings—and thereby the dynamic construction of independent name spaces over common data
  • parametric polymorphism in a style similar to that later popularised by Java generics, but with computation over truly persistent polymorphic values
  • existentially quantified abstract data types for data abstraction
  • a programming environment, including graphical windowing library, object browser, program editor and compiler, implemented entirely as persistent objects within the store
  • support for hyper-code, in which program source code may contain embedded direct references to extant objects
  • support for structural reflection, where a running program may generate new program fragments and integrate these into its own execution.

The integrated persistent environment of Napier88 that supported higher- order procedures yielded a new programming paradigm, which is only possible by this means, whereby source programs could include direct links to values that already exist in the persistent environment. The programming technique was termed hyper-programming and the underlying representation hyper-code. Hyper-code is a representation of an executing system modelled as an active graph linking source code, existing values and meta-data. It unifies the concepts of source code, executable code and data, by providing a single representation (as a combination of text and hyperlinks) of software throughout its lifecycle. Sharing is represented by multiple links to the same value. Hyper-code also allows state and shared data, and thereby closure, to be preserved during evolution.

Persistent Java was implemented on the Grasshopper operating system. Unlike the other persistent Java systems, no modifications were made to the abstract machine or to the bytecode generated for a particular application. Instead, orthogonal persistence was achieved by instantiating the entire Java machine within a persistent address space. In this system, like the later ANU-OPJ system, static fields were implicitly roots of persistence.

Alan Dearle
Professor of Computer Science

My research interests include similarity search, data linkage, operating systems, databases and programming languages.