Direction des Relations Internationales (DRI)

Programme INRIA "Equipes Associées"

AFMES: Advanced Formal Methods for Embedded Systems


Abstract

This collaboration involves two teams, the POP ART team from INRIA (led by Alain Girault, directeur de recherche at INRIA) and the ACEI team from the University of Auckland, New Zealand (led by Zoran Salcic, professor at the University of Auckland).

We work on some of the most important challenges for the design of embedded systems. Let us recall that embedded systems are characterized by several constraints, such as enormous complexity and heterogeneity, need for determinism or bounded reaction time. Accordingly, design methods for embedded systems should, wherever possible, be automated and guarantee these properties by construction, therefore shifting the burden of checking these constraints from the programmer/system designer to the design method. In order to achieve this, our goal is to improve the existing design methods in several key directions: (1) Incremental converter synthesis. (2) Programming language for adaptive computing (SystemJ and beyond). (3) Time predictable programming language and execution architectures. Together, these advanced formal methods will provide foundations for automated design and higher level of safety of the designed embedded systems.


I. OBJECTIVES --- II. PARTNERS --- III. WORK PROGRAMME --- IV. RESULTS --- V. VISITS


I. SCIENTIFIC OBJECTIVES

Embedded systems are characterized by several constraints, such as determinism and bounded reaction time. Accordingly, design methods for embedded systems should, when possible, guarantee these properties by construction. This allows the shifting of the burden of checking these constraints from the programmer to the design method and the associated compilers and code generation tools. In order to achieve this, our goal is to improve the existing design methods in several key directions: (1) Incremental converter synthesis. (2) Programming language for adaptive computing (SystemJ and beyond). (3) Time predictable programming language and execution architectures. Together, these advanced methods will provide a higher level of safety in the design of embedded systems. We now present in more details those three research axes.

1. Incremental converter synthesis

System-on-Chip (SoC) design involves the interconnection of many pre-designed components, called IP (Intellectual Property) blocks. While, a set of selected IPs may meet the functional requirements, their interaction protocols may not be consistent, leading to several kinds of mismatches. The most common types of such mismatches are control, data, and clock mismatches. Control mismatches happen when the sequencing of control signals between protocols is inconsistent. Data mismatches happen when the data-widths of the two protocols differ and additional buffers are needed to manage loss-less data communication. Clock mismatches are common between IPs having different clock frequencies. The goal of converter synthesis is to start from a pair of components A and B and possibly a global specification S, and then synthesize automatically a converter I (also often called an adaptor or an interface in the literature) that solves the mismatches between A and B's interaction protocols, and is such that the composition of the two components and their adaptor, (A || I || B), satisfies the global specification S, if S was required by the user (where ''||'' is the synchronous parallel product).

The first solutions proposed to solve the converter synthesis problem failed to address several questions. These include how to formally model protocols and their interaction, and when such models are available, how to determine if a converter exists for a given set of protocols? Moreover, once the existence of a converter is determined, how to synthesize it? More recently, a set of formal techniques have been proposed [Kumar & Nelvagal, ISCACSD 1996] [Passerone, de Alfaro, Henzinger & Sangiovanni-Vincentelli, ICCAD 2002] [D'Silva, Ramesh & Sowmya, IEE CDT 2005] [Sinha, Roop & Basu, SLA++P 2007] [Tivoli, Fradet, Girault & Goessler, TACAS 2007] [Sinha, Roop, Basu & Salcic, DATE 2009] [Roop, Girault, Sinha & Goessler, ACSD 2009] to address these questions.

In the AFMES project, we propose to extend the most advanced of these techniques [Roop, Girault, Sinha & Goessler, ACSD 2009] in two directions: firstly to handle in a uniform framework both control and data mismatches, and secondly to achieve incrementality. This last property is actually the conjunction of two sub-properties. The first sub-property denotes the ability to synthesize a first converter I1 between A and B, and then synthesize a second converter I2 between the assembly (A || I1 || B) and the third component C, such that the resulting assembly ((A || I1 || B) || I2 || C) be equivalent to the alternative assembly (A || I1' || (B || I2' || C)). In other words, the converter synthesis operation should be associative (in saying this, we are, somewhat abusively, seeing the converter synthesis binary operation ". || I || ." as a binary operator). The second sub-property denotes the ability to synthesize a converter I directly for the three components A, B, and C, such that the resulting assembly (I || A || B || C) be equivalent to ((A || I1 || B) || I2 || C). Proposing such a sound converter synthesis framework is a difficult and an important challenge that we wish to tackle within the AFMES project.

2. Programming language for adaptive computing (SystemJ and beyond)

System level design languages like Esterel [Berry & Gonthier, SCP 1992], Signal [Le Guernic, Gautier, Le Borgne & Lemaire, Proc. IEEE 1991], Lustre [Halbwachs, Caspi, Raymond & Pilaud, Proc. IEEE 1991], and SystemJ [Malik et al., 2009] can reduce the design and development times in the software development life cycle as these languages are based on a formal model of computation. The current system level languages are sufficient for programming static systems, i.e., systems that do not evolve in their life times. Such static system form a significant but still, a restricted portion of the complete set of heterogeneous models. The future of computing is ingrained in dynamic systems. Dynamic systems morph during their life time. Examples of such systems include, wireless mobile computing platforms, secure surveillance systems, sensor networks, and so on.

The domain of system level languages that target such dynamic systems includes RML (Reactive ML) [Mandel and Pouzet, PPDP 2005], SugarCubes [Boussinot and Susini, SPE 1998], and Junior [Hazard et al., INRIA RR 1999]. All three execution platforms are based on the SL model of computation [Boussinot and de Simone, IEEE TSE 1996], and allow dynamic forking of synchronous reactions at runtime. The synchronous nature of these execution platforms restricts targeting only closed knit models, i.e., systems with large dependencies and running on a tightly coupled platform like a single processor. There is currently no system level design language that can be used to design and implement dynamic and distributed loosely knit systems.

SystemJ combines the Esterel model of computation with the decoupling of CSP [Hoare, CACM 1985] to implement the Globally Asynchronous Locally Synchronous (GALS) model of computation. The asynchronous subset of the SystemJ language allows the targeting of distributed and multi-core systems. Currently, SystemJ only allows building and implementing static GALS models and systems. Thus, the next step in the evolution of the language will be to target dynamic systems, which is precisely our goal inside the AFMES project.

We seek to define extensions to the SystemJ language for adaptive computing systems, in which the system functionality evolves over time. The AFMES project will result in a suite of tools in the form of an integrated development environment including a compiler, debugger, and flexible code generators for a range of execution platforms to be readily used by target applications. Examples of such execution platforms include, but are not limited to, mobile computers, sensor networks, distributed embedded and control systems and platforms for ad-hoc collaborative computing. Examples of target applications include smart power grid, flexible manufacturing systems, and intelligent environments in health and elderly care.

3. Time predictable programming language and execution architectures

Speculative execution on modern processors maximizes throughput at the expense of predictability. Hence, the design of time-predictable embedded systems is a very time consuming and manual process to ensure correctness. The use of real-time operating systems to achieve predictable timing has associated flaws due to the inherent non-determinism. A move in the right direction to achieve predictable timing is a proposal for the design of Precision Timed (PRET) architectures [Edwards & Lee, DAC 2007], where time is a property at all levels of abstractions of a design. It should be emphasized that achieving a predictable execution (time predictable, deterministic, ...) is of paramount importance when designing safety critical systems.

Within the AFMES project, we seek to develop a new language called Precision Timed C (PRET-C) that guarantees precise timing by providing a set of simple extensions to C for logical time and thread-safe programming [Andalam, Roop, Girault & Traulsen, RePP 2009]. Logical time will be mapped to physical time through a static analysis [Roop et al., CASES 2009], which will provide very accurate estimates of the execution time of a program by relying on the underlying PRET architecture. The current execution architecture is the so called ARPRET (Auckland PRET Processor), a general soft-core processor (Microblaze) augmented with a dedicated Predictable Functional Unit and implemented on FPGA [Andalam, Roop, Girault & Traulsen, RePP 2009]. The combination of a logically timed language, static analyses, and a customized PRET processor shall guarantee timing while not sacrificing throughput. Our approach will be the first to offer all these in a single package, so that the complexities of designing time-critical embedded systems will be greatly simplified.

We seek to develop an integrated development environment (IDE) that supports the PRET-C compiler, debugger, and timing analyzer. The proposed compiler will generate code for both pure software execution of PRET-C on any general purpose processor, and for our ARPRET dedicated processor. Compiler switches shall be able to enable code generation for both the worst case and average case execution (when the user wants to generate code so as to maximize throughput at the expense of predictability). The debugger will extend a standard debugger such as the GNU debugger (GDB) to support debugging of PRET-C and C code. A timing analyzer will be developed combining model checking and abstract interpretation to determine very close over-approximation of the worst case tick length (also called the Worst-Case Reaction Time -- WCRT).

We will support a range of architectural extensions for efficient PRET-C execution. Possible extensions include thread interleaved pipelining, multi-threading support in hardware, and possibly predicated execution. The proposed compiler should be capable of generating code on all these architectures. We will compare the software execution of PRET-C with the execution on a range of hardware platforms that support predictable execution of PRET-C.

Multicore processors can provide better power-performance trade-offs than single cores. The availability of affordable multicores has resulted in their penetration into market segments with safety-critical concerns, such as automotive engine control units (ECUs). However, the programming and timing analysis of multicores is non-trivial due to inteferences that cores suffer from when accessing shared resources. Without a disciplined approach to the parallel programming of multicores, the understanding and debugging of multicore systems becomes difficult. Thus, we aim to develop a new language called FORE-C to enable the deterministic parallel programming of multicores. Initially, we will target the execution of FORE-C programs on general purpose embedded multicores. To guarantee the timing behaviour of FORE-C programs, we will develop an efficient static analysis method to compute precise execution times. For experimentation purposes, an extensible multicore simulator will be created to execute benchmark programs and report runtime statistics. We will demonstrate that programs written in FORE-C, while being deterministic, will still benefit from multicore execution.

The proposed PRET architecture [Edwards & Lee, DAC 2007] is of interest to FORE-C due to the interleaved (parallel) execution of hardware threads within the same core. The current method of programming the PRET architecture is to create and compile a C program for each hardware thread. We will investigate the use FORE-C to offer a centralised programming model where a parallel program is mapped to the available hardware threads. We also expect to benefit from the design considerations of the proposed PRET architecture for time predictability and execution performance.


II. PARTNERS

This collaboration will involve two teams, the POP ART team from INRIA (led by Alain Girault, directeur de recherche at INRIA) and the ACEI team from the University of Auckland, New Zealand (led by Zoran Salcic, professor at the University of Auckland).

1. INRIA partner

The POP ART team at INRIA focuses on formal methods for the design of safety-critical embedded systems. In particular, they have a worldwide recognized expertise is the areas of formal design methods, synchronous programming languages, semantics, and compilers. The POP ART team is currently involved in the European Network of Excellence Artist-Design on embedded systems as well as several international and national projects (Combest, Cesar, Asopt, Apron, OpenTLM, Vedecy, ...). In the past, the POP ART team was also involved in two other NoEs: AOSD-Europe NoE on aspect-oriented software design and Artist-II on embedded systems.

With respect to the AFMES project, the POP ART team has recently made important advances in the areas of compositional model-checking [Goessler & Sifakis, SCP 2005], adaptor synthesis [Tivoli, Fradet, Girault & Goessler, TACAS 2007], and GALS systems [Caspi, Girault & Pilaud, IEEE TSE 1999] [Girault, Nicollin & Pouzet, ACM TECS 2006]. Concerning component-based approaches, the BIP model (Behavior, Interaction, Priorities) was jointly designed by the Pop Art team and by Verimag [Goessler & Sifakis, SCP 2005].

Alain Girault is "directeur de recherche" at INRIA. He received his PhD from Grenoble INP in 1994 after doing his research in the Verimag laboratory. He spent two years and a half as a postdoc researcher, in the ESTEREL team in Sophia-Antipolis, France, in the PTOLEMY group at UC Berkeley, and in the PATH project at UC Berkeley. His research interests include the design of reactive systems, with a special concern for distributed implementation, fault-tolerance, reliability, formal verification, and discrete controller synthesis. Since January 2005, he has been the scientific leader of the POP ART team at INRIA Grenoble Rhône-Alpes, which focuses on formal methods for embedded systems. In 2008, he was on sabbatical leave in the ACEI team at the University of Auckland. He has published 50 papers in international journals and conferences. He is associate editor of the Journal of Embedded Systems and has been member of several conference programme committees.

Gregor Goessler is "chargé de recherche" at INRIA. He studied computer science and graduated from both the University of Karlsruhe and the Computer Science Engineering School at Grenoble (Ensimag) in 1998. He worked as a graduate researcher at Verimag and received his PhD at the University of Grenoble in 2001. In 2001/02 he was working as a postdoctoral researcher at the University of California at Berkeley on the Metropolis project, before joining INRIA in 2002. His research interests include component-based design of embedded systems, and in particular the development of techniques for correctness by construction of heterogeneous systems. He is a co-author of the BIP component framework with Joseph Sifakis.

Bertrand Jeannet is "chargé de recherche" at INRIA. He studied computer science and graduated from Ecole Polytechnique and obtained his PhD from Grenoble INP in the Verimag lab in 2000. He was then a postdoctoral researcher in Aalborg University, before joining INRIA in Rennes, in the Vertecs team. He has been in the POP ART team since 2006. He is a specialist of formal methods, in particular static analysis, abstract interpretation, and model checking. He is the main architect and designer of the several widely used tools and software libraries: Apron (a set of abstract domain libraries sharing the same interface) with its Interproc demonstration interprocedural analyzer, NBac (Numerical and Boolean Automaton Checker: a verification tool based on dynamic partitioning to verify safety properties of programs or systems, connected to Lustre and hybrid automata), and RAPTURE (Reachability Analysis of Probabilistic Transition systems by sUccsessive Refinements: a verification tool for finite-state probabilistic systems).

Avinash Malik got his PhD in 2009 from UoA in the ACEI team, under the supervision of Zoran Salcic. His PhD topic is the SystemJ programming language. He joined the POP ART team for a postdoc of 12 months in October 2009. Within the AFMES project, he focused on the issues of adaptive computing, more specifically for the SystemJ and DSystemJ programming languages.

Roopak Sinha holds a PhD from the University of Auckland, obtained in November 2008. He joined the POP ART team for a postdoc of 12 months in February 2011. Within the AFMES project, he focused on the issues of incremental converter synthesis.

2. New Zealand partner

The ACEI team (Auckland Centre for Embedded Intelligence) at the University of Auckland focuses on formal methods for embedded systems, tools for design and computer execution platforms (architectures) that support the systems with formal foundations.

With respect to the AFMES project, the ACEI team invented reactive processors, architectures that directly interact with the environment without needing interrupts, and that are time-triggered in nature. The most recent one, called ReCOP, is a part of tandem processor proposed for efficient execution of SystemJ language [Malik, Salcic, Girault, Walker & Lee, JTRES 2009]. Also recent in this family is the StarPro [Yuan, Andalam, Yoong, Roop & Salcic, JES 2009], which is one of the most efficient platforms for the execution Esterel, by providing a specific instruction set architecture (ISA) to execute directly Esterel program. An earlier multiprocessor called Emperor [Roop, Salcic & Dayaratne, EMSOFT 2004] supported multi-core thread-level distribution of Esterel, a concept which is extremely significant. One key property of reactive processors is that they are inherently non-speculative and hence time-predictable. One major limitation, on the other hand, is that they only support the execution of Esterel, which is less widely used compared to C. In this project we seek to develop reactive execution platforms for PRET-C, thus bringing time-predictability to the widely used C language.

The ACEI team has significant contributions in real-time programming languages and in formal methods for embedded systems. In particular, they designed the SystemJ programming language for GALS embedded systems [Gruian, Roop, Salcic & Radojevic, MEMOCODE 2006] [Malik, Salcic & Roop, ACM TODAES]. The SystemJ language is accompanied with a compiler that implements the formal semantics of the language and guarantees equivalence between the design and its implementation. To our knowledge, SystemJ is the first system level language that a designer can use to implement real-life GALS systems. The ACEI team also developed a host of different execution platforms for executing SystemJ which include: Java based execution, multiprocessor support for SystemJ execution [Malik, Salcic, Girault, Walker & Lee, JTRES 2009] and the newest addition is the real-time operating system based support for GALS model execution [Sun, Salcic & Malik, ASP-DAC 2010].

Zoran Salcic is a professor of computer systems engineering at the University of Auckland. He made original contributions and published more than 250 journal and conference papers, books and book chapters, as well as major technical reports in the areas of complex digital systems design, custom-computing machines, reconfigurable systems, FPGAs, processor and computer systems architectures, embedded systems and their implementation, design automation tools for embedded systems, hardware-software co-design, new computing architectures, and models of computation for heterogeneous embedded systems. He is editor in chief of the EURASIP Journal for Embedded Systems and associate editor of Elsevier Journal on Microprocessors and Microsystems and Australian Journal on Research and Practice in Information Technology. He is a fellow of the Royal Society (Academy of Science) New Zealand.

Partha Roop is senior lecturer at the University of Auckland in the ACEI team. He has a PhD in computer science and engineering from the University of New South Wales, Sydney, Australia, M. Tech from Indian Institute of Technology in Kharagpur, India and B.E. from Anna University (College of Engineering), Madras, India. His main research interests are in the areas of real-time embedded systems, formal verification and validation of hardware-software systems and computer architecture. He has published over 60 papers in refereed journals and conferences. He is a memeber of the editorial board of Elsevier Journal of Microprocessors and Microsystems and EURASIP Journal on Embedded Systems.

Sidharta Andalam was a PhD student in the ACEI team, co-advised by Roop (main advisor) and Girault (co-advisor). Within the AFMES project, he focused on the issues of time predictable programming language and execution architectures. During his PhD, he designed the PRET-C programming language and the ARPRET architecture. He is now postdoc at TUM Create in Singapore.

Wei-Tsun Sun obtained his PhD from UoA in the ACEI team, under the supervision of Zoran Salcic. His PhD topic is defining and implementing run-time support as extension of the operating systems for languages that have formal foundation. He has designed a library that supports processes that create GALS systems above operating system abstraction. He will join the POP ART team for a 12 months postdoc in December 2012.

Eugene Yip Eugene Yip is currently a PhD student in the ACEI team, under the supervision of Partha Roop. His PhD research involves the design of FORE-C, a deterministic (synchronous) parallel programming language, and its implementation for multicore systems with attention to timing analysis using reachability.

Li Hsien Yoong obtained his PhD from UoA in the ACEI team, under the supervision of Partha Roop. His PhD work included the formal modelling and automatic code synthesis of IEC 61499 function blocks. Within the AFMES project, he will focus on the synthesis of multi-rate distributed systems within the IEC 61499 standard.

2. Impact for the partners

The areas of expertise of the two teams are complementary: ACEI has expertise in reactive processors, optimized implementation, prototyping on FPGAs, formal models of computation for embedded systems, and programming languages for GALS systems, while POP ART has expertise on model checking, component-based design, abstract interpretation, and semantics of programming languages. We believe that this complementarity shall allow us to leverage on both sides' expertise fields to achieve the goals of the AFMES project. This is the key of a successful and fruitful collaboration as "équipe associées".

One striking example of this complementarity is our PRET-C proposal. We believe that the two keys to the success of this proposal will be, on the one hand using a tailored reactive processor (the concept of which was invented by the ACEI team), and on the other hand using abstract interpretation for the WCRT analysis (for which the expertise of the POP ART team will be required).

Finally, both teams POP ART and ACEI have roughly the same size, so we expect both of them to have the same impact on the scientific results obtained in the AFMES project. In particular, the research effort shall be evenly distributed between both teams. Finally, the three research axes presented in the AFMES project are a priority for both teams.


III. WORK PROGRAMME

1. Incremental converter synthesis

The overall aims of our first research axis will be broken down into the following specific goals. The main participants shall be Sinha, Goessler and Girault (POP ART), and Roop and Salcic (ACEI).

  1. Extend our existing converter synthesis formal method to handle in a uniform framework both control and data mismatches.
  2. Study incremental converter synthesis in this unified framework. Providing this for pure control converter synthesis seems easy, while for mixed control/data converter synthesis it constitutes a major challenge.
  3. Study converter synthesis for Kahn process networks, such that control and data exchanges shall be supervised in a uniform way (both taking place through FIFOs).

2. Programming language for adaptive computing (SystemJ and beyond)

The overall aims of our second research axis will be broken down into the following specific goals. The main participants shall be Malik, Girault, and Jeannet (POP ART), and Salcic and Sun (ACEI).

  1. Demonstrate the power of the SystemJ language on a selected complex distributed and adaptive application (e.g., intelligent sensor network for surveillance applications).
  2. Define the semantics of the proposed language, capable of describing adaptive and dynamic computing systems.
  3. Implement the complete tool chain for program development that includes a compiler, a debugger, and a code generator for SystemJ.
  4. Implement the runtime support for the language (in particular the dynamic features).
  5. Explore the features of new execution platforms (architectures) for languages for adaptive computing (dynamic creation and destruction).

3. Time predictable programming language and execution architectures

The overall aims of our third research axis will be broken down into the following specific goals. The main participants shall be Girault and Jeannet (POP ART), and Andalam, Roop, and Yip (ACEI).

  1. The development of the syntax and semantics of PRET-C and FORE-C.
  2. The development of compiler support for both languages.
  3. The design and implementation of a static timing analysis tool.
  4. The design of a debugger.
  5. The development of an integrated IDE and the associated predictable architectures.
  6. The development of a multicore simulator (for FORE-C).

IV. RESULTS

1. Results on incremental converter synthesis

Significant progress has been made on this aspect of the project in 2011. Compared to our previous work [Roopak, Girault, Sinha & Goessler, ACSD 2009], a completely new formulation to the problem of incremental converter synthesis for interacting components has been proposed. The most significant contributions of this work are summarized below:

The proposed approach uses a simple but expressive representation based on labeled transition systems (LTS) to describe protocols, specifications and converters. The representation for protocols captures both the control and data exchange aspects of IP protocols precisely, and is timing accurate. The LTS-based models can be readily extracted from given intellectual property components (IPs), and can be translated into HDL automatically. Moreover, many formal methods require LTS-like representation.

We allow the user to model the desired behavior (correctness) of the converted system using two types of specifications -- control and data:

These specifications are simple and intuitive to create but at the same time enable more flexible and expressive specifications than the existing techniques found in the literature.

The proposed algorithm can handle multiple protocols sharing different types of control signals such as uncontrollable, buffered and converter-generated signals. Uncontrollable signals are also used to model and handle non-determinism in the environment as well as within protocols themselves.

We propose specification-enforcing refinement (SER) as a relation between protocols and specifications describing the maximal non-deterministic converter under which the protocols satisfy the specification. We prove that the presence of an SER is the necessary as well as a sufficient condition for converter synthesis. The proposed approach has polynomial complexity in the sizes of the protocols and specifications.

We employ a game theoretic algorithm to compute the SER. Using the algorithm for Buchi games [Chatterjee, Henzinger & Piterman, GDV 2006], we construct the maximally-permissive non-deterministic converter, as well as a strategy to extract a deterministic converter in linear time (to the size of the graph).

We have shown that using numeric annotations on protocol transitions describing quantitative criteria such as energy usage, and a more complex game-theoretic algorithm, such as energy-parity games [Chatterjee & Doyen, ICALP 2010], an optimal converter can be generated.

Our SER technique allows the incremental design of SoCs, where protocols and specifications are added progressively. We prove that incremental design is mathematically sound, and that the converted system obtained from a previous stage of conversion can be reused in a subsequent stage.

We have demonstrated the generality of our approach by showing that some current approaches [Avnit et al, 2008; Passerone et al 2002] are restricted cases of our formulation.

We have presented a protocol conversion tool, SoCConvert, based on the proposed technique. An analysis over a number of SoC benchmarks show that our SER approach can handle complex IP protocol mismatches successfully. In addition, the previous conversion framework developed at ACEI employing CTL specifications [Sinha et al, 2009], has been extended to include the concepts of maximality and incremental design that were studied during the formulation of the above framework. Other extensions to this work include handling full-CTL specifications (before that, only negative-normal form CTL was accepted), and proof of incremental synthesis. This extended work has resulted in a publication [Sinha, Roop, Salcic & Basu, 2012] to be presented at DATE Conference 2012. The following table summarizes the related approaches found in the litterature and compares with our own SER-based approach:

 Method   Specification   Data channels   Uncontrollable signals   Complexity   Compatibility 
 [Passerone et al. 1998]   Implicit (DTR, latency)   1-way buffers   No   Polynomial   No 
 [Androutsopoulos et al, 2004]   Implicit (DTR)   1-way buffers   No   Polynomial   No 
 [Avnit et al, 2009]   Implicit (DTR)   1-way buffers   No   Polynomial   Yes 
 [Tivoli et al, 2007]   Implicit (no deadlocks)   1-way buffers   No   Exponential   Yes 
 [Cao & Nymeyer, 2009]   Explicit (temporal logic)   1-way buffers   No   Exponential   Yes 
 [Passerone et al, 2002]   Implicit   No   No   Polynomial   No 
 [Sinha et al, 2004]   Explicit (CTL, bounds)   Multi-directional buffers   Yes   Exponential   No 
 [Roop et al, 2009]   Explicit (LTS)   No   Yes   Polynomial   No 
 Our SER approach   Explicit (LTS, bounds)   multi-directional buffers   Yes   Polynomial   Yes 

Two submissions are planned from the work that has been done on incremental converter synthesis: A conference and a journal paper are in preparation.

2. Results on programming language for adaptive computing (SystemJ and beyond)

SystemJ[Malik, Salcic, Roop & Girault, Elsevier 2010] is a new system level design language, based on formal semantics, targeted at developing Globally Asynchronous Locally Synchronous (GALS) systems. SystemJ extends the imperative synchronous Esterel language [Berry, Ecole des Mines & INRIA, 2000] with asynchrony. SystemJ allows multiple synchronous islands called clock-domains to run together asynchronously, thereby resulting in a GALS execution model. SystemJ is accompanied with a compiler based on the Asynchronous GRaph Code (AGRC) format, targeting multiple different execution platforms, ranging from desktops and servers to purpose built reactive processor architectures. We have developed substantial applications with SystemJ, e.g., a smart distributed surveillance system [Malik, Salcic, Chong & Javed, TECS, 2011], to validate the compiler, the language, and the GALS model of computation itself.

Our work on reactive processors, supporting asynchronous and in particular the GALS execution model is also the very first. Our design is a multi-core architecture including; (1) a processor capable of directly executing the AGRC intermediate format, and (2) any data-processor capable of executing the data-driven computations. The resultant processor design, called the GALS-Hybrid Multi-Processor (GALS-HMP) [Malik, Salcic, Lee & Walker, TECS, 2011] is the first to handle real-world examples, unlike the previous reactive processors, which were an academic exercise due to lack of support for data-driven computations and asynchrony. Furthermore, our approach is shown to be scalable to a large number of cores, a criteria, which was explored unsuccessfully in previous reactive processor design attempts. Finally, our approach of supporting direct execution of the intermediate format rather than the language constructs makes the processor available to other system level language such as Esterel, which is a subset of SystemJ.

We have successfully designed and implemented a new programming language called DSystemJ, which is a conservative extension of SystemJ. It is primarily aimed at dynamic distributed systems, which are connected via networks. DSystemJ adopts the GALS model of computation (Globally Asynchronous Locally Synchronous), and is equipped with a rigorous mathematical semantics allowing for potential formal verification of dynamic distributed systems. DSystemJ allows runtime forking and destruction of processes and of their communication channels, reconfiguration of the system topology, and mobility of processes across networks and physical machines. Although currently targeted at desktop and server based systems, the concepts introduced in DSystemJ are equally applicable to dynamic Network on Chip (NoC) systems.

DSystemJ introduces five new syntactic constructs to allow the design of dynamic systems with ease. Syntactic constructs like -> and run create asynchronous processes called clock-domains, and execute them at runtime. Constructs such as send and receive are used to communicate via channels between these asynchronous processes. Mobility is inherent in DSystemJ as it allows passing clock-domains via such communication channels.

The main innovation in DSystemJ is the new rendezvous algorithm and its formal semantics introduced to communicate between clock-domains in the presence of mobility [Malik, Girault & Salcic, IEEE TPDS 2012]. There is no current language that supports, in the presence of mobility, a robust rendezvous algorithm like in DSystemJ. The language uses a combination of broadcast and point-to-point protocols to establish communication between clock-domains running on different physical machines.

DSystemJ is more expressive than other classical languages targeted at dynamic systems like JoCaml [Mandel and Maranget, INRIA No RR-6261 2008], and Occam-π [Welch and Barnes, LNCS 2005], because DSystemJ provides reactivity, mobility, and support for complex data computations for design of real systems, unlike in the aforementioned approaches. The only other solution that matches the expressive power of DSystemJ is the Java based library called JADE, designed at the Telecom Italia Labs (not to be mistaken with the JADE autonomic agent framework of the SARDES INRIA team). Even in this case, DSystemJ is a better solution as it provides a language based design environment thereby reducing the design and maintenance effort. Also, our experimental results are clearly in favor of DSystemJ, showing significantly faster performance and reduced memory footprints.

We have designed a compiler, and a runtime system for the DSystemJ language. The compiler is written in Java, while the runtime system is written in the SystemJ language itself. The use of SystemJ makes the runtime system easily extensible by the designer as there is no need to learn a new programming language. We have tested our implementation by designing and implementing a substantially large object tracking system using multiple physical machines within a large sub-net with positive results.

This work on DSystemJ has resulted in a large body of software that is available under a liberal license here. The startup company has been created to market the SystemJ compiler and programming environment. Besides the software, our work has resulted in a conference paper [Malik, Girault, Salcic, ACSD 2011] and a journal paper [Malik, Girault, Salcic, IEEE TPDS 2012].

In conclusion, we have met all but two of the objectives that we set out to achieve with the design and research of dynamic extensions to the SystemJ language (see Section III.2). Substantial further research effort is needed to design a debugger for DSystemJ, especially with mobility exacerbating this problem. Similarly, we need to think more carefully about verification of DSystemJ programs, because of their intractability under general conditions.

3. Results on time predictable programming language and execution architectures

Our results on PRET programming and implementation are reported in the PRETZEL web page. We have developed the syntax and semantics of our PRET-C language and have proposed a new PRET processor execution [Andalam, Roop & Girault, MEMOCODE'10]. This article also compares the efficiency of PRET-C with similar light-weight, multithreaded C such as Protothreads [Dunkels et al., SenSys'06] and Syncharts in C [Plotkin, 1981]. These benchmarking experiments demonstrate the superiority of PRET-C over these languages, both in terms of worst case and average case execution time, while being competitive in terms of code size.

We have also developed the formal semantics of PRET-C using structural operational style semantics [Roop et al., CASES 2009]. The proposed semantics ensures the determinism of PRET-C programs, and PRET-C threads are causal by construction. A journal version of PRET-C syntax, semantics, compilation, and execution is currently (fall 2012) under minor revision at IEEE Transaction on Computers.

Concerning the timing analysis (item 3 of Section III.3), we have continued our initial model checking based formulation [Roop et al., CASES 2009]. We have now developed an automated tool to convert PRET-C programs into the input format for the UPPAAL model checker. This has enabled us to perform timing analysis over a range of large benchmark programs. We have also recently compared our approach with ILP-based approaches (Integer Linear Programming). This comparison reveals that our own approach yields both more scalable and tighter analysis compared to the ILP-based approaches. These results have appeared in a conference paper [Andalam, Roop & Girault, DATE 2011].

We have already developed most of the core parts of the tool chain. These are constantly updated, as we are exploring techniques for tighter and smarter timing analysis. We are currently lacking a complete an integrated development environment (IDE) combined with a debugger tailored for PRET-C.

In 2011, Sidharta worked on cache analysis with Jan Reineke (at that time postdoc at UC Berkeley, in the group of Prof. Edward Lee). We discussed and compared the "binary approach" (developed during Sidharta's visit in INRIA) with the absint approach. We concluded that, although the "binary approach" will be more precise, it is unsure how much more time it will take to analyze the program. We also developed a new abstraction to enhance the binary approach, based on a ternary representation of the cache states as guaranteed hit (h), guaranteed miss (m), and either miss or hit (nc). This "ternary approach" will be less precise than the "binary approach", but will result in a faster analysis time.

Secondly, we got access to the commercial tool aiT from absint. The tool is very impressive, and we used it to perform a simple cache analysis over the ARM architecture.

Thirdly, in collaboration with the PRET UC Berkeley research team, we started looking at the timing analysis problem from a new perspective. Instead of making the timing analyzer tightly coupled to an architecture (like the current approaches), we started designing a new architecture-parametric timing analyzer. The parameters describing the architecture will therefore be an input to the timing analyzer. This will allow the portability of hard real-time applications over different architectures.

Another key aspect is the balance between the precision of the WCET estimate and the analysis time. The analysis time is crucial for scalability. We have implemented the absint approach and the NUS approach (from National University of Singapore). Our initial benchmarking concluded that the NUS approach gives tighter estimates than the absint approach. However, the NUS approach takes a much longer time to analyze a program. For example, a program that takes NUS about 5 hours to analyze will only take absint a few seconds. Also, for larger programs the NUS approach either runs out of memory (3Gb) or takes too long (>12hrs) to analyze. We have concluded that the NUS approach will not scale for larger applications.

We have also implemented our "binary approach". It achieves the same precision as the NUS approach but consumes significantly less memory and takes a shorter time to analyze. However, during the benchmarking of large programs, we were unable to finish the analysis within a reasonable amount of time, so the "binary approach" is not very scalable.

We are therefore developing a new approach. Instead of operating at either ends of the spectrum (high precision or smaller analysis time), we have formulated a new "parameterized approach", in order to control the abstraction, which in turn affects the analysis time. This new approach allows us to explore intermediate points within the spectrum, making this approach very attractive. Moreover, the abstraction captured by the previously proposed ternary approach is equivalent to a special point in this new parameterized approach. Currently, we operate at the maximum abstraction during the analysis. Due to the implementation overheads, we have found our new parameterized approach to be more effective than the previously proposed ternary approach. We are currently benchmarking this parameterized approach against the absint and the NUS approaches, and we will submit a journal paper at the end of 2012.

Finally, concerning the FORE-C language, we have developed its syntax and the (informal) semantics. To the best of our knowledge, FORE-C is the first attempt at the design of a synchronous language for the deterministic parallel programming of embedded multicores. A FORE-C compiler has been developed that targets a general purpose multicore architecture with cores based on the Xilinx MicroBlaze softcore processor. The traditional approach of using shared variables for thread communication requires the programmer to manage parallel access from multiple threads with mutual exclusion protocols. These protocols impose a sequential order on thread execution, thus sacrificing parallelism. In contrast, FORE-C enables threads to access shared variables in parallel without needing to sequentialize.

We have also developed a reachability-based static timing analysis approach for FORE-C programs executed on multicores. Preliminary results have shown that very precise WCRTs are computed for a range of large programs. WCRT speedups were also computed for the same programs when distributed over additional cores. To decrease the analysis time, we developed optimisation techniques that decrease the size of the program's control-flow graph used as input into the analysis. Using the optimisations, we are able to reduce analysis time by two orders of magnitude, without affecting the precision.

To validate the FORE-C timing analysis work, we have developed a cycle-accurate multicore simulator with cores based on the Xilinx MicroBlaze softcore processor. An arbitrary number of cores connected to a shared TDMA bus can be simulated. The simulator is extensible in that custom peripherials can be created and attached to any processor or to the shared TDMA bus. The simulator is able to output to the user the executed instructions on each core.

4. Scientific production (published papers and software releases)

5. Work programme for 2012

On incremental converter synthesis

The following items will be conducted jointly by the POP ART and ACEI partners.

  1. Extention of the current formulation to allow uncontrollable signals between components (uncontrollable in the sense of discrete controller synthesis [Ramadge and Wonham, 1987]). This is a madatory step towards incrementality.

  2. Implementation of the converter synthesis algorithm, and benchmarking of this implementation.

  3. Submit a journal article on our new formulation and solution.

  4. In the current formulation, control signals and data signals are distinguished. Control signals are handled as transition events in our LTSs, and they are directly managed by our discrete controller synthesis based formulation. In contrast, data signals are handled as buffers for which we only control the number of bits that they contain. We seek to unify these two formulation under a common formalism based on Kahn process networks.

On programming language for adaptive computing

The following items will be conducted jointly by the POP ART and ACEI partners.

  1. The first concept and implementation of DSystemJ is based on the assumptions of unbounded computing resources and especially memories. Also, DSystemJ depends on Java as it compiles at the end to Java byte-codes and executes on the JVM. Our first task for 2011 shall be to look beyond Java as an underlying language and work on specific run-time support that will enable dynamic GALS systems on arbitrary standard platforms that do not require Java. This work will exploit the GALS library of Wei-Tsun Sun [Sun, Salcic & Malik 2010] and extend it for the dynamic aspects of DSystemJ.

  2. We shall also investigate how existing operating systems can be used to provide the elements of run-time support for dynamic GALS languages based on standard programming languages (C and C++).

  3. Finally, we shall investigate optimized code generation of DSsytemJ programs for high-performance platform, and in particular MPSoCs. This differs from the current distributed code generation of DSystemJ because is raises issues of mapping efficiently the DSystemJ clock-domains to the cores, of optimizing the channel based communications on the NoC, and of memory hierarchy. This last task is really exploratory and, with the recent move of Avinash Malik to IBM in Dublin, progress will depend on what manpower will be available to tackle it in 2011 and 2012.

On time predictable programming language

Items (5) below will be mostly done by the ACEI partner. All the other items will be conducted jointly by the POP ART and ACEI partners.

  1. Work on the timing analysis of PRET-C programs running on processors with caches. This is a crucial point to allow the design and implementation of large application with PRET-C. This task may look daunting, but we believe that the global synchronization barrier between parallel threads of PRET-C (the so-called EOT statement) makes it achievable. Indeed, it will be possible to refresh all the cache entries upon each synchronization barrier, therefore removing the cache non-determinism due to the control flow branches taken by the code above the barrier. We have been discussing this issue with Reinhard Wilhelm (from the University of Saarland).

  2. Develop a scratchpad memory allocation algorithm. In the context of PRET programming, scratchpad memories present a nice alternative to standard caches because they are easier to control at a software level. This is also the approach followed by the UC Berkeley team of Edward Lee.

  3. Evaluate and compare the predictability and throughput of PRET-C programs between processors equipped with caches and with scratchpad memories. We will then submit a journal and a conference paper on our timing analysis of our memory hierarchy work.

  4. Depending on the available manpower, we will complete the Integrated Design Environment (IDE). This requires the following items: (i) A PRET-C Editor. (ii) A TCCFG viewer. (iii) A timing analysis report. (iv) Based on the Worst Case Reaction Time path, the IDE will highlight both the TCCFG and the PRET-C source file with timing information and will highlight the path leading to WCRT.

  5. A final research direction is the use of multicore processors for the execution of PRET programs. This will be a significant step that will be started in 2011 and will be continued all the way through to the end of the project. This will imply the design of a new programming language FORE-C to facilitate distribution of threads over cores, so as to achieve both predictable execution and good throughput. FORE-C will be derived from PRET-C but with significant differences, the main one being that PRET-C threads are executed sequentially on a single core, FORE-C threads are executed in real concurrency; this has a major impact on how shared variables are declared, handled, and compiled. This action will involve Eugene Yip who is doing his with Partha Roop in the ACEI team.

  6. We shall work on the formalization of the FORE-C semantics and on extending the FORE-C timing analysis to include cache hierarchy in the multicore system. Although this work can leverage the cache analysis work developed for PRET-C, the sharing of resources in the multicore setting complicates the analysis. In tandem, the multicore simulator will be extended to include a cache hierarchy for validation purposes.


V. VISITS

1. List of the visits for 2010

2. List of the visits for 2011

3. Planned visits for 2012 (POP ART researchers)

4. Planned visits for 2012 (ACEI researchers)



© INRIA - last updated on 16/11/2012