Direction des Relations Internationales (DRI)
This collaboration involves two teams, the POP ART team from INRIA (led by Alain Girault, directeur de recherche at INRIA) and the ACEI team from the University of Auckland, New Zealand (led by Zoran Salcic, professor at the University of Auckland).
We work on some of the most important challenges for the design of embedded systems. Let us recall that embedded systems are characterized by several constraints, such as enormous complexity and heterogeneity, need for determinism or bounded reaction time. Accordingly, design methods for embedded systems should, wherever possible, be automated and guarantee these properties by construction, therefore shifting the burden of checking these constraints from the programmer/system designer to the design method. In order to achieve this, our goal is to improve the existing design methods in several key directions: (1) Incremental converter synthesis. (2) Programming language for adaptive computing (SystemJ and beyond). (3) Time predictable programming language and execution architectures. Together, these advanced formal methods will provide foundations for automated design and higher level of safety of the designed embedded systems.
I. OBJECTIVES | --- | II. PARTNERS | --- | III. WORK PROGRAMME | --- | IV. RESULTS | --- | V. VISITS |
Embedded systems are characterized by several constraints, such as determinism and bounded reaction time. Accordingly, design methods for embedded systems should, when possible, guarantee these properties by construction. This allows the shifting of the burden of checking these constraints from the programmer to the design method and the associated compilers and code generation tools. In order to achieve this, our goal is to improve the existing design methods in several key directions: (1) Incremental converter synthesis. (2) Programming language for adaptive computing (SystemJ and beyond). (3) Time predictable programming language and execution architectures. Together, these advanced methods will provide a higher level of safety in the design of embedded systems. We now present in more details those three research axes.
System-on-Chip (SoC) design involves the
interconnection of many pre-designed components, called IP
(Intellectual Property) blocks. While, a set of selected IPs may meet
the functional requirements, their interaction protocols may not be
consistent, leading to several kinds of mismatches. The most common
types of such mismatches are control, data,
and clock mismatches. Control mismatches happen when the
sequencing of control signals between protocols is inconsistent. Data
mismatches happen when the data-widths of the two protocols differ and
additional buffers are needed to manage loss-less data
communication. Clock mismatches are common between IPs having
different clock frequencies. The goal of converter synthesis is
to start from a pair of components A
and B
and possibly a global specification S
, and then
synthesize automatically a converter I
(also often
called an adaptor or an interface in the literature)
that solves the mismatches between A
and B
's
interaction protocols, and is such that the composition of the two
components and their adaptor, (A || I || B)
, satisfies
the global specification S
, if S
was
required by the user (where ''||
'' is the synchronous
parallel product).
The first solutions proposed to solve the converter synthesis problem failed to address several questions. These include how to formally model protocols and their interaction, and when such models are available, how to determine if a converter exists for a given set of protocols? Moreover, once the existence of a converter is determined, how to synthesize it? More recently, a set of formal techniques have been proposed [Kumar & Nelvagal, ISCACSD 1996] [Passerone, de Alfaro, Henzinger & Sangiovanni-Vincentelli, ICCAD 2002] [D'Silva, Ramesh & Sowmya, IEE CDT 2005] [Sinha, Roop & Basu, SLA++P 2007] [Tivoli, Fradet, Girault & Goessler, TACAS 2007] [Sinha, Roop, Basu & Salcic, DATE 2009] [Roop, Girault, Sinha & Goessler, ACSD 2009] to address these questions.
In the AFMES project, we propose to extend the most
advanced of these techniques [Roop, Girault, Sinha & Goessler, ACSD
2009] in two directions: firstly to handle in a uniform
framework both control and data mismatches, and secondly to
achieve incrementality. This last property is actually the
conjunction of two sub-properties. The first sub-property denotes the
ability to synthesize a first converter I1
between A
and B
, and then synthesize a
second converter I2
between the assembly (A || I1
|| B)
and the third component C
, such that the
resulting assembly ((A || I1 || B) || I2 || C)
be
equivalent to the alternative assembly (A || I1' || (B || I2' ||
C))
. In other words, the converter synthesis operation should
be associative (in saying this, we are, somewhat abusively,
seeing the converter synthesis binary operation ". || I ||
.
" as a binary operator). The second sub-property denotes the
ability to synthesize a converter I
directly for the
three components A
, B
, and C
,
such that the resulting assembly (I || A || B || C)
be
equivalent to ((A || I1 || B) || I2 || C)
. Proposing such
a sound converter synthesis framework is a difficult and an important
challenge that we wish to tackle within the AFMES project.
System level design languages like Esterel [Berry & Gonthier, SCP 1992], Signal [Le Guernic, Gautier, Le Borgne & Lemaire, Proc. IEEE 1991], Lustre [Halbwachs, Caspi, Raymond & Pilaud, Proc. IEEE 1991], and SystemJ [Malik et al., 2009] can reduce the design and development times in the software development life cycle as these languages are based on a formal model of computation. The current system level languages are sufficient for programming static systems, i.e., systems that do not evolve in their life times. Such static system form a significant but still, a restricted portion of the complete set of heterogeneous models. The future of computing is ingrained in dynamic systems. Dynamic systems morph during their life time. Examples of such systems include, wireless mobile computing platforms, secure surveillance systems, sensor networks, and so on.
The domain of system level languages that target such dynamic systems includes RML (Reactive ML) [Mandel and Pouzet, PPDP 2005], SugarCubes [Boussinot and Susini, SPE 1998], and Junior [Hazard et al., INRIA RR 1999]. All three execution platforms are based on the SL model of computation [Boussinot and de Simone, IEEE TSE 1996], and allow dynamic forking of synchronous reactions at runtime. The synchronous nature of these execution platforms restricts targeting only closed knit models, i.e., systems with large dependencies and running on a tightly coupled platform like a single processor. There is currently no system level design language that can be used to design and implement dynamic and distributed loosely knit systems.
SystemJ combines the Esterel model of computation with the decoupling of CSP [Hoare, CACM 1985] to implement the Globally Asynchronous Locally Synchronous (GALS) model of computation. The asynchronous subset of the SystemJ language allows the targeting of distributed and multi-core systems. Currently, SystemJ only allows building and implementing static GALS models and systems. Thus, the next step in the evolution of the language will be to target dynamic systems, which is precisely our goal inside the AFMES project.
We seek to define extensions to the SystemJ language for adaptive computing systems, in which the system functionality evolves over time. The AFMES project will result in a suite of tools in the form of an integrated development environment including a compiler, debugger, and flexible code generators for a range of execution platforms to be readily used by target applications. Examples of such execution platforms include, but are not limited to, mobile computers, sensor networks, distributed embedded and control systems and platforms for ad-hoc collaborative computing. Examples of target applications include smart power grid, flexible manufacturing systems, and intelligent environments in health and elderly care.
Speculative execution on modern processors maximizes throughput at the expense of predictability. Hence, the design of time-predictable embedded systems is a very time consuming and manual process to ensure correctness. The use of real-time operating systems to achieve predictable timing has associated flaws due to the inherent non-determinism. A move in the right direction to achieve predictable timing is a proposal for the design of Precision Timed (PRET) architectures [Edwards & Lee, DAC 2007], where time is a property at all levels of abstractions of a design. It should be emphasized that achieving a predictable execution (time predictable, deterministic, ...) is of paramount importance when designing safety critical systems.
Within the AFMES project, we seek to develop a new language called Precision Timed C (PRET-C) that guarantees precise timing by providing a set of simple extensions to C for logical time and thread-safe programming [Andalam, Roop, Girault & Traulsen, RePP 2009]. Logical time will be mapped to physical time through a static analysis [Roop et al., CASES 2009], which will provide very accurate estimates of the execution time of a program by relying on the underlying PRET architecture. The current execution architecture is the so called ARPRET (Auckland PRET Processor), a general soft-core processor (Microblaze) augmented with a dedicated Predictable Functional Unit and implemented on FPGA [Andalam, Roop, Girault & Traulsen, RePP 2009]. The combination of a logically timed language, static analyses, and a customized PRET processor shall guarantee timing while not sacrificing throughput. Our approach will be the first to offer all these in a single package, so that the complexities of designing time-critical embedded systems will be greatly simplified.
We seek to develop an integrated development environment (IDE) that supports the PRET-C compiler, debugger, and timing analyzer. The proposed compiler will generate code for both pure software execution of PRET-C on any general purpose processor, and for our ARPRET dedicated processor. Compiler switches shall be able to enable code generation for both the worst case and average case execution (when the user wants to generate code so as to maximize throughput at the expense of predictability). The debugger will extend a standard debugger such as the GNU debugger (GDB) to support debugging of PRET-C and C code. A timing analyzer will be developed combining model checking and abstract interpretation to determine very close over-approximation of the worst case tick length (also called the Worst-Case Reaction Time -- WCRT).
We will support a range of architectural extensions for efficient PRET-C execution. Possible extensions include thread interleaved pipelining, multi-threading support in hardware, and possibly predicated execution. The proposed compiler should be capable of generating code on all these architectures. We will compare the software execution of PRET-C with the execution on a range of hardware platforms that support predictable execution of PRET-C.
Multicore processors can provide better power-performance trade-offs than single cores. The availability of affordable multicores has resulted in their penetration into market segments with safety-critical concerns, such as automotive engine control units (ECUs). However, the programming and timing analysis of multicores is non-trivial due to inteferences that cores suffer from when accessing shared resources. Without a disciplined approach to the parallel programming of multicores, the understanding and debugging of multicore systems becomes difficult. Thus, we aim to develop a new language called FORE-C to enable the deterministic parallel programming of multicores. Initially, we will target the execution of FORE-C programs on general purpose embedded multicores. To guarantee the timing behaviour of FORE-C programs, we will develop an efficient static analysis method to compute precise execution times. For experimentation purposes, an extensible multicore simulator will be created to execute benchmark programs and report runtime statistics. We will demonstrate that programs written in FORE-C, while being deterministic, will still benefit from multicore execution.
The proposed PRET architecture [Edwards & Lee, DAC 2007] is of interest to FORE-C due to the interleaved (parallel) execution of hardware threads within the same core. The current method of programming the PRET architecture is to create and compile a C program for each hardware thread. We will investigate the use FORE-C to offer a centralised programming model where a parallel program is mapped to the available hardware threads. We also expect to benefit from the design considerations of the proposed PRET architecture for time predictability and execution performance.
This collaboration will involve two teams, the POP ART team from INRIA (led by Alain Girault, directeur de recherche at INRIA) and the ACEI team from the University of Auckland, New Zealand (led by Zoran Salcic, professor at the University of Auckland).
The POP ART team at INRIA focuses on formal methods for the design of safety-critical embedded systems. In particular, they have a worldwide recognized expertise is the areas of formal design methods, synchronous programming languages, semantics, and compilers. The POP ART team is currently involved in the European Network of Excellence Artist-Design on embedded systems as well as several international and national projects (Combest, Cesar, Asopt, Apron, OpenTLM, Vedecy, ...). In the past, the POP ART team was also involved in two other NoEs: AOSD-Europe NoE on aspect-oriented software design and Artist-II on embedded systems.
With respect to the AFMES project, the POP ART team has recently made important advances in the areas of compositional model-checking [Goessler & Sifakis, SCP 2005], adaptor synthesis [Tivoli, Fradet, Girault & Goessler, TACAS 2007], and GALS systems [Caspi, Girault & Pilaud, IEEE TSE 1999] [Girault, Nicollin & Pouzet, ACM TECS 2006]. Concerning component-based approaches, the BIP model (Behavior, Interaction, Priorities) was jointly designed by the Pop Art team and by Verimag [Goessler & Sifakis, SCP 2005].
Alain Girault is "directeur de recherche" at INRIA. He received his PhD from Grenoble INP in 1994 after doing his research in the Verimag laboratory. He spent two years and a half as a postdoc researcher, in the ESTEREL team in Sophia-Antipolis, France, in the PTOLEMY group at UC Berkeley, and in the PATH project at UC Berkeley. His research interests include the design of reactive systems, with a special concern for distributed implementation, fault-tolerance, reliability, formal verification, and discrete controller synthesis. Since January 2005, he has been the scientific leader of the POP ART team at INRIA Grenoble Rhône-Alpes, which focuses on formal methods for embedded systems. In 2008, he was on sabbatical leave in the ACEI team at the University of Auckland. He has published 50 papers in international journals and conferences. He is associate editor of the Journal of Embedded Systems and has been member of several conference programme committees.
Gregor Goessler is "chargé de recherche" at INRIA. He studied computer science and graduated from both the University of Karlsruhe and the Computer Science Engineering School at Grenoble (Ensimag) in 1998. He worked as a graduate researcher at Verimag and received his PhD at the University of Grenoble in 2001. In 2001/02 he was working as a postdoctoral researcher at the University of California at Berkeley on the Metropolis project, before joining INRIA in 2002. His research interests include component-based design of embedded systems, and in particular the development of techniques for correctness by construction of heterogeneous systems. He is a co-author of the BIP component framework with Joseph Sifakis.
Bertrand Jeannet is "chargé de recherche" at INRIA. He studied computer science and graduated from Ecole Polytechnique and obtained his PhD from Grenoble INP in the Verimag lab in 2000. He was then a postdoctoral researcher in Aalborg University, before joining INRIA in Rennes, in the Vertecs team. He has been in the POP ART team since 2006. He is a specialist of formal methods, in particular static analysis, abstract interpretation, and model checking. He is the main architect and designer of the several widely used tools and software libraries: Apron (a set of abstract domain libraries sharing the same interface) with its Interproc demonstration interprocedural analyzer, NBac (Numerical and Boolean Automaton Checker: a verification tool based on dynamic partitioning to verify safety properties of programs or systems, connected to Lustre and hybrid automata), and RAPTURE (Reachability Analysis of Probabilistic Transition systems by sUccsessive Refinements: a verification tool for finite-state probabilistic systems).
Avinash Malik got his PhD in 2009 from UoA in the ACEI team, under the supervision of Zoran Salcic. His PhD topic is the SystemJ programming language. He joined the POP ART team for a postdoc of 12 months in October 2009. Within the AFMES project, he focused on the issues of adaptive computing, more specifically for the SystemJ and DSystemJ programming languages.
Roopak Sinha holds a PhD from the University of Auckland, obtained in November 2008. He joined the POP ART team for a postdoc of 12 months in February 2011. Within the AFMES project, he focused on the issues of incremental converter synthesis.
The ACEI team (Auckland Centre for Embedded Intelligence) at the University of Auckland focuses on formal methods for embedded systems, tools for design and computer execution platforms (architectures) that support the systems with formal foundations.
With respect to the AFMES project, the ACEI team invented reactive processors, architectures that directly interact with the environment without needing interrupts, and that are time-triggered in nature. The most recent one, called ReCOP, is a part of tandem processor proposed for efficient execution of SystemJ language [Malik, Salcic, Girault, Walker & Lee, JTRES 2009]. Also recent in this family is the StarPro [Yuan, Andalam, Yoong, Roop & Salcic, JES 2009], which is one of the most efficient platforms for the execution Esterel, by providing a specific instruction set architecture (ISA) to execute directly Esterel program. An earlier multiprocessor called Emperor [Roop, Salcic & Dayaratne, EMSOFT 2004] supported multi-core thread-level distribution of Esterel, a concept which is extremely significant. One key property of reactive processors is that they are inherently non-speculative and hence time-predictable. One major limitation, on the other hand, is that they only support the execution of Esterel, which is less widely used compared to C. In this project we seek to develop reactive execution platforms for PRET-C, thus bringing time-predictability to the widely used C language.
The ACEI team has significant contributions in real-time programming languages and in formal methods for embedded systems. In particular, they designed the SystemJ programming language for GALS embedded systems [Gruian, Roop, Salcic & Radojevic, MEMOCODE 2006] [Malik, Salcic & Roop, ACM TODAES]. The SystemJ language is accompanied with a compiler that implements the formal semantics of the language and guarantees equivalence between the design and its implementation. To our knowledge, SystemJ is the first system level language that a designer can use to implement real-life GALS systems. The ACEI team also developed a host of different execution platforms for executing SystemJ which include: Java based execution, multiprocessor support for SystemJ execution [Malik, Salcic, Girault, Walker & Lee, JTRES 2009] and the newest addition is the real-time operating system based support for GALS model execution [Sun, Salcic & Malik, ASP-DAC 2010].
Zoran Salcic is a professor of computer systems engineering at the University of Auckland. He made original contributions and published more than 250 journal and conference papers, books and book chapters, as well as major technical reports in the areas of complex digital systems design, custom-computing machines, reconfigurable systems, FPGAs, processor and computer systems architectures, embedded systems and their implementation, design automation tools for embedded systems, hardware-software co-design, new computing architectures, and models of computation for heterogeneous embedded systems. He is editor in chief of the EURASIP Journal for Embedded Systems and associate editor of Elsevier Journal on Microprocessors and Microsystems and Australian Journal on Research and Practice in Information Technology. He is a fellow of the Royal Society (Academy of Science) New Zealand.
Partha Roop is senior lecturer at the University of Auckland in the ACEI team. He has a PhD in computer science and engineering from the University of New South Wales, Sydney, Australia, M. Tech from Indian Institute of Technology in Kharagpur, India and B.E. from Anna University (College of Engineering), Madras, India. His main research interests are in the areas of real-time embedded systems, formal verification and validation of hardware-software systems and computer architecture. He has published over 60 papers in refereed journals and conferences. He is a memeber of the editorial board of Elsevier Journal of Microprocessors and Microsystems and EURASIP Journal on Embedded Systems.
Sidharta Andalam was a PhD student in the ACEI team, co-advised by Roop (main advisor) and Girault (co-advisor). Within the AFMES project, he focused on the issues of time predictable programming language and execution architectures. During his PhD, he designed the PRET-C programming language and the ARPRET architecture. He is now postdoc at TUM Create in Singapore.
Wei-Tsun Sun obtained his PhD from UoA in the ACEI team, under the supervision of Zoran Salcic. His PhD topic is defining and implementing run-time support as extension of the operating systems for languages that have formal foundation. He has designed a library that supports processes that create GALS systems above operating system abstraction. He will join the POP ART team for a 12 months postdoc in December 2012.
Eugene Yip Eugene Yip is currently a PhD student in the ACEI team, under the supervision of Partha Roop. His PhD research involves the design of FORE-C, a deterministic (synchronous) parallel programming language, and its implementation for multicore systems with attention to timing analysis using reachability.
Li Hsien Yoong obtained his PhD from UoA in the ACEI team, under the supervision of Partha Roop. His PhD work included the formal modelling and automatic code synthesis of IEC 61499 function blocks. Within the AFMES project, he will focus on the synthesis of multi-rate distributed systems within the IEC 61499 standard.
The areas of expertise of the two teams are complementary: ACEI has expertise in reactive processors, optimized implementation, prototyping on FPGAs, formal models of computation for embedded systems, and programming languages for GALS systems, while POP ART has expertise on model checking, component-based design, abstract interpretation, and semantics of programming languages. We believe that this complementarity shall allow us to leverage on both sides' expertise fields to achieve the goals of the AFMES project. This is the key of a successful and fruitful collaboration as "équipe associées".
One striking example of this complementarity is our PRET-C proposal. We believe that the two keys to the success of this proposal will be, on the one hand using a tailored reactive processor (the concept of which was invented by the ACEI team), and on the other hand using abstract interpretation for the WCRT analysis (for which the expertise of the POP ART team will be required).
Finally, both teams POP ART and ACEI have roughly the same size, so we expect both of them to have the same impact on the scientific results obtained in the AFMES project. In particular, the research effort shall be evenly distributed between both teams. Finally, the three research axes presented in the AFMES project are a priority for both teams.
The overall aims of our first research axis will be broken down into the following specific goals. The main participants shall be Sinha, Goessler and Girault (POP ART), and Roop and Salcic (ACEI).
The overall aims of our second research axis will be broken down into the following specific goals. The main participants shall be Malik, Girault, and Jeannet (POP ART), and Salcic and Sun (ACEI).
The overall aims of our third research axis will be broken down into the following specific goals. The main participants shall be Girault and Jeannet (POP ART), and Andalam, Roop, and Yip (ACEI).
Significant progress has been made on this aspect of the project in 2011. Compared to our previous work [Roopak, Girault, Sinha & Goessler, ACSD 2009], a completely new formulation to the problem of incremental converter synthesis for interacting components has been proposed. The most significant contributions of this work are summarized below:
The proposed approach uses a simple but expressive representation based on labeled transition systems (LTS) to describe protocols, specifications and converters. The representation for protocols captures both the control and data exchange aspects of IP protocols precisely, and is timing accurate. The LTS-based models can be readily extracted from given intellectual property components (IPs), and can be translated into HDL automatically. Moreover, many formal methods require LTS-like representation.
We allow the user to model the desired behavior (correctness) of the converted system using two types of specifications -- control and data:
Control specifications are based on LTS with marked states that describe liveness and/or fairness criteria. A key feature of the model for control specifications is that we allow specifications to describe only relevant behavior(s) of the protocols. A control specification is satisfied by a collection of IPs if their projected behavior(s) are consistent with some behavior(s) of the specification. This projection-based approach allows the user to write succinct specifications, independent of detailed knowledge of the protocols (unlike previous techniques).
Data specifications bound the data channels that the protocols communicate through. Our approach can handle multiple data channels that can have multiple reader and writer protocols connected to them.
These specifications are simple and intuitive to create but at the same time enable more flexible and expressive specifications than the existing techniques found in the literature.
The proposed algorithm can handle multiple protocols sharing different types of control signals such as uncontrollable, buffered and converter-generated signals. Uncontrollable signals are also used to model and handle non-determinism in the environment as well as within protocols themselves.
We propose specification-enforcing refinement (SER) as a relation between protocols and specifications describing the maximal non-deterministic converter under which the protocols satisfy the specification. We prove that the presence of an SER is the necessary as well as a sufficient condition for converter synthesis. The proposed approach has polynomial complexity in the sizes of the protocols and specifications.
We employ a game theoretic algorithm to compute the SER. Using the algorithm for Buchi games [Chatterjee, Henzinger & Piterman, GDV 2006], we construct the maximally-permissive non-deterministic converter, as well as a strategy to extract a deterministic converter in linear time (to the size of the graph).
We have shown that using numeric annotations on protocol transitions describing quantitative criteria such as energy usage, and a more complex game-theoretic algorithm, such as energy-parity games [Chatterjee & Doyen, ICALP 2010], an optimal converter can be generated.
Our SER technique allows the incremental design of SoCs, where protocols and specifications are added progressively. We prove that incremental design is mathematically sound, and that the converted system obtained from a previous stage of conversion can be reused in a subsequent stage.
We have demonstrated the generality of our approach by showing that some current approaches [Avnit et al, 2008; Passerone et al 2002] are restricted cases of our formulation.
We have presented a protocol conversion tool, SoCConvert, based on the proposed technique. An analysis over a number of SoC benchmarks show that our SER approach can handle complex IP protocol mismatches successfully. In addition, the previous conversion framework developed at ACEI employing CTL specifications [Sinha et al, 2009], has been extended to include the concepts of maximality and incremental design that were studied during the formulation of the above framework. Other extensions to this work include handling full-CTL specifications (before that, only negative-normal form CTL was accepted), and proof of incremental synthesis. This extended work has resulted in a publication [Sinha, Roop, Salcic & Basu, 2012] to be presented at DATE Conference 2012. The following table summarizes the related approaches found in the litterature and compares with our own SER-based approach:
Method | Specification | Data channels | Uncontrollable signals | Complexity | Compatibility |
[Passerone et al. 1998] | Implicit (DTR, latency) | 1-way buffers | No | Polynomial | No |
[Androutsopoulos et al, 2004] | Implicit (DTR) | 1-way buffers | No | Polynomial | No |
[Avnit et al, 2009] | Implicit (DTR) | 1-way buffers | No | Polynomial | Yes |
[Tivoli et al, 2007] | Implicit (no deadlocks) | 1-way buffers | No | Exponential | Yes |
[Cao & Nymeyer, 2009] | Explicit (temporal logic) | 1-way buffers | No | Exponential | Yes |
[Passerone et al, 2002] | Implicit | No | No | Polynomial | No |
[Sinha et al, 2004] | Explicit (CTL, bounds) | Multi-directional buffers | Yes | Exponential | No |
[Roop et al, 2009] | Explicit (LTS) | No | Yes | Polynomial | No |
Our SER approach | Explicit (LTS, bounds) | multi-directional buffers | Yes | Polynomial | Yes |
Two submissions are planned from the work that has been done on incremental converter synthesis: A conference and a journal paper are in preparation.
SystemJ[Malik, Salcic, Roop & Girault, Elsevier 2010] is a new system level design language, based on formal semantics, targeted at developing Globally Asynchronous Locally Synchronous (GALS) systems. SystemJ extends the imperative synchronous Esterel language [Berry, Ecole des Mines & INRIA, 2000] with asynchrony. SystemJ allows multiple synchronous islands called clock-domains to run together asynchronously, thereby resulting in a GALS execution model. SystemJ is accompanied with a compiler based on the Asynchronous GRaph Code (AGRC) format, targeting multiple different execution platforms, ranging from desktops and servers to purpose built reactive processor architectures. We have developed substantial applications with SystemJ, e.g., a smart distributed surveillance system [Malik, Salcic, Chong & Javed, TECS, 2011], to validate the compiler, the language, and the GALS model of computation itself.
Our work on reactive processors, supporting asynchronous and in particular the GALS execution model is also the very first. Our design is a multi-core architecture including; (1) a processor capable of directly executing the AGRC intermediate format, and (2) any data-processor capable of executing the data-driven computations. The resultant processor design, called the GALS-Hybrid Multi-Processor (GALS-HMP) [Malik, Salcic, Lee & Walker, TECS, 2011] is the first to handle real-world examples, unlike the previous reactive processors, which were an academic exercise due to lack of support for data-driven computations and asynchrony. Furthermore, our approach is shown to be scalable to a large number of cores, a criteria, which was explored unsuccessfully in previous reactive processor design attempts. Finally, our approach of supporting direct execution of the intermediate format rather than the language constructs makes the processor available to other system level language such as Esterel, which is a subset of SystemJ.
We have successfully designed and implemented a new programming language called DSystemJ, which is a conservative extension of SystemJ. It is primarily aimed at dynamic distributed systems, which are connected via networks. DSystemJ adopts the GALS model of computation (Globally Asynchronous Locally Synchronous), and is equipped with a rigorous mathematical semantics allowing for potential formal verification of dynamic distributed systems. DSystemJ allows runtime forking and destruction of processes and of their communication channels, reconfiguration of the system topology, and mobility of processes across networks and physical machines. Although currently targeted at desktop and server based systems, the concepts introduced in DSystemJ are equally applicable to dynamic Network on Chip (NoC) systems.
DSystemJ introduces five new syntactic
constructs to allow the design of dynamic systems with ease. Syntactic
constructs like ->
and run
create
asynchronous processes called clock-domains, and execute them at
runtime. Constructs such as send
and
receive
are used to communicate via channels between
these asynchronous processes. Mobility is inherent in DSystemJ
as it allows passing clock-domains via such communication
channels.
The main innovation in DSystemJ is the new rendezvous algorithm and its formal semantics introduced to communicate between clock-domains in the presence of mobility [Malik, Girault & Salcic, IEEE TPDS 2012]. There is no current language that supports, in the presence of mobility, a robust rendezvous algorithm like in DSystemJ. The language uses a combination of broadcast and point-to-point protocols to establish communication between clock-domains running on different physical machines.
DSystemJ is more expressive than other classical languages targeted at dynamic systems like JoCaml [Mandel and Maranget, INRIA No RR-6261 2008], and Occam-π [Welch and Barnes, LNCS 2005], because DSystemJ provides reactivity, mobility, and support for complex data computations for design of real systems, unlike in the aforementioned approaches. The only other solution that matches the expressive power of DSystemJ is the Java based library called JADE, designed at the Telecom Italia Labs (not to be mistaken with the JADE autonomic agent framework of the SARDES INRIA team). Even in this case, DSystemJ is a better solution as it provides a language based design environment thereby reducing the design and maintenance effort. Also, our experimental results are clearly in favor of DSystemJ, showing significantly faster performance and reduced memory footprints.
We have designed a compiler, and a runtime system for the DSystemJ language. The compiler is written in Java, while the runtime system is written in the SystemJ language itself. The use of SystemJ makes the runtime system easily extensible by the designer as there is no need to learn a new programming language. We have tested our implementation by designing and implementing a substantially large object tracking system using multiple physical machines within a large sub-net with positive results.
This work on DSystemJ has resulted in a large body of software that is available under a liberal license here. The startup company has been created to market the SystemJ compiler and programming environment. Besides the software, our work has resulted in a conference paper [Malik, Girault, Salcic, ACSD 2011] and a journal paper [Malik, Girault, Salcic, IEEE TPDS 2012].
In conclusion, we have met all but two of the objectives that we set out to achieve with the design and research of dynamic extensions to the SystemJ language (see Section III.2). Substantial further research effort is needed to design a debugger for DSystemJ, especially with mobility exacerbating this problem. Similarly, we need to think more carefully about verification of DSystemJ programs, because of their intractability under general conditions.
Our results on PRET programming and implementation are reported in the PRETZEL web page. We have developed the syntax and semantics of our PRET-C language and have proposed a new PRET processor execution [Andalam, Roop & Girault, MEMOCODE'10]. This article also compares the efficiency of PRET-C with similar light-weight, multithreaded C such as Protothreads [Dunkels et al., SenSys'06] and Syncharts in C [Plotkin, 1981]. These benchmarking experiments demonstrate the superiority of PRET-C over these languages, both in terms of worst case and average case execution time, while being competitive in terms of code size.
We have also developed the formal semantics of PRET-C using structural operational style semantics [Roop et al., CASES 2009]. The proposed semantics ensures the determinism of PRET-C programs, and PRET-C threads are causal by construction. A journal version of PRET-C syntax, semantics, compilation, and execution is currently (fall 2012) under minor revision at IEEE Transaction on Computers.
Concerning the timing analysis (item 3 of Section III.3), we have continued our initial model checking based formulation [Roop et al., CASES 2009]. We have now developed an automated tool to convert PRET-C programs into the input format for the UPPAAL model checker. This has enabled us to perform timing analysis over a range of large benchmark programs. We have also recently compared our approach with ILP-based approaches (Integer Linear Programming). This comparison reveals that our own approach yields both more scalable and tighter analysis compared to the ILP-based approaches. These results have appeared in a conference paper [Andalam, Roop & Girault, DATE 2011].
We have already developed most of the core parts of the tool chain. These are constantly updated, as we are exploring techniques for tighter and smarter timing analysis. We are currently lacking a complete an integrated development environment (IDE) combined with a debugger tailored for PRET-C.
In 2011, Sidharta worked on cache analysis with Jan Reineke (at that time postdoc at UC Berkeley, in the group of Prof. Edward Lee). We discussed and compared the "binary approach" (developed during Sidharta's visit in INRIA) with the absint approach. We concluded that, although the "binary approach" will be more precise, it is unsure how much more time it will take to analyze the program. We also developed a new abstraction to enhance the binary approach, based on a ternary representation of the cache states as guaranteed hit (h), guaranteed miss (m), and either miss or hit (nc). This "ternary approach" will be less precise than the "binary approach", but will result in a faster analysis time.
Secondly, we got access to the commercial tool aiT from absint. The tool is very impressive, and we used it to perform a simple cache analysis over the ARM architecture.
Thirdly, in collaboration with the PRET UC Berkeley research team, we started looking at the timing analysis problem from a new perspective. Instead of making the timing analyzer tightly coupled to an architecture (like the current approaches), we started designing a new architecture-parametric timing analyzer. The parameters describing the architecture will therefore be an input to the timing analyzer. This will allow the portability of hard real-time applications over different architectures.
Another key aspect is the balance between the precision of the WCET estimate and the analysis time. The analysis time is crucial for scalability. We have implemented the absint approach and the NUS approach (from National University of Singapore). Our initial benchmarking concluded that the NUS approach gives tighter estimates than the absint approach. However, the NUS approach takes a much longer time to analyze a program. For example, a program that takes NUS about 5 hours to analyze will only take absint a few seconds. Also, for larger programs the NUS approach either runs out of memory (3Gb) or takes too long (>12hrs) to analyze. We have concluded that the NUS approach will not scale for larger applications.
We have also implemented our "binary approach". It achieves the same precision as the NUS approach but consumes significantly less memory and takes a shorter time to analyze. However, during the benchmarking of large programs, we were unable to finish the analysis within a reasonable amount of time, so the "binary approach" is not very scalable.
We are therefore developing a new approach. Instead of operating at either ends of the spectrum (high precision or smaller analysis time), we have formulated a new "parameterized approach", in order to control the abstraction, which in turn affects the analysis time. This new approach allows us to explore intermediate points within the spectrum, making this approach very attractive. Moreover, the abstraction captured by the previously proposed ternary approach is equivalent to a special point in this new parameterized approach. Currently, we operate at the maximum abstraction during the analysis. Due to the implementation overheads, we have found our new parameterized approach to be more effective than the previously proposed ternary approach. We are currently benchmarking this parameterized approach against the absint and the NUS approaches, and we will submit a journal paper at the end of 2012.
Finally, concerning the FORE-C language, we have developed its syntax and the (informal) semantics. To the best of our knowledge, FORE-C is the first attempt at the design of a synchronous language for the deterministic parallel programming of embedded multicores. A FORE-C compiler has been developed that targets a general purpose multicore architecture with cores based on the Xilinx MicroBlaze softcore processor. The traditional approach of using shared variables for thread communication requires the programmer to manage parallel access from multiple threads with mutual exclusion protocols. These protocols impose a sequential order on thread execution, thus sacrificing parallelism. In contrast, FORE-C enables threads to access shared variables in parallel without needing to sequentialize.
We have also developed a reachability-based static timing analysis approach for FORE-C programs executed on multicores. Preliminary results have shown that very precise WCRTs are computed for a range of large programs. WCRT speedups were also computed for the same programs when distributed over additional cores. To decrease the analysis time, we developed optimisation techniques that decrease the size of the program's control-flow graph used as input into the analysis. Using the optimisations, we are able to reduce analysis time by two orders of magnitude, without affecting the precision.
To validate the FORE-C timing analysis work, we have developed a cycle-accurate multicore simulator with cores based on the Xilinx MicroBlaze softcore processor. An arbitrary number of cores connected to a shared TDMA bus can be simulated. The simulator is extensible in that custom peripherials can be created and attached to any processor or to the shared TDMA bus. The simulator is able to output to the user the executed instructions on each core.
[Sun, Salcic & Malik, 2010]: "LibGALS: a library for GALS systems design and modeling", in ASP DAC Conference, Taipei, Taiwan, January 2010.
[Andalam, Roop, Girault & Traulsen, 2010]: "Deterministic, predictable and light-weight multithreading using PRET-C -- Extended abstract", interactive presentation at DATE Conference, Dresden, Germany, March 2010.
[Malik, Salcic, Roop & Girault, 2010]: "SystemJ: A GALS Language for System Level Design", in Elsevier Journal on Computer Languages, Systems and Structures, vol 36(4):317-344, December 2010.
[Andalam, Roop & Girault, 2010]: "Predictable multithreading of embedded applications using PRET-C", in MEMOCODE Conference, Grenoble, France, July 2010.
[Malik, 2010]: First release of the DSystemJ compiler, April 2010.
[Malik, Girault & Salcic, 2010]: "The DSystemJ programming language for dynamic GALS systems: it's semantics, compilation, implementation, and run-time system", INRIA Research Report, No RR-7346, July 22, 2010, http://hal.inria.fr/inria-00505085/fr/.
[Andalam, Roop & Girault, 2011]: "Pruning Infeasible Paths for Tight WCRT Analysis of Synchronous Programs", in DATE Conference, Grenoble, France, March 2011.
[Kuo, Sinha & Roop, 2011]: "Efficient WCRT analysis of synchronous programs using reachability", in DAC conference, San Diego, USA, June 2011.
[Malik, Girault & Salcic, 2011]: "A GALS language for dynamic distributed and reactive programs", in ACSD conference, Newcastle, UK, 2011.
[Malik, Girault & Salcic, 2012]: "Formal semantics, compilation and execution of the GALS programming language DSystemJ", in IEEE Transactions on Parallel and Distributed Systems, 2012.
[Sinha, Roop, Salcic & Basu, 2012]: "Correct-by-Construction Multi-Component SoC Design", in DATE Conference, Dresden, Germany, March 2012.
[Yuan, Yoong & Roop, 2012]: "Efficient Compilation of Esterel for Multi-core Execution", INRIA Research report 8056, September 2012.
[Andalam, 2012]: "Predictable Platforms for Safety-Critical Embedded Systems", PhD thesis, University of Auckland, September 2012.
[Sinha, Girault, Goessler & Roop]: SoCConvert - a Java-based interactive tool for incremental converter synthesis using SER [to be released in 2012].
[Malik, Salcic, Chong & Javed]: "System-level approach to design of a smart distributed surveillance system using SystemJ", to appear in ACM Trans. on Embedded Computer Systems.
[Salcic, Malik]: "GALS-HMP: A Heterogeneous Multiprocessor for Embedded Applications", to appear in ACM Trans. on Embedded Computer Systems.
The following items will be conducted jointly by the POP ART and ACEI partners.
Extention of the current formulation to allow uncontrollable signals between components (uncontrollable in the sense of discrete controller synthesis [Ramadge and Wonham, 1987]). This is a madatory step towards incrementality.
Implementation of the converter synthesis algorithm, and benchmarking of this implementation.
Submit a journal article on our new formulation and solution.
In the current formulation, control signals and data signals are distinguished. Control signals are handled as transition events in our LTSs, and they are directly managed by our discrete controller synthesis based formulation. In contrast, data signals are handled as buffers for which we only control the number of bits that they contain. We seek to unify these two formulation under a common formalism based on Kahn process networks.
The following items will be conducted jointly by the POP ART and ACEI partners.
The first concept and implementation of DSystemJ is based on the assumptions of unbounded computing resources and especially memories. Also, DSystemJ depends on Java as it compiles at the end to Java byte-codes and executes on the JVM. Our first task for 2011 shall be to look beyond Java as an underlying language and work on specific run-time support that will enable dynamic GALS systems on arbitrary standard platforms that do not require Java. This work will exploit the GALS library of Wei-Tsun Sun [Sun, Salcic & Malik 2010] and extend it for the dynamic aspects of DSystemJ.
We shall also investigate how existing operating systems can be used to provide the elements of run-time support for dynamic GALS languages based on standard programming languages (C and C++).
Finally, we shall investigate optimized code generation of DSsytemJ programs for high-performance platform, and in particular MPSoCs. This differs from the current distributed code generation of DSystemJ because is raises issues of mapping efficiently the DSystemJ clock-domains to the cores, of optimizing the channel based communications on the NoC, and of memory hierarchy. This last task is really exploratory and, with the recent move of Avinash Malik to IBM in Dublin, progress will depend on what manpower will be available to tackle it in 2011 and 2012.
Items (5) below will be mostly done by the ACEI partner. All the other items will be conducted jointly by the POP ART and ACEI partners.
Work on the timing analysis of PRET-C programs running on processors with caches. This is a crucial point to allow the design and implementation of large application with PRET-C. This task may look daunting, but we believe that the global synchronization barrier between parallel threads of PRET-C (the so-called EOT statement) makes it achievable. Indeed, it will be possible to refresh all the cache entries upon each synchronization barrier, therefore removing the cache non-determinism due to the control flow branches taken by the code above the barrier. We have been discussing this issue with Reinhard Wilhelm (from the University of Saarland).
Develop a scratchpad memory allocation algorithm. In the context of PRET programming, scratchpad memories present a nice alternative to standard caches because they are easier to control at a software level. This is also the approach followed by the UC Berkeley team of Edward Lee.
Evaluate and compare the predictability and throughput of PRET-C programs between processors equipped with caches and with scratchpad memories. We will then submit a journal and a conference paper on our timing analysis of our memory hierarchy work.
Depending on the available manpower, we will complete the Integrated Design Environment (IDE). This requires the following items: (i) A PRET-C Editor. (ii) A TCCFG viewer. (iii) A timing analysis report. (iv) Based on the Worst Case Reaction Time path, the IDE will highlight both the TCCFG and the PRET-C source file with timing information and will highlight the path leading to WCRT.
A final research direction is the use of multicore processors for the execution of PRET programs. This will be a significant step that will be started in 2011 and will be continued all the way through to the end of the project. This will imply the design of a new programming language FORE-C to facilitate distribution of threads over cores, so as to achieve both predictable execution and good throughput. FORE-C will be derived from PRET-C but with significant differences, the main one being that PRET-C threads are executed sequentially on a single core, FORE-C threads are executed in real concurrency; this has a major impact on how shared variables are declared, handled, and compiled. This action will involve Eugene Yip who is doing his with Partha Roop in the ACEI team.
We shall work on the formalization of the FORE-C semantics and on extending the FORE-C timing analysis to include cache hierarchy in the multicore system. Although this work can leverage the cache analysis work developed for PRET-C, the sharing of resources in the multicore setting complicates the analysis. In tandem, the multicore simulator will be extended to include a cache hierarchy for validation purposes.
March 2010: Sidharta Andalam and Alain Girault attended DATE'10 in Dresden to present a joint paper on PRET-C. Presenting PRET-C at the DATE conference has given us an opportunity to discuss our work with the leading researchers in the field. We received a positive response: firstly they liked the new paradigm we offer for real-time systems designers, and they were also eager to play with our tools. Secondly they were intrigued by our methods of using model checker for timing analysis. However, some were skeptical on how our work will scale for larger applications.
March 2010: Sidharta Andalam visited INRIA / POP ART to discuss the current status of the tools and designed an experiment to compare our work with software verifiers such as CBMC. Sidharta developed a tool that converts PRET-C into annotated assembly code, allowing CBMC to perform timing analysis. This experimented shows that CBMC takes significantly longer time to analysis the program compared to our approach. Sidharta also discussed with Bertrand Jeannet about model-checking and the use of abstract interpretation for timing analysis of PRET-C. We developed a new heuristic to increase the tightness of our timing analysis.
May 2010: Zoran Salcic visited INRIA / POP ART to work with Alain Girault and Avinash Malik on the DSystemJ language for programming dynamic GALS systems. DSystemJ extends the GALS language SystemJ with dynamic creation of clock-domains. So far, we have completed the initial design of the language and its first implementation (i.e., the compiler and the run-time system) for desktop-based distributed execution platforms. A conference article has been submitted and a journal article is under construction.
July 2010: Roopak Sinha visited INRIA / POP ART for 2 weeks. The main aim of this visit was to develop a formal protocol conversion technique between protocols with control and data-width mismatches in collaboration with Alain Girault and Gregor Goessler. Significant progress towards this aim was made, and a sound formulation has been developed.
July 2010: Sidharta Andalam visited INRIA / POP ART for 2 weeks, to work with Alain Girault and to present our paper on the PRET-C programming language at the MEMOCODE'10 conference [Andalam, Roop, Girault, 2010]. We received a very positive feedback on PRET-C from the MEMOCODE attendees.
July 2010: Avinash Malik visited University of Erlangen-Nuremberg in Germany, for 1 week, to present our work and discuss further collaboration on development of tools for DSystemJ. This visit was hosted by Zoran Salcic who was on a sabbatical leave at the University of Erlangen-Nuremberg at that time. During the visit, it was decided that DSystemJ would be a suitable language for programming innovative object tracking systems, and the group at the University is now in the process of further researching integration of their existing tools with DSystemJ.
November 2010: Alain Girault visited U
of Auckland / ACEI to work on PRET-C and incremental converter
synthesis. Concerning the PRET-C language, extensions have been
planned to relax the constrain that each loop should contain an
EOT. We now distinguish two kinds of loops: temporal loops that
contain at least an EOT, and combinatorial loops that contain no EOT
but for which the programmer must provide a bound on the loop
counter. For instance, the following loop while (1) {...}
#100
has a bound equal to 100
.
December 2010: Partha Roop and Li-Hsien Yoong visited INRIA / POP ART to attend the SYNCHRON Workshop in Fréjus and to work on AFMES topics, namely: (1) on converter synthesis with Alain Girault and Gregor Goessler, (2) and on WCRT analysis with Bertrand Jeannet and Alain Girault. Regarding WCRT analysis, new plans have been made to combine our current technique with abstract interpretation and loop acceleration. This will allow us to implement a new static analysis that will first compute invariants on the program's variables, then will analyze the cache replacement policy (LRU) to determine the cache hits and misses, and will finally use the results of those two analyses to prune much more aggressively the infeasible paths in the control flow graph of the program to compute a tighter WCRT value.
March 2011: Sidharta Andalam visited INRIA for a period of 2 weeks, to work with the POP ART team and to present our paper on static timing analysis of PRET-C at the DATE'11 conference [Andalam, Roop, Girault, 2011]. Regarding cache analysis, we have discussed different cache replacement policies for the instruction cache and examined current literature based on Abstract Interpretation. For a simple direct mapped cache (no replacement policy), we (Sidharta Andalam, Pascal Sotin, Bertrand Jeannet, Alain Girault) have formulated a binary abstraction to capture the cache behavior: a given memory access will either be a hit or a miss. Then, we have developed a tool to generate structured code which captures cache behavior; this code is then analyzed by Bertrand Jeannet's ConcurInterproc tool to calculate all possible cache states at all control points. This allows us to calculate the number of cache hits and cache misses. We will now continue the evaluation (testing for scalability) of this new approach, and see how we can extend to N-way associative caches, with the LRU cache replacement policy ("Least Recently Used").
March 2011: Sidharta Andalam visited the group of Edward Lee at UC Berkeley for a period of two months. Although this visit may seem out of the scope of AFMES (because it took place neither in Grenoble nor in Auckland), it is in fact perfectly relevant scientifically within our third research axis, "Time predictable programming language and execution architectures". Besides, this visit was made possible thanks to Alain Girault's contact with Edward Lee at UC Berkeley.
Sidharta worked with Jan Reineke and Edward Lee on the comparison of several approaches for cache analysis: the so called absint approach, the NUS approach, and our two onw proposals, namely the ternary and the parameterized approaches. This is detailed in the RESULTS section.
November 2011: Partha Roop attended the SYNCHRON'11 workshop (Damarie-les-Lys, France). He presented his work on the design of the TickPAD memory allocation scheme for synchronous programs. TickPAD is a special type of scratchpad memory tailored for efficient and predictable execution of synchronous programs.
Partha also had meetings with other researchers such as Gregor Goessler (INRIA), Jacky Potop (INRIA), Michael Mendler (U. Bamberg) and Reinhard von Hanxleden (U. Kiel). These meetings were in the direction of collaborative research efforts between INRIA, Auckland, and the German groups towards the design of Precision Timed Computers.
November 2011: Zoran Salcic attended the SYNCHRON'11 workshop (Damarie-les-Lys, France). He prepared a joint presentation on the DSystemJ programming language and the use of the dynamic GALS MoC with Avinash Malik and Alain Girault, which Avinash delivered. Zoran also had talks with Alain and Avinash about the preparation of publications on dynamic GALS in standard C/C++ language setting (one conference and one journal publication are in preparation) and new research directions as the continuation of the current work. Zoran also met a number of colleagues with whom he discussed the real-time aspects of the SystemJ programming language and its efficient execution in embedded settings.
November 2011: Eugene Yip visited INRIA / POP ART and attended the SYNCHRON'11 workshop (Damarie-les-Lys, France). He discussed his PhD which includes work on synchronous languages. At the SYNCHRON workshop, Eugene presented his current work on the timing analysis of synchronous programs executed on multicores and conversed with top researchers in synchronous languages. At INRIA, Alain worked with Eugene on a variant of the PRET-C programming language, called FORE-C, which is amenable for execution on multicores. First, they worked on alternate approaches to implementing parallelism and discussed the tradeoffs between each approach. Second, the semantics of FORE-C was discussed, where Eugene gained a deeper understanding of formal semantics from Alain. Overall, SYNCHRON and INRIA were ideal venues for feedback on Eugene's PhD work and has been invaluable in shaping his PhD further.
February 2012: Alain Girault visited U of Auckland / ACEI to work on PRET-C, on FORE-C, incremental protocol conversion, and LibDGALS. Many discussions with colleagues Salcic, Roop, Andalam, Sinha, Yip, and Sun took place. Good progresses were made on: (i) PRET-C (a journal paper was submitted), in particular on instruction cache analysis; (ii) incremental converted conversion, in particular with quantitative analysis; (iii) requirement engineering, in particular on impact analysis; (iv) and LibDGALS, in particular on the implementation of fault-tolerant reconfigurable systems.
March 2012: Partha Roop visited INRIA / POP ART to attend the DATE'12 conference. He worked with Alain Girault and Gregor Goessler on PRET-C and on incremental protocol conversion with quantitative analysis. (The trip was funded by UoA, while the DATE registration and the stay were funded by INRIA-AFMES for 2000 €).
Fall 2012: Eugene Yip is currently visiting INRIA / POP ART (October 1st to December 15th, 2012). He will attend the SYNCHRON'12 workshop. We shall work on the FORE-C time-predictable programming language. (The trip will be funded by UoA, while the stay will be funded by INRIA-AFMES).
Fall 2012: Sidharta Andalam will visit INRIA / POP ART to work on PRET-C and attend the SYNCHRON'12 workshop.
Fall 2012: Starting in December 2012, Wei-Tsun Sun will be postdoc at INRIA / POP ART to work on runtime support for dynamic creation and process mobility.