Most of the West Team recent publications are available here online. You can remove the abstracts.
Previous publications are also available.
The bibliography was generated with a custom version of the bib2xhtml
package.
With the increase of amount of transistors which can be contained on a chip and the constant expectation for more sophisticated applications, the design of Systems-on-Chip (SoC) is more and more complex. In this paper, we present the use of model transformations in the context of SoC co-design. Both the hardware part and the software part of a SoC can be represented as a model using the MARTE standard from the OMG. We introduce the use of Model-Driven Engineering in order to generate executable code from a self-contained model of SoC. First, we detail the restrictions and extensions we have brought to the MARTE profile in order to permit the complete description of the SoC as a model. The compilation is a sequence of small and maintainable transformations that allows to pass gradually from a high-level description into models closer in abstraction to the final model, which is then converted into code. An in-depth view of one of the several transformation chains composing our tool is given. The implementation relies on the use of our experimental Java-based transformation engine which uses a hybrid declarative-imperative language. We later discuss why model transformations fit better the compilation of the SoCs than traditional compilers. In particular, the re-use of transformations can greatly help with the fast evolution of SoC design, allowing development time reduction. Additionally, as each rule is small and relatively self-contained, their correctness is easier to ensure, which leads to more reliable compilation and indirectly more reliable SoCs.
As System-on-Chip (SoC) architectures become pivotal for designing embedded systems, the SoC design complexity continues to increase exponentially necessitating the need to find new design methodologies. In this paper we present a novel SoC co-design methodology based on Model Driven Engineering using the MARTE (Modeling and Analysis of Real-time and Embedded Systems) standard. This methodology is utilized to model fine grain reconfigurable architectures such as FPGAs and extends the standard to integrate new features such as Partial Dynamic Reconfiguration supported by modern FPGAs. The goal is to carry out modeling at a high abstraction level expressed in UML (Unified Modeling Language) and following transformations of these models, automatically generate the code necessary for FPGA implementation.
Abstract Digital television (DTV) is an advanced broadcasting technology that is spreading fast today. It gives broadcasters the capability to send programs with a better picture and sound quality. Moreover, broadcasters can send several programming choices, called multicasting. DTV consists of a high-performance system combining both control and intensive data processing. In this paper, we first show how the OMG MARTE profile can serve to model such a system. Then, we use the synchronous approach to formally check some temporal properties of the expected system implementation for validation purpose.
This paper presents an approach for the modeling and formal validation of high-performance systems. The approach relies on the repetitive model of computation used to express the parallelism of such systems within the Gaspard framework, which is dedicated to the codesign of high-performance system-on-chip. The system descriptions obtained with this model are then projected on the synchronous model of computation. The result of this projection consists of an equational model that allows one to formally analyze clock synchronizability issues so as to guarantee the reliable deployment of systems on platforms.
Array-OL specification model is a mixed graphical-textual language designed to model multidimensional intensive signal processing applications. Data and task parallelism are specified directly in the model. High level transformations are defined on this model, allowing the refactoring of an application and furthermore providing directions for optimization. The resemblances between with the wide-known and used Loop transformations lead us to try taking concepts and results from this domain and see how they fit in Array-OL context.
System-on-Chip (SoC) architectures are becoming the preferred solution for implementing modern embedded systems. However their design complexity continues to augment due to the increase in integrated hardware resources requiring new design methodologies and tools. In this paper we present a novel SoC co-design methodology based on aModel Driven Engineering framework while utilizing the MARTE (Modeling and Analysis of Real-time and Embedded Systems) standard. This methodology permits us to model fine grain reconfigurable architectures such as FPGAs and allows to extend the standard for integrating new features such as Partial Dynamic Reconfiguration supported by modern FPGAs. The overall objective is to carry out modeling at a high abstraction level expressed in a graphical language like UML (Unified Modeling Language) and afterwards transformations of these models, automatically generate the necessary specifications required for FPGA implementation.
In this paper, we present a methodology which allows omp code generation and makes the design of parallel applications easier. The methodology is based on the Model Driven Engineering (MDE) approach. Starting from UML models at a high abstraction level, omp code is generated through several metamodels which have been defined. Results show that the produced code is competitive with optimized code.
As System-on-Chip (SoC) design complexity continues to increase rapidly, new design methodologies are required to resolve this dilemma. In this paper we present a novel SoC co-design methodology based on Model Driven Engineering using the MARTE (Modeling and Analysis of Real-time and Embedded Systems) standard. We utilize this methodology to model fine grain reconfigurable architectures such as FPGAs and extend the standard to integrate new features such as Partial Dynamic Reconfiguration supported by modern FPGAs. The goal is to carry out modeling at a high abstraction level expressed in UML and following transformations of these models, automatically generate the code necessary for FPGA implementation.
The increasing amount of hardware resources in next generation MultiProcessor Systems-on-Chip (MPSoC) calls for efficient design methodologies and tools to reduce their development complexity. This paper presents a candidate MPSoC design environment Gaspard2, which uses the MARTE (Modeling and Analysis of Real-Time and Embedded systems) standard profile for high-level system specification. Gaspard2 adopts a methodology based on Model- Driven Engineering. It promotes separation of concerns, reusability and automatic model refinement from higher abstraction levels to executable descriptions.
Modern System-on-Chip (SoCs) are becoming more complex with the integration of heterogeneous components. Therefore, a high performance interconnection medium is required to handle the complexity. Thus Network-on-Chips (NoCs) come into play enabling the integration of more Intellectual Properties (IPs) into the SoC with increased performance. The NoCs are based on the concept of Interconnection Networks for connecting parallel machines. In the recent MARTE (Modeling and Analysis of Real-time and Embedded Systems) Profile, a notion of multidimensional multiplicity has been proposed to model repetitive structures and topologies. This paper presents a modeling methodology based on that notation that can be used to model the Delta Network family of Interconnection Networks for NoC construction.
Les unités mémoires jouent un rôle de première importance dans les architectures embarquées multiprocesseurs (ou Multi-Processor System-on-Chip MPSoC). Dans cet article, nous nous intéressons en particulier aux MPSoC à mémoires partagées où la gestion de la cohérence de données représente un point crucial de l'architecture. Notre étude montre que la mise en cache des données partagées permet de réaliser des gains en performance et en consommation de puissance. De ce fait, les architectures multiprocesseurs embarquées sans cohérence de caches ne sont pas efficaces. Cette étude montre aussi qu'un choix judicieux de l'implémentation de cette gestion est nécessaire pour offrir aux applications un gain en performances et en consommation d¿énergie. Deux méthodes sont habituellement proposées pour résoudre ce problème ; celles qui appliquent l'invalidation des données et celles qui appliquent la mise à jour des données. Ces 2 méthodes comportent plusieurs inconvénients. D'un coté, leur mise en oeuvre dans le cas des MPSoC utilisant des NoC complexes peut engendrer un coût important en consommation d¿énergie et de l'autre coté elles ne prennent pas en compte les motifs des accès mémoires réalisés par les applications. La méthode que nous proposons dans cet article tente de remédier à ces 2 faiblesses. Elle tire profit des 2 approches précédentes et repose sur une architecture originale facilitant sa mise en oeuvre. Les résultats préliminaires ont montré que des gains intéressants en performance et en consommation d¿énergie peuvent être obtenus grâce à cette méthode.
In this paper, we present a framework for Shared Memory Architectures that makes designing of parallel applications easier. We use the Model-Driven Engineering (MDE) approach and integrate new metamodels in Gaspard for each step of the design flow. The targeted model is an OpenMP metamodel, from which we immediately derive a source code in OpenMP Fortran or OpenMP C. This approach based on models allows a better reuse and also gives a better and more hierarchic view of the application so that it can better fit the architecture.
Field programmable gate arrays (FPGAs) provide an interesting solution when custom logic is needed for short time to market products. The products embedding FPGA System on chip solutions allow them to be updated once deployed. Recent FPGA architectures, such as Xilinx Virtex Series, allow for partial and dynamic run-time reconfiguration (PDR). The FPGA fabric can modify its configuration data at run-time, enabling substitution of specific portions of an implemented hardware design causing the system to be adapted to the needs of the application. Hence reliability, failure-redundancy and run-time adaptivity by usage of PDR are introduced that are critical aspects for embedded systems. In this paper, we present a work in process to provide high level modeling of PDR and a design flow to automatically generate VHDL code from these high level models depending on QoS criteria such as time and area for reconfiguration and power consumption.
Les technologies du génie logiciel ne sont pas réservées au développement d'applications Web. Nous montrons dans ce papier que l'approche dirigée par les modèles, associée à des modèles qui permettent l' expression à un haut niveau d'abstraction le parallélisme d'une application vont permettre de produire de façon automatique du code VHDL à partir de modèles exprimés en UML. Les types d' applications concernées recouvrent le traitement de signal systématique avec une partie contrôle. Le secteur du transport, en particulier l'automobile, est par-delà même un domaine d'application privilégié de notre environnement de conception.
The increasing complexity of embedded system designs calls for high-level specification formalisms and for automated transformations towards lower-level descriptions. In this report, a metamodel and a transformation chain are defined from a high-level modeling framework, Gaspard, for data-parallel systems towards a formalism of synchronous equations. These equations are translated in synchronous data-flow languages, such as Lustre, Lucid synchrone and Signal, which provide designers with formal techniques and tools for validation. In order to benefit from the methodological advantages of re-usability and platform-independence, a Model-Driven Engineering approach is applied.
In this paper, we first present an efficient Multi-Processor Systems-on-Chip design methodology based on Model-Driven Engineering. Later, a deployment profile is introduced to allow IP reuse and to carry multilevel implementation details. With this methodology, simulations at different levels are automatically generated, reducing the cost of targeting several levels. A compilation chain has been developed to transform the high abstraction level into both CABA and PVT simulation levels. The effectiveness of the methodology is illustrated by the development of an H.263 encoder.
In this paper, we present an efficient Multi-Processor Systems-on-Chip (MPSoC) design flow. It is based on a Model-Driven Engineering (MDE) approach. A compilation chain has been developed to transform the high abstraction level into both Cycle Accurate Bit Accurate (CABA) and Timed Programmer View PVT SystemC simulation. We use the standard MARTE profile to represent MPSoC systems. This representa- tion separates the application, the hardware architecture and the corresponding allocation. Later, through several model to model transformations, we succeed to generate SystemC code of the modeled MPSoC system.
With the advent of multi-processor Systems-on-Chip (MpSoC), the need for modeling the distribution of a parallel application onto a parallel hardware architecture is increasing. The recent standard profile for the modeling and analysis of real-time and embedded systems (MARTE) provides a notation for the modeling of regular distributions. This notation allows to distribute computations to processing elements, data to shared or distributed memories, etc. In this paper we will highlight the expressivity of this notation and clarify its usage through examples and comparisons to other distribution notations such as in High Performance Fortran.
The increasing complexity of embedded system designs calls for high-level specification formalisms and for automated transformations towards lower-level descriptions. In this paper, a metamodel and a transformation chain are defined from a high-level modeling framework, gaspard, for data-parallel systems towards a formalism of synchronous equations. These equations are translated in synchronous data-flow languages, such as lustre, which provide designers with formal techniques and tools for validation. In order to benefit from the methodological advantages of re-usability and platform-independence, a Model-Driven Engineering approach is applied.
This document describes the current UML profile of Gaspard2. This profile extends the UML semantics to allow the user to describe a SoC (System-on-Chip) in three steps: the application (behavior of the Soc), the hardware architecture, and the association of the application to the hardware architecture. The application is represented following a data flow model, but additional mechanisms permit the usage of control flow on those applications. In addition to those notions, the profile contains a package introducing factorization mechanisms to enable the compact description of massively parallel and repetitive systems.
Ce document décrit le profil UML Gaspard2 actuel. Ce profil étend la sémantique d'UML pour permettre à l'utilisateur de décrire un SoC (système-sur-puce) en trois étapes : l'application (le comportement du SoC), l'architecture matérielle, et l'association de l'application sur l'architecture. L'application est représentée selon un modèle de flux de données, mais des mécanismes supplémentaires permettent l'usage d'un flux de contrôle sur ces applications. En complément à ces notions, le profil contient un paquetage introduisant des mécanismes de factorisation rendant possible la description compacte de systèmes massivement parallèles répétitifs.
Manipulating configurable resources like FPGAs in a co-design framework has become essential: especially, FPGAs may efficiently implement parallel systematic signal processing tasks. Nevertheless, such implementations are usually hand written at a low level. Our proposition is to provide high level modeling of an application and tools to automatically generates tuned VHDL code from theses high level models. This paper introduces a new flow able to fit a parallel application onto an FPGA according to the FPGA characteristics such as computing power and IOs. The flow is based on iterative refactoring and transformations of the application. From the resulting application, a VHDL code generation is launched. The generated code is finally used to simulate or synthesize the application. Significant experiments have validated the approach.
To use the tremendous hardware resources available in next generation MultiProcessor Systems-on-Chip (MPSoC) efficiently, rapid and accurate design space exploration (DSE) methods are needed to evaluate the different design alternatives. In this paper, we present a framework that makes fast simulation and performance evaluation of MPSoC possible early in the design flow, thus reducing the time-to-market. In this framework and within the Transaction Level Modeling (TLM) approach, we present a new definition of the timed Programmer's View (PVT) level by introducing two complementary modeling sublevels. The first one, PVT Transaction Accurate (PVT-TA), offers a high simulation speedup factor over the Cycle Accurate Bit Accurate (CABA) level modeling. The second one, PVT Event Accurate (PVT-EA), provides a better accuracy with a still acceptable speedup factor. An MPSoC platform has been developed using these two sublevels including performance estimation models.Simulation results show that the combination of these two sublevels gives a high simulation speedup factor of up to 18 with a negligible performance estimation error margin.
Manipulating configurable resources like FPGAs in a co-design framework has become essential: especially, FPGAs may efficiently implement parallel systematic signal processing tasks. Nevertheless, such implementations are usually hand written at a low level. Our proposition is to provide high level modeling of an application and tools to automatically generates tuned VHDL code from these high level models. This paper introduces a flow able to fit a parallel application onto an FPGA according to the FPGA characteristics, and to map this application in a structural way onto an FPGA. Each step of the flow requires specific details of the FPGA that realize the implementation. To get these details, several views of a same FPGA are introduced: black box, quantitative and physical. The flow automatically generates the VHDL code of the initial application and a constraint file that guide the synthesis tools.
Modern SoCs are becoming more complex with the integration of heterogeneous components (IPs). For this purpose, a high performance interconnection medium is required to handle the complexity. Hence NoCs come into play enabling the integration of more IPs into the SoC with increased performance. These NoCs are based on the concept of Interconnection networks used to connect parallel machines. In response to the MARTE RFP of the OMG, a notation of multidimensional multiplicity has been proposed which permits to model repetitive structures and topologies. This report presents a modeling methodology based on this notation that can be used to model a family of Interconnection Networks called Delta Networks which in turn can be used for the construction of NoCs.
In order to unify our internal exchange and communication about transformations, we propose TrML (Transformation Modeling Language), a unified UML notation to design model transformations. This proposal aims to reify the synthesis of existing notations dedicated to transformation modeling. TrML is independent from implementation details and could be adapted to several transformation engines. To let TrML run on top of existing engine, we transform TrML model to a model accepted by the engine. But, which language should we use for the first transformation TrML, the targeted engine or another one? In this article we will describe how we bootstrap our new language on top of existing transformation engines.
MppSoC is a SIMD architecture composed of a grid of processors and memories connected by a X-Net neighbourhood network and a general purpose global router. MppSoC is an evolution of the famous massively parallel systems proposed at the end of the eighties. We claim that today such a machine may be integrated in a single chip. On one side, new design methodologies such as IP reuse and, on the other side, the possible high level of integration on a chip let us envisage such a revival. Some improvements of the system architecture are possible because of the high degree of integration: The mppSoC processing elements share most of their design with the control processor, the integrated network allows to exchange data between PEs, but also between the control processor and the PE memories, and even to connect the external devices to the system. This paper presents the mppSoC architecture, a cycle-accurate bit-accurate SystemC simulator of this architecture, and a prototype of implementation on FPGA. A complete tool chain and the execution of some applications on the simulator and the FPGA implementation validate the modeling choices and show the effectiveness of this design.
MppSoC is a SIMD architecture composed of a grid of extended MIPS R3000 processors, called Processing Elements (PEs). This embedded system gives interesting performances in several modern applications based on parallel algorithms. Communication is clearly a key issue in such a system. In fact, regular communication between the PEs are assumed by a X-Net network, while point to point connections seem to be very tedious to realize using such a network. We present in this paper a model and an implementation of a communication network called mpNoC. This IP permits non-regular communications between PEs in an efficient way. MpNoC is integrated in the mppSoC platform.
A SoC («System on Chip») is an integrated circuit made of a set of hardware components (micro-processors, DSP, input/output devices, etc.) interconnected through communication buses and a software layer (application and real time operating system). The design of such systems tends to be based mostly on IP (Intellectual Property) reuse. The designer uses IPs from different sources with heterogeneous models (different abstraction levels). This approach reduces «time to market», but its application requires new design methodologies. Gaspard proposes a methodology based on Model Driven Engineering (MDE) for SoC design. It intends the use of many simulation and execution platforms (Java, OpenMP, SystemC, VHDL, etc.) at different levels of abstraction (TLM, RTL, etc.). Models of different platforms and abstraction levels are generated in Gaspard by model transformations. The heterogeneity of the targeted platforms leads to an interoperability problem. In this thesis, we propose a solution based on MDE to this interoperability problem. This solution is performed in three steps. First, we introduce traceability in model transformations; therefore a trace model is generated along with model tranformations. This trace model is then used as an entry model for a transformation which generates an interoperability bridge model. Finally, the code for interoperability is generated from the bridge model of the previous step. In order to automate the process, metamodels for traceability and interoperability bridge have been designed. We also provide the description of the different transformations involved in the process.
Un système sur puce (SoC, pour «System on Chip») est un circuit intégré qui comporte un ensemble de composants matériels (microprocesseurs, DSP, entrées/sorties ...) connectés entre eux par des bus de communication et une couche logicielle (système d'exploitation temps réel et applicatif). La conception de tels systèmes repose de plus en plus sur la réutilisation de composants virtuels (IP, pour Intellectual property). Le concepteur utilise des IPs très souvent d'origines diverses ayant des modèles hétérogènes (différents niveaux d'abstraction : comportemental, RTL, etc.). Cette approche améliore le délai de mise sur le marché («time to market»), mais elle nécessite de la part du concepteur de nouvelles méthodes de conception. Gaspard propose une méthodologie basée sur l'Ingénierie Dirigée par les modèles (IDM) pour la conception des SoCs. Il vise l'utilisation de plusieurs plates-formes de simulation (Java, OpenMP, SystemC, VHDL, etc.) et différents niveaux d'abstraction (TLM, RTL, etc.). Les modèles des différentes plates-formes et niveaux d'abstraction sont générés dans Gaspard par transformations de modèles. L'hétérogénéité des plates-formes visées introduit un problème d'interopérabilité. Dans ce travail de thèse, nous proposons une démarche basée sur l'IDM pour répondre à ce besoin d'interopérabilité. Cette solution est élaborée en trois étapes. Dans un premier temps, nous introduisons la traçabilité dans les transformations de modèles; un modèle de trace est alors généré pendant les phases de transformations de modèles. Ce modèle de trace est ensuite utilisé en entrée d'une transformation pour générer un modèle de pont (« bridge ») d'interopérabilité. Enfin, la génération du code du pont d'interopérabilité est réalisée à partir du modèle de pont. Pour automatiser ce processus, nous avons défini un métamodèle de traçabilité et un métamodèle de pont d'interopérabilité. Les différentes opérations de transformations de modèles nécessaires ont également été décrites.
The work presented in this thesis adresses modeling of high performance system on chip. These systems are based on applications of systematic signal processing, which processes data with several dimensions. It is therefore important to use well-adapted models that are able to take this multidimensional aspect into account. We present the existing computation models used for the specification of multidimensional applications. Then, we focus on the Array-OL model, which only relies on the specification of data dependencies. However, this model does not support the modeling of control behaviors that are generally useful in the description of some signal processing applications. The goal of our work is then to propose a specification model that introduces the concept of control in the Array-OL model. To do this, we study existing works in the field of synchronous reactive systems, and in particular those enabling the description of hybrid systems. This study allows us to define a design methodology which clearly separates control and data processing. We discuss the advantages of this methodology for formal verification, and we illustrate its application in the case of the design of an automotive system. After that, we propose an approach based on a concept of degree of granularity to specify control in Array-OL. We also study the possibility of extending this concept to that of multi-degrees of granularity to enable the modeling of more complex applications taking into account different control parts. Finally, our study is based on MDE approach and contributes to the definition of an UML profile for the Gaspard2 development environment. In this context, we detail the description of the profile and we illustrate its use in the design of a video processing application.
Les travaux présentés dans cette thèse s'inscrivent dans le cadre des recherches menées sur la modélisation et la conception des systèmes sur puce à hautes performances. Ces systèmes sont basés sur des applications de traitement systématique à parallélisme massif opérant sur des données à plusieurs dimensions. Il est donc important de disposer de modèles capables de prendre en considération cet aspect multidimensionnel. Nous présentons les différents modèles de calcul existants pour la spécification de ces applications multidimensionnelles. Puis, nous nous intéressons au modèle Array-OL basé sur la seule expression des dépendances de données. Cependant, ce modèle ne prend pas en compte la modélisation des comportements de contrôle qui sont généralement indispensables dans la description de certaines applications de traitement du signal. L'objectif de notre travail est donc de proposer un modèle de spécification introduisant la notion de contrôle dans le modèle Array-OL. Nous étudions pour cela les travaux réalisés autour des systèmes réactifs synchrones, et en particulier ceux permettant la description des systèmes hybrides. Cette étude nous a permis de définir une méthodologie de conception séparant clairement le contrôle et les calculs. Nous discutons les avantages de cette méthodologie, notamment en terme de vérification formelle, et nous illustrons son application dans la conception d'un système automobile. Par la suite, nous proposons une approche basée sur un concept de degré de granularité pour associer la description du contrôle aux modèles Array-OL. Nous étudions également la possibilité d'étendre ce concept à celui de multi-degrés de granularité pour permettre la modélisation d'applications plus complexes contenant différentes parties de contrôle. Enfin, notre démarche est basée sur une approche IDM et contribue à la définition d'un profil UML pour l'environnement de développement Gaspard2. Dans ce cadre, nous détaillons la description du profil et nous illustrons son utilisation pour concevoir une application de traitement de vidéo.
In this paper, we present a multilevel framework for MultiProcessor Systems-on-Chip (MPSoC) that makes fast simulation and performance evaluation possible in the design flow. In this framework, we use the Model-Driven Engineering (MDE) approach within the GASPARD design flow. Two target simulation models at the Cycle Accurate Bit Accurate (CABA) and the timed Programmers view (PVT) abstraction levels are defined. In addition, in this paper, a set of meta-models corresponding to the simulation models and the deployment phase are also detailed. The latter meta-model allows hardware component refinement with performance parameters specification. Experimental results show the usefulness of our framework to decrease the design complexity of MPSoC architecture and to acheive high speedup simulation with a negligible estimation error margin.
Ce papier présente les premiers résultats d'une étude concernant la transformation d'applications à parallélisme de données en équations synchrones. Les applications considérées sont exprimées à l'aide du métamodèle GASPARD qui étend le langage ARRAY-OL, dédié aux applications de traitement de données intensives. Le principe général des transformations envisagées est exposé ainsi que les idées de mise en oeuvre. Les modèles synchrones résultants permettent d'aborder plusieurs questions liées à la validation formelle, par exemple, vérification de propriétés de synchronisabilité, de latence, etc, en utilisant les outils et techniques formels offerts par la technologie synchrone. Ils permettent ainsi l'accès à des fonctionnalités complémentaires avec celles de l'environnement associé à GASPARD, qui propose uneméthodologie de conception conjointe matériel/logiciel de systèmes intégrés sur puce. Les transformations suivront une approche d'Ingénierie dirigée par les modèles (IDM/MDE). Des perspectives sont mentionnées concernant l'introduction d'automates de contrôle au sein des modèles obtenus.
In this paper we study the application of a safe design methodology in the case of an automotive system. This methodology is based on a clear separation between control and data parts. It allows to facilitate the specification and to have a better readability. We present the advantages of this methodology on a GPS cruise control system.
Tolerating the value failures of sensors is an important problem in automated control processes and plants. In this paper, we address this problem in a theoretical framework in order to demonstrate the feasibility of an automatic method based on discrete controller synthesis. We consider a faultintolerant program whose job is to control an automated process, here a liquid tank equipped with level sensors that can be subject to value faults. This fault-intolerant program is modeled as a finite labeled transition system. We then specify formally a fault hypothesis, i.e., how many sensors can fail simultaneously. We use discrete controller synthesis to obtain automatically a program, having the same behavior as the initial fault-intolerant one, and satisfying the fault tolerance requirements under the fault hypothesis. We advocate that, thanks to the use of discrete controller synthesis, our method offers flexibility, reliability, separation of concern, and it is automatic.
Anti-collision radars help prevent car accidents by detecting obstacles in front of vehicles equipped with such systems. This task traditionally relies on a correlator, which searches for similarities between an emitted and a received wave. Other modules can then use the information produced by the correlator to compute the distance and the relative speed between the moving vehicle and the obstacle. We implemented such a system using FPGAs. We used hardware blocks to implement the high demanding computing correlator and a soft-core processor to compute the distances and speeds. In order to improve the maximum detection distance reached by the correlation algorithm, we developed and tested a modified version of a Higher Order Statistics based algorithm. This work results in a detailed description of this new algorithm, its possible implementations and the corresponding FPGA synthesis results.
The ModEasy project seeks to develop techniques and software tools to aid in the development of reliable microprocessor based electronic (embedded) systems using advanced development and verification systems. The tools are to be evaluated in practical domains such as the automotive sector for reactive cruise control and anti-collision radar. We choose to define specific IPs using FPGA techniques to cover this application domain. This paper presents the implementation of such a complex and safety application on a single FPGA. The target system is composed of a reactive cruise control, a detection radar and the associated treatments.
Les SoC ou systèmes monopuces deviennent l'architecture principale d'exécution des applications embarquées. Ces systèmes sont composés sur une même puce de plusieurs processeurs, coprocesseurs et mémoires connectés par un réseau de communication (un bus dans les cas simples). Pour pouvoir assurer un haut débit d'échange d'information, les réseaux sur puce se complexifient. On parle d'ailleurs de NoC (Network-on-Chip) pour désigner des systèmes sur puce dont les composants sont reliés par un réseau complexe. Dans le cadre de la réponse au RPF MARTE de l'OMG, l'équipe WEST a proposé une notation à base de multiplicités multidimensionnelles pour modéliser des structures répétitives. Le pouvoir d'expression de cette notation est important. Elle permet d'exprimer facilement toutes les topologies à base de grilles. La question de savoir si on peut l'utiliser pour représenter l'ensemble des topologies utilisées dans les réseaux d'interconnexion reste ouverte. Ce mémoire présente une méthode de modélisation que nous pouvons utiliser pour une famille des réseaux d'interconnexion multi étage ce qui s'appellent les réseaux delta. Nous avons étendu les constructions auxiliaires dans la norme d'UML 2 parce que la norme existante manque de la sémantique pour exprimer la modélisation désirer.
La conception de métamodèle introduit un certain nombre de problèmes récurrents : à chaque développement de nouveaux métamodèles des questions identiques se posent. Les solutions retenues sont très similaires, quand elles ne sont pas un simple copié/collé. Cet article présente le motif « relation » et ses déclinaisons « relation dirigé » et « association » que nous avons identifié lors de la conception de métamodèles pour la modélisation de systèmes embarqués, et pour la réalisation de moteurs de transformation. Les motif de relation permettent de modéliser des relation entre des concepts, et d'associer de l'information à ses relations.
In this report, we present the first results of a study on the modeling of data-intensive parallel applications following the synchronous approach. More precisely, we consider the Gaspardextension of Array-Ol, which is dedicated to System-on-Chip codesign. We define an associated synchronous dataflow equational model that enables to address several design correctness issues (e.g. verification of frequency / latency constraints) using the formal tools and techniques provided by the synchronous technology. We particularly illustrate a synchronizability analysis using affine clock systems. Directions are drawn from these bases towards modeling hierarchical applications, and adding control automata involving verification.
Dans ce rapport, nous présentons les premiers résultats d'une étude sur la modélisation d'applications parallèles de traitement de données intensives, basée sur l'approche synchrone. Plus exactement, nous considérons l'extension Gaspard d'Array-Ol, qui est dédiée à la conception conjointe de systèmes intégrés sur puce. Nous définissons un modèle flot de données synchrone équationnel associé, qui permet d'aborder plusieurs questions liées à la correction lors de la conception (par exemple, vérification de contraintes de latence ou de fréquence), en utilisant les outils et techniques formelles offerts par la technologie synchrone. Nous illustrons particulièrement une analyse de synchronisabilité en utilisant les systèmes d'horloges affines. Des perspectives sont ensuite mentionnées concernant la modélisation d'applications hiérarchiques, et l'ajout d'automates de contrôle impliquant la vérification.
The ARTiS system is a real-time extension of the GNU/Linux scheduler dedicated to SMP (Symmetric Multi-Processors) systems. It allows to mix High Performance Computing and Real-Time. ARTiS exploits the SMP architecture to guarantee the preemption of a processor when the system has to schedule a real-time task. The implementation is available as a modification of the Linux kernel. The basic idea of ARTiS is to assign a selected set of processors to real-time operations. A migration mechanism of non-preemptible tasks insures a latency level on these real-time processors. Furthermore, specific load-balancing strategies permit ARTiS to benefit from the full power of the SMP systems: the real-time reservation, while guaranteed, is not exclusive and does not imply a waste of resources.
Early energy estimation is increasingly important in MultiProcessor System-On-Chip (MPSoC) design. Applying traditional approaches, which consist in delaying the estimation until the architectural layout has been produced, is inefficient and prevents the rapid exploration of alternative architectures. In this paper, we present a framework for architectural exploration as part of MPSoC design. Our framework allows configurations that offer a good performance/energy tradeoffs to be found early in the design flow. The hardware components, described at the Cycle-Accurate Bit-Accurate (CABA) level of SystemC, were taken from the SoCLib library. For each component in the library, we developed an energy model using both physical measurements and analytical models of energy consumption. These models indicate a good accuracy/speed tradeoffs. Plugging the energy models into the SoCLib architectural simulator makes it easy to estimate the application's performance and energy consumption. The effectiveness of our method is illustrated through design space exploration (DSE) for a parallel signal processing application.
Anti-collision radars help prevent car accidents by detecting obstacles in front of vehicles equipped with such systems. This task traditionally relies on a correlator, which searches for similarities between an emitted and a received wave. Other modules can then use the information produced by the correlator to compute the distance and the relative speed between the moving vehicle and the obstacle. We implemented such a system using FPGA. We used hardware blocks to implement the high demanding computing correlator and a soft-core processor to compute the distances and speeds. In order to improve the maximum detection distance reached by the correlation algorithm, we developed and tested a modified version of a Higher Order Statistics based algorithm. This work results in a detailed description of this new algorithm, its possible implementations and their FPGA synthesis results.
Les radars anti-collision permettent la prévention d'accidents, via d'une détection des obstacles se trouvant devant un véhicule utilisant ce dispositif. Cette tâche de détection d'obstacles est généralement réalisée à l'aide d'un corrélateur, mettant en évidence les points communs entre une onde émise et une onde reçue. D'autres modules peuvent ensuite calculer la distance et la vitesse relative entre la voiture en déplacement et l'objet détecté. Nous avons implémenté un tel système sur FPGA. Nous avons utilisé des blocs matériels pour implémenter le corrélateur, gourmand en ressources de calculs, et un processeur soft-core pour traiter les calculs de distance et de vitesse. Dans le but d'augmenter la distance maximale de detection actuellement permise par l'algorithme de la corrélation, nous avons développé et testé une version modifiée d'un algorithme basé sur les statistiques d'ordres supérieurs. Il découle de ce travail une description détaillée de ce nouvel algorithme, ces possibles implémentations et les résulats de sa synthèse sur FPGA.
Recent research have demonstrate interests in a codesign framework that allows description refinement at different abstraction level. We have proposed such a framework that allows SoC resources allocation for regular and repetitive tasks found in intensive multimedia applications. Nevertheless, the framework does not directly target reconfigurable architectures, the difficult job of placing and routing an application on a FPGA being postponed to a dedicated tool. In order to limit the number of synthesis on this external tool, we propose an algorithm that, from a high level description of an intensive multimedia application, estimates the resource usages on a given FPGA architecture. This algorithm makes use of a simple mathematical formalism issued from case study implementations.
De récentes recherches ont démontré l'intérêt d'utiliser une chaîne de conception conjointe permettant le raffinement d'une description, et ceci à différents niveaux d'abstration. Nous avons proposé une plateforme répondant à ces attentes. Elle assure l'allocation de ressources contenues dans un SOC, permettant ainsi l'implémentation de tâches répétitives, isolées dans des applications de type multimédia. Néanmoins, cette plateforme ne cible pas directement l'architecture reconfigurable. Le difficile travail de placement et routage sur FPGA est réalisé ultérieurement par un outil dédié. Afin de limiter le nombre de synthèses à réaliser par l'outil dédié, nous proposons un algorithme permettant, à partir d'une description de haut niveau de l'application, une estimation des ressources exploitées pour une architecture cible donnée. Cet algorithme utilise des formulations mathématiques simples, elles-même obtenues à partir d'implémentations réelles.
This paper presents a micro-network that is ageneric, scalable and multi-stage interconnect architecture for systems on a chip (SoC).Thenetwork architecture relies on a packet switching and point-to-point bi-directional links between the routers implementing the micro network. The NoC provides a configurable number of OCP compliant communication interfaces for both initiators (master) and targets (slave). This network has been used in a multiprocessor SoC with 16 initiators 16 slaves, and compared with an AMBA bus in terms of latency and saturation threshold.
Many signal processing applications have to manipulate multidimensional data ; however there are only few models of specifications which are able to handle such data. Thus, we study this problem, but we are limiting our approach to systematic signal processing which is characterised by the application of very regular treatments, independent from the data values. We analyse the modelling of such applications. Firstly, we compare different models based on the synchronous dataflow paradigm and then we introduce ARRAY-OL. ARRAY-OL is a model of description which is able to express data dependences. However, ARRAY-OL does not explain how to execute the applications and does not provide any methodology for optimisations. Therefore, in the case of using ARRAY-OL, it is necessary to suggest some optimisations and to analyse its projections on a computation model. Hence, we analyse the possibilities offered by the loops transformations and we depict the ODT formalism. We show that the ODT are perfectly suitable for expressing the data dependences. Using it, we recommend several transformations constituting a toolbox which is able to realise simple modifi- cations or more complex optimisations on ARRAY-OL applications. Lastly, we examine the projection of ARRAY-OL on different models of computations.
De nombreuses applications de traitement du signal ont à traiter des données comportant plusieurs dimensions, pourtant il n'existe que très peu de modèles qui soient capables de tenir compte de cet aspect multidimensionnel. Nous nous sommes donc intéressés à ce problème, mais en nous limitant au traitement du signal systématique (TSS) qui consiste en l'application de traitements très réguliers indépendants des données. Nous abordons tout d'abord la problématique de la modélisation des applications. Nous com- parons différents modèles reposant sur les flux de données synchrones. Puis, nous nous intéressons à ARRAY-OL qui est un modèle de description ayant la faculté d'exprimer les dépendances de don- nées. Mais ce dernier ne fournit aucune méthodologie pour l'exécution des applications et aucun biais d'optimisation. Il faut donc projeter ARRAY-OL sur un modèle de calcul pour exécuter des ap- plications, mais en ayant proposé au préalable une phase d'optimisations. Nous étudions pour cela différentes méthodes : nous analysons les possibilités offertes par les transformations de boucles, puis nous présentons le formalisme ODT. Nous montrons que les ODT sont en mesure d'exprimer parfaitement les dépendances de données. À l'aide des ODT, nous pro- posons une suite de transformations constituant une « boîte à outils » capable d'effectuer de simples modifications ou des optimisations plus complexes sur des applications ARRAY-OL. Enfin, nous ana- lysons la projection d'ARRAY-OL vers des modèles de calculs et nous étudions l'impact de nos trans- formations.
In the last few years, the design of systems-on-chip (SoC) has seen several major changes. On one hand, the new applications are more and more complex, and this makes their dedicated SoCs very heterogeneous. On the other hand, new technologies allow the integration of more and more components on the same silicon area. Current design methods, based mainly on the engineer's experiment for coding and exploring architectures, do not allow any more to follow the technology evolution, more especially as the lifespan of the systems is increasingly short, while the time to market and the design cost are dramatically increasing. The approach adopted in this thesis is a part of a global project for co-modeling and co-design called Gaspard. It aims, even partially, to reach the new design requirements. The methodology is based on Model Driven Architecture (MDA), particularly in the simulation and performances analysis at high abstraction levels. In this thesis, a metamodel, allowing systems modelling, is described. Then, a methodology of automatic code generation, using transformation engines, is used in order to generate the code necessary to simulate the modelled system. Lastly, high level performances estimation criteria are presented.
Ces quelques dernières années, le monde de la conception des systèmes monopuces a subi un très grand bouleversement. D'une part, la puissance des nouvelles applications fait que les nouveaux systèmes doivent incorporer de nombreuses ressources hétérogènes. D'autre part, les nouvelles technologies permettent d'incorporer de plus en plus de composants sur une même surface de silicium. Les méthodes de conception actuelles, basées sur l'expérience des concepteurs pour choisir les différentes architectures, ne permettent plus de suivre l'évolution des technologie, d'autant plus que la durée de vie des systèmes est de plus en plus courte, alors que le temps de mise sur le marche et le coût de conception ne font qu'augmenter. L'approche adoptée dans cette thèse s'inscrit dans un projet global de co-modélisation et co-conception appelé Gaspard. Elle vise à remédier, même partiellement à ces nouvelles exigences. La méthodologie est basée sur les principes de l'architecture dirigée sur les modèles (MDA), et plus particulièrement la partie traitant de la simulation et de l'analyse de performances à un haut niveau d'abstraction. Dans cette thèse, un métamodèle, permettant la modélisation des systèmes, est décrit. Ensuite, une méthodologie de génération de code, utilisant des moteurs de transformations, est utilisée afin de générer le code nécessaire à la simulation du système modélisé. Enfin, une présentation des différents critères retenus pour l'estimation du système est faite.
The ARTiS system is a real-time extension of the GNU/Linux scheduler dedicated to SMP (Symmetric Multi-Processors) systems. It allows to mix High Performance Computing and real-time. ARTiS exploits the SMP architecture to guarantee the preemption of a processor when the system has to schedule a real-time task. The implementation is available as a modification of the Linux kernel, especially focusing (but not restricted to) IA-64 architecture. The basic idea of ARTiS is to assign a selected set of processors to real-time operations. A migration mechanism of non-preemptible tasks insures a latency level on these real-time processors. Furthermore, specific load-balancing strategies permit ARTiS to benefit from the full power of the SMP systems: the real-time reservation, while guaranteed, is not exclusive and does not imply a waste of resources. This document describes the theoretical approach of ARTiS as well as the details of the Linux implementation. Several kind of measurements are also presented in order to validate the results.
Le système ARTiS est une extension temps-réel de GNU/Linux dédiée aux architectures SMP (multi-processeurs symétriques). Il permet de mixer calcul à haute performance et temps-réel. ARTiS exploite la caractéristique SMP de l'architecture pour garantir la possible préemption d'un processeur quand le système doit ordonnancer une tâche temps-réel. L'implémentation est disponible sous la forme d'une modification du noyau Linux, visant en particulier (sans être une restriction) l'architecture IA-64. Le principe d'ARTiS est d'identifier un ensemble de processeurs dédiés aux opérations temps-réel. Un mécanisme de migration automatique des activités non préemptibles assure une garantie de latence sur ces processeurs temps-réel. De plus, une stratégie spécifique d'équilibrage de charge permet à ARTiS d'exploiter la pleine puissance d'une machine SMP : les réservations temps-réel, bien que garanties, ne sont pas exclusives et n'entraînent pas de sous-utilisations des ressources. Nous présentons ici l'approche théorique d'ARTiS ainsi que les détails de l'implémentation dans Linux. Différents types de mesures sont égalements présentés afin de valider les résultats.
The Array-OL specification model has been introduced to model systematic signal processing applications. This model is multidimensional and allows to express the full potential parallelism of an application: both task and data parallelism. The Array-OL language is an expression of data dependences and thus allows many execution orders. In order to execute Array-OL applications on distributed architectures, we show here how to project such specification onto the Kahn process network model of computation. We show how Array-OL code transformations allow to choose a projection adapted to the target architecture.
The augmentation of number of gates on chip makes SOC design more difficult. So we have to work on SOC design tools to make designer work easier and manage all the available gates. We propose an embedded Linux co-simulation with hardware simulation at a high level of abstraction (TLM) to verify the system very early in the design flow. This will allow to avoid return behind in the design flow.
From early days of embedded controllers to modern multiprocessor systems on chip, there has been a growing complexity gap that current electronic design automation tools cannot fully bridge. Designers do not dispose of tools that enable them to exploit potentially available transistors at a reasonable cost. The Gaspard design flow offers original solutions in order to solve this problem: a model oriented approach to handle the complexity of the flow, aimed at regular multiprocessor systems. The main goal of this thesis is to define common mechanisms to express regularity and parallelism of such systems, for application as well as hardware. Our contribution is twofold: the definition of an abstract syntax for the modelling of these systems via metamodels expressed with MOF (infrastructure to put the flow into action), and the definition of a corresponding concrete syntax via a UML profile.
Des contrôleurs embarqués d'autrefois aux systèmes sur puce multiprocesseurs actuels, il existe un saut de complexité que les outils d'aide à la conception n'arrivent pas à franchir. Les concepteurs ne disposent pas d'outil leur permettant d'exploiter à un coût raisonnable les transistors potentiellement mis à leur disposition. Pour tenter de résoudre ce problème, le flot de conception Gaspard propose des solutions originales : une approche orientée modèle pour gérer la complexité du flot, et une orientation vers les systèmes multiprocesseurs réguliers. Intégrée dans ce flot, cette thèse propose une contribution à deux niveaux : définition d'une syntaxe abstraite sous forme de métamodèles exprimés en MOF (infrastructure pour la mise en oeuvre du flot), et définition d'une syntaxe concrète sous la forme d'un profil UML. L'objectif principal est de définir des mécanismes communs pour exprimer la régularité et le parallélisme des systèmes, tant au niveau applicatif qu'au niveau matériel.
The MARTE RFP (Modeling and Analysis of Real-Time and Embedded systems) was voted by OMG in February 2005. This request for proposals solicits submissions for a UML profile that adds capabilities for modeling Real Time and Embedded Systems (RTES), and for analyzing schedulability and performance properties of UML specifications. One of the particular request of this RFP concerns the definition of common high-level modeling constructs for factorizing repetitive structures, for software, hardware and allocation modeling of RTES. We propose an answer to this particular requirement, based on the introduction of multi-dimensional multiplicities and mechanisms for the description of regular connection patterns between model elements. This proposition is domain independent. We illustrate the use of these mechanisms in an intensive computation embedded system co-design methodology. We focus on what these factorization mechanisms can bring for each of the aspects of the co-design: application, hardware architecture, and allocation.
In this paper, we study the introduction of control into the Gaspard2 application UML metamodel by using the principles of synchronous reactive systems. This allows to take the change of running mode into account in the case of data parallel applications, and to study more general ways of mixing control and data parallel processing. Our study is applied to a particular context using two different models, exclusively dedicated to the process of computation or control. The computation part represents the Gaspard2 application metamodels based on the Array-OL language which is often used to specify the data dependencies and the potential parallelism in intensive applications treating multidimensional data. The control part is represented by an automaton structure based on the mode-automata concept which makes it possible to clearly identify the different modes of a task and the switching conditions between modes. The proposed UML metamodel makes it possible to describe the control automata, the different running modes and the link between control and computation parts. It also allows to clearly separate the control and data parts, and to respect the concurrency, the parallelism, the determinism and the compositionality of the Gaspard2 models.
The evolution of technologies is enabling the integration of complex platforms in a single chip, called a System-on-Chip (SoC). Mod- ern SoCs may include several CPU subsystems to execute software and sophisticated interconnect in addition to specific hardware subsystems. Designing such mixed hardware and software systems requires new me- thodologies and tools or to enhance old tools. These design to ols must be able to satisfy many relative trade-offs (real-time, performance, low power consumption, time to market, re-usability, cost, area, etc). It is recognized that the decisions taken for scheduling and mapping at a high level of abstraction have a major impact on the global design flow. They can help in satisfying different trade-offs before proceeding to lower level refinements. To provide good potential to scheduling and mapping decisions we pro- pose in this paper a static scheduling framework for MpSoC design. We will show why it is necessary to and how to integrate different schedul- ing techniques in such a framework in order to compare and to combine them. This framework is integrated in a model driven approach in order to keep it open and extensible.
The ARTiS system, a real-time extension of the GNU/Linux scheduler dedicated to SMP (Symmetric Multi-Processors) systems is proposed. ARTiS exploits the SMP architecture to guarantee the preemption of a processor when the system has to schedule a real-time task. The basic idea of ARTiS is to assign a selected set of processors to real-time operations. A migration mechanism of non-preemptible tasks insures a latency level on these real-time processors. Furthermore, specific load-balancing strategies allows ARTiS to benefit from the full power of the SMP systems: the real-time reservation, while guaranteed, is not exclusive and does not imply a waste of resources. ARTiS have been implemented as a modification of the Linux scheduler. This paper details the evaluation of the performance we conduct on this implementation. The level of observed latency shows significant improvements when compared to the standard Linux scheduler.
Embedded system designs and simulations become tedious and time consuming due to the complexity of modern applications. Thus, languages allowing high level description, such as SystemC, are more and more used. We present in this paper a new methodology allowing scripting inside SystemC. We integrate both SystemC and Python within a single framework for system designs and simulations called SystemPy. Communication is performed using SWIG interfaces. SystemPy allows dynamic IP changes during the simulation. This makes designers able to perform a quick architecture exploration without stopping the simulation process. Steps and performances of our framework are illustrated on mixed SystemC - Python system.
In this paper we present a distributed simulation environment for System-on-Chip (SoC) design. Our approach enables automatic generation of, geographically distributed, SystemC simulation models for IP-based SoC design and eases communication between heterogeneous parts of the simulation. The SystemC simulation model follows a client/server architecture, where each clients/server should contain some IPs composing the system. The number of distributed SystemC simulators used in the simulation platform is theatrically unlimited. The communication between them is made using Simple Object Access Protocol (SOAP) trough communication standards as XML and HTML. The feasbility of our method is shown on an example composed by four SystemC modules, simulated on three hosts.
MDE (Model Driven Engineering) is a new approach of software design where the whole process of design and implementation is worked out around models. With MDE, a system is built by designing a set of models at different levels of abstraction. At the first level, only the main functionalities of the system are modeled. This first model is called according the MDA (Model Driven Architecture) terminology the PIM (Platform Independant Model). This PIM can be projected into one or more other models by transformations. These latter models being at lower levels of abstraction. When a model at a given level of abstraction integrates some platform (technology) information, it is called a PSM (Platform Specific Model). Model transformation is therefore a key issue of the MDE approach. However many questions arise about transformations. Among these questions is: When a model is transformed into different other models on different platforms, how to ensure the interoperability between these models? This paper aims to provide an answer to the above question. Our approach is based on a traceability model. This model keeps links between the source and target model elements but also records the different operations that where performed in the transformation. We present a methodology for the automatic generation of the traceability model, and the exploitation of this model to ensure interoperability. An example based on OCP is provided to illustrate our proposal.
Alors que l'utilisation des systèmes embarqués augmente, les méthodes de conception des systèmes embarqués n'arrivent pas à suivre l'évolution. Ainsi, il y a un écart entre le nombre de transistors que l'on est capable de mettre sur une puce et le nombre que l'on est capable de gérer dans un temps raisonnable. C'est pourquoi, il est nécessaire de développer les outils de développement et notamment les méthodes de simulations permettant de vérifier rapidement ces systèmes. Une solution qui est suggérée pour réduire cet écart est d'augmenter le niveau d'abstraction dans la conception des systèmes. Ce mémoire présente une méthode de simulation de systèmes embarqués à un haut niveau d'abstraction pour essayer de réduire cet écart dans la productivité. La simulation intègre des systèmes d'exploitation de type Linux embarqué pour permettre l'ordonnancement dynamique. La simulation est effectuée de manière séparée, c'est à dire la simulation du matériel (en SystemC) d'un côté et la simulation du logiciel (application + système d'exploitation) de l'autre. Cela permet d'exécuter les choses voulues sur le matériel tout en ayant une simulation assez rapide. Des mesures sont faites durant la simulation (nombre de changement de contexte, temps d'exécution) pour vérifier la validité du système.
The evolution of technologies is enabling the integration of complex platforms in a single chip, called a System-on- Chip (SoC).Modern SoCsmay include several CPU subsystems to execute software and sophisticated interconnect in addition to specific hardware subsystems. To manage and exploit this high degrees of provided parallelism in hw / sw, we need regular constructors both for hardware and software. SoC co-design requires to master a lot of different abstraction levels, different simulation techniques, different synthesis tools. Due to the technology evolution, the best one is the one to come. Evolution of the embedded systems is not simple, both hardware and software, the business logic has to be kept and the technical aspect has to be thrown. To improve the permanence of System on Chip we have to abstract from the technical concerns. Model Driven Engineering proposes a separation of concerns: application and technical concerns. Use of modeling standard can capitalize system descriptions and improve system evolution and integration.We propose the use of UML2 as a modeling language for MPSoC system design. To model regular hardware and software, we propose to introduce multi-dimensional multiplicities and mechanisms for the description of regular connection patterns between model elements. This proposition is domain independent. We illustrate the use of these mechanisms in an intensive computation embedded system co-design methodology. We focus on what these factorization mechanisms can bring for each of the aspects of the co-design: application, hardware architecture, and allocation.
SystemC is a quasi open source C++ event driven HDL (Hardware Description Language) reference simulator which was introduced in September 1999 from the OSCI. At first, it meant to be a replacement for VHDL, and although SystemC can be use for RTL modeling, it is now envisioned by the community as a high level system simulator. SystemC inherits all the properties, methodologies and mechanics of its bases (C and C++) which can be seen as macro-assemblers. This has the positive effect of having a lot of freedom in manners of doing things. This freedom can be beneficial because a given methodology can be chosen accordingly and more appropriately to the situation. On the other hand, freedom has a cost and the designer or IP (Intellectual Property) provider can lose a lot of time trying to figure out which methodology best fit his needs. In this paper, we establish a comprehensive list of all the different mechanisms of configuring an IP in SystemC. We then compare the different methods and highlight the ones which would best suite, following our opinions, the IP development process and publishing cases.
The Model-Driven architecture is an initiative by the Object Management Group (OMG) to define an approach to software development based on modeling and automated mapping of models to implementations. The basic MDA pattern involves the definition of a platform-independent model (PIM) and its automated mapping to one or more platform-specific models (PSMs). By defining different PIM and PSM dedicated to embedded systems, we will show the benefits of using the MDA approach in System on Chip codesign. From UML 2.0 profiles to System C or VHDL codes, the same model transformation engine is used with different rules expressed in XML.
Le système ARTiS est une extension temps-réel de GNU/Linux dédiée aux architectures multiprocesseurs symétriques (SMP). ARTiS exploite la caractéristique SMP de l'architecture pour garantir la possible préemption d'un processeur quand le système doit ordonnancer une tâche temps-réel. Le principe d'ARTiS est d'identifier un ensemble de processeurs dédiés aux opérations temps-réel. Un mécanisme de migration automatique des activités non préemptibles assure une garantie de latence sur ces processeurs temps-réel. De plus, une stratégie spécifique d'équilibrage de charge permet à ARTiS d'exploiter la pleine puissance d'une machine SMP : les réservations temps-réel, bien que garanties, ne sont pas exclusives et n'entraînent pas de sous-utilisations des ressources. Des simulations du comportement d'ARTiS ont permis de vérifier la viabilité du modèle proposé. Nous présentons ici une évaluation des performances, en particulier en terme de latence de traitement des interruptions, de différentes versions du noyau Linux et de notre solution ARTiS. Notre première implantation sur Intel x86 et IA-64, bien qu'incomplète, confirme la supériorité de la solution ARTiS sur le noyau Linux standard.
In this paper, we present a new design methodology for synchronous reactive systems, based on a clear separation between control and data flow parts. This methodology allows to facilitate the specification of different kinds of systems and to have a better readability. It also permits to separate the study of the different parts by using the most appropriate existing tools for each of them. Following this idea, we are particularly interested in the notion of running modes and in the Scade tool. Scade is a graphical development environment coupling data processing and state machines (modeled by the synchronous languages Lustre and Esterel). It can be used to specify, simulate, verify and generate C code. However, this tool does not follow any design methodology, which often makes difficult the understanding and the re-use of existing applications. We will show that it is also difficult to separate control and data flow parts using Scade. Regulation systems are better specified using mode-automata which allow adding an automaton structure to data flow specifications written in Lustre. When we observe the mode-structure of the mode-automaton, we clearly see where the modes differ and the conditions for changing modes. This makes it possible to better understand the behavior of the system. In this work, we try to combine the advantages of Scade and running modes, in order to develop a new design methodology which facilitates the study of several systems by respecting the separation between control and data flows. This schema is illustrated through the Climate case study suggested by Esterel Technologies, in order to exhibit the benefits of our approch compared to the one advocated in Scade.
The Array-OL specification model has been introduced to model systematic signal processing applications. This model is multidimensional and allows to express the full potential parallelism of an application: both task and data parallelism. The Array-OL language is an expression of data-dependences and thus allows many execution orders.In order to execute Array-OL applications on distributed architectures, we show here how to project such specification onto the Kahn process network model of computation. We show how Array-OL code transformations allow to choose a projection adapted to the target architecture. An experiment on a distributed process network implementation based on CORBA concludes this article.
Le modèle de spécification Array-OL a été créé pour décrire des applications de traitement du signal systématique. Il s'agit d'un modèle multidimensionnel permettant d'exprimer le parallélisme d'une application, que se soit le data parallélisme ou le parallélisme de tâche. De plus, Array-OL étant un langage d'expression de dépendances, il est possible d'avoir plusieurs ordres d'exécution. Afin de pouvoir exécuter Array-OL sur des architectures distribuées, nous proposons ici une projection d'Array-OL sur les réseaux de processus de Kahn en utilisant ces derniers comme modèles de calcul. Nous introduisons également des transformations qui permettent d'optimiser cette projection en fonction de l'architecture cible. Nous concluons en donnant un exemple basé sur une implémentation CORBA des réseaux de processus.
Computation intensive multidimensional applications appear in many application domains such as video processing or detection systems. We present here the Array-OL specification model to handle such multidimensional applications. This model is compared to the Multidimensional Synchronous Dataflow proposition by Lee et al.We also detail in this a new domain in the Ptolemy simulation environment dedicated to Array-OL specification simulation.
Nous introduisons dans cet article le model de spécification Array-OL qui permet de gérer des applications à flots de données multidimensionels pour le traitement du signal. Nous comparons également Array-OL à Array-OL le seul modèle équivalent dans ce domaine. De plus nous proposons un nouveau « domaine » Ptolemy dédié à la simulation d'applications décrites en Array-OL.
Taking into account the hardware architecture specificities is a crucial step in the development of an efficient application. This is particularly the case for embedded systems where constraints are strong (real-time) and resources limited (computing, power). This approach is called co-design, and it is found more or less explicitly in ADLs. Much work have been done around co-design and ADLs, but no standard notation and semantics have emerged. Concerning software engineering, UML has become a recognized standard language for modeling, proving the need of users for common syntax and vocabulary to specify their applications. We believe that it would useful to use the well achieved syntax and vocabulary of UML for both applications and hardware architectures, that is to say using UML as an ADL. Our approach consists in a clear specialization of an UML subset via a the proposition of a generic profile that allows the definition of precise semantic and syntaxic rules. The generic profile can then be extended to suit the need of the user. To illustrate our subject, we give a refinement example of the profile to get relevant informations for a simulation at the TLM level (Transaction Level Modeling). The modeling of the Texas Instrument OMAP2410 and OMAP2420 is provided as an example.
In modern embedded systems, parallelism is a good way to reduce power consumption while respecting the real-time constraints. To achieve this, one needs to efficiently exploit the potential parallelism of the application and of the architecture. We propose in this paper a hybrid optimization method to improve the handling of repetitions in both the algorithm and the architecture. The approach is called Globally Irregular Locally Regular and consists in combining irregular heuristics and regular ones to take advantage of the strong points of both.
In this paper, we present an extension of the SystemC simulator in order to allow its execution on an IA-64 platform. Our approach relies on adding to SystemC a new user thread package in a simple way. The proposed user thread mechanism is based on the ucontext primitives, and can be integrated or updated in SystemC easily. The effectiveness of this approach is shown on a concrete size example composed of two masters and two slaves connected to an AMBA bus.
ARTiS is a real-time extension of GNU/Linux dedicated to SMP systems (Symmetric Multi-Processors). ARTiS divides the CPUs of an SMP system into two sets: real-time CPUs and non real-time CPUs. Real-time CPUs execute preemptible code only, thus tasks running on these processors perform predictably. If a task wants to enter into a non-preemptible section of code on a real-time processor, ARTiS will automatically migrate this task to a non real-time processor. Furthermore, dedicated load-balancing strategies allow all the system's CPUs to be fully exploited. The purpose of this paper is to describe the basic API that has been specified to deploy real-time applications, and to present the current implementation of the ARTiS model, which was achieved through modifications of the 2.6 Linux kernel. The implementation is build around an automatic migration of tasks between real-time and non real-time processors and the use of a load-balancer. The basic function of those mechanisms is to move a task structure from one processor to another. A strong constraint of the implementation is the impossibility for the code running on an RT processor to share a lock or to wait after another processor.
Applications that require a combination of high-performance computing capabilities and real-time behavior, although pervasive (simulation, medicine, training, multimedia communications), often rely on specific hardware and software components that make them high performance but expensive, and quite difficult to develop, validate and moreover upgrade. The increasing performance of COTS and the volume of software developed for these applications lead to the consideration of incremental development schemes in addition to sole performance. In the ITEA Hyades project, industrial companies, research centres and academic departments, propose a complete set of software technologies aimed at adding real-time capabilities to multi-processor systems, with a strong commitment to standards. In this paper we present the application requirements with respect to real-time, the architectural model proposed, as well as the reasons for using the Linux operating system. Then, we introduce software components that have been selected to provide real-time needs, among which are Adeos and ARTiS, and their expected contribution to global performance. Finally we provide performance measurements for these elements.
Embedded system design needs to model together application and hardware architecture. For that a huge number of models are available, each one proposing its own abstraction level associated to its own software platform for simulation or synthesis. To produce a co-design framework, we are obviously obliged to support different models among all possible ones. Between these models we should produce automatic transformations. Each time a new model is included in the framework, we should develop a new transformation. To improve transformation engine development, Model Driven Architecture (MDA) techniques are useful. This approach permits to define the transformations at the metamodel level. It guaranties to the framework the reuse of models and unifies the definition of the transformation rules. We present the application of MDA in the context of Intensive Signal Processing (ISP) applications deployed on System on Chip (SoC) platforms. For that purpose, we have developed a new MDA Transformation engine: ModTransf. We apply this engine on UML profiles to generate SystemC Transaction Level Model dedicated to ISP. A particular rule will be presented to illustrate the interest of this approach in a multi model embedded system design environment.
In this thesis, we are interested in the performance evaluation of multistage interconnection networks. The presented work covers two essential aspects: the definition of a multi-criteria methodology for the evaluation and comparison of interconnection networks. This methodology is based on the definition of a distance function in a multidimensional space, where each dimension represents a performance metric. The function can be used in a Pareto optimisation context or in the context of a classification. The second aspect is the proposition of a novel family of multistage interconnection network called over-sized Delta interconnection networks. This family of networks provides better performance than Delta networks but has a higher complexity. The methodology is used to compare the performance of the two families considering this complexity difference.
Dans cette thèse, nous nous intéressons à l'évaluation de performances des réseaux d'interconnexion multi-étages. Le travail présenté couvre deux aspects essentiels : la définition d'une méthodologie multi-critères pour l'évaluation et la comparaison de réseaux d'interconnexion. Cette méthodologie est basée sur la définition d'une fonction de distance dans un espace multidimensionnelle, où chaque dimension représente un facteur de performance. La fonction peut être utilisée dans un contexte d'optimisation Pareto ou dans le contexte d'une classification. Le deuxième aspect concerne la proposition d'une nouvelle famille de réseaux d'interconnexion multi-étages baptisée les réseaux d'interconnexion Delta surdimensionnés. Cette famille de réseaux fournit des performances meilleures que celles des réseaux Delta au prix d'une complexité plus élevée. La méthodologie est utilisée pour comparer les performances de deux familles en prenant en compte cette complexité plus élevée.
ARTiS is a project that aims at enhancing the Linux kernel with better real-time properties. It allows to retain the flexibility and ease of development of a normal application for the real-time applications while keeping the whole power of SMP (Symmetric Multi-Processors) systems for their execution. Based on the introduction of an asymmetry between the processors, distinguishing real-time and non real-time processors, the system can insure low interrupt latencies to real-time tasks. Furthermore, every processor can execute all the tasks, excepted when they request real-time endangering functions. In this case the task is moved before continuing to be executed. A first version of ARTiS has demonstrated this is technically possible. Unfortunately, the original load-balancing mechanism of Linux is not aware of this enhanced design. We have studied all the types of migration possible between the combinations of a real-time specialized processor and a general one. From the deducted requirements, we have specified special mechanisms and policies taking into account both performances and real-time specificities. We are currently working on implementing those particular load-balancing functions within the ARTiS system.
The subject is part of the ITEA european project HYADES which attempts to promote SMP computers as platforms for HPC real-time applications. An asymmetric real-time scheduling, called ARTiS, has been proposed. The first evaluations have proven viability of the solution. The principle of ARTiS is to distinguish two types of processors, the processors which can execute every kind of task and the processors prohibiting execution of real-time endangering functions. When a task attempts to execute such function, it will be automatically migrated. The power of the ARTiS model is to allow simultanously ressources reservation for real-time applications and load-balancing between real-time and non real-time processors. The Linux original load-balancing mechanism is not aware of this asymmetry between the processors. We have studied and listed all the possible migrations between the processors. From this study, modifications to the orginal mechanism were specified. More specificaly, we propose: -the use of lock-free queues associated to a ''push'' trigger policy, -a local designation policy which can estimate the probability of the future migrations of the tasks, -an evaluation of the processors load which distinguish between the real-time tasks and the others. Finally, the implementation into the ARTiS kernel is in progress and the design of specific measurement tests were written in order to verify and estimate the enhancements provided by this implementation.
Le sujet entre dans le cadre du projet européen ITEA HYADES qui vise à promouvoir les machines SMP comme plate-forme pour des applications temps-réel de calcul intensif. Un ordonnancement asymétrique pour le temps-réel, nommé ARTiS, a été proposé. Des premières évaluations ont montré la faisabilité de la solution. Le principe d'ARTiS est de distinguer deux types de processeurs, les processeurs pouvant exécuter toutes les tâches et les processeurs interdisant l'exécution de fonctions mettant en danger la qualité temps-réel du processeur. Lorsqu'une tâche tente d'exécuter une telle fonction, elle est automatiquement migrée. La force du modèle ARTiS est d'autoriser simultanément la réservation de ressources pour les applications temps-réel et l'équilibrage de la charge entre les processeurs temps-réel et les processeurs non temps-réel. Le mécanisme d'équilibrage de charge original de Linux ne prend pas en compte cette asymétrie entre les processeurs. Nous avons étudié et explicité l'ensemble des migrations possibles entre processeurs. À partir de cette étude, il a été possible de spécifier des modifications au mécanisme original. En particulier, nous proposons : -l'utilisation de queues à accès non bloquants couplée à une politique de déclenchement active ; -une politique de désignation locale estimant les migrations futures des tâches ; -un calcul de la charge des processeurs distinguant la charge des tâches temps-réel des autres tâches. Enfin, l'implantation dans le noyau ARTiS est en cours et des protocoles de mesures ont été écrits afin de vérifier et d'estimer les améliorations apportées par cette implantation.
High-throughput real-time systems require non-standard and costly hardware and software solutions. Modern, and specially multiprocessor, workstations can represent a credible alternative to develop real-time intensive signal processing applications. Furthermore, the programming model of Kahn Process Networks (KPN) corresponds completely to this kind of applications and fits perfectly multiprocessor systems. However, the current scheduling of KPN suffers of drawbacks with respect with real-time and efficiency. A new activation strategy of processes in a KPN is presented, which considerably improves the existing techniques. This new activation order takes into account both the usual artificial deadlock detection problem and resolution with time constraints. With this algorithm, it is not required to wait until the entire execution has deadlocked to remove bottleneck. Moreover, an optimized allocation memory and a bounded number of processes context switches mechanism is described.
ARTiS is a project that aims at enhancing the Linux kernel with better real-time properties. It allows to retain the flexibility and ease of development of a normal application for the real-time applications while keeping the whole power of SMP (Symmetric Multi-Processors) systems for their execution. Based on the introduction of an asymmetry between the processors, distinguishing real-time and non real-time processors, the system can insure low interrupt latencies to real-time tasks. Furthermore, every processor can execute all the tasks, excepted when they request real-time endangering functions. In this case the task is moved before continuing to be executed. A first version of ARTiS has demonstrated this is technically possible. Unfortunately, the original load-balancing mechanism of Linux is not aware of this enhanced design. We have studied all the types of migration possible between the combinations of a real-time specialized processor and a general one. From the deducted requirements, we have specified special mechanisms and policies taking into account both performances and real-time specificities. We are currently working on implementing those particular load-balancing functions within the ARTiS system.
We propose the ARTiS system, a real-time extension of GNU/Linux dedicated to SMP (Symmetric Multi-Processors) systems. ARTiS exploits the SMP architecture to guarantee the possible preemption of a processor when the system has to schedule a real-time task. The basic idea of ARTiS is to assign a selected set of processors to real-time operations. A migration mechanism of non-preemptible tasks insures a latency level on these real-time processors. Furthermore, specific load-balancing strategies allows ARTiS to benefit of the full power of the SMP systems: The real-time reservation, while guaranteed, is not exclusive and does not imply a waste of resources. Simulations of the ARTiS performances have been conducted. The level of observed latency comfort the model proposition. A first implementation of ARTiS, while incomplete, also shows significant improvements compared to the standard Linux kernel.
Taking into account the hardware architecture specificities is a crucial step in the development of an efficient application. This is particularly the case for embedded systems where constraints are strong (real-time) and resources limited (computing, power). This approach is called co-design, and it is found more or less explicitly in ADLs (Architecture Description Languages). Many works have been done around co-design and ADLs, but no standard notation and semantics have emerged. Concerning software engineering, UML has become a recognized standard language for modeling, proving the need of users for common syntax and vocabulary to specify their applications. We believe that it would useful to use the well achieved syntax and vocabulary of UML for both applications and hardware architectures, that is to say using UML as an ADL. Our approach consists in a clear specialization of an UML subset via a the proposition of a generic profile that allows the definition of precise semantic and syntaxic rules. The generic profile can then be extended to suit the need of the user. To illustrate our subject, we give a refinement example of the profile to get relevant informations for a simulation at the TML level (Transaction Level Modeling). The modeling of the TI OMAP2410 and OMAP2420 is provided as an example.
La prise en compte des spécificités de l'architecture matérielle est un élément essentiel du développement d'applications efficaces, particulièrement dans le cas des systèmes embarqués fortement contrains (contraintes temporelles, contraintes de consommations électriques...). Cette approche de conception conjointe (co-design) se retrouve dans les ADL (Architecture Description Language), même si aucune notation ou sémantique standardisée ne s'est aujourd'hui imposée. Dans le domaine du génie logiciel, UML est reconnu comme un langage standard de modélisation. Nous promouvons l'utilisation de la syntaxe et du vocabulaire d'UML pour la conception non seulement des applications, mais aussi des architectures matérielles. Il s'agit d'envisager UML comme un ADL. Notre proposition est une spécialisation d'un sous-ensemble d'UML au travers un profile générique permettant la définition de règles syntaxique et d'une sémantique. Ce profile générique peut ensuite être étendu en fonction des besoins particuliers de l'utilisateur. Nous illustrons cette proposition par la donnée d'un raffinement du profile pour faire figurer les informations nécessaires à une simulation au niveau transaction (TML, Transaction Level Modeling). La modélisation des 2410 et 2420 de TI est fournie comme exemple
Interconnection network performance is a key factor when constructing parallel computers. Today's technological progress makes it possible to build and use crossbars of sizes up to 128. Crossbars can be used as switching elements (SEs) in parallel architectures intercommunication systems such as multistage interconnection networks (MINs). A MIN is usually defined, among others, by its topology. One of the factors defining the topology of a MIN is its degree. The degree of a MIN is the size of the SE of which it is composed. In this paper we are interested in studying the influence of the degree of two classes of MINs on their performance. The tested MINs classes are the famous Delta networks and a subclass of this family called the over-sized Delta networks. This study is to be used in future work in order to evaluate the use of MINs as an intercommunication medium in Symmetric Multiprocessors.
In this thesis, we are interesting in the design of an environment of execution for dynamic distributed applications. We have defined and used the Distributed Kahn Process Networks model Kahn, as a basic execution model. The distribution of the model make it possible to establish the link between the distributed systems and Kahn process networks applications (embedded systems simulation, signal processing application, video and audio processing...) thus opening the way to the construction of simulation applications in an distributed environment. Our work covers primarily three topics : 1. Distributed simulation : our approach is component-based, with interactive deployment and communication transparency. 2. Dynamic distributed systems : several dynamicity aspects have been added to ou environement for load balancing and application evolution. 3. The multidimensional signal processing : we propose and implement a dataflow execution model for Array-OL applications (signal processing language).
Dans cette thèse, nous nous sommes intéressés à la conception d'un environnement d'exécution pour des applications réparties dynamiques. Nous avons défini et utilisé le modèle des réseaux de processus distribués de Kahn, comme modèle de base de notre environnement d'exécution. L'extension du modèle de Kahn pour supporter la distribution a permis de faire le lien entre les systèmes distribués et les applications des réseaux de processus de Kahn (simulation des systèmes embarqués, application de traitement de signal, traitement vidéo, ...) ouvrant ainsi la voie à la construction d'applications de simulation dans un environnement distribué. Nos travaux couvrent essentiellement trois facettes : 1. La simulation distribuée : notre approche est basée sur l'utilisation d'une méthodologie à base de composants, la transparence des communications et l'interactivité du déploiement. 2. La dynamicité des systèmes distribués : l'intégration de plusieurs aspect de dynamicité dans le support pour faire évoluer l'application et s'adapter aux changements de l'environnement d'exécution. 3. Le traitement de signal multidimensionnel : l'adéquation du support d'exécution distribué pour des applications Array-OL (langage dédié au traitement de signal), par construction d'un modèle d'exécution particulier et par sa mise en oeuvre.
Complexity in the digital systems integration rises from the heterogeneity of the components integrated in a chip. The aim of the Sophocles project is to validate methodologies, platforms and technologies to support integration, verification and programming, over a distributed environment, of complex systems composed of heterogeneous virtual components. Several formalisms are gathered, according to their applicability, in order to immediately propose a framework of formal specification and validation of applications for SoCs. The unification of these formalisms in a modeling language facilitates the work of the users while guaranteeing a strong semantics on all the levels of the specification.
High-throughput real-time systems require non-standard and costly hardware and software solutions. Modern workstation can represent a credible alternative to develop real-time intensive signal processing applications. Furthermore, the programming model of Kahn Process Networks (KPN) corresponds completely to this kind of applications and fits perfectly on multiprocessor systems.We present a new activation strategy of processes in a KPN which considerably improves the existing techniques. This new activation order takes into account both deadlock detection and resolution with time constraints. With this algorithm, we do not need to wait until the entire execution has deadlocked to remove bottleneck. Moreover, we have an optimized allocation memory mechanism and we can bound the number of process context switches.
The development of embedded applications is very difficult. Several different languages are usually used to specify different parts of the application or of the hardware. Dealing with so many languages can be daunting. A separation of the preoccupations: application, hardware architecture, association between them and the simulation or execution technologies are keys to efficient co-design of embedded applications. The Model Driven Architecture can be used to better deal with the reuse of parts of the design and the interoperability between both the implementation technologies and the various simulation levels. We propose a construction of metamodels to support a co-design methodology. This construction will be experimented on intensive signal processing application co-design to justify the adequacy of this methodology to usual industrial development techniques.
ARTiS est un ordonnancement asymétrique de processus temps réel. ARTiS est une souche possible pour la production d'un système temps réels dans le cadre du projet ITEA HYADES. Le présent rapport technique accompagne les premières modifications du noyau Linux 2.5.69 SMP dans le but d'implanter ARTiS. Un modèle des processus temps réel ARTiS compatible avec les objectifs du projet HYADES est exposé. Les premiers éléments d'implantation sont commentés et des perspectives d'implantation sont fournies.
Interconnection network performance is a key factor when constructing parallel computers. The choice of an interconnection network used in a parallel computer depends on a large number of performance factors which are very often applications dependent. We propose a performance evaluation and comparison methodology. This methodology is applied on a new class of interconnection network (Over-Sized Delta network) and on the Omega network, and will be used in future work in order to evaluate the use of multistage interconnection networks as an intercommunication medium in today's Symmetric Multiprocessors.
Interconnection network performance is a key factor when constructing parallel computers. The choice of an interconnection network used in a parallel computer depends on a large number of performance factors which are very often applications dependent. We try in this paper to give the outlines of a performance evaluation and comparison methodology using what we think of as the most important parameters to be considered when solving such a problem. This methodology is applied on a new interconnection network called MCRB network and on Omega network.
Complexity in the digital systems integration rises from the heterogeneity of the components integrated in a chip. The simulation or code generation of such systems require to validate methodologies, platforms and technologies to support integration, verification and programming, of complex systems composed of heterogeneous virtual components. Several formalisms are needed according to their applicability in order to propose a framework of formal specification. The unification of these formalisms leads to visually model intensive signal processing applications for embedded systems. A part of this methodology has come down from the Array-OL language. An application is represented by a graph of dependences between tasks and arrays. Thanks to the data-parallel paradigm, a task may iterate the same code on different patterns which tile its depending arrays. The visual notation we propose uses a UML 2.0 standard proposal. This allows usage of existing UML 2.0 tools to model an application. A UML profile dedicated to Intensive Signal Processing with a strong semantics allows automatic code generation, automatic mapping on SoC architectures for early validation at the higher level of specification.
La complexité d'intégration des systèmes numériques vient de l'hétérogénéïté des composants intégrés sur une puce. La simulation ou la génération de code pour de tels systèmes nécessite la validation de méthodologies, de plate-formes et de technologies pour supporter l'intégration, la vérification et la programmation de systèmes complexes composés de composants virtuels hétérogènes. En fonction de leur domaine d'application, plusieurs formalismes sont nécessaires pour proposer un cadre de spécification formelle. L'unification de ces formalismes conduit à la modélisation visuelle d'applications de traitement de signal intensif pour systèmes embarqués. Une partie de cette méthodologie vient du langage Array-OL. Une application y est représentée comme un graphe de dépendances entre des tâches et des tableaux. En utilisant le paradigme du parallélisme de données, on peut décrire la répétition d'une même tâche sur différent motifs pavant les tableaux avec lesquels elle est en relation de dépendance. La notation visuelle que nous proposons utilise une proposition de standard UML 2.0. Nous pouvons ainsi réutiliser les outils UML 2.0 pour modéliser une application. Nous proposons ici un profil UML dédié au traitement de signal intensif avec une sémantique forte permettant la génération de code automatique ou le placement sur des architectures de type SoC pour une validation au plus tôt des spécifications.
Process networks are networks of sequential processes connected by channels behaving like FIFO queues. These are used in signal and image processing applications that need to run in bounded memory for infinitely long periods of time dealing with possibly infinite streams of data. This paper is about a distributed implementation of this computation model. We present the implementation of a distributed process network by using distributed FIFOs to build the distributed application. The platform used to support this is the CORBA middleware.
Les réseaux de processus sont des processus séquentiels communicant uniquement par des canaux se comportant comme des files d'attentes. Ils sont utilisés pour modéliser des applications de traitement du signal ou de l'image devant fonctionner en mémoire bornée pendant des périodes de temps potentiellement infines et traitant des flux de donnés eux-aussi potentiellement infinis. Cet article s'intéresse à l'implémentation distribuée de ce modèle de calcul. Nous présentons l'implémentation distribuée de réseaux de processus grâce à l'utilisation de files d'attentes distribuées. La plate-forme logicielle utilisée est l'intergiciel CORBA.
We present a methodology to visually model intensive signal processing applications for embedded systems. This methodology is based on the Array-OL language. The idea is to represent an application as a graph of dependencies between tasks and arrays. It differs from the classical reactive programming or message passing paradigm. A task may iterate the same code on different patterns tilling its depending arrays. In this case, visual specifications of dependencies between the pattern elements are enough to define an application. The visual notation we propose uses the UML standard. This allows usage of existing UML tools to model an application. More, the application's model can be exported and saved in a standardized XMI format. The resulting application's model can then be imported by others tools performing automatic exploitation, like validation, transformation or code generation.
Complexity in the digital systems integration rises from the heterogeneity of the components integrated in a chip. The aim of the Sophocles project is to validate methodologies, platforms and technologies to support integration, verification and programming, over a distributed environment, of complex systems composed of heterogeneous VCs. Several formalisms are gathered, according to their applicability, in order to immediately propose a framework of formal specification and validation of applications for SoCs. The unification of these formalisms in a modeling language facilitates the work of the users while guaranteeing a strong semantics on all the levels of the specification.
La complexité de l'intégration de systèmes numériques provient de l'hétérogénéïté des composants intégrés sur une puce. Le but du projet Sophocles est de valider des méthodologies, des plateformes et des technologies permettant le support de l'intégration, la vérfication et la programmation, dans un environnement distribué, de systèmes complexes composés de composants virtuels hétérogènes. Plusieurs formalismes sont regroupés selon leur domaine d'application dans le but de proposer immédiatement un cadre de spécification formelle et de validation d'applications sur systèmes sur silicium. L'unification de ces formalismes au sein d'un langage de modélisation facilite le travail des utilisateurs tout en garantissant une sémantique forte à tous les niveaux de la spécification.
Process networks is a widely used model to describe highly concurrent applications. We present here a distributed implementation of a slightly restricted process network model realized using the CORBA middleware. This implementation allows the non computer science specialist to easily program heterogeneous meta-applications based on an assembly of components communicating through FIFO queues.
Les réseaux de processus sont souvent utilisés pour décrire les applications fortement concurrentes. Nous présentons ici une implémentation distribuée d'une légère restriction de ce modèle utilisant l'intergiciel CORBA. Cette implémentation permet au non-informaticien de programmer facilement des méta-applications construites comme un assemblage de composants communicant via des files d'attentes.
CacheFS est un système de fichiers compatible VFS et complètement distribué, sans notion de serveur, dans lequel la capacité de stockage d'un noeud est géré comme un cache du système global. Une première implantation de ces principes a été réalisée sous GNU/Linux.
Real-time intensive signal processing applications have been traditionally deployed on various custom platforms. Meanwhile, the enterprise computing market had spurred the advent of inexpensive and powerful systems based on widely available processors. Today, SMP systems associating a potentially large number of recent processors are deemed to cope with the needs of the most demanding real-time applications. On the operating system side, GNU/Linux is gaining wider acceptance and extending GNU/Linux to tackle real-time application scheduling is a common approach. We propose ARTiS, an asymmetric real-time scheduling for SMP systems. ARTiS ensures the possible preemption of a processor when the system has to schedule a real-time process. We have modified the GNU/Linux SMP scheduler to implement ARTiS. The evaluation of our approach shows significant improvements.
The applications of signal processing (SP), like sonar processing chains, have quite particular algorithmic characteristics. In order to standardize the specification of these applications, TMS (Thomson Marconi Sonar) has developed a SP oriented language: Array-OL (Array Oriented Language). It allows to specify the computation algorithm and the data dependences without worrying about mapping or scheduling. We have focused on the compilation of applications specified in Array-OL aiming as much traditional workstations (for simulation) as Array-OL dedicated systems. Owing to the pre-existence of a support of execution for Array-OL (software and hardware), we have preferred a compilation method that transform applications at the level of the language (by introduction of hierarchical levels) rather than a strategy of direct implementation. In order to set up these transformations, we have used a formalism suited to the description of Array-OL: the ODT (Operateurs de Distribution de Tableaux, Array Distributions Operators in english). ODT let us formally describe the transformations which consists in producing one or more hierarchies from a sequence of tasks and to control their granularity. Giving the number of different schema these transformations can generate, we have also defined measurements which allow to evaluate the effect of these transformations in order to guide their use. Finally the graphical environment Gaspard makes available these tools to everyone, and in particular to the developers of SP applications, by allowing users to create, transform and compile Array-OL applications toward multi-platforms (sequential, SMP, distributed...) by a completely graphic and interactive way.
Les applications de traitements de signal (TS), qu'on trouve notamment dans les chaînes sonar, ont des caractéristiques algorithmiques bien particulières. Afin de répondre aux besoins de spécification et de standardisation de celles-ci, TMS (Thomson Marconi Sonar) a développé un langage orienté TS : Array-OL (Array Oriented Language). Il permet de spécifier l'algorithme de calcul et les dépendances de données sans se soucier des problèmes de placement et d'ordonnancement. Nos travaux se situent au niveau de la compilation d'applications spécifiées en Array-OL visant autant les stations de travail classiques (pour la simulation) que des machines dédiées à Array-OL. La préexistence d'un support d'exécution Array-OL (logiciel et matériel) nous a conduit à préférer une méthode de compilation par transformation des applications au niveau du langage (introduction de niveaux hiérarchiques) plutôt que des stratégies d'implémentation directes. Pour mettre en place ces transformations, nous avons utilisé un formalisme approprié à la description du langage Array-OL : les ODT (Opérateurs de Distribution de Tableaux). Ils nous ont permis de décrire formellement les transformations qui consiste à produire une ou plusieurs hiérarchies à partir d'une séquence de tâches et de contrôler le grain de celles-ci. Devant le nombre de schémas différents que ces transformations peuvent engendrer, nous avons également défini des mesures permettant d'évaluer l'effet de ces transformations afin de guider leur utilisation. Enfin l'environnement graphique Gaspard rend accessible ces outils à tous, et notamment aux développeurs d'applications TS, en permettant la création, la transformation et la compilation multi-plateformes (séquentielle, SMP, distribuée...) d'applications Array-OL de manière totalement graphique et interactive.
CacheFS est un système de fichiers compatible VFS et complètement distribué, sans notion de serveur, dans lequel la capacité de stockage d'un n oe ud est géré comme un cache du système global. Une première implantation de ces principes a été réalisée sous GNU/Linux.
Real-time intensive signal processing applications have been traditionally deployed on various custom platforms. Meanwhile, the enterprise computing market had spurred the advent of inexpensive and powerful systems based on widely available processors. Today, SMP systems associating a potentially large number of recent processors are deemed to cope with the needs of the most demanding real-time applications. On the operating system side, GNU/Linux is gaining wider acceptance and extending GNU/Linux to tackle real-time application scheduling is a common approach. We propose ARTiS, an asymmetric real-time scheduling for SMP systems. ARTiS ensures the possible preemption of a processor when the system has to schedule a real-time process. We have modified the GNU/Linux SMP scheduler to implement ARTiS. The evaluation of our approach shows significant improvements.
Array-OL, developed by Thomson Marconi Sonar, is a programming language dedicated to signal processing. An Array-OL program specifies the dependencies between array elements produced and consumed by tasks. In particular, temporal dependencies may be specified by referencing elements that belong to an infinite dimension of an array. A basic compilation strategy of Array-OL on a workstation has been defined. This basic compilation does not allow the generation of efficient code for any Array-Ol application; specifically those defining infinite arrays. We propose to transform such applications to hierarchical Array-OL applications that may be compiled with Array-Ol basic strategy. We introduce a formal representation of Array-OL applications, which is a relation between points of Zn spaces; code transformations are applied at this level. In this paper we show how the transformation process is used during the compilation phase of a representative application.
Gaspard is a visual programming environment devoted to the development and control of scientific parallel applications. The two paradigms of parallel programming (task and data parallelism) are mixed in Gaspard: a hierarchy of task graphs operates on array flows. These two levels are mixed in a common metaphor. An application is designed as a printed circuit: the programmer specifies tasks as boards or chips and instantiates tasks by plugging them into slots. The number crunching applications developped using Gaspard are deployed on metacomputing platforms. The visual specification of the application mapping may be dynamically modified at runtime according to the information provided by Gaspard.
Matrix manipulation programs are easily developed using a visual language. For signal processing, a graph of tasks operates on arrays. Each task iterates the same code on different patterns tilling these arrays. In this case visual specifications of dependencies between the pattern elements are enough to define an application. From the Array-OL language developed by Thomson Marconi Sonar, we propose a graphical environment, Gaspard, dedicated to the data-parallel paradigm. Only elementary SPMD tasks are textual. A full environment has been implemented including the graphical editor, a code transformer and a code generator for SMP computers.
La modélisation numérique en électromagnétisme permet de réduire les coûts de développement d'un dispositif en prédisant son comportement sur ordinateur. La qualité de la prédiction dépend de la finesse du modèle employé. De nos jours, la complexité des dispositifs oblige souvent à recourir à des modèles 3D, grands consommateurs de puissance et de mémoire. Une solution consiste à employer les architectures parallèles. Nous nous intéressons ici à la parallélisation d'une application de recherche 3D, écrit en Fortran 77 et basé sur les éléments finis de Whitney. Elle est fortement évolutive et sa maintenance est assurée par des électrotechniciens. Nous proposons une approche de haut niveau basée sur l'idée de compromis entre les impératifs d'efficacité et les critères de qualité du génie logiciel. Le but est de créer une application parallèle efficace et facilement maintenable pour des non-spécialistes. L'aspect génie logiciel est pris en charge par la nouvelle norme qu'est Fortran 90 et par le modèle à parallélisme de données incarné par son langage standard High Performance Fortran (HPF). Ce dernier, dans le cas des applications irrégulières, ne permet pas une gestion efficace des communications. Pour rendre ces dernières efficaces, nous utilisons Halos, une bibliothèque de passage de messages de haut niveau spécialement conçue pour les applications HPF. Deux versions parallèles ont été développées. La première utilise une parallélisation qui ne remet pas en cause la structure et les algorithmes fondamentaux du code. Les résultats obtenus sont très acceptables pour un petit nombre de processeurs, mais ne peuvent être généralisés à un plus grand nombre. La seconde est basée sur la méthode du complément de Schur qui permet de corriger les faiblesses de la première, au prix d'un investissement logiciel plus important. Les résultats obtenus révèlent une application à la fois efficace et facilement maintenable pour des non-spécialistes.
Matrix manipulation programs are easily developed using a visual language. For signal processing, a graph of tasks operates on arrays. Each task iterates the same code on different patterns tilling these arrays. In this case visual specifications of dependencies between the pattern elements are enough to define an application. From the Array-OL language developed by Thomson Marconi Sonar, we propose a graphical environment, Gaspard, dedicated to the data-parallel paradigm. Only elementary SPMD tasks are textual. A full environment has been implemented including the graphical editor, a code transformer and a code generator for SMP computers.
The polyhedral model is quite popular in the field of parallel computing. So, research prototypes tend to use tools like PIP (parametric integer programming solver), the PolyLib (library for polyhedra manipulation) or Omega (library and calculator for Presburger formulas). The two main drawbacks of these tools are a poor human-computer interface and a lack of agressive simplification. This last deficiency leads to sequences of computations which give too complex results or even that cannot be completed due to memory exhaustion or time constraints. The SPPoC calculator brings a solution to these problems due to its interactive and totally symbolic interface and to its advanced simplification modules. It allows also the unification of different tools. We present two applications which use SPPoC: a code generator and a communication volume estimator
Dans le domaine de l'informatique parallèle, le modèle polyèdrique est très souvent utilisé. Les prototypes de recherche dans ce domaine utilisent donc souvent des outils comme PIP (résolution paramétrique de programes linéaires), la PolyLib (bibliothèque de manipulation de polyèdres) ou Omega (bibliothèque et interface de manipulation de formules de Presburger). Les deux principaux problèmes de ces outils sont leur manque de convivialité et des modules de simplification trop primitifs. Le manque de simplification fait que l'enchaînement de calculs conduit à des résultats incompréhensibles ou qui n'aboutissent pas pour des problèmes de mémoire ou de temps. La calculatrice SPPoC résoud ces problèmes grâce à son interface interactive totalement symbolique et à des modules de simplification des résultats plus poussés. Elle permet aussi d'unifier différents outils. La présentation de SPPoC est illustrée par deux applications : une application de génération de code et une application d'estimation de volume de communications
Array-OL is a programming language dedicated to signal processing developed by Thomson Marconi Sonar. An Array-OL program specifies the dependences between array elements produced and consumed by tasks. In particular, temporal dependences are specified by referencing elements that belong to an infinite dimension of an array. A basic compilation strategy of Array-OL on a workstation has been defined. This basic compilation does not allow to generate efficient code for any Array-OL application; specially those defining infinite arrays. We propose to transform such Array-OL applications to hierarchical Array-OL applications that may be compiled with the basic strategy. We introduce a representation of ArrayOL applications in a formalism of relation between points of Zn spaces; code transformations are applied at this level. The paper reports usage of the transformation process in the compilation of a representative application.
The Gaspard (Graphical Array Specification for PARallel and Distributed computing) project is a visual specification environment for data-parallelism. We describe here the specification model used in Gaspard. This model inherits from the Array-OL one. We then define an SQL inspired approach to intensive data treatment that proposes a language to describe irregular components.
Le projet Gaspard (Graphical Array Specification for PARallel and Distributed computing) est un environnement de spécification visuelle pour le data-parallélisme. Nous décrivons le modèle de spécification utilisé dans Gaspard défini comme extension du modèle Array-OL. Nous définissons ici une approche du traitement de données intensif inspiré de SQL qui propose un langage de description de composants irréguliers.
Parallel computers are difficult to program efficiently. We believe that a good way to help programmers write efficient programs is to provide them with tools that show them how their programs behave on a parallel computer. Data distribution is the major performance factor of data-parallel programs and so automatic data layout for HPF programs has been studied by many researchers recently. The communication volume induced by a data distribution is a good estimator of the efficiency of this data distribution. We present here a symbolic method to compute the communication volume generated by a given data distribution during the program writing phase (before compilation). We stay machine-independent to assure portability. Our goal is to help the programmer understand the data movements its program generates and thus find a good data distribution. Our method is based on parametric polyhedral computations. It can be applied to a large class of regular codes.