What is an embedded domain specific language

In recent years there has been an increasing interest in new programming languages ​​that are specialized in certain problem domains, so-called domain-specific languages ​​(DSLs). DSLs provide special syntax and semantics for solving a specific class of problems. When programs are developed with a DSL, their specific syntax allows higher productivity for the end user, since their syntax is closer to the problem domain than the syntax of a general-purpose language (GPL). However, implementing a new programming language is costly because language developers have to design and implement syntax and semantics. Because of these high initial costs, many are reluctant to implement and use a new DSL.

In order to minimize the initial costs for DSLs, there is the approach of embedding a DSL in an existing language. The syntax and semantics of the embedded language are described in the form of a library in this language. The existing language acts as a host language for the DSL, which is embedded in the host language as a guest language. The embedding allows guest languages ​​to reuse the general language constructs of the host language, which saves development costs. In addition, the embedding allows the reuse of the existing language infrastructure, such as the development tools, the parser, the compiler and the virtual runtime environment. If necessary, new language constructs can be added by expanding the corresponding libraries. A comparative study shows that embedding can significantly reduce development costs for implementing new languages ​​compared to traditional non-embedded language development approaches.

Despite these advantages, existing approaches to language embedding have important limitations on support for language properties that are normally supported by non-embedded approaches. First and foremost, with existing approaches to embedding in most of the host languages ​​used, there is no support for the free design of the specific syntax for the embedded guest language. When embedding the host language in the host language, the host language expressions must match the host language syntax. Second, the language semantics that a host language can have is limited by the given language semantics of the host language. Third, there is no support for special compositions from multiple DSLs that have interactions in their syntax and semantics. These restrictions are mainly due to the lack of adaptability of constructs in embedded languages ​​and host languages.

To address these problems, the present work proposes the Reflective Embedding Architecture, which enables guest languages ​​to be embedded in reflective host languages. Reflexive languages ​​are programming languages ​​that have special language constructs to change the structure and behavior of programs during their evaluation. Using a reflexive language allows the above limitations to be overcome, as embedded languages ​​can be customized through analysis and transformation of their language libraries.

The Reflective Embedding Architecture defines a new procedure for embedding DSLs and describes how the syntax and semantics of a new DSL can be embedded as a set of artifacts. Every language is an encapsulated component that defines a set of well-defined interfaces. The classes of the embedding implement these interfaces in order to define the details of the execution semantics for the language component. The architecture also supports the embedding of language compositions, scoping strategies, domain-specific analyzes and transformations. The special thing about the architecture for embedding is that every language automatically has a meta-level that developers can use to adapt the embedding. Adjustments at the meta level have the following advantages:

First, the Reflective Embedding Architecture enables the support of concrete syntax for embedded languages. For each embedded language, the developers can add annotations to the associated library in order to describe metadata for converting DSL programs with concrete syntax into a corresponding executable form. By using a special form of agile grammars called island grammars, only those parts of the embedded language need to be specified in a grammar that actually differ from the grammar of the host language. Island grammars help to significantly reduce the initial costs compared to other embedded approaches, while full support for context-free languages ​​is guaranteed.

Second, the meta-level enables support for language compositions whose sub-languages ​​have interactions in their syntax and semantics. On the one hand, the meta-level enables the syntax of several sub-languages ​​to be analyzed and composed into a composite language syntax. On the other hand, it is possible to adapt the embedded sub-languages ​​in such a way that they can interact with each other at well-defined points within their evaluation logic. By supporting language compositions with interactions, the architecture enables the embedding of special language abstractions that make it easier for the end user to develop modular programs.

Third, the meta-level enables languages ​​that interact with their host language to be embedded by using reflexive mechanisms to influence the evaluation of the host's language constructs. DSLs can only be embedded with special modularization concepts if the evaluation of constructs in the host language can be changed.

To evaluate the reflective embedding architecture, the concepts proposed in this thesis are qualitatively and quantitatively checked. On the one hand, an evaluation for the support of the reflective embedding architecture for desirable language properties is created and compared with related work in order to evaluate the concepts qualitatively. On the other hand, a language implementation based on the reflective embedding architecture is compared quantitatively with implementations of the same language that are based on embedded approaches and non-embedded approaches from the related work. In comparison, the evaluation shows that the reflective embedding architecture for embedded languages ​​enables comprehensive support of the desirable properties for embedded languages ​​for the first time and that only moderately increased runtime costs result.