Posts tagged ‘DSL’

Lets continue the development agile DSL for music notation with Groovy. If you remember our fundamental concepts are Scores, Parts (for instruments), Phrases and Notes.

At a certain time we are typically working only on a Score, Part and Phrase (indeed we might work only on a single Score during a session). So, we would like to have a concept of default Score, Part and Phrase, and avoid referring to it (unless of, course, we want to change the default). For instance, instead of writing:

...
myScore = score(name:"Row Your Boat")
myPart = part(title: "Flute", instrument: FLUTE, channel: 0)
myPhrase = phrase(startTime: 0.0)
myPhrase.addNoteList pitchArray, rhythmArray

(pitchArray and rhythmArray are pre-defined before)
We want to write, the much simpler

1
2
3
4
5
...
score(name:"Row Your Boat")
part(title: "Flute", instrument: FLUTE, channel: 0)
phrase(startTime: 0.0)
addNoteList pitchArray, rhythmArray

All Score, Part and Phrase methods will implicitly refer to myScore, myPart and myPhrase. Note that you can still explicitly refer to them. Indeed this will be necessary has most scores.

In this first instalment (of 2) we will not deal with line 5 above. Part 1 is actually the bulk of the work. Breath deeply has this will be the tough part.

We will use Groovy ASTTransformations for this. The Groovy compiler allows us to attach code to it while it is working. We can manipulate the AST (Abstract Syntax Tree) of our code during most of the compilation stages. This means that we will need a separate program to attach to the compiler. So, if we step back we now have 3 artifacts:

  1. The code to do the AST transformation (called during compilation)
  2. The core DSL implementation (with all the other stuff except AST transforms)
  3. Your music scripts with your score

So we need kind of a sub-project to handle this as Groovy requires a separate jar with the AST transformation code. This separate jar will have to have a descriptor file in the META-INF/services directory called

org.codehaus.groovy.transform.ASTTransformation

That is the name of the file (big one eh?). Inside it should have only one line: the fully qualified name for the class implementing the transformation (SimpleTransformation in our case).

OK, now we need to develop SimpleTransformation. This is not a trivial bit of code, I will splash it here and the it line by line (only dealing with Scores – Parts and Phrases are similar):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
@GroovyASTTransformation(phase=CompilePhase.CONVERSION)
public class SimpleTransformation implements ASTTransformation {
 
  public void visit(ASTNode[] astNodes, SourceUnit sourceUnit) {
    BlockStatement sblock = sourceUnit.getAST()?.getStatementBlock()
    List stmts = sblock.getStatements()
    int numStmts = stmts.size()
    for (int i=0;i < numStmts ;i ++) {
      Class cls = stmts.get(i).getClass()
      if (cls == ExpressionStatement) { 
        Expression es = stmts.get(i).expression
 
        if (es.getClass() == MethodCallExpression) {
          String method = es.method.text
          if (method.equals("score")) {
              Expression e = transformBinary("myScore", es)
              stmts[i].setExpression(e)
          }
        }
      }
    }
  }
...

So

  • Lines 1-4 – Boilerplate of our class so that the Groovy compiler uses this. There is one important part here: the phase where the code will attach. For now I am attaching to the conversion phase. But this might change in the future (I would like to do some type analysis, but I do not even think that that is possible with Groovy. If it is possible, than analysis would have to be done at a later phase).
  • 5-6 – We get the statements of the script that we are compiling
  • 7 – Here we iterate through all statements. Note the for and not an each/closure. I do this because I might want to change the statement list (like adding stuff at the end – prints). That is not so easy with each/closures
  • 10 – We get all expressions. This means we ignore fors, ifs, switches, function definitions, … We are not going deep, just changing methods at the top level of the code.
  • 13-17 – If it is a Method Call, and it method name is called score then we apply our transformation (16) and replace the expression (line 17)

Our transformation is:

BinaryExpression transformBinary(String var, Expression expression) {
    BinaryExpression newExp = new BinaryExpression(
      new VariableExpression(var),new Token (100, "=", 1, 1), expression)
    return newExp
  }

OK, here the bulk of the work is done: We create a new BinaryExpression composed of a Variable (called myScore in our case - as per the code above), a Token and then we attach the old expression, as is. So score(name:"Row Your Boat") becomes myScore=score(name:"Row Your Boat").

Now, a confession. The 100 in the Token was a reverse engineering of an expression. I do not know where the table of options for token types is (If you know, please tell).

You will need a few imports to do the above, by the way

import org.codehaus.groovy.ast.*
import org.codehaus.groovy.ast.expr.*
import org.codehaus.groovy.ast.stmt.*
import org.codehaus.groovy.control.*
import org.codehaus.groovy.syntax.Token
import org.codehaus.groovy.transform.*

With all this you now create a jar that will have to be on the classpath of the Groovy compiler. So, this code will be used by the Groovy compiler to manipulate the AST.

Note that this code is pretty basic: It will not recurse through for/switch statements, will not go in closures, functions, etc. It will also only look at the first token in a method call expression. I will deal with this in time (not in the second part of this article). For now it is good for illustrative purposes and good for my personal needs.

Some final notes...

You can inspect an AST from groovyConsole (helps a lot), here is an example for sc=score(name:"Row Your Boat"):

AST viewing with groovyConsole

Another point if that these kind of transformations are a bit heavy in the theory and heavy in the approach. For instance, it was difficult, in netbeans to setup a project architecture that would allow easy build (an agile cycle of develop/build/test). This is of course because part of the code has to be hooked to the compiler and IDEs are not normally used to do that. It is a bit like compiling part of the compiler before going to the actual code. Well, I finally switched to emacs+gradle. Any excuse to stop using Oracle software (which netbeans nowadays is) is fair game for me.

In the second instalment we will trap method calls like addNoteList so that addNoteList listOfNotes becomes myPhrase.addNoteList listofNotes (like line 5 on the initial example above). In this case we will use some introspection to determine the method names of Score, Part and Phrase. The second part will be cooler as the bulk of the boilerplate work was done here.

You can find the code in launchpad. Note that this is still in early stages.

Comments and improvements will be most appreciated!

I am starting to play (pun intended) with jMusic. I am just learning the basics of music composition. jMusic is quite cool, but having the usual Java overhead makes things oh so boring! Therefore I am starting developing a DSL in Groovy to write some scores. Score is exactly the first class that was DSLed. Something of a trivial nature. Just replacing this

score = new Score("My new Score")

with:

score = score(name: "My new Score")

and the same to Note and Phrase. Furthermore I would like to avoid all the usual “import everything”.

Solution? Create a class with a static method that accepts an environment to which I add the necessary functions. So, in an external file I have class Music to do just this:

1
2
3
4
5
6
7
8
9
10
11
12
13
class Music implements JMC {
  static init(env){
 
    env.score = { Map args ->
      Score score = new Score(args["name"])
      return score
    }
 
    ProgramChanges.fields.each {env."$it.name"=ProgramChanges."$it.name"}
    Durations.fields.each {env."$it.name"=Durations."$it.name"}
    Pitches.fields.each {env."$it.name"=Pitches."$it.name"}
 
    ...

So, lines 4-7 I am creating a new property with a closure. The property accepts a map of parameters and creates a new Score object that is returned. Similar things exist with Note and Phrase.

Now look at lines 9-11. I am importing all fields of those classes into the environment namespace. “That is namespace polution”, I hear you say. Well, maybe, but it happens to be a jMusic design philosophy (you will find that in many classes of jMusic), and, at least for now it is pretty manageable. If it becomes problematic, this can always be changed. Writing CLARINET sounds better than writing ProgramChanges.CLARINET or even creating an INSTRUMENT property/class in the environment to hold all instruments. This is particularly useful with Pitches and Durations (because we tend to write A LOT of these). A simple script looks like this for now:

Music.init(this)
score = score(name:"Row Your Boat")
flute = part(title: "Flute", instrument: FLUTE, channel: 0)
trumpet = part(title: "Trumpet", instrument: TRUMPET, channel: 1)
clarinet = part(title: "Clarinet", instrument: CLARINET, channel: 2)
int[] pitchArray = [C4,C4,C4,D4,E4,E4,D4,E4,F4,G4,
		    C5,C5,C5,G4,G4,G4,E4,E4,E4,
		    C4,C4,C4,G4,F4,E4,D4,C4]
double[] rhythmArray = [ C, C,CT,QT, C,CT,QT,CT,QT, M,
			QT,QT,QT,QT,QT,QT,QT,QT,QT,QT,
			QT,QT,CT,QT,CT,QT, M]
 
phrase1 = phrase(startTime: 0.0)
phrase1.addNoteList pitchArray, rhythmArray
...

Nothing particularly fantastic, but somewhat less clutter.
This is version 0.0.0.0.0.1 pre-pre-pre-alpha. ;)
Watch this space for newer versions (more useful).
For now this serves to show two very basic DSL techniques with Groovy: adding methods to the environment and inspecting classes to copy fields.

When you read about programming language comparisons, the main narrative for comparison is normally about the paradigm(s) supported. Lisp, Haskell, Scala, Clojure fall mainly in the functional realm. Prolog is logic. Smalltalk OO. C and Fortran, imperative. Most of them are not “pure” paradigm (e.g. you can make nice OO designed programs in C – just check GTK’s GLib library if you disagree, imperative coding in Prolog, and so on…), but that is besides the point.

The point is that, when comparing programming languages, the main issue of discussion is the bloody paradigm thing.

Paradigm is not really that important! In fact, as said above, you normally can tweak a language to write in your favorite paradigm. Sure the ability to do that varies from case to case, but in most cases that I can think of, it is really not difficult to cross paradigm boundaries. In fact, I would go as far as to argue that it is easier to do proper OO design with C using GLib then with the highly complex and convoluted C++.

Before going into the fundamental point that I want to make, I would also note that ecology matters: Are there good libraries? Good documentation? Does it run on a virtual machine? Portability? Nice community? User base? That is, when comparing programming languages all that is around the language is more important than the language itself. Just ask all the poor of us poor Prolog/Lisp/Haskell fans why are we doing Java/C++ during most of our day? It puts bread on the table, and, for the most of us, that is the most important criteria (I prefer not to starve!).

But, going to the main point here, I would like to propose that one of the fundamental points in comparing programming languages from a technical standpoint is homoiconicity.

Just to remember, an homoiconic language is a language where the program is represented as the core language data-type. Code is a data type.

If you classify languages according to homoiconicity, then they split in completely different ways:

  1. The homoiconic bunch: Lisp, Prolog, Ioke, Clojure, …
  2. The non-homoiconic bunch: Cobol, Fortran, C, Java, Goovy, Scala, Haskell, OCaml, [A very long list follows]…

From this point of view, the comparison of say, Clojure to Scala as sister-languages makes little sense, as they fall in different groups.

Homoiconic languages lend themselves to – by construction – metaprogramming and extensibility (think very easy embedded DSLs). And some of these features are difficult (with varying levels of difficulty) to implement in non-homoiconic languages. At best (as “best” I am thinking of some scripting languages like Python), they are awkward to do in a non homoiconic language.

As a side jab, last time a checked, Scala was very very poor on metaprogramming (has that changed?), making it the only “modern” language which seems to be scorning metaprogramming. Scala can still be DSL-extensible (I offer my own example both in Scala and Grovy: Ronald: A Domain-Specific Language to study the interactions between malaria infections and drug treatments.

One could argue of the value of doing programs that reason about themselves (and that idea has very bad karma coming from assembler – an idea so old and so disconnected from current reality that I am not even going to discuss it). I am surely on the side that proper metaprogramming is one of the core features of any elegant, productive and declarative solution.

Also, a very nice side effect of having code as data, is that the syntax of homoiconic languages is normally very, very simple (as in trivial to learn). This is just a side effect, but compare this with the learning curve of, say, C++ syntax. There is also a philosophical issue here: you get a simple, highly flexible environment, where complexity is tacked not by having a complex mammoth that tries to address all possible cases, but by a set of plastic, bendable building blocks.

Homoiconicity is not a black-and-white feature. For instance, Lisp macros are not first-class objects (I am a Clojure newbie, so feel free to correct me) so you cannot metaprogram with them. Prolog seems to come close. In fact, to a Prolog programmer, Lisp macros seem especially inelegant as the are “out of the system”.

I am doing some development in Clojure (a Lisp type language for the JVM). Lisp as in a clone tailored for the JVM, not Lisp as only “functional programming”. I note, by the way, that more than functional programming, Lisp is an homoiconic language.

I developed a simple system to specify Swing menus in clojure, here is an example:

Simple Menu

Simple Menu

The following “micro-language” was developed to specify this:

 (getMenuBar actionManager '(
    (menu {
      :text "Project" :key "P"
      :content (
        (item {:text "New" :key "N"})
        (item {:text "Open" :key "O"  })
        (item {:text "Close" :key "O" :id "Close" :enabled false})
        (item {:text "Recent" :key "R"})
        (separator)
        (item {:text "Exit" :key "E"})
      )
    })
    (menu {
      :text "Options" :key "O"
      :content (
        (item {:text "Rendering" :key "R"})
      )
    })
))

The code is very easy to read, I hope: two menu items, with a few menu entries with text, ability to enable/disable and accelerator keys, plus a separator.

Notice the actionManager on top, is it the (very simple) event processing function which receives only a text as parameter (to identify the selection). The text is simply the menu text, or, if specified an id. Not the most general solution, but enough for simple menu structures.

The code? Below is the _complete_ implementation.

 
(ns org.tiago.swing
  ;(:require clojure.contrib.def)
  (:use
    [clojure.contrib.seq-utils :only (flatten)]
    [clojure.contrib.def :only (defnk)]
  )
  (:import
    (java.awt.event ActionListener KeyEvent)
    (javax.swing JFrame JMenu JMenuBar JMenuItem)
  )
)
 
(defnk createFrame [title :menuBar nil]
  (def frame (new JFrame title))
  (. frame setDefaultCloseOperation (. JFrame EXIT_ON_CLOSE))
  (if menuBar (. frame setJMenuBar menuBar))
  (. frame pack)
  (. frame setVisible true)
  frame
)
 
(defmulti addMItem (fn [manager x & rst] (first x)))
(defmethod addMItem 'item [manager content menu]
  (let [params (second content)]
    (def mItem (new JMenuItem (:text params)))
    (if (contains? params :id) (. mItem putClientProperty "id" (:id params)))
    (if (contains? params :key) (. mItem setMnemonic (. (:key params) charAt 0)))
    (. menu add mItem)
    (. mItem addActionListener manager)
 
  )
)
(defmethod addMItem 'separator [manager sep menu]
  (. menu addSeparator)
)
 
(defmulti getMBItem first)
(defmethod getMBItem 'menu [desc]
  (let [params (second desc) manager (last desc)]
    (def menu (new JMenu (:text params)))
    ;Assuming mnemonic is ASCII CODE.
    ;java7 has . KeyEvent getExtendedKeyCodeForChar
    (if (contains? params :key) (. menu setMnemonic (. (:key params) charAt 0)))
    (if (contains? params :id) (. menu putClientProperty "id" (:id params)))
    (dorun (map #(addMItem manager % menu) (:content params)))
    menu
  )
)
(defmethod getMBItem :default [arg] (new JMenu "UNK"))
 
(defn getMenuBar [actionManager menuItems]
  (let [manager (
      proxy [ActionListener]
      []
      (actionPerformed [e] (let [obj (.getSource e)
                                 id (.getClientProperty obj "id")]
        (actionManager (if (nil? id) (. obj getText) id))
      ))
   )]
   (def menuBar (new JMenuBar))
   (dorun (map #(. menuBar add %)
            (map #(getMBItem (concat % (cons manager ()))) menuItems)))
    menuBar
  )
)

OK, comments have to be added ;) .
From a declarative point of view, not bad at all.

My first Lisp program. It completely baffles me that, 25 years of programming with all the languages imaginable (including some functional like Caml or highly declarative like Prolog), I never tried Lisp.

If you search the web you can find some discussions on whether IDEs for dynamic languages can be as helpful as IDEs for static languages. The issue is that static languages like Java have compile-time (thus easy to get at IDE-time) information in order to provide that fundamental code-completion functionality (among many others). If the IDE knows that a certain parameter is a String, than it is simple: it will present to you all the String methods when you type in the dot. For dynamic languages things get more complex are there is formally no (by definition) compile-time information. Some people would argue that there are ways around it (which you can already find in existing IDEs, I remember having some sort of code completion, years ago, on SPE – for Python). I will not add anything to that discussion here, this preamble was mainly for putting the reader in context. I am more interested in discussing good IDEs for DSLs.

With DSLs you get, most of the times, added syntax. Worse than that, you might fall into situations where you have changed (not only added) the initial language syntax; furthermore those syntax changes might even become valid only in runtime (imagine that a method is added to a class that is supplying DSL methods).

One example comes from Ioke and Prolog operator precedence and associativity rules which are changeable (see the previous post). It is not trivial to know if something like 1+2 is even syntactically valid (*). Even if it is syntactically valid things like association rules might change. In languages like Groovy you can add (e.g., through categories) methods to code blocs (from classes that can be dynamically changed). Then there is dynamic dispatching and macros. What is valid in a certain piece of code can be different from what is valid a few lines below. In fact, complete information of what is valid in a certain code block might require code execution. Or, to put in another way, it might be very difficult to have a completely helpful IDE! In this scenario there are 3 considerations that I think are worth being done:

1. One should not be discouraged for not having perfect solutions. Maybe it is not possible to determine all that can be expressed in a certain code block, but sometimes good approximations are enough.
2. On this issue, one good example comes from Prolog: In Prolog, syntax can be changed mainly through the use of the :-o p directive (and through asserts and retracts). The :-o p directive changes operators but is very easy to analyze pre-compilation/interpretation. So, the way DSLs are normally be constructed lend themselves very easily to code analysis which can be used by IDEs. This unfortunately not the case in most real-world languages.
3. It would be cool to have a language where DSL specifications could be automatically used to construct IDEs. The current real-world DSL-able languages (Ruby, Groovy, …) are DSL-enabled through indirect techniques which can be used to build DSLs (Dynamic reception, operator overload, whatever), in fact many of these techniques exist with other objectives than creating DSLs. If there was a declarative and explicit way to create DSLs, that information could be used to inform IDEs on parsing and other issues. An embedded, core way, to explicitly specify DSLs.

(*) I suppose some will see this as an argument for the fact that you can do pretty stupid (or at least unintuitive) things with DSLs. Well, you can do stupid things with everything. The question is not if you can or not, but the extent of bad use cases and how bad uses can creep in easily. Another (interesting) discussion, but not for now.

I was reading Ola Bini’s post about operators in Ioke (Ioke being the new language that Ola is developing).

It is a common saying around LISPers that everything that is being done in “modern” languages is a return to LISP. And the argument holds some ground. The truth is, among the 4 most conceptually influential programming languages that I can think of (Lisp/Functional, Fortran/Imperative, Smalltalk/OO, Prolog/Logic), the bad option (Fortran) won as it is the major philosophical contributor to current programming languages (much more than Smalltalk).

Take the reinvention of operators on Ioke as per the post above. This concept is available in Prolog for decades. It is all there: precedence (i.e. 2*3+4 means (2*3)+4 and not 2*(3+4)). Associativity (left or right – ie. 3-2-1 is 0 (3-2)-1 and not 2 3-(2-1) ). And even more as new operators can be defined and can be made of alphanumeric characters (want to create a new operator called say, “in”? go ahead). In fact people were doing DSLs a long time ago (in the small Prolog community at least) using techniques such as these.

The next thing that you will need (and we are getting there with macros and AST access) is no default interpretation. This is especially important with arithmetic, let me give an example:

Imagine the expression 1+x. Most languages will evaluate this expression and will return the sum of 1 + x. If x is defined and say is 4, then 1+x is 5. If x is not defined then an error (compile or run)-time will be raised. This is an absolute disgrace for DSLs with are essentially declarative (i.e., detached from semantics). “1+x” might be something that you want to evaluate now (and get the result) or might be something that you want to specify in order to evaluate later (say, I want to do a chart of all values of x between 1 and 5, or I want to differentiate), look at this pseudo-code

1
2
3
4
5
6
7
Var x
Exp expression = 1 + x**2
 
chart(expression, [[x,[1, 5]]]) //do a chart, x between 1 and 5
evaluate(expression, [[x,3]]) //Evaluate expression where x is 3 (i.e.  10)
diffe = differentiate(expression, x) //returns the expression 2*x
prettyprint(expression) //Pretty prints the expression.

Most people automatically associate the operation evaluate to 1+x**2. That might be so in an imperative world (can I call it shitty world?). But in an declarative/DSL world 1+x**2 is just that, an expression, it has no meaning attached per se. What you do with it depends on the context. Pretty print it, differentiate it, integrate it, or even evaluate it by instantiating x to 3 and getting the “precious” 10.

Update: I was rereading the post and noticed that it might be read as seeing Ola’s work as less interesting. Not at all: I actually think the way forward is precisely improving the current “imperative” setting in the way Ola is doing.

Preamble: In order to understand this post you should know a little bit (a little is enough, that is how much I know) about ExpandoMetaClass and Categories in Groovy.

DSLs that involve existing classes might be a source of long term sorrow. Let me give an example: Imagine that you want to make a small DSL to handle equations, like

x = new Symbol("x")
(2 * x).differentiate(x) //Result is 2

The problem is that the * operator of Numbers doesn’t know how to handle Symbols, therefore an exception would be raised. The obvious solutions as discussed before on mailing lists and blog posts are:

Categories

Categories would solve the problem, but at the expense of polluting the source with things like

use (Something.Category) {
  //code here
}

Not a disaster, but not pretty too…

Talking about disasters…

Expando over Numbers

The idea here would be to change the behavior of Numbers to be able to handle Symbols. Code would be very clean, no need for uses…

As somebody said on the groovy mailing list: This is disaster in the making. The problem is that I change Numbers, then, for another valid reason you change Numbers, somebody else also changes Numbers… This is chaos. Or at least it would make code from different sources potentially not inter operable or exhibiting very strange, buggy, behavior. This is clearly akin to the “global variable” problem. I believe that in the long term and with big software projects, this approach is a dead end.

Enter Python

Python actually has a workaround (I will not call it a clear, beautiful solution) that might be somewhat useful here. Imagine that you do

1 + x

The default 1 (default class for number) is not able to handle the symbol. For python that is OK, it will try to call a “right add” method of x (Search for __radd__ in this page). So, the default behavior is not to raise an exception if the left object cannot handle the operator, but to try to call the “right” version on the right object (if it fails then raise).

Not perfect, but might be just enough to avoid Expando in anger.

I do believe that people still don’t appreciate the consequences of Expanding core classes and the interop disaster that that can entail.