Sunday 13 April 2008

The trouble with DSL's (Domain Specific Languages )

Why do we have domain specific languages? They are supposed to make our lives easier for a specific task. SQL is the classic example of this, but XPath and other DSL's exist. SQL does an excellent job of doing what it was designed to do, support database access. Or does it?


SQL began life without and If statement, without assignment statements (:=) and variables but PL/SQL and T/SQL added them to make it useful. SQL has changed its spec 6 times since 1986. To include new features that people thought it should do. Still this is not enough.


I think the concept of a DSL has a a fundamental flaw. They can't move with business requirements and as such they need access to a broader language. This almost defeats the entire purpose of them.



The solution going on around the database world is to replace or integrate the DSL (SQL) with broad languages.(Oracle and SQL Server now support .net assemblies). The other solution is to add the useful features of the DSL into the non DSL (LINQ kind of adds SQL to .net). The point is SQL as the most well known DSL has failed to accommodate the business requirements of the developers world. And yet it is hugely popular and very useful.


The wikipedia entry for domain specific languages makes a good comment. "A domain-specific language is somewhere between a tiny programming language and a scripting language, and is often used in a way analogous to a programming library." And herein lies some of the problems with DLS's.

  • DSL's are not libraries. They are not easily extensible. You can add to a library pretty easily. It's nigh on impossible for me to extend Oracle PL/SQL. The only people who can extend PL/SQL are Oracle and they move like a glacier. (Does this mean they are speeding up of late with global warming? ;) )


  • DSL's by definition are limited in scope and usually dont' have access to the entire operating system. This means they usually don't cover all of the area's that people want to use them for.


This is a big problem for the developer. Initially you get a problem and let's say you solve it with a DSL, like SQL. It seems a a good solution, the language is designed for the job, its all a neat fit. But inevitably business requirements change and code needs to be refactored. But its a change thats not really in the fundamental design of the DSL, (ie you want to do some maths on data retrieved from a database). You oughta turf the DSL and go with a more flexible solution but you can't. Time and effort has already been invested, there is no time, the impact would not be this one area of the code but would affect other areas of the system. You are forced politically to go with the existing solution and do some kind of work around. The code ends up as a bit of spaghetti as a result and maintainability suffers. This goes on for a while and eventually the entire system becomes a rigid, impossible to morph system prone to bugs and errors due to the enormous complexity.


The problem here was not the change of the system, that happens all the time, the problem was the choice of a DSL.


It is hard to refactor SQL and as far as I know you can't unit test SQL. While functions are supported, classes aren't and using maths libraries in SQL is way out there.


Part of the problem lies in that SQL does not have access to a broad language like C#. LINQ is an attempt to solve this. This traditional boundary of DSL's not having access to the full operating system (which is by design) is a fatal flaw.


So my thought is to be warned when you choose a DSL that will not have access to a broad language. Ask yourself will it work in the longer haul? Can you break out of it to do non DSL specific things? And do this before you are wedded to the code base it will create. They have good bits but they are all flawed. In the end I now try and only use DSL's that have access to a broad language if I reasonably can (note the use of reasonably). Still I like the fundamental structure of something like SQL.

7 comments:

Anonymous said...

I see that the requirement for language evolution and extensibility is true for both general-purpose and domain-specific languages. For me you seem to advocate for embedded DSLs rather than external DSLs although the line is not always so clear between them (e.g. when combining two or more (DSL) languages, or have different levels of access between languages).

I see that the “trouble with DSLs” is partly solved if you have the control. In that case the SQL is not the best choice for an example. Instead, I noticed that you considered earlier building
own simple modeling language. While doing so you will have also all the control and power to extend and integrate the language with other languages. In that respect, I would suggest defining languages for a narrow area of interest - those occurring within your company only.

Tim Yen said...

You have a good point, If you do have control over the DSL then you can extend it and go on your merry way. This I find to be a rare case.

One of the other points raised last night in a discussion at work was that as soon as you go into a DSL you largely lose your tools. Unit tests, refactoring, IDE tools even unless the DSL is very common like SQL. I figure its largely because if someone is going to write a tool, they aim it for the broadest market.

Thanks for your comment, you are the first serious feedback from someone I have never met. Much appreciated

Pai Rico said...

SQL was well-suited for querying and manipulating relational data but databases today are used for more than this -vendors want you to do everything in your database.

The vendor-driven and design by committee natures of the language are responsible for its changes.

If you get an example like Regular Expressions or HTML you see that a language can evolve without losing its DSL-ness.

cheers
http://fragmental.tw

Pai Rico said...

Ops, didn't check the 'Follow up comments' box.

Tim Yen said...

Wow I've got a second intelligent comment. Thanks guys, Its obviously a hot topic. :)

Phillip, I agree HTML can evolve and does evolve without losing its DSL-ness. And it is a great language in the history of languages. It made the web what it is today.

The issue in this case is that HTML can't do everything on a web page, so java script was invented. This gave coders a way out to a broader language. If HTML had been a broader language to begin with then we wouldn't have had java script.
But the mere fact that we have plugins like flash,silverlight and java script tells me other people have run into its limitations.

I can't really comment on regex as I've generally avoided it as its very hard to use when things get complex

Pai Rico said...

Tim,

Although you use JavaScript and, say, XML to compose a modern web page you still use HTML for the hypermedia markup -even when it is dynamically generated by JavaScript or is a result of a XSTL transformation- and that's why it is meant to do.

You use a GPL like JavaScript to extend the DSL into something that it can't handle and that's perectly fine. A DSL is not supposed to handle anything.

I think that your point is that a DSL will not be enough to develop an application and that is completely true. When your DSL is not enough to handle the new needs you generally have two options:

1) Evolve the language, like the new standard is trying to do with HTML and all the new tags

2) Use a extension language, in this case JavaScript.

When you use an embedded/internal DSL both are not hard, when you have an external DSL the language runtime has to support those options.

cheers

http://fragmental.tw

Tim Yen said...

Phillip,

I agree whole heartedly with your points.

I would add a third point and that is if option 2 is not available to you then either don't or very carefully consider using the DSL. Try to use a library instead, so you can extend it.