Skip to main content

Productivity gains are permanent, Performance losses are temporary

This is the first in a short series of notes about language design. The goal is to identify what is truly important about programming language design.

It is all about productivity


There are 'too many moving parts' in a typical Java program.

Here is a simple straw-man example: iterating over a collection. Before Java 5, the paradigmatic way of iterating over a collection was to use a fragment like:

for(Iterator it=coll.iterator();it.hasNext();){
SomeType el = (SomeType)it.next();
...
}

There is a lot of 'clutter' here; but worse, there are a lot of concepts that are not really that important to the problem at hand.

Compare that to the Java 5 version:

for(SomeType el:coll){
...
}

Here we can focus on exactly what is needed: the collection and the iteration.

This kind of example can be repeated fractally up the entire stack of abstractions: not just at this micro level but also at the level of using standard JDK class libraries on through to using features such as Spring/Hibernate or Wicket.

The elements of productivity


I believe that the list of requirements is not that long:

Digestibility


A programming language is an artifact, and needs to be understood in order to be effectively used. So, a language with a well crafted structure is going to be easier to learn and use than one with many ill fitting pieces.

Note that this does not mean a small number of pieces (or features). The classic example of this is natural languages (such as English or Spanish). There are at least 500,000 words in the English language. A daunting task for someone wishing to learn the language. But, the grammar of English is relatively simple compared to other languages (such as German or modern Greek). This has helped English to become the dominant second-language globally.

Semantic Lifting


Structure in software is evident at many levels: in the micro structure of individual fragments of code, to the organization of libraries to the ecosystem of multiple applications.

A powerful tool for managing this complexity is abstraction. Commonly abstraction is thought of as a technique that enables one to ignore inessential details.

But a better concept might be semantic lifting. Semantic lifting is a common technique in Mathematics. For example, vector geometry is layered over cartesian geometry but permits powerful statements to be expressed in a simple way.

A programming language that can support similar techniques for lifting the language — like the iteration example — enables higher productivity.

Late Binding


Programming languages are often quite fussy in character: requiring all kinds of details to be established quite early in the design process. For example, Java (like most languages) requires that all types have names.

This emphasis on detail makes a significant barrier to developing a program: the programmer is forced to focus on issues that she may not actually be ready or willing to.

A language that supports late binding allows programmers to delay such choices until they are needed.

For example, one reason that type inference is so powerful is that it permits a programmer to program in a 'type-free' way while knowing that the compiler will verify the type safety of the program. Establishing types of functions is a detail that can often be left to later. However, ultimately, type inference still ensures that the program is 'correct'.

Declarative Semantics


This one is a hard one!

But the fundamental benefits of a declarative semantics arise from the tractability that follows. Having a declarative semantics makes program manipulation of all kinds more straightforward. That, in turn, means that some high-powered transformations — such as those required for scaling on parallel hardware — much easier to accomplish without requiring enormous input from the programmer.

What does the future language look like?


According to this thesis, a language based on these principles would have a sound semantic foundation, would be easy to understand and would not require more from the programmer than was required by the problem. And it would be easy to deploy on systems consisting of many cores.

What's not to like? In the subsequent posts I will look at each of these principles in turn in a little greater depth.

Popular posts from this blog

Existential Types are the flip side of generics

Generic types, as can now be seen in all the major programming languages have a flip side that has yet to be widely appreciated: existential types.

Variables whose types are generic may not be modified within a generic function (or class): they can be kept in variables, they can be passed to other functions (provided they too have been supplied to the generic function), but other than that they are opaque. Again, when a generic function (or class) is used, then the actual type binding for the generic must be provided – although that type may also be generic, in which case the enclosing entity must also be generic.

Existential types are often motivated by modules. A module can be seen to be equivalent to a record with its included functions: except that modules also typically encapsulate types too. Abstract data types are a closely related topic that also naturally connect to existential types (there is an old but still very relevant and readable article on the topic Abstract types have …

Concept Oriented Markup

I have long been frustrated with all the different text mark up languages and word processors that I have used. There are many reasons for this; but the biggest issue is that markups (including very powerful ones like TeX) are not targeted at the kind of stuff I write.

Nowadays, it seems archaic to still be thinking in terms of sections and chapters. The world is linked and that applies to the kind of technical writing that I do.

I believe that the issue is fundamental. A concept like "section" is inherently about the structure of a document. But, what I want to focus on are concepts like "example", "definition", and "function type".

A second problem is that, in a complex environment, the range of documentation that is available to an individual reader is actually composed of multiple sources. Javadoc exemplifies this: an individual library may be documented using Javadoc into a single HTML tree. However, most programmers require access to multiple…

Robotic Wisdom

It seems to me that one of the basic questions that haunt AI researchers is 'what have we missed?' Assuming that the goal of AI is to create intelligence with similar performance to natural intelligence; what are the key ingredients to such a capability?

There is an old saw
It takes 10,000 hours to master a skill
There is a lot of truth to that; it effectively amounts to 10 years of more-or-less full-time focus. This has been demonstrated for many fields of activity from learning an instrument, learning a language or learning to program.

But it does not take 10,000 hours to figure out if it is raining outside, and to decide to carry an umbrella. What is the difference?

One informal way of distinguishing the two forms of learning is to categorize one as `muscle memory' and the other as 'declarative memory'. Typically, skills take a lot of practice to acquire, whereas declarative learning is instant. Skills are more permanent too: you tend not to forget a skill; but it is…