Skip to main content

Posts

In Praise of Crappy Code

Not all code needs to be perfect! This is pretty heretical thinking for a software engineer. The issue is simple: how do you go about developing software for a small fixed budget. Imagine that you have $500 to implement a solution to a problem. If you spend more than that you will never recoup the extra that you spent. This comes up a lot in systems integration scenarios and also in customization efforts. Someone wants you to 'tweak' an application that they are using; you know that no-one else would want that feature and that if you spend more than what the customer will pay you will end up losing money. From the customer's perspective, the common 'time and materials' approach to quoting for software development is a nightmare. Being able to offer a fixed price contract for a task is a big benefit for the customer. But, how much do you quote for? Too much and you scare the customer away. Too little and you lose money. This is where 'crappy code' com...

Minimum Viable Product

When was the last time you complained about the food in a restaurant? I thought so. Most people will complain if they are offended by the quality or service; but if the food and/or service is just underwhelming then they won't complain, they will simply not return to the restaurant. The same applies to software products, or to products of any kind. You will only get negative feedback from customers if they care enough to make the effort. In the meantime you are both losing out on opportunities and failing your core professional obligation. Minimum Viable Product speaks to a desire to make your customers design your product for you. But, to me, it represents a combination of an implicit insult and negligence. The insult is implicit in the term minimum. The image is one of laziness and contempt: just throw some mud on the wall and see if it sticks. Who cares about whether it meets a real need, or whether the customer is actually served. The negligence is more subtle but, in the end, ...

Hook, Line and Sinker

It is well documented that people’s #1 fear is speaking in public ! Effective and efficient public speaking is a whole topic in its own right; but a few simple tips might help to both improve your effectiveness and help to reduce the anxiety. You may be called on to talk about your work at very short notice; or you may have a week’s notice; and you may be required to give a formal slide show or just a brief verbal update. Many, if not most of the issues, are the same. The Hook Newspaper editors call the first paragraph of an article ‘the hook’. Its meant to hook you into reading the rest of the piece. On the other hand, the classical ‘say what you are going to say, say it, and say what you said’ approach gives people plenty of time to switch off. The hook may be playful, it may be controversial, but it must communicate why the listener should pay attention. The Line Its a conversation! Even if no one says anything they are listening and thinking; and maybe replying to you in their head...

Existential Types are the flip side of generics

Generic types, as can now be seen in all the major programming languages have a flip side that has yet to be widely appreciated: existential types. Variables whose types are generic may not be modified within a generic function (or class): they can be kept in variables, they can be passed to other functions (provided they too have been supplied to the generic function), but other than that they are opaque. Again, when a generic function (or class) is used, then the actual type binding for the generic must be provided – although that type may also be generic, in which case the enclosing entity must also be generic. Existential types are often motivated by modules. A module can be seen to be equivalent to a record with its included functions: except that modules also typically encapsulate types too. Abstract data types are a closely related topic that also naturally connect to existential types (there is an old but still very relevant and readable article on the topic Abstract types have...

Concept Oriented Markup

I have long been frustrated with all the different text mark up languages and word processors that I have used. There are many reasons for this; but the biggest issue is that markups (including very powerful ones like TeX) are not targeted at the kind of stuff I write. Nowadays, it seems archaic to still be thinking in terms of sections and chapters. The world is linked and that applies to the kind of technical writing that I do. I believe that the issue is fundamental. A concept like "section" is inherently about the structure of a document. But, what I want to focus on are concepts like "example", "definition", and "function type". A second problem is that, in a complex environment, the range of documentation that is available to an individual reader is actually composed of multiple sources. Javadoc exemplifies this: an individual library may be documented using Javadoc into a single HTML tree. However, most programmers require access to multip...

Comments Should be Meaningless

This is something of a counterintuitive idea: Comments should be meaningless What, I hear you ask, are you talking about? Comments should communicate to the reader! At least that is the received conventional wisdom handed does over the last few centuries (decades at least). Well, certainly, if you are programming in Assembler, or C, then yes, comments should convey meaning because the programming language cannot So, conversely, as a comment on the programming language itself, anytime the programmer feels the imperative to write a meaningful comment it is because the language is not able to convey the intent of the programmer. I have already noticed that I write far fewer comments in my Java programs than in my C programs.  That is because Java is able to capture more of my meaning and comments would be superfluous. So, if a language were able to capture all of my intentions, I would never need to write a comment. Hence the title of this blog.

Robotic Wisdom

It seems to me that one of the basic questions that haunt AI researchers is 'what have we missed?' Assuming that the goal of AI is to create intelligence with similar performance to natural intelligence; what are the key ingredients to such a capability? There is an old saw It takes 10,000 hours to master a skill There is a lot of truth to that; it effectively amounts to 10 years of more-or-less full-time focus. This has been demonstrated for many fields of activity from learning an instrument, learning a language or learning to program. But it does not take 10,000 hours to figure out if it is raining outside, and to decide to carry an umbrella. What is the difference? One informal way of distinguishing the two forms of learning is to categorize one as `muscle memory' and the other as 'declarative memory'. Typically, skills take a lot of practice to acquire, whereas declarative learning is instant. Skills are more permanent too: you tend not to forget a skill; but i...

A Tale of Three Loops

This one has been cooking for a very long time. Like many professional programmers I have often wondered what is it about programming that is just hard . Too hard in fact. My intuition has led me in the direction of turing completeness: as soon as a language becomes Turing complete it also gathers to itself a level of complexity and difficulty that results in crossed eyes. Still, it has been difficult to pin point exactly what is going on. A Simple Loop Imagine that your task is to add up a list of numbers. Simple enough. If you are a hard boiled programmer, then you will write a loop that looks a bit like: int total = 0; for(Integer ix:table) total += ix; Simple, but full of pitfalls. For one thing we have a lot of extra detail in this code that represents additional commitment: We have had to fix on the type of the number being totaled. We have had to know about Java's boxed v.s. unboxed types. We have had to sequentialize the process of adding up the numbers. While one loo...

The true role of domain specific languages

It is easy to be confused by the term domain specific language. It sounds like a fancy term for jargon. It is often interpreted to mean some form of specialized language. I would like to explore another role for them: as vehicles for policy statements . In mathematics there are many examples of instances where it is easier to attack a problem by solving a more general, more uniform, problem and then specializing the result to get the desired answer. It is very similar in programming: most programs take the form of a general mechanism paired with a policy that controls the machine. Taken seriously, you can see this effect down to the smallest example: fact(n) where n>0 is n*fact(n-1); fact(0) is 1 is a general machine for computing factorial; and the expression: fact(10) is a policy 'assertion' that specifies which particular use of the factorial machine is intended. One important aspect of policies is that they need to be intelligible to the owner of the machine: unlike the...

Single Inheritance and Other Modeling Conundrums

Sometimes a restriction in a programming language makes sense and no sense at all — all at the same time. Modeling the real world Think about the Java restrictions on the modeling of classes: a given class can only have one supertype and a given object's class is fixed for its lifetime. From a programming language perspective these restrictions make a good deal of sense: all kinds of ambiguities are possible with multiple inheritance and the very idea of allowing an object to be 'rebased' fills the compiler writer with horror. (Though SmallTalk allows it.) The problem is that, in real life, these things do happen. A 'natural' domain model is quite likely to come up with situations involving multiple inheritance and dynamic rebasing. For example, a person can go from being a customer, to an employee, to a manager to being retired. A given person might be both an employee and a customer simultaneously (someone else may not be). Given a domain that is as flexible as th...

Too Many Moving Parts

A common, if somewhat informal, observation about a large code base is that there are "too many moving parts" in it. In my experience, this is especially true for large Java systems but is probably universally true. What do we mean by ‘too many moving parts’? Simply put, there is always a significant semantic gap between a programming language and the program. The larger this gap, the more that has to be expressed in the language, as opposed to simply using it. For example, consider the problem of traversing a recursive tree structure. In Java, we can iterate over an Iterable ; structure using a loop, for example, to count elements: int count = 0; for(E el:tree) {   count++; } If the Tree class did not implement Iterable we would be forced to construct an explicit iterator (or worse, write a recursive one-off function): int count = 0; for(Iterator<E> it=tree.iterator(); it.hasNext();) {   E el = it.next();   count++; } This version illustrates what ...

Late Binding in Programming Languages

Late binding is key to enhanced productivity in programming languages. I believe that this is the single most important reason why so-called dynamic typed languages are so popular. This note is part of an ongoing ‘language design’ series which aims to look at some key aspects of programming language design. What do we mean by late binding? Simply put, a programmer should not have to say more than they mean at any particular time. To see what I mean, consider a function that computes a person's name from a first and last name. In Star, I can write this: fullName(P) is P.firstName()++P.lastName() This constitutes a complete definition of the function: there is no need to declare types; furthermore this function will work with any type that has a first and last name. Contract this with a typical well-crafted Java solution: boolean fullName(Person P){ return P.firstName()+P.lastName(); } Not so different one might argue. Except that we have had to define a type Person ; at bes...

Productivity gains are permanent, Performance losses are temporary

This is the first in a short series of notes about language design. The goal is to identify what is truly important about programming language design. It is all about productivity There are 'too many moving parts' in a typical Java program. Here is a simple straw-man example: iterating over a collection. Before Java 5, the paradigmatic way of iterating over a collection was to use a fragment like: for(Iterator it=coll.iterator();it.hasNext();){ SomeType el = (SomeType)it.next(); ... } There is a lot of 'clutter' here; but worse, there are a lot of concepts that are not really that important to the problem at hand. Compare that to the Java 5 version: for(SomeType el:coll){ ... } Here we can focus on exactly what is needed: the collection and the iteration. This kind of example can be repeated fractally up the entire stack of abstractions: not just at this micro level but also at the level of using standard JDK class libraries on through to using features such as Spr...

(Software) Architecture = Policy + Mechanism

In the early 1980's Bob Kowalski made famous an interesting equation: Program = Logic + Control. The idea of that equation was that programming was essentially a combination of logic -- i.e., what you wanted done -- with algorithm -- how you wanted it done. It is a fairly commonplace fact that any non-trivial program has a similar flavor to it: there is often a substantial amount of machinery that is used to deliver the value in the program; together with some form of policy statement/expression that governs the precise requirements for a particular execution of the program. The larger the program, the more obvious it is that there is this layering into mechanisms and policies. For example, one could argue that a word processor's mechanisms are all the pieces need to implement text editing, formatting and so on. If the word processor supports styles, especially named styles, then these styles are a simple form of policy. At larger scales, when considering networked applications...

Sub-turing complete programming languages

Here is an interesting intuition: the key to liberating software development is to use programming languages that are not, by themselves, turing-complete. That means no loops, no recursion 'in-language'. Why? Two reasons: any program that is subject to the halting problem is inherently unknowable: in general, the only way to know what a turing-complete program means is to run it. This puts very strong limitations on the combinatorics of turing-complete programs and also on the kinds of support tooling that can be provided: effectively, a debugger is about the best that you can do with any reasonable effort. On the other hand, a sub-turing language is also 'decidable'. That means it is possible to predict what it means; and paradoxically, a lot easier to provide a rich environment for it etc. etc. An interesting example of two languages on easier side of the turing fence are TeX and CSS. Both are designed for specifying the layout of text, TeX is turing complete and CSS ...

On programming languages and the Mac

Every so often I dig out my Xcode stuff and have a go at exploring developing an idea for Mac OS X. Everytime the same thing happens to me: Objective-C is such an offensive language to my sensibilities that I get diverted into doing something else. All the lessons that we have learned the hard way over the years -- the importance of strong static typing, the importance of tools for large scale programming -- seem to have fallen on deaf ears in the Objective-C community. How long did it take to get garbage collection into the language? I also feel that some features of Objective-C represent an inherent security risk (in particular categories) that would make me very nervous to develop a serious application in it. As it happens, I am currently developing a programming language for Complex Event Processing. Almost every choice that I am making in that language is the opposite to the choice made for Objective-C -- my language is strongly, statically typed; it is designed for parallel exe...

About the right tools for the job

Some time ago I was involved in a running debate about whether we should be using Ruby on Rails rather than the Java stack (junkyard?) that we were using. At the time, I did not really participate in the discussion except to note that everything seemed to be at least 5 times too difficult. I had this strong intuition that there were so many moving parts that that was the problem. The application itself was not really that hard. My assertions really ticked some of my colleagues off; for which I apologize; sort of. I guess that I come from a tradition of high-level programming languages, by high level, I would say that I would consider LISP to be a medium level language, and Prolog is slightly better. I would say that it is a pretty common theme of my career that I end up having to defend the position of using high-level tools. I have gotten a number of arguments, ranging from "it will not be efficient enough" to "how do you expect to find enough XX programmers?". I u...

Another thought about Turing and Brooks

Rodney Brooks once wrote that robots would be human when treating them as though they were human was the most efficient way of interacting with them. (Not a precise quote.) This is an interesting variation on the Turing test. It assumes that we decide the smartness of machines in the context of frequent interactions with them. It also builds on an interesting idea: that in order to deal with another entity, be it human, animal or mineral, we naturally build an internal model of the entity: how it behaves, what it can do, how it is likely to react to stimuli etc. That model exists for all entities that we interact with; a rock is not likely to kick you back, your word processor will likely crash before you can save the document etc. When the most effective way to predict the behavior of a machine is to assume that it has similar internal structure to ourselves, then it will, for all intents and purposes, be human. So, here is another thought: how do we know that another human is human?...

Turning Turing upside down

I am probably not alone in visualizing Turing's Universal Machine as a little animacule walking over a linear landscape of ones and zeros: The great innovation of thinkers such as Turing and others was to reduce the complex world of algorithms and functions into something simple and elemental: all computable functions can be thought of as state machines operating over a large collection of ones and zeros, presence and absence. There are arguably many differences between a Turing Universal Machine and a modern browser (quite apart from the fact that, being a Javascript interpreter makes a browser a TUM). But for me, one of the most striking differences is that where a TUM is an animacule in a universe of one and zeroes, the browser is an animacule in a universe of HTML , CSS , HTTP and so on. The browser understands a different world than Turing's computer. Were we to draw a browser as an animacule, it should look like: There are similarities, and if you were to look at it fro...

Ontologies for matching

I have previously wondered out loud what ontologies are good for. I now believe that one of the most powerful use cases for semantic technology lies in social networking applications; and matching in general. By social networking I mean "putting people in touch with each other"; especially in situations that are inherently asymmetric. For example, putting potential volunteers in touch with people who could use their services; putting buyers in touch with sellers, and so on. The reason is simple: the language spoken by the two sides is inherently different: a seller or volunteer knows a lot (or maybe not) about what he or she can do or would like to do. But a consumer often does not know to translate his or her problem into a solution that the provider can offer. Put more graphically, providers speak features, and consumers speak problems. This is even if they can find each other. In the middle, there is an opportunity for someone to put the two together. A match maker has to...