Industrial Programming Languages and Pragmatics

The approach of applying pragmatics and principles of industrial programming language design to formal methods, verification, and dependently-typed languages is something I care a great deal about.  It’s the focus of my programming language project Salt, and it was the subject of my recent lightning talk at IEEE SecDev and many of the discussions that followed.

Those who interact with me know I tend to talk about this subject quite a bit, and that I have a particular concern for the pragmatic aspects of language design.  The origin of this concern no doubt comes from my time with the Java Platform Group at Oracle; there I saw the issues that a successful industrial language must confront firsthand and the demands that arise from real-world industrial use.   I’ve also seen these sorts of issues in other places, like working on the FreeBSD and Linux kernels.

A common question I get is “what do you mean by ‘industrial language’?”  I see this frequently in the context of the academic programming language and Haskell communities.  I’ve seen this frequently enough that it’s prompted me to write an article on the topic.

Academic and Industrial PL

In my experience, there’s something of a gulf between academic knowledge about programming languages and industrial knowledge.  Academic PL is primarily concerned with things like semantics, idealized language features, and type systems (particularly type systems).  Academic PL knowledge is available through the usual academic conferences: POPL, PLDI, OOPSLA, ICFP, and the like, as well as various books and journals.

Industrial PL, by contrast, is generally less structured and accessible in its knowledge, existing primarily in the form of institutional knowledge in organizations like OpenJDK and others dedicated to maintaining major industrial languages.  Some of this makes it out into industrial conferences (Brian Goetz, the Java language architect has given a number of talks on these subjects, such as this one).

In recent history, there has been a concerted effort by the industrial PL world to tap into this knowledge.  This manifests in recent and upcoming Java language features as well as in languages like Rust which incorporate a number of advanced type system features.  I saw this firsthand in my time on the Java Platform Group, and particularly in the work that went into the Lambda project and is now going into the Valhalla project.

Academic PL, on the other hand, has tended to be dismissive of the knowledge base of industrial PL as it lacks the kind of formalism that modern academic PL work demands.  The source of this, I believe, is rooted in the differing goals and scientific foundations.  The concerns of academic PL are well-addressed by higher mathematics, where the concerns of industrial PL are better answered through disciplines such as organizational psychology.  In truth, both are important.

It is my belief that both worlds need to make an effort to get out of their silos.  The industrial PL world needs to do a better job at describing the problems it solves and reporting its solutions and findings.  The academic PL world needs to acknowledge the intellectual merit of creating real-world realizations of the idealized features presented in academic PL conferences.

Principles of “Industrial Programming Languages”

A first step towards getting out of the silo is to clearly express what the notion of an “industrial programming language” means.  This phrase is an example of what might be called a “common-sense” term.  Any of my colleagues from OpenJDK certainly could have expounded at length as to its meaning, industrial programmers could probably list a number of examples, and even academics have some notion of what it means (often some variation on “icky details we don’t want to deal with”).

The problem with “common sense” notions is that everyone has their own ideas of what they mean.  To effectively study an issue, we must refine such notions into more concrete and unambiguous ideas.  A truly principled effort in this direction would, in my opinion, be worthy of peer-reviewed publication.  My goal here is merely to begin to explore the ideas.

As a first draft, we can list a number of issues that “industrial” software development must confront, and which “industrial” programming languages must provide tools to address:

  • Very large codebases, running from hundreds of thousands to tens of millions of lines of code
  • Long-running development, spanning many development cycles and lasting years to decades
  • Very large staff headcounts, distributed teams, staff turnover, varying skill-levels
  • Withstanding refactoring, architectural changes, and resisting “bit-rot”
  • Large and complex APIs, protocols, interfaces, etc, possibly consisting of hundreds of methods/variables
  • Interaction with other complex systems, implementation of complex standards and protocols
  • Need for highly-configurable systems, large numbers of possible configurations
  • Large third-party dependency sets and management thereof

The common theme here is unavoidable complexity and inelegance.  It is a well-known fact in many disciplines that different factors emerge and become dominant at different scales.  In algorithms and data structures for example, code simplicity is often the dominant factor in execution time at small scales, asymptotic complexity dominates at mid-scale, and I/O cost dominates all other concerns at large scales.  Similarly, with software development, certain stresses (such as the ones listed above) emerge and become increasingly important as the scope of a development effort increases.

Industrial practice and industrial languages aim to withstand these factors and the stresses they produce as development scope scales up to modern levels.  An important corollary is that the failure to provide mechanisms for dealing with these factors effectively limits the scope of development, as has been the case throughout the development of industrial practice.

Dealing with Development at Scale

I routinely use the term “industrial pragmatics” to refer to the various methods that have been developed in order to cope with the stresses that emerge in large-scale software development.  In particular, I use the word “pragmatics” because the efficacy of these techniques generally can’t be effectively evaluated using theory alone.  They involve both complex interactions at scale as well as human behavior- two phenomena that defy theoretical abstraction.

It is worth exploring some of these techniques and why they have been so successful.

The Path of Least Eventual Cost/Pain

At very large scales, the only successful strategy for managing things quickly becomes the path of least eventual cost (or for the more blunt, pain).  A key word in this is eventual, as it is critically important to realize that some decisions may minimize short-term cost or maximize short-term gain in some way but eventually lead to a much higher cost in the long run.  At large scales and in long-running projects, short-sighted decision-making can dramatically increase the eventual cost, to the point of compromising the entire effort.

This is, in my opinion, one of the critically-important principles of industrial pragmatics.  Any tool, technique, language, or practice must be evaluated in terms of its effect on eventual cost.

Modularity and Interfaces

More than any other technique, modularity has proven itself invaluable in managing the complexity of industrial-scale software development.  The most widely-successful adoption of this technique came in the form of Object-Oriented programming, which removed the barriers that prevented software development from scaling up.  Since the advent of OO, other adaptations have emerged: Haskell’s typeclasses, various module systems, and the like, all of which share common features.

A key features of these systems is that they manage complexity by encapsulation- walling off part of a huge and complex system, thereby limiting the concerns of both the author of that component as well as the user.  This significant limits the set of concerns developers must address, thereby limiting the size and complexity of any “local” view of the system.

Constraining Use

Large systems quickly build up a huge number of interfaces and configurations, and the fraction of possible uses of those interfaces and configurations that represent “valid” or “good” use quickly shrinks to the point where the vast majority of uses are incorrect in some way.  Well-designed systems provide mechanisms to restrict or check use of interfaces or configurations to identify misuse, or else restrict use to the “good” cases in some way.  Badly designed systems adopt an “anything goes” mentality.

An prime example of this comes in the form of cryptographic APIs.  Older crypto APIs (such as OpenSSL) provide a dizzying array of methods that can result in an insecure system at the slightest misuse.  This problem was identified in recent academic work, and a series of new crypto APIs that restrict use to correct patterns have been created.  Type systems themselves also represent a technology that restricts usage, greatly constraining the space of possible cases and greatly improving the degree to which a program can be analyzed and reasoned about.

Managing Implicit Constraints

In a sufficiently-large system, a significant effort must be dedicated to maintaining knowledge about the system itself and preventing the introduction of flaws by violating the implicit logic of the system.  Well-managed systems minimize the externally-facing implicit constraints, expose APIs that further minimize the degree to which they can be violated, and ensure that the unavoidable constraints are easily detected and well-known.  Badly-designed systems are rife with such implicit constraints and require expert knowledge in order to avoid them.

This is one area that I believe formal methods can make a huge impact if adapted correctly.  There are many ways of approaching this problem: documentation, assertions, unit tests, and so on, but they are all deficient for various reasons.  Even the ability to explicitly codify invariants and preconditions would be a considerable boon.  More modern languages and tools such as Rust and JML approach this, but only formal methods can provide a truly universal solution.

The Art of API (and Protocol/Standard) Design

The design of APIs is an art that is underappreciated in my opinion.  It takes considerable skill to design a good API.  It requires the ability to generalize, and more importantly, to know when to generalize and when to hammer down details.  It requires the ability to foresee places where people will want to change things and how to make that process easy.  It requires a good sense for how people will build upon the API.  It requires a good sense of necessary vs. unnecessary complexity.  Lastly, It requires a knack for human factors and for designing things to make doing the right thing easy and the wrong thing hard or impossible.

Managing Failure

Failure of some kind becomes unavoidable at large scales.  Software systems subject to real use will eventually be used incorrectly, and it will become necessary to diagnose failures quickly and efficiently.  This is unavoidable, even in a perfect world where all of our own developers are perfect- users and third-party developers can and will do things wrong, and they need to be able to figure out why.  In the real world, our own developers also must diagnose failures, if only in the internal development and testing processes.

This, I think, is one of the most undervalued principles in both academic and even industrial circles.  Far too many people want to answer the problem of managing failure with “don’t fail”.  But failure can’t be avoided or wished away, and failing to provide tools to quickly and effectively diagnose failure translates to increased cost in terms of development times, testing times, and diagnosis of failure in the field (in other words, much more eventual cost and pain).

In my own career, Java and other JVM languages have proven themselves to have the best track record in terms of easy diagnosis of failure of any language I’ve seen in common use, both with myself and with others.  By contrast, languages that don’t provide the kind of functionality that JVM languages do, either because they can’t (like C), or because they choose not to (like Haskell) tend to slow things down as it takes longer to diagnose issues.  Lastly, I’ve also dealt with some truly nightmarish platforms, such as embedded or boot-loaders, where diagnosis is a task for experts and takes considerable effort and time.

Human Factors

In addition to the principles I’ve discussed thus far, human factors- particularly the difficulty of modifying human behavior -plays a key role in the design of industrial languages.  A considerable amount of thought must go into how to influence user behavior and how to get them to adopt new techniques and methodologies, often with little chance of recovering from mistakes.  Unlike the other principles, these are rooted in psychology and therefore cannot be the subject of the kinds of mathematical theories that underlie many aspects of PL.  They are nonetheless critical to success.

Making it Easy to be Good/Hard to be Bad

One of the most successful principles I’ve learned is that of making it easy to be good and hard to be bad.  This means that “good” use of the tool or language should be easy and intuitive, and it should be difficult and non-obvious how to do “bad” things (this of course presupposes an accurate notion of good vs. bad, but that’s another discussion).

An excellent example of this comes from the language Standard ML.  Functional programming with immutable data is the notion of “good” in SML, and imperative programming with mutable data is strongly discouraged.  It’s tolerated, but the syntax for declaring and using mutable state is gnarly and awkward, thereby encouraging programmers to avoid it in most cases and encapsulate it when they do use it.  Java and other languages’ design reflects a notion of object-orientation being “good”, with global state being strongly discouraged and made deliberately difficult to use.

Regimentation

Coordinating human activity at scale is an extremely difficult problem.  Regimentation- creating common idioms that serve to make behavior predictable and facilitate implied communication is a very common technique for dealing with the problem at scale.  In the context of software development, this takes the form of things like design patterns and anti-patterns, well-defined interfaces and standards, style documents, and programming paradigms.

Languages that succeed at large-scale development tend to provide facilities for this kind of regimentation in one way or another.  This incidentally is one major argument in favor of types: they provide a considerable amount of implicit communication between developers and users of an interface.  Similarly, languages with well-known idioms and built-in usage patterns tend to produce more coherent codebases.  Java is one example of this.  Haskell is quite good in some ways (such as its typeclasses, tightly-controlled use of side-effects, and very powerful type system) and deficient in others (five string types).  Python achieves good regimentation despite its lack of types, which I believe is a crucial factor in its popularity.

Making Complex Ideas Accessible

A key to success in industrial development is the ability to make a complex idea accessible.  In my own experience, this was one of the core design principles of the Lambda project in Java.  We had the challenge of introducing higher-order functional programming and all the notions that come along with it in a way that was accessible to programmers used to the stateful OO style of thinking.  Conversely, many efforts have gotten this wrong, as exemplified in the infamous quote “a monad is just a monoid in the category of endofunctors”.

This translation between idioms is quite difficult; it requires one to deeply understand both sides of the fence, as well as which use cases and concerns are most common on both sides.  However, it’s critical- people don’t use what they can’t understand.  The best way to facilitate understanding is to present an idea or principle in a context the audience already understands.

Harm-Reduction

Just as failure must be dealt with in the context of software systems, changing behavior and practices must be dealt with in the context of human factors.  As an example, programming practices based on the functional paradigm are becoming increasingly recommended practice due to a number of changing factors in the world.  Many languages face the challenge of adapting their user base to these new practices.

The mentality of harm-reduction often proves the most effective attitude when it comes to changing human behavior, both here and elsewhere.  Harm-reduction accepts that “bad” behavior takes place and focuses its efforts on minimizing the negative impacts on oneself and others and encouraging a transition towards better behavior in the long-run.  This reflects the realities of industrial programming: software can’t be rewritten into a new paradigm all at once, and some legacy systems can’t be eliminated or rewritten at all.

In the design of industrial languages and tools, this takes the form of several adages I’ll list here: provide options instead of demands, minimize upfront costs and facilitate gradual transitions, work with the old but encourage the new.

On Seeming Contradictions

It might seem, particularly to the more academically and philosophically inclined, that many of these principles I’ve elaborated contradict each other.  How can we provide regimentation while working with the mentality of harm-reduction?  How can we restrict use to the “good” behaviors while gracefully handling failure?

The solution is in itself another critically important principle: the notion of avoiding absolutist conflicts of ideas.  In the practical world, ideas tend to be useful in certain contexts, and a given principle needs to be evaluated in the context of use.  Put another way, “it’s important to use the right tools for the job”.

On a deeper level, ideological wars tend to end with both sides being “wrong”, and often some synthesis of the two being right.  In physics, the debate over whether light was a wave or a particle raged for centuries; in truth, it’s both.  There are many other such examples throughout history.

Conclusion

If we reduce everything I’ve discussed down to its essence, we arrive at the following statement: as the scale of industrial programming increases, the dominant concerns become irreducible complexity and inelegance, and human factors associated with coordinating effort.  Thus, “industrial pragmatics” refers to a set of techniques and principles for managing these concerns, and industrial programming languages are those languages that consider these techniques in their design.

InfoSec and IoT: A Sustainability Analogy

Yesterday saw a major distributed denial-of-service (DDoS) attack against the DNS infrastructure that crippled the internet for much of the east coast.  This attack disabled internet access for much of the Northeastern US, as well as other areas.  These sorts of attacks are nothing new; in fact, this attack came on the anniversary of a similar attack fourteen years ago.  Yesterday’s attack is nonetheless significant, both in its scope and also in the role of the growing internet of things (IoT) in the attack.

The attack was facilitated by the Mirai malware suite, which specifically targets insecure IoT devices, applying a brute-force password attack to gain access to the machines and deploy its malware.  Such an attack would almost certainly fail if directed against machines with appropriate security measures in place and on which passwords had been correctly set.  IoT devices, however, often lack such protections, are often left with their default login credentials, and often go unpatched (afterall, who among even the most eager adopters of IoT can say that they routinely log in to every lightbulb in their house to change the passwords and download patches).  Yesterday, we saw the negative consequences of the proliferation of these kinds of devices

Public Health and Pollution Analogies

Industry regulation- whether self-imposed or imposed by the state -is an widely-accepted practice among modern societies.  The case for this practice lies in the reality that some actions are not limited in their effect to oneself and one’s customers, but rather that they have a tangible effect on the entire world.  Bad practices in these areas leads to systemic risks that threaten even those who have nothing to do with the underlying culprits.  In such a situation, industry faces a choice of two options, one of which will eventually come to pass: self-regulate, or have regulations imposed from without.

Two classic examples of such a situation come in the form of public health concerns and environmental pollution.  Both of these have direct analogs to the situation we now face with insecure IoT devices and software (in)security in the broader context.

IoT and Pollution

After the third attack yesterday, I posted a series of remarks on Twitter that gave rise to this article, beginning with “IoT is the carbon emissions of infosec. Today’s incident is the climate change analog. It won’t be the last”.  I went on to criticize the current trend of gratuitously deploying huge numbers of “smart” devices without concern for the information security implications.

The ultimate point I sought to advance is that releasing huge numbers of insecure, connected devices into the world is effectively a form of pollution, and it has serious negative impacts on information security for the entire internet.  We saw one such result yesterday in the form of one of the largest DDoS attacks and the loss of internet usability for significant portions of the US.  As serious as this attack was, however, it could be far worse.  Such a botnet could easily be used in far more serious attacks, possibly to the point of causing real damage.  And of course, we’ve already seen cases of “smart” device equipped with cameras being used to surreptitiously capture videos of unsuspecting people which are then used for blackmail purposes.

These negative effects, like pollution, affect the world as a whole, not just the subset of those who decide they need smart lightbulbs and smart brooms.  They create a swarm of devices ripe for the plucking for malware, which in turn compromises basic infrastructure and harms everyone.  It is not hard to see the analogies between this and a dirty coal-burning furnace contaminating the air, leading to maladies like acid rain and brown-lung.

Platforms, Methodologies, and Public Health

Anyone who follows me on Twitter or interacts with me in person knows I am harshly critical of the current state of software methodologies, Scrum in particular, and of platforms based on untyped languages, NodeJS in particular.  Make no mistake, scrum is snake-oil as far as I’m concerned, and NodeJS is a huge step backward in terms of a programming language and a development platform.  The popularity of both of these has an obvious-enough root cause: the extreme bias towards developing minimally-functional prototypes, or minimum-viable products (MVPs), in Silicon Valley VC lingo.  Scrum is essentially a process for managing “war-room” emergencies, and languages like JavaScript do allow one to throw together a barely-working prototype faster than a language like Java, Haskell, or Rust.  This expedience has a cost, of course: such languages are far harder to secure, to test, and to maintain.

Of course, few consumers really care what sort of language or development methodology is used, so long as they get their product, or at least the current conventional wisdom goes.  When we consider the widespread information security implications, however, the picture begins to look altogether different.  Put another way, Zuckerburg’s addage “move fast and break things” becomes irresponsible and unacceptable when the potential exists to break the entire internet.

Since the early 1900’s, the US has had laws governing healthcare-related products as well as food, drugs and others.  The reasons for this are twofold: first, to protect consumers who lack insight into the manufacturing process, and second, to protect the public from health crises such as epidemics that arise from contaminated products.  In the case of the Pure Food and Drug act, the call for this regulation was driven in a large part by the extremely poor quality standards of large-scale industrial food processing as documented in Upton Sinclair’s work The Jungle.

The root cause of the conditions that led to the regulation of food industries and the conditions that have led to the popularization of insecure platforms and unsound development methodologies is, I believe, the same.  The cause is the competition-induced drive to lower costs and production times combined with a pathological lack of accountability for the quality of products and the negative effects of quality defects.  When combined, these factors consistently lead nowhere good.

Better Development Practices and Sustainability

These trends are simply not sustainable.  They serve to exacerbate an already severe information security crisis and on a long enough timeline, they stand to cause significant economic damage as a result of attacks like yesterdays, if not more severe attacks that pose a real material risk.

I do not believe government-imposed regulations are a solution to this problem.  In fact, in the current political climate, I suspect such a regulatory effort would end up imposing regulations such as back-doors and other measures that would do more damage to the state of information security that they would help.

The answer, I believe, must come from industry itself and must be led by infosec professionals.  The key is realizing that as is the case with sustainable manufacturing, better development practices are actually more viable and lead to lower eventual costs.  Sloppy practices and bad platforms may cut costs and development times in the now, but in the long run they end up costing much more.  This sort of paradigm shift is neither implausible nor unprecedented.  Driving it is a matter of educating industrial colleagues about these issues and the benefits of more sound platforms and development processes.

Summary

Yesterday’s attack brought the potential for the proliferation of insecure devices and software to have a profound negative effect on the entire world to the forefront.  A key root cause of this is an outdated paradigm in software development that ignores these factors in favor of the short-term view.  It falls to the infosec community to bring about the necessary change toward a more accurate view and more sound and sustainable practices.