05 Jun 2011Babel
Babel is a proposal for a multi-paradigm, low level type-safe virtual machine capable of executing several different programming languages. Different programming languages will inevitably be optimized to solve certain classes of problem in certain domains. No single programming language or programming paradigm is ideal in all scenarios.
The aim is to be able to execute virtual instruction sets from several different virtual machines such as Ruby/rubinius, Java/Java bytecode, Erlang/Beam, Scheme, Haskell, Mercury, R etc. Different types of languages that are expected to be supported should represent functional, logic or constraint-based, message-based and imperative paradigms.
In an ideal environment it would be possible to transparently combine elements written using different paradigms in the same application. The impedance mismatch at the language barriers often makes this a difficult proposition.
One possible remedy is to host each different programming language or paradigm in a software isolated process (SIP). Each SIP communicates with other SIPs through message passing. Communication between SIPs would still need to translate values from one paradigm to another and may even need to serialize, deserialize or copy values between SIPs. However translation could be skipped when processes share a representation and in some scenarios copying may be avoided if a SIP supports copy-on-write or can only transfer immutable values.
SIPs have many of the advantages of Erlangs processes; fault isolation, low overhead, easy to parallelize. If Babel is structured correctly, the exection engine for each SIP could be shared between SIPs and maybe composed from elements such that a logic and functional SIP share many of the same VM components.
Babel is unlikely to be started until well after I complete my PhD and while I have experimented with several different components at times, no head way has been made. For Babel to have any chance of success it must be optimized for fun.
Instruction Set Composition
It is likely that the various programming languages that are built on the Babble VM (BVM) can share subsets of their instruction sets. Candidate instructions that immediately come to mind include arithmetic operations for 32-bit integer values and 32-bit IEEE 754 floating point values. It is also likely that there will be dependency relationships be instruction groups. i.e. The 32-bit integer vector operations rely on the presence of 32-bit integer scalar operations.
Each instruction group is likely to result in different sets of optimization passes being incorporated in the runtime compilers. These could be a new set of BURS rules or specific optimization passes (i.e. to vectorize scalar operations in loops). Thus each instruction group should be able to be bundled separately and identify associated optimization passes etc.
Layered Programming Language Features
The BVM is likely to have a family of “native” languages. The languages should be layered such that each successive language incorporates features from lower layer. The “kernel” language is most likely going to be a stack based language such as Joy/Forth with linear types and procedures/functions that are guaranteed not to be recursive. The next higher language may support recursive functions and immutable variables (i.e. those that can be written once and read many times). A higher layer still may support mutation of variables or polymorphic invocations.
Language features such as generic types, tail calls, lazy vs strict modes, query matching vs normal execution, object manipulation etc are gradually added depending on where the language is positioned in the kernel-system-application-scripting language spectrum. Ideas should definitely be incorporated from Forth, Scheme, Haskell, Smalltalk and Mercury when developing the composable language features.