Great little demonstration. I am extremely fond of array-based languages and believe that one day their features will become standard, due mainly to the fact that parallelism is implicit in everything written in an array style, meaning that sophisticated interpreters will be able to automatically portion out parallel work to multiple cores or multiple machines in a cluster. And once you learn an array-based language, you get the feeling that "this is how computers think."
The way this implicit parallelism looks is that iteration/loops are implicit. As the "J for C programmers" book quips (J is a modernized version of APL):
I promise, if you code in J for 6 months, you will no longer think in loops, and if you stay with it for 2 years, you will see that looping code was an artifact of early programming languages, ready to be displayed in museums along with vacuum tubes, delay lines, and punched cards. Remember, in the 1960s programmers laughed at the idea of programming without goto's!
APL has its roots in mainframe computing where vector (SIMD) processors were the norm. One way to familiarize yourself in the modern day is to find an opportunity to program in MATLAB (or GNU Octave) or Python+numpy. MATLAB's array paradigms were inspired by APL, and numpy was inspired by MATLAB.
Here is an example function from numpy where implicit looping over multidimensional arrays is offered as a feature. The result is that the code will run faster because the iteration is pushed into low-level routines in the interpreter, instead of for-loops in the high-level language itself, and in theory could be parallelized:
numpy.matmul(a, b, out=None)
Matrix product of two arrays.
The behavior depends on the arguments in the following way.
If both arguments are 2-D they are multiplied like conventional matrices.
If either argument is N-D, N > 2, it is treated as a stack of matrices residing in the last two indexes and broadcast accordingly.
The part about a "stack of matrices" means it will automatically iterate across all the dimensions other than the last two, so you don't have to write any for-loops.
What is really amazing about languages like J or APL is that this paradigm is directly codified in the language, and so it gives you this extensibility for free. While MATLAB and numpy have borrowed some of these paradigms, they are limited as languages and so these implicit-looping features are implemented manually as special cases-- only some of the time and often with limited support (often only working for 2-d arrays, for example).
The numpy.matmul function is a perfect case in point: they only introduced the "stack of matrices" functionality in a recent release. But in J it would have been there from the start, with the built-in notion of "verb rank." This is what the matmul function would look like in J:
matmul =: +/ . * "2
I haven't tested it, but it is in all likelihood just as fast as the numpy version. The J interpreter will recognize +/ . * as matrix multiplication and use BLAS just like numpy does.
The "2 is all that is needed to add the "stack of matrices" functionality, and in a J implementation it would have been there from the start because it is so easy, it is good practice, and it is obvious that matrix multiplication should be a "rank 2" operation (operate on 2-d arrays).
The J for C Programmers book puts it this way (in the parlance of J, which is a bit foreign to the uninitiated):
We have been focusing so closely on shapes, cells, and frames that we haven't paid attention to one of the great benefits of assigning a rank to a verb: extensibility. When you give a verb the rank r, you only have to write the verb to perform correctly on cells of rank ≤r. Your verb will also automatically work on operands with rank >r: J will apply the verb on all the r-cells and collect the results using the frame.
J has its problems, and while I would say absolutely learn it to become a better programmer, don't expect to use it on the job. But I think this language is "ahead of its time" and this type of thinking will become the norm as computing refocuses on parallelism.
1 comment
0 u/onegin 06 Mar 2016 09:50
Great little demonstration. I am extremely fond of array-based languages and believe that one day their features will become standard, due mainly to the fact that parallelism is implicit in everything written in an array style, meaning that sophisticated interpreters will be able to automatically portion out parallel work to multiple cores or multiple machines in a cluster. And once you learn an array-based language, you get the feeling that "this is how computers think."
The way this implicit parallelism looks is that iteration/loops are implicit. As the "J for C programmers" book quips (J is a modernized version of APL):
APL has its roots in mainframe computing where vector (SIMD) processors were the norm. One way to familiarize yourself in the modern day is to find an opportunity to program in MATLAB (or GNU Octave) or Python+numpy. MATLAB's array paradigms were inspired by APL, and numpy was inspired by MATLAB.
Here is an example function from numpy where implicit looping over multidimensional arrays is offered as a feature. The result is that the code will run faster because the iteration is pushed into low-level routines in the interpreter, instead of for-loops in the high-level language itself, and in theory could be parallelized:
[https://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.matmul.html]
The part about a "stack of matrices" means it will automatically iterate across all the dimensions other than the last two, so you don't have to write any for-loops.
What is really amazing about languages like J or APL is that this paradigm is directly codified in the language, and so it gives you this extensibility for free. While MATLAB and numpy have borrowed some of these paradigms, they are limited as languages and so these implicit-looping features are implemented manually as special cases-- only some of the time and often with limited support (often only working for 2-d arrays, for example).
The numpy.matmul function is a perfect case in point: they only introduced the "stack of matrices" functionality in a recent release. But in J it would have been there from the start, with the built-in notion of "verb rank." This is what the matmul function would look like in J:
I haven't tested it, but it is in all likelihood just as fast as the numpy version. The J interpreter will recognize +/ . * as matrix multiplication and use BLAS just like numpy does.
The "2 is all that is needed to add the "stack of matrices" functionality, and in a J implementation it would have been there from the start because it is so easy, it is good practice, and it is obvious that matrix multiplication should be a "rank 2" operation (operate on 2-d arrays).
The J for C Programmers book puts it this way (in the parlance of J, which is a bit foreign to the uninitiated):
[http://www.jsoftware.com/help/jforc/loopless_code_i_verbs_have_r.htm#_Toc191734340]
J has its problems, and while I would say absolutely learn it to become a better programmer, don't expect to use it on the job. But I think this language is "ahead of its time" and this type of thinking will become the norm as computing refocuses on parallelism.