OJL (Java) and Numpy (Python) performance benchmark on massive matrix operations
OJL, also known as the Open Jazari Library, https://github.com/hakmesyo/open-jazari-library is a library developed in the Java programming language for matrix, data analysis, data visualization, image processing and deep learning purposes. While developing OJL, it was generally aimed to be a high-performance library for Java programmers and has been used successfully in many real world projects. It is not used only for R&D or prototyping purposes like Matlab or Python. It is also used for production purposes in real world applications such as real-time agricultural robots or autonomous vehicles. Since data analysis, image processing and deep learning algorithms contain highly complex matrix and mathematical operations, the performance of the mathematics library used in the background is extremely important. In previous versions (open cezeri library https://github.com/hakmesyo/open-cezeri-library) on the other hand, matrix operations were developed using traditional Java arrays and generic for loops. Although the JVM tried to optimize it in terms of accessing and saving frequently used data in cache after the warming up phase, it could not reach the speed of the numpy library, which is frequently used in data science, especially from the Python side. Likewise, nd4j https://github.com/deeplearning4j/deeplearning4j/tree/master/nd4j is a high performance math and matrix library implemented in Scala programming language for Java echo systems. Both libraries (numpy and nd4j) use a library called blas, developed in c programming language to speed up matrix operations. Blas is extensively utilized Matlab’s backend, as well.
Unlike its previous version, OJL currently uses the high-performance nd4j library for matrix and mathematical computation. In this article, we will compare the running performance of the program, which includes several massive matrix operations, with OJL and Numpy. While benchmarking, to be fair enough against both sides , CPU and memory usage criteria have been added as well as completion times. Throughout the study, a windows 10 64 bit OS laptop with intel core i7 processor and 16 gb ram capacity was used. In addition, as a benchmarking for OJL-Java and Python, Netbeans 8.2 development environment with JDK 8 and Spyder ide with Numpy were utilized, respectively. Our benchmarking strategy having three exercises would be as follows. The first exercise will involve relatively less complex matrix operations. We will test our libraries with a gradually increasing stress loading from a matrix size of 100 x 100 to a matrix size of 10000 x 10000. The second exercise will be a benchmarking test having moderate complexity of matrix and mathematical operations. Our final exercise will be a benchmark test with high complexity level in terms of matrix and mathematical operations. It can be observed that numpy is the winner, especially in low-dimensional matrices and easy operations. For the first and second exercies, the numpy library is generally better than ojl, since the first two exercises have a low level of complexity of operations. Let’s omit the first two and get started with the third benchmark exercise.
Exercise — 3 : For a matrix of size 10000 x 10000 with one value for each element, multiply each element by the scalar 1.3f and then round up. Then square the matrix (power matrix by 2)
Java side
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —
As it is understood from the output overall elapsed time takes 5.6 sec. for Java OJL.
Python Side
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —
In this case Pyhton numpy finished the process at almost 66 sec.
Conclusion
As can be seen from the benchmark chart above, the numpy library has a speed advantage over ojl in small sized matrices, especially up to 300x300. The overwhelming superiority of the ojl library begins at 1000x1000 and above. The chart above shows the values on the y-axis in logarithm scale for better understanding of the results. You can also evaluate the performance of ojl codes yourself. For this, you need to download ojl from https://github.com/hakmesyo/open-jazari-library to your netbeans editor and then just run the exercise from https://github.com/hakmesyo/open-jazari-library/blob/main/src/jazari/nd4j/Exercise_3.java. Comments are appreciated. See you later.
Reference
1- Ataş, Musa. “Open Cezeri Library: A novel java based matrix and computer vision framework.” Computer Applications in Engineering Education 24.5 (2016): 736–743.