Bytedeco - Refining native integration

Refining native integration

December 07, 2016

Over the past few months, I have been busy integrating JavaCPP to Deeplearning4j, mainly as part of ND4J and DataVec. Many bugs had to be fixed, we also needed to enhance native memory management, a few presets were created, but I actually spent most of my time trying to run Skymind in Japan. In any case, thanks to all the team, Deeplearning4j has come a long way since last year, so if you have not given it a try recently, make sure you do!

However, a release at Bytedeco was long overdue, so I am proud to announce the availability of version 1.3! The source code and the binaries can be obtained as usual from GitHub and the Maven Central Repository for JavaCPP, JavaCPP Presets, JavaCV, ProCamCalib, and ProCamTracker. This release comes with binaries for linux-armhf (for devices such as Raspberry Pi), thanks to Vince Baines for his continuous effort, as well as linux-ppc64le, built on SuperVessel Cloud, which features virtual machines that IBM offers for free to the community. Lloyd Chan has also been maintaining sbt-javacpp and sbt-javacv, while Andreas Eberle has generously provided Android builds for TensorFlow.

To manage native memory more automatically, JavaCPP now monitors physical memory usage, also known as “resident set size” on Linux, Mac OS X, etc or “working set size” on Windows, as reported by the kernel. The maximum value defaults to 2 times Runtime.maxMemory(), which can be specified with the usual -Xmx option on the command line, but we can also set it independently via the “org.bytedeco.javacpp.maxphysicalbytes” system property. When the whole process uses more physical memory than that amount, System.gc() followed by Thread.sleep(100) are called a few times in a row, an amount adjustable via the “org.bytedeco.javacpp.maxretries” system property, in an attempt to free memory. With this strategy, we are able to tame memory usage more accurately no matter how it is being allocated.

The presets that were created for this release are librealsense (a great contribution from Jeremy Laviole), HDF5 (that Deeplearning4j uses to import models from Keras, Theano, and TensorFlow), and OpenBLAS (which also dynamically binds to MKL if found on the system or in the class path). Additions planned for the near future include MAGMA to supplement functionality missing from cuSOLVER. To simplify further the user experience, we also plan to offer bundles containing all the binaries for CUDA and MKL, if it is determined that we are allowed to do so, which appears likely according to the EULAs that accompany their free downloads (CUDA and MKL). Traditionally, we have to spend time either installing them manually or figuring out a way to automate the process on a case-by-case basis, using when possible platform-specific package managers or containers such as Docker, probably along with some scripts. Having such bundles available on the Maven Central Repository would relieve developers and operators from this burden.

Other important changes include a HalfIndexer to process in Java 16-bit half-precision floating-point data from CUDA or other libraries, the adoption of a user defined directory (defaults to ~/.javacpp/cache/) where JavaCPP now caches native library files, instead of extracting them into a temporary directory, and the introduction of “platform artifacts” for the JavaCPP Presets and JavaCV. Each entry in the cache is a directory with the same name as the JAR file from which the files are extracted, including the subdirectories, for example, opencv-3.1.0-1.3-linux-x86_64.jar/org/bytedeco/javacpp/linux-x86_64/libopencv_core.so.3.1. This way, the files are given a (most of the time) unique, predetermined, but easy to remember path, preventing not only the build up of messy temporary files, but also allowing for faster startup times as well as easier integration with native tools, outside the scope of JavaCPP. The technique also works for files other than libraries. Right now, the implementation does not support the extraction of whole directories, but when that becomes possible, one will be able to bundle header files, among other native resources, and have them available for immediate consumption with such a simple call as Loader.cacheResource(opencv_core.class, "include"). As a matter of course, the cache functions just as smoothly with uber JARs. “Platform artifacts” can also come in handy. Users are invited to add dependencies on those artifacts suffixed with “-platform”, for example, javacv-platform, opencv-platform, ffmpeg-platform, etc, which in turn depend on binaries for all supported platforms. This new strategy was designed to work well with build systems other than Maven (sbt, Gradle, M2Eclipse, etc).

On a final note, before long, we hope to have build servers running allowing us to make releases available in a more timely fashion. Stay tuned for updates, but in the meantime, do not hesitate to contact us through the mailing list from Google Groups, issues on GitHub, or the chat room at Gitter, for any questions that you may have. Contributions are also very welcome!

Comments

To add a comment, please edit the comments file and send a pull request!