I was selected as an intern to work on SciPy build system. In this blog post, I will be describing my journey of this 10-months long internship at SciPy. I worked on a variety of topics starting from migrating the SciPy build system to Meson, cleaning up the public API namespaces and adding Uarray support to SciPy submodules.
The main reasons for switching to Meson include (in addition to distutils
being deprecated):
For more details on the initial proposal to switch to Meson, see scipy-13615
I was initially selected to work on the migrating the SciPy build system to meson. I started by adding Meson build support for scipy.misc and scipy.signal. While working on this, we came across many build warnings which we wanted to fix, since they unnecessarily increased the build log and might point to some hidden bugs. I fixed these warnings, the majority of which came from deprecated NumPy C API calls.
runtests.py
, but using Meson for building SciPy.Meson build support including all the above work was merged into SciPy’s main
branch around Christmas 2021. Meson will now become the default build in the upcoming 1.9.0 release.
“A basic API design principle is: a public object should only be available from one namespace. Having any function in two or more places is just extra technical debt, and with things like dispatching on an API or another library implementing a mirror API, the cost goes up.”
>>> from scipy import ndimage
>>> ndimage.filters.gaussian_filter is ndimage.gaussian_filter # :(
True
The API reference docs of SciPy define the public API. However, SciPy still had some submodules that were accidentally somewhat public by missing an underscore at the start of their name.
I worked on cleaning the pubic namespaces for about a couple of months by carefully adding underscores to the .py
files that were not meant to be public and added depecrated warnings if anyone tries to access them.
>>> from scipy import ndimage
>>> ndimage.filters.gaussian_filter is ndimage.gaussian_filter
<stdin>:1: DeprecationWarning: Please use `gaussian_filter` from the `scipy.ndimage` namespace, the `scipy.ndimage.filters` namespace is deprecated.
True
“SciPy adopted uarray to support a multi-dispatch mechanism with the goal being: allow writing backends for public APIs that execute in parallel, distributed or on GPU.”
For about the last four months, I worked on adding Uarray support to SciPy submobules. I do recommend reading this blog post by Anirudh Dagar covering the motivation and actual usage of uarray
. I picked up the following submodules for adding uarray
compatibility:
At the same time, in order to show a working prototype, I also added uarray
backends in CuPy to the following submodules:
The pull requests contain links to Colab notebooks which show these features in action.
import scipy
import cupy as cp
import numpy as np
from scipy.linalg import inv, set_backend
import cupyx.scipy.linalg as _cupy_backend
x_cu, x_nu = cp.array([[1.0, 2.0], [3.0, 4.0]]), np.array([[1.0, 2.0], [3.0, 4.0]])
y_scipy = inv(x_nu)
with set_backend(_cupy_backend):
y_cupy = inv(x_cu)
meson-python
backend.uarray
support are still under heavy discussion, and the main aim will be get them merged as soon as possible once we have reached a concrete decision.I am very grateful to Ralf Gommers for providing me with this opportunity and believing in me. His guidance, support and patience played a major role during the entire course of internship. I am also thankful to whole SciPy community for helping me with the PR reviews and providing essential feedback. Also, huge thanks to Gagandeep Singh for always being a part of this wonderful journey.
In a nutshell, I will remember this experience as: Ralf Gommers has boosted my career by millions!
]]>