Topic: Implement JAX based automatic differentiation to Stingray
The project involved the study of modern statistical modelling to augment the accuracy, speed, and robustness of the likelihood function, into a software package called Stingray. This report demonstrates the experiment done for a combination of different optimizers to fit the scipy.optimize function. Another emphasis is to investigate the gradient calculation using JAX and compare it with scipy.optimize.
The proposed milestone was to investigate the room for improvement to enhance the overall performance of modelling to Stingray, using JAX. However, the current stage of the model is still a sandbox model. Stingray is astrophysical spectral timing software, a library in python built to perform time series analysis and related tasks on astronomical light curves. JAX is a python library designed for high-performance numerical computing. Its API for numerical functions is based on NumPy, a collection of functions used in scientific computing. Both Python and NumPy are widely used and familiar, making JAX simple, flexible, and easy to adopt. It can differentiate through a large subset of python’s features, including loops, ifs, recursion, and closures, and it can even take derivatives of derivatives. Such modern differentiation packages deploy a broad range of computational techniques to improve applicability, run time, and memory management.
JAX utilizes the grad function transformation to convert a function into a function that returns the original function’s gradient, just like Autograd. Beyond that, JAX offers a function transformation jit for just-in-time compilation of existing functions and vmap and pmap for vectorization and parallelization, respectively.
Today is the last day of GSoC-21. The entire journey was a rollercoaster ride and I learnt a lot of new things along the way. I started out with hardly knowing any of the shortcomings of pandas and as we dug in, I was surprised to see so many loopholes it contains. This will be the final blogpost of my gsoc journey and I hope you like it.
Pandas and Vaex
This blog post is a consolidated report of my GSoC ’21 project. I’ve been contributing to ndcube - a sunpy affiliated package, which is itself a part of the umbrella organization called openastronomy. Phew, that’s quite some hierarchy.
Here’s a list of pull requests that I’ve opened during the coding period:
Initial implementation for validating two WCS: Merged
Implements a function to check if two given WCS objects are compatible with each other for reprojecting the NDCube.
Reproject implementation: Merged
Adds a method to reproject an NDCube using the astropy package called reproject.
Reproject NDCube Documentation: Merged
Documentation for the above PR.
Combine cubes from NDCubeSequence using reproject: Unmerged
Stacks the data of all cubes in an NDCubeSequence together into one cube. This PR is ready to merge but awaits testing from the community.
Reproject NDCubeSequence Documentation: Unmerged
Documentation for the above PR. This will be merged after the code.
Support adaptive and exact algorithms for reproject: Unmerged
This PR is completed and is ready to merge.
Make reproject more efficient by identifying invariant axes: Unmerged
This PR is a work in progress and might need some time until it’s ready. The last commit on this PR as of writing this post can be found here.
I’ve also been writing blog posts throughout the coding period. Here are links to the 4 parts I’ve written so far: Part 1, Part 2, Part 3, and Part 4. They contain a more technical description of the work along with some obstacles that we faced.
GSoC 2021 — Final Report
Summer of 2021 held quite a few surprises for me. I’d have never imagined working with SunPy as a GSoC student and here I am concluding it with the final report. Before I summarize all the 30+ pull requests I’ve made to Sunkit-Pyvista, I just want to take a moment to than the brilliant mentors I’ve gotten to work with. The were not only patient and understanding but also extremely helpful with making me understand how everything works.
Coming to Sunpy, or should I say Sunkit-Pyvista.
Sunkit-Pyvista was created with the intention of extending Sunpy’s extensive plotting capabilities to 3D with the help of a VTK wrapper for Python — Pyvista.
The last few days haven’t been as productive as earlier. We fixed some issues with the NDCubeSequence’s stacking PR and it looks like it’s ready to merge now. With some feedback from the community, I think it will happen soon. There have also been some minor updates to the PR that brings reproject’s other algorithms to NDCube.
A new task that I’ve taken up now is identifying invariant axes in a cube. Let’s say there’s a 3D data cube where one of the axes corresponds to a quantity like time, which you don’t want to reproject onto another grid. Identifying this axis would let us reproject at only one point along this axis and then apply it throughout. This will speed up the execution significantly and require a lesser amount of memory. It’s a tricky path though and the first implementation might not be very efficient. What we’re trying to do is convert pixel coordinates to world coordinates using the source WCS, and convert it back from world to pixel using the target WCS. If the original and final pixel coordinates match, we can conclude that the axis is invariant.
I shall update its progress soon, but this is all for now. GSoC is officially coming to an end, but as I said in the previous post, it doesn’t matter much for continuing my contributions to this community. I’ve been fascinated by this open-source environment and culture and learned so much along the way. I guess GSoC did serve its purpose for me.
With the end of the GSoC project, I will give this blog to summarise the JAX based optimization to analyze its applicability to enhance the loglikelihood calculation. The goal is to analyze, (i) the performance of different optimizers to evaluate the loglikelihood function, (ii) demonstrated the robustness of JAX to calculate gradients. And talk about the current code and corresponding improvement due to JAX.
The application of loglikelihood fitting to periodograms is discussed in . Let us start with analyzing best-fit power spectrum (i) with different sets of optimizers namely: minimize(method=’Nelder-Mead’, ’Powell’, ’CG’, ’BFGS’, ’Newton-CG’, ’L-BFGS-B’, ’TNC’, ’COBYLA’, ’SLSQP’, ’trust-constr’, ’dogleg’, ’trust-ncg’, ’trust-krylov’, ’trust-exact’). The problem setting shifts the start and test parameters to study the graph of best fit optimizer using different “methods” listed above. First, we will stick with the Powell optimizer and try to check what is the current sensitivity of the implementation.
Currently, we seek to find a solution to the problem when the optimization algorithm often gets stuck in local minima, terminate without meeting its formal success criteria, or fails due to any contributing factor. Possible ways are: (1) add more Lorentzian components, (2) reduce the amplitude, (3) start the optimization process with parameters very far away from the true parameters, (4) experiment with the different optimizers/ “methods” to investigate if there is more superior algorithm compared to Powell.
From when I’ve started writing these blog posts, I’ve taken quite a liking to writing these blog posts (except for the one fortnight I missed). OpenAstronomy has helped me improve the way I express myself in a satirical manner. I’ve spent quite a bit of time trying to figure out what’s the plan for my future but let’s save that story for another day.
Here’s a bit of how my last week of “officially” working on Sunkit-Pyvista went :-
- We put a pin on the animator as it didn’t make sense for us to work on something that isn’t of more value for the first release.
- Pfsspy field lines now allow for a custom color function to be passed while plotting.
- All main functionality is big-free (as far as I know) and can be used efficiently without having to worry about something failing.
- The final boss with our little project is trying to get the documentation to render 3D plots correctly and this is being handled by the mentors which is a huge load off my head.
That’s pretty much it I guess, these last few PRs mark the end of the so called “GSoC” period. What more have I left out? I’ve spoken about everything there is to be said, we’re nearing the end of this pretty interesting journey that I had embarked on. Other than a small bit of sadness, I’ve got nothing else to say. I’ve decided to continue working on Sunkit-Pyvista as being an open-sourced contributor is all about the community and I’m doing my part in working towards that (It’s not like I was planning on leaving either way though, this is almost a part of my daily routine).
Hi! In the last weeks we have finished the Astrometry linear part programming.
You know, 7's my lucky number.
And Happy Independence Day!