The start was amazing! The community bonding was really great. I got to meet the mentors, get to know the whole organization structure of OpenAstronomy and LINCC frameworks, the work they do, the people and facilities associated with it, the ways PyArrow and nested-pandas was being used in astronomy and the expectations they had form the internship. They offered to help me through tasks if I could not do it on my own along with some content to look through to get a deeper understanding of the project. I attended the Apache Arrow community meeting with my mentor to introduce the project them and get their views on it. They were really helpful and even suggested certain thing to do to improve the final PR.

Week 1 and 2 were spent on improving the parallel reading of parquet files, which was successfully implemented by me. The benchmarking of these performance changes proved quite hard, as this would require the following, starting with the main arrow branch:

GSoC 2026 - Weeks 1 & 2: Understanding Before Implementing

Kumar Amityush

2026-06-07 21:13

The Journey Begins 💪

The first two weeks of the coding period have been a mixture of excitement, confusion, debugging, learning and the occasional moment of staring at a terminal wondering why something that worked five minutes ago suddenly doesn't.

As someone once said:

"The expert in anything was once a beginner who refused to give up."

Beyond the Abstraction

Reem Hamraz

2026-05-31 20:14

The glow of the screen hits different when you're reading an email that dictates your entire summer. Getting accepted into Google Summer of Code 2026 programme for Astropy, under the OpenAstronomy umbrella, was one of those really defining moments; all I could do was stare at the screen in disbelief (I'd done it). Then, the abstraction fades away, and suddenly, you need to get down to the real work.

This post, is the first of many. I plan to keep these updates transparent and real—documenting, so it's really not just going to be about the code that ships, but rather the (not-so)basic commands I need to google, or the dumb mistakes, or the long hours spent scouring the internet for things that developers ought to know (but I don't), and the brutal reality of open-source development or at least the reality from the perspective of a first-timer (that would be me needing to google how to squash and rebase, haha fun times:| or not)

GSoC 2026 - My Journey to SunPy Under OpenAstronomy

Kumar Amityush

2026-05-17 18:53

My GSoC Journey

I’m incredibly excited to share that I’ve been selected for Google Summer of Code 2026 under OpenAstronomy for the SunPy project:

Improving radiospectra’s Functionality and Interoperability.

GSoC’25@OpenAstronomy — Final Report

Darshan Patil

2025-09-27 09:02

GSoC’25@OpenAstronomy — Final Submission

Organization: OpenAstronomy (RADIS)
Mentors: Nicolas Minesi, Erwan Pannier

Electronic spectra for RADIS

This project will extend RADIS to calculate electronic spectra, enabling analysis of high-temperature phenomena in plasmas, flames, and exoplanet atmospheres. Currently, RADIS excels at rovibrational calculations but lacks electronic capabilities. This project will implement electronic spectroscopy in RADIS by leveraging existing ExoMol integration, adding separate electronic temperature parameters, implementing population distributions for electronic states, and creating functions for manual adjustment of electronic band intensities. Deliverables include OH electronic spectra with manual band adjustment capability, implementation of electronic temperature handling, non-equilibrium OH spectrum calculation, and extension to support electronic spectra for all ExoMol molecules. This enhancement will make RADIS a comprehensive spectroscopic tool capable of handling the full range of molecular transitions within a unified framework.

The Solution

My work focused on extending RADIS to calculate electronic spectra by:

My GSoC journey @Open-Astronomy

mohyware

2025-09-09 14:11

Introduction

So for people who don’t know, Google Summer of Code is a global program by Google where students and developers get the chance to work on real open-source projects with mentors from around the world. It’s not just about coding it’s about learning how to collaborate, contribute to big projects, and ship something useful. And yes, there’s also a stipend, which makes it even more exciting.

Well, the first time I heard about GSoC was in my first year of college, but I didn’t give it much attention. I thought it would be really hard to get in. Then, in my second year, two of my friends got accepted, and that really inspired me. It made me realize it was actually possible.

Starting Early

So I started searching deep, even before the orgs got announced. I really wanted to get in, since there aren’t many internships out there that give both good experience and a stipend. I navigated through different orgs' codebases, read a lot of issues and PRs from different orgs to understand how OSS contributions are done, but I didn’t pick a specific org at that point.

GSoC 2025 Final ReportFlyingRa...

Mohammad Adnan

2025-09-01 14:18

GSoC 2025 Final Report

Time Series Analysis in Stingray.jl

kashish shrivastav

2025-08-30 13:21

From Stargazing Dreams to Code Stargazing:)

Ever since childhood, I have been passionate about the mysteries of the universe. Over time, that passion grew into something more technical — not just looking at the stars, but wanting to understand their signals.

This summer, through Google Summer of Code 2025 (GSoC), I had the opportunity to contribute to Stingray.jl, a Julia library for time series analysis in astronomy. What follows is both a story of my personal journey and an exploration of the science and code I worked on.

The Beginning – Switching Languages and First Steps

When the organization was announced, I immediately started exploring Stingray (the Python version). I wanted to understand the structure, so I made my first PRs there.

Soon, I understood the Stingray a bit, then I started working on Stingray.jl, the Julia port of Stingray initiated by Aman Pandey in 2022. That meant learning Julia from scratch! Thankfully, Julia was similar to Python in many ways, so I picked it up quickly.

During my early testing, I noticed outdated dependencies — for example, some packages were only compatible with Julia 1.7, while the project was moving towards Julia 1.11. My first clumsy PR (on power_colors) wasn’t great, will tell you in a bit,t but it gave me valuable lessons.

Later, I worked on updating fourier.jl. This was also my first time seriously using structs in Julia. My mentor @fjebaker guided me on abstract methods and struct design, while @matteobachetti helped me understand what kind of astronomical data the project needed.

At first, I was confused about telescopes and their instruments. To learn, I went to NASA HEASARC documentation, where I discovered FITS data (Flexible Image Transport System) — the standard format for astronomical data. That’s when things started making sense.

My First Big Task – Reading Event Data

@fjebaker opened a milestone and assigned me the readevent task. At the same time, I became good friends with fellow contributor Jawd Ahmad. We often helped each other understand the codebase.

I implemented the readevent function with a small struct called
EventList something like :

Google Summer of Code: Final Submission

Pratham

2025-08-29 21:45

I have completed my Google Summer of Code project, “Fast Parsing of Large Databases and Execution Bottlenecks,” on the Radis project under the OpenAstronomy umbrella. The project developed high-performance, line-by-line parsing for high-resolution infrared molecular spectra, and this is the final blog posts to document the results and lessons learned. Most important of all, I loved the work I was able to do; it was interesting and made possible by my helpful and amazing mentors. I am incredibly grateful to Dr. Nicolas Minesi, Dr. Dirk van den Bekerom and Tran Huu Nhat Huy.

Project Description?

The problem is that the HITEMP CO₂ spectroscopic database is extremely large and inefficient to work with: the distributed file is about 6 GB when compressed but expands to roughly 50 GB, and the existing workflow requires fully decompressing and then batch-parsing the entire file an operation that takes on the order of 2.5 hours and uses a lot of disk I/O, memory, and network bandwidth. That full-decompress/parse approach creates several practical bottlenecks: it forces users to have large, fast storage and long processing windows, makes quick exploratory analysis or iterative development impractical, wastes bandwidth and time when only small portions of the data are needed. In short, the dataset’s size and the parser’s all-or-nothing design prevent efficient, selective access and slow down every downstream analysis that depends on these spectral lines.

Project Walk Through

After discussing the project we divided it into three parts. The first part was optimizing the existing code so that, at a minimum, we would have a better working infrastructure. The second part was enabling partial downloads so a user can retrieve only the necessary part of the file without downloading it entirely. The last part was building a C++ Single Instruction, Multiple Data (SIMD) parser using Intel intrinsic. Below I have described each of these in detail.

Building a super-fast SIMD parser for dataset - The final episode

Pratham

2025-08-24 16:36

Welcome to the last episode of my Google Summer of Code series. In the previous post I showed how I could seek inside a large .bz2 file and decompress a region to get about 500 megabytes of raw data. That worked, but it still required downloading the full 6 gigabyte compressed file up front. After talking with the maintainers we switched to a partial-download approach: a user requests a region and the system downloads only the 45–65 megabytes of compressed bytes that decompress to the exact 500 megabyte window we need. That change took some extra work, but it makes the system feel immediate for new users, you can get hundreds of thousands of parsed rows in a couple of minutes without pulling the whole archive.

When building a parser that aims to beat Pandas’ vectorized operations, single-threaded concurrency isn’t enough. Concurrency is about handling multiple tasks by rapidly switching between them on a single core. It gives the illusion of things happening in parallel, but at any given instant only one task is actually running. That’s why it feels like multitasking in everyday life where you’re switching back and forth, but you’re not truly doing two things at the same time.

True parallelism, on the other hand, is about dividing independent work across multiple cores so that tasks literally run simultaneously. Each task makes progress without waiting for others to finish, which is what makes SIMD vectorization or multiprocessing so powerful for workloads like parsing large datasets.