Ideas page for GSoC 2015

Browse ideas for the following projects:

Astropy core package
Packages affiliated with Astropy
ChiantiPy
SunPy
yt

For each participating project, the ideas are organized from easiest to hardest.

Astropy core package

Implement Distribution Support for Quantity

Suggested Mentor(s): Erik Tollerud

Difficulty: Beginner to Intermediate

Astronomy knowledge needed: none, but statistics knowledge/background useful

Programming skills: Python

Description

The Quantity class is powerful but doesn’t have particularly useful support for uncertainties on quantities or other statistical approaches to thinking about numbers. A very straightforward way to make progress on this would be to create a subclass of Quantity called “Distribution” (or similar) that represents a probability density function of a quantity as Monte-Carlo-sampled arrays. This project would involve implementing this subclass, propagating operations while combining distributions, as well as tools for extracting useful information from such distributions. If there is time, this could also involve expanding this system to support common analytically-representable distributions such as Gaussian and Poisson distributions.

Implement image rasterization methods for models

Suggested Mentor(s): Christoph Deil

Difficulty: Intermediate

Astronomy knowledge needed: Basic

Programming skills: Python, Cython

Description

When fitting models to binned data, evaluating the model at the bin centers leads to incorrect results if the model changes significantly within a bin. E.g. think of an image where the point spread function (PSF) only has a width slightly above the pixel size and you want to distinguish small Galaxies from stars.

Currently Astropy models have an evaluate method that can be used to evaluate them on a grid of pixel centers, there’s also an oversampling function to get a better representation of the expected flux in pixels. It would be useful to add methods that allow fast and precise rasterization of models, similar to what graphics libraries do (sparse subsampling or resampling of models evaluated on grids that are appropriate for each model, or anti-aliasing).

There are different options for how to proceed with this project, e.g. possibly add optional extension, sampling grid and bounding box information to the Astropy model classes, or contribute rasterisation code to astropy.modeling or photutils, or expand the existing resampling code in the reproject package. The student should be interested in model fitting and image rasterisation as well as profiling and extensive testing of a given method to make it “just work” for the end user.

Add indexing capability to Table object

Suggested Mentor(s): Tom Aldcroft (Astropy), Stuart Mumford (SunPy)

Difficulty: Intermediate

Astronomy knowledge needed: none

Programming skills: Python, Cython, familiarity with database algorithms

Description

The Table class is the core astropy class for storing and manipulating tabular data. Currently it supports a limited set of database-like capabilities including table joins and grouping. A natural extension of this is to provide the ability to create and maintain an index on one or more columns as well as a table primary key. With these indexed columns available then certain selection and query operations could be highly optimized. The challenge is to maintain the integrity of the indexes as column or table properties change, using state of the art algorithms for high performance.

There are various uses of this functionality, such as supporting time series data, where the index column would allow you to sort the Table correctly as well as performing operations such as truncations and merges while maintaining the integrity of the time series. Other uses include catalogs of positions in the night sky where an index column of astropy coordinate objects would maintain the uniqueness of every position.

To summarize:

Add method to create an index for a specified column
Add code to maintain these indexes when the table is modified
Add method to designate a column as a primary key (possibly maintaining table in sort order for that key)
Optimize existing table operations to use indexes where possible
Add new methods to select table rows based on column values

Unify and improve file handling

Suggested Mentor(s): Michael Droettboom

Difficulty: Intermediate to Expert

Astronomy knowledge needed: none

Programming skills: Python, Unix features

Description

We have a number of packages that read and write data to files and file-like objects. While there was some initial effort to unify this code in get_readable_fileobj and others, in general each package is handling its own file I/O. This sort of code is notoriously difficult to get right across versions of Python and the different platforms we support, so it would be beneficial to remove this duplication. This also means that some features, such as gzip handling or URL handling, are not universally available or inconsistent across packages. Once this is unified, we can move on to some more advanced features that don’t exist anywhere in astropy, such as HTTP Range fetching (see astropy/#3446, and OS-level file locking to make multiprocessing applications that write to files more robust.

Implement missing astropy.modeling functionality

Suggested Mentor(s): Christoph Deil

Difficulty: Intermediate to expert

Astronomy knowledge needed: Basic

Programming skills: Python

Description

Implement some basic features are still missing in the astropy.modeling package:

Fit parameter errors (symmetric and profile likelihood)
Poisson fit statistic
PSF-convolved models
model parameter and fit result serialisation, e.g. to YAML or JSON or XML (e.g. some astronomers use XML)

For the parameter error and Poisson fit statistic part some statistics background is needed, as well as interest in discussing and finding a good API for these things.

An optional fun application at the end of this project (if model and fit result serialisation is implemented) could be to develop an interactive image fitting GUI (e.g. with IPython widgets in the web browser) for common 2D Astropy models, showing data, model and residual images and letting the user adjust model parameters and display fit statistics and results interactively.

Implement framework for handling velocities and velocity transforms in astropy.coordinates

Suggested Mentor(s): Adrian Price-Whelan & Erik Tollerud

Difficulty: Intermediate to Expert

Astronomy knowledge needed: understanding of coordinate transformations, some knowledge of astronomical coordinate systems would be useful

Programming skills: Python

Description

The coordinates subpackage currently only supports transforming positional coordinates, but it would be useful to develop a consistent framework for also transforming velocities (e.g., proper motion to proper motion, or proper motion to cartesian) with full support for barycentric, galactocentric, and LSR motion. This project could be:

working with us to develop a consistent API for handling velocities within coordinates,
developing a trial implementation of an API,
actually doing core development to implement the new features, or
some combination of all of the above.

Implement Public API for ERFA

Suggested Mentor(s): Erik Tollerud

Difficulty: Intermediate to Expert

Astronomy knowledge needed: None required, but may be helpful for understanding ERFA functionality

Programming skills: Python, Cython, C

Description

Some of the major functionality for Astropy uses the ERFA C library (adapted from the IAU SOFA library) as the back-end for computational “heavy-lifting”. Members of the community have expressed a desire to use this lower-level python wrapper around ERFA for other purposes that may not be directly relevant for Astropy. So this project would involve making the necessary changes to make the ERFA python API public. This includes:

Getting the documentation up to the astropy standard (currently it is mostly auto-generated verbatim from the C comments).
Implementing a more complete test suite for the python side of the code.
Possibly moving it to a separate package as part of the liberfa GitHub organization. This would also include making the necessary changes to ensure everything continues to work in Astropy.
Any other steps necessary to ensure the resulting package (or sub-package of Astropy) is stable and relatively easy to use.

Packages affiliated with Astropy

Develop an affiliated package for observation planning / scheduling

Suggested Mentor(s): Christoph Deil

Difficulty: Beginner

Astronomy knowledge needed: Intermediate

Programming skills: Python

Description

Now that Astropy can transform from horizontal (altitude/azimuth) to sky coordinates it’s possible to develop tools for observation planning / scheduling (see here for an example). It would be nice to start developing an affiliated package that can be used by observers and observatories to plan and schedule observations. This project could go in a few different directions, including:

creating typical tables and plots for observation planning
optimising scheduling of observations for given target lists and telescope slew speed / exposure lengths for a given night or even month / year
contribute sun / moon rise / set functionality to astropy coordinates
a desktop or web GUI

The project could start with a look at the functionality of existing tools and then gather some input on the astropy mailing list what the community wants. The student should have an interest in coordinates, observations planning / scheduling and plotting / GUIs.

Contribute gamma-ray data analysis methods to Gammapy

Suggested Mentor(s): Christoph Deil, Axel Donath

Difficulty: Beginner to intermediate

Astronomy knowledge needed: Basic

Programming skills: Python

Description

Gammapy is an Astropy-affiliated package to simulate and analyse data from gamma-ray telescopes such as Fermi, H.E.S.S. and CTA. A lot of basic functionality is still missing, specifically we think that contributing to one of the sub-packages gammapy.background (background modeling), gammapy.detect (source detection methods) or gammapy.spectrum (spectral analysis methods) would be a good GSoC project if you are interested in implementing specific established data analysis algorithms (e.g. adaptive-ring or reflected region or template background estimation, or spectrum forward-folding or unfolding methods) used in gamma-ray astronomy (but no prior gamma-ray data experience / knowledge needed).

Astropy Acknowledgement/Citation Generator

Suggested Mentor(s): Erik Tollerud

Difficulty: Beginner to Intermediate

Astronomy knowledge needed: none, although some experience with astronomy citation practices might be useful

Programming skills: Python and LaTeX/BibTeX

Description

Some parts of Astropy and affiliated packages use algorithms or tools that have been published in the scientific literature (this includes Astropy itself). To encourage citing these works, it would be useful if Astropy had a feature to allow attaching citations to methods, functions, or packages. This would then allow a user to simply run a function along the lines of “write_citations” and have it print or write a file that tells them what papers to cite. Bonus points if this actually can show BibTeX or LaTeX bibliography entries that can be just dropped into papers with minimal effort on the part of the user.

Adding further spectral standards to specutils

Suggested Mentor(s): Adam Ginsburg & Wolfgang Kerzendorf

Difficulty: Intermediate

Programming skills: Python

Description

Specutils is a package within the astropy collection that deals with operations with spectra. Apart from imaging, spectra are the second main data product in astronomy. While imaging data is collected by hooking a giant DSLR at the end of telescope and sticking coloured glass between telescope and DSLR (a filter), spectra are obtained by breaking light up into its components and then observing the resulting distribution. These data are saved in a variety of formats.

Currently, we are able to read and write a subset of standards that are out there. As a project, we suggest to implement the remaining unsupported standards. All of the code is in Python and a good understanding of classes is needed for this project.

Improve pyregion and pyds9

Suggested Mentor(s): Christoph Deil

Difficulty: Intermediate

Astronomy knowledge needed: Basic

Programming skills: Python

Description

The pyregion package is very useful to work with ds9 and CIAO region files. It is now at https://github.com/astropy/pyregion but it is unfinished … someone has to improve and polish it. In particular the region file parser is very slow (see pyregion#48 and someone interested in parsing should find out why and make it fast. There are several other things to do, e.g. using astropy coordinates everywhere and implementing tests so that it is compatible with ds9 to a very high accuracy. The package could also be extended with Python functions to read / write / visualise MOC files or to unify and [improve the existing Python interfaces to ds9. The student should be interested in sky coordinates and regions, parsing, visualisation, writing tests and docs, and for the ds9 interfaces some Cython coding is probably needed.

Revamp astropython.org web site

Suggested Mentor(s): Tom Aldcroft

Difficulty: Intermediate

Astronomy knowledge needed: Basic / none

Programming skills: Python, web development (javascript etc)

Description

The http://www.astropython.org site is one of the top two generic informational / resource sites about Python in astronomy. This site uses Google App Engine and is basically all custom code built around the bloggart engine. Currently it is getting a bit stale for a few reasons:

There is no good mechanism for guest posting to expand the community of people contributing.
It is painful to add content because of the antiquated entry interface which now seems to work only on firefox.
The comment system is lacking (no feedback to comment authors etc).
The website code itself is convoluted and difficult to maintain / improve

The proposal is to start over with all modern tools to bring fresh energy and involvement into this project. All details of how to do this to be determined, but one requirement is to migrate all the current content. Part of this would be re-evaluating current resources as well as digging around to freshen up the resource list.

ChiantiPy

GUI Spectral Explorer

Suggested Mentor(s): Ken Dere

Difficulty: Intermediate

Astronomy knowledge needed: A basic understand of astrophysical spectroscopy

Programming skills: Python

Description

The goal of this project is to provide a graphical user interface to enable a user to explore observed spectra and compare it with theoretical spectra. The basis for the theoretical spectra is the CHIANTI atomic database for astrophysical spectroscopy that was first released in 1997. Programmatic access to the database, which is freely available, is provided by the ChiantiPy package – a pure python package. It is highly object oriented with each ion, such as Fe XVII, being the basic object. Higher level objects are often assembled from a collection of ions, such as when calculating a spectrum. ChiantiPy uses the CHIANTI database to calculate line and continuum intensities as a function of temperature, electron density. This can be done for a set of elemental abundances in CHIANTI or for a user provided set of elemental abundances. At present, if a user wants to compare CHIANTI theoretical spectra it must be done on a case-by-case basis. A GUI explorer, written in Python and preferably PyQt or Wx based, will provide an integrated tool to import observed spectra and plot them alongside theoretical spectra. It will further allow the user to understand what spectra lines contribute to various spectral line profile, how the predicted spectra vary as a function of temperature and density.

It will be necessary to develop techniques to import observed spectra from a variety sources. Typical sources are in FITS files, HDF5 files, or csv files. It will also be important to allow users import their data through modules of their own.

SunPy

Improvements to the SunPy Database

Suggested Mentor(s): Stuart Mumford, Steven Christe

Difficulty: Beginner

Astronomy knowledge needed: None

Programming skills: Python, some database knowledge would be helpful, but not required.

Description

The database module provides functionality to users to manage collections of files on disk in a way not reliant upon folder structure and file name. The database allows users to find files on disk by either physical parameters, such as wavelength and time or properties of the instrument such as name and spacecraft. It also allows more complex queries by enabling searches of the raw meta data associated with the files.

The improvements to the database functionality that would be implemented by this project include:

Integration of the new UnifiedDownloader code into the database search, to replace the direct VSO integration current present. (The VSO is a repository of solar physics data, SunPy’s VSO API has been wrapped by UnifiedDownloader.)
Support for relative paths in the database module #783 to allow a centralised database with multiple users, all referencing a central file store mounted with different absolute paths on each client.
Supporting all data supported by the sunpy.lightcurve module in the database. The major hurdle here is the lack of standardisation in the file used by this data.

There are various other maintenance tasks which need undertaking (https://github.com/sunpy/sunpy/labels/Database) which would be a good way for someone interested in this project to familiarise themselves with the codebase.

Integrating ChiantiPy and SunPy

Suggested Mentor(s): Dan Ryan, Ken Dere

Difficulty: Beginner

Astronomy knowledge needed: Some knowledge of spectra.

Programming skills: Python.

Description

The CHIANTI atomic physics database is a valuable resource for solar physics. The CHIANTI database holds a large amount of information on the physical properties of different elements in different ionisation states and enabled the calculation of various parameters from this information. Using CHIANTI it is possible to calculate the spectra of various types of solar plasma (e.g., flare, quiet sun, etc.) from the observed elemental abundances and ionisation states. These synthetic spectra are essential for comparing to the data observed by various instruments to calculate the response functions of the instruments and to compare to the properties of observed plasma to allow the calculation of physical parameters such as temperature.

Currently, no SunPy code uses the Python interface to the CHIANTI database ChiantiPy. This project would develop the routines to be included in SunPy to use ChiantiPy for various physical calculations desired. The first potential use of ChiantiPy in SunPy is in the sunpy.instr.goes module, where currently data tables calculated using CHIANTI are downloaded from the Solar Software (SSW) distribution, these data tables should be created using SunPy.

Other potential application of ChiantiPy in SunPy include:

Conversion of ChiantiPy spectra objects to SunPy Spectra objects.
Calculation of AIA temperature response functions from ChiantiPy contribution functions.

Expected Outcomes: This project would facilitate SunPy becoming independent from Solar SoftWare (SSW) in producing and maintaining files required by the sunpy.instr.goes module for determining the thermodynamic properties of the emitting plasma observed by GOES. It would also allow SunPy users to calculate spectra and exclusively through Python without relying on SSW.

Support for analysis of Solar Energetic Particles

Suggested Mentor(s): David Pérez-Suárez

Difficulty: Beginner

Astronomy knowledge needed: None

Programming skills: Python.

Description

SunPy is able to read a lightcurve from different sources (GOES x-ray, Lyra, Norh,…), however these are not all. SoHO/ERNE (Energetic and Relativistic Nuclei and Electron experiment on board SoHO) measures one of the important effects in Space Weather, Solar Energetic Particles (SEP). The data of such instrument (as for GOES particle measurements) comes as plaintext csv files with header information. This project should add ERNE to the SunPy supported instruments by being able to read these files in as a lightcurve object and allow to perform the basic operations used when such data is analysed: eg. energy ranges binning, visualisation, …

Lightcurve Refactor

Suggested Mentor(s): Stuart Mumford, Dan Ryan, Andrew Inglis

Difficulty: Intermediate

Astronomy knowledge needed: None

Programming skills: Python

Description

The Lightcurve class is one of the three core datatypes in SunPy, along with Map and Spectra. Lightcurve is designed to read in, process and store meta data related to solar physics time series data. Currently, Lightcurve uses the pandas library as its underlying data structure, however, this is subject to change in the future.

Much like the map submodule, lightcurve needs to be able to read in various supported data formats (such as FITS, ascii and others in the future), store their meta data and give users Beginner and unified access to this metadata independently of the original source of the data.

As currently implemented (as of 0.5) the lightcurve module performs three core tasks:

Download the raw data
Read this data into a pandas dataframe
store the meta data obtained with the data.

As of the SunPy 0.6 release the first stage will be moved out of lightcurve and into the net subpackage as part of the UnifiedDownloader (name subject to change) Pull Request. This leaves lightcurve in a similar position to map where the data acquisition is not part of the core data type and is managed separately. Therefore, enabling the implementation of a factory class like Map for the lightcurve module.

Expected Outcomes

Someone under taking this project will complete the following tasks:

Become familiar with the UnifiedDownloader code, if it has not been accepted into the SunPy codebase, complete the remaining tasks for this to be achieved.
Re-write any new lightcurve sources that were not included in the UnifiedDownloader code as sources for UnifiedDownloader.
Write a factory class for lightcurve similar to the sunpy.map.Map class. This class will be a generic constructor for lightcurve allowing the user to instantiate any one of the many subclasses of GenericLightcurve present in sunpy.lightcurve.sources. The API design for the factory class is here: https://github.com/sunpy/sunpy-SEP/pull/6
Design and develop a robust method of dealing with lightcurve meta data, which can handle joining different parts of timeseries from different files, each with their own meta data. (See #1122)

IRIS, 4D Cubes and GUI

Suggested Mentors: Steven Christe (NASA GSFC, SunPy), Nabil Freij (Sheffield University)

Difficulty: Intermediate to Expert

Astronomy knowledge needed: None

Programming skills: Python and basic knowledge of GUI design.

Description:

Recently, a new Sun observing satellite was launched, called IRIS. It performs high-resolution, multi-wavelength observations of the solar atmosphere. As a result, the data is saved out as 4D cubes. These cubes have the following structure, [Time, Wavelength, Spatial]. This format is also used by other ground and space-based telescopes.

Traditionally (which is a powerful thing in astronomy), data analysis is done using a programming language called IDL. Using this language, a GUI was created called CRISPEX and is used to do simple but effective analysis.

This project aims to create a smaller scale version that uses Ginga as a backend. Ginga is a file viewer that was created with astrophysics in mind. It allows basic manipulation of FIT files, which are the standard data container in astrophysics. A Python plugin will be created and integrated into Ginga, allowing the user to open 3D/4D datasets and perform basic analysis, such as, slit extraction.

To achieve this, a previous ESA summer project created a cube class. While it was finished, it was never integrated into SunPy. The code was created to hold and manipulate complex datatypes. It is similar in style to the SunPy Map Class and follows that convention. It however, has extra features enabling specific data formats to be extracted that the user requires, for example, a spectrum. The student will need to become familiar with this code, as small tweaks need to occur before it is added to SunPy.

Finally, the plugin will be created using Python. However, a background in QT would ideally be needed but it is not necessary. Ginga uses multiple backends for the GUI but we plan to use QT.

Plugin Features:

Open FITS file and call the correct SunPy Map or Cube class.
Solar coordinate integration.
Perform slit extraction with the ability to choose a time and/or wavelength range.

Sunpy Feature:

Full IRIS support.

yt

Enable volume rendering of octree datasets

Suggested Mentor(s): Matthew Turk, Sam Skillman

Difficulty: Intermediate

Astronomy knowledge needed: None

Programming skills: Familiarity with Python and Cython, and a familiarity with data structures such as octrees and B-trees.

Description

At present, volume rendering in yt works best with patch-based AMR datasets. Extending this to support octree datasets will enable a much greater diversity of data types and formats to be visualized in this way.

This would include several specific, concrete actions:

Development of viewpoint traversal-ordering for Octree datasets
Refactoring grid traversal methods to travel along the octree data structure without explicit parentage links (i.e., using built-in neighbor-finding functions)
Optimizing for parallel decomposition of octrees in this way

Implementation of deep image format

Suggested Mentor(s): Matthew Turk, Kacper Kowalik

Difficulty: Advanced

Astronomy knowledge needed: None

Programming skills: Familiarity with Python and Cython, and a familiarity with z-buffering.

Description

Deep image compositing can be used to create a notion of depth. This could be utilized for multi-level rendering, rendering of semi-transparent streamlines inside volumes.

This would require:

Developing a sparse image format data container
Utilizing aforementioned container for multi-level rendering

Volume Traversal

Suggested Mentor(s): Matthew Turk, Sam Skillman

Difficulty: Advanced

Astronomy knowledge needed: None

Programming skills: Familiarity with Python and Cython, and a familiarity with data structures such as octrees and B-trees.

Description

Currently yt uses several objects that utilize brick decomposition, i.e. a process by which overlapping grids are broken apart until a full tessellation of the domain (or data source) is created with no overlaps. This is done by the kD-tree decomposition. This project aims to enhance current capabilities by providing easy mechanisms for creating volume traversal mechanisms. There are two components to this: handling tiles of data, and creating fast methods for passing through the data and moving between tiles.

This would require:

Creating flexible (in terms of ordering) iterator over the “tiles” that compose a given data object
Designing and implementing object for storing values returned by aforementioned iterator, that would:
- Cache a slice of the grid or data object that it operates on
- Filter particles from the data object it operates on
- Provide a mechanism for identifying neighbor objects from a given face index.
- Provide mechanisms for generating vertex-centered data or cell-centered data quickly
Implement a mechanism for integrating paths through tiles, that would:
- define a method for determining when a ray has left an object
- define a method for selecting the next brick to traverse or connect to
- update the value of a ray’s direction