#Ideas page for GSoC 2015

Browse ideas for the following projects:

For each participating project, the ideas are organized from easiest to hardest.

Astropy

If you are interested in one of the following Astropy Project ideas please see the Astropy GSoC 2016 Guidelines for additional information that is specific to Astropy.

Implement Scheduling capabilities for Astroplan

Suggested Mentor(s): Erik Tollerud, Eric Jeschke, Josh Walawender

Difficulty: Beginner to Intermediate

Astronomy knowledge needed: Basic understanding of how astronomy observations work, practical experience a plus

Programming skills: Python

Description

The astroplan affiliated package is an Astropy affiliated package that provides tools for planning observations. One valuable feature that astroplan could provide is basic scheduling capabilities for an observing run. Many large observatories have their own schedulers, but this package would be targeted at the needs of the typical individual or small-collaboration observing run. While some initial efforts have occurred, this project would involve expanding those efforts into a full-fledged API and implementing both the interface and the actual scheduler(s).

Ephemerides for Solar System objects in Astropy

Suggested Mentor(s): Marten van Kerkwijk, Erik Tollerud

Difficulty: Beginner to Intermediate

Astronomy knowledge needed: Some understanding of astronomical coordinate systems, basic knowledge of solar system dynamics (or ability to learn as-needed to implement the specific algorithms required)

Programming skills: Python, some knowledge of C might be helpful

Description

An often-requested missing feature in Astropy is the ability to compute ephemerides: the on-sky location of Solar System objects like the planets, asteroids, or artificial satellites. This project would involve implementing just this feature. This will likely start with implementing a get_moon function similar to the existing get_sun to familiarize the student with the important concepts in the astropy.coordinates subpackage. The larger part of the project will likely involve using the orbital elements that the JPL Solar System dynamics group has already complied (there is already a package to read these files: JPLEphem), and translate those into the Astropy coordinates framework. The student will implement these algorithms and also collaborate with the mentors and Astropy community to develop an API to access this machinery.

Implement Public API for ERFA

Suggested Mentor(s): Erik Tollerud, Tom Aldcroft

Difficulty: Intermediate to Expert

Astronomy knowledge needed: None required, but may be helpful for understanding ERFA functionality

Programming skills: Python, Cython, C

Description

Some of the major functionality for Astropy uses the ERFA C library (adapted from the IAU SOFA library) as the back-end for computational “heavy-lifting”. Members of the community have expressed a desire to use this lower-level python wrapper around ERFA for other purposes that may not be directly relevant for Astropy. So this project would involve making the necessary changes to make the ERFA python API public. This includes:

  • Getting the documentation up to the astropy standard (currently it is mostly auto-generated verbatim from the C comments).
  • Implementing a more complete test suite for the python side of the code.
  • Possibly moving it to a separate package as part of the liberfa GitHub organization. This would also include making the necessary changes to ensure everything continues to work in Astropy.
  • Any other steps necessary to ensure the resulting package (or sub-package of Astropy) is stable and relatively easy to use.

Web development for Gammapy

Suggested Mentor(s): Christoph Deil, Johannes King

Difficulty: Intermediate to Expert

Astronomy knowledge needed: None.

Programming skills: Scientific python (Numpy, Scipy, Astropy), Web development (Python backend, Javascript frontend)

Description

Gammapy is a Python package for professional gamma-ray astronomers. We are looking for a web developer with good Python, HTML and Javascript skills that is interested in building web pages and apps to display and browse gamma-ray data and maybe even launch Gammapy analyses. There’s a few different projects we’d like to see realised, depending on your interests and skills. One option is to build a much-improved version of TeVCat (a TeV catalog browse web page), that includes more image and catalog data and interactivity (maps that pan & zoom, search field for source name) with the general public as well as professional gamma-ray astronomers as the target. This project would mostly be front-end development, plus Python scripts to prepare the images and catalogs in suitable formats. Another option is to write several small static site generator scripts or Python web apps that let us browse the gamma-ray data and analysis results, basically a web GUI for Gammapy. That project would mostly be Python web app development, and you have to learn a bit more about Gammapy before GSoC starts.

Data analysis for Gammapy

Suggested Mentor(s): Christoph Deil, Johannes King

Difficulty: Intermediate to Expert

Astronomy knowledge needed: Some, e.g. sky coordinates and projections. Experience with X-ray or gamma-ray data analysis (e.g. Fermi-LAT) is a plus, but not a requirement.

Method knowledge needed: Some experience in data analysis (e.g. images, regions) and statistics (e.g. Poisson noise).

Programming skills: Python (including pytest and Sphinx) and scientific python (Numpy, Scipy, Astropy)

Description

Gammapy is a Python package for professional gamma-ray astronomers. We are looking for someone that’s interested towork on a few distinct data analysis tasks, each taking a few weeks of the GSoC total time. Gammapy is a very young project, and there’s a lot to do. Examples of what needs to be done include implementing new algorithms (e.g. image reprojection, source detection, region-based analysis), bringing existing prototype algorithms to production (improve API and implementation, add tests and docs) as well as grunt work that’s needed to go towards production quality and a Gammapy 1.0 release this fall (e.g. set up continuous integration for example IPython notebooks or adding more tests). To get an idea of what is going on in Gammapy and what still needs to be done, please check out the project on Github (https://github.com/gammapy/gammapy) and browse the documentation a bit (or try out the examples) and if this looks interesting to you, send us an email and let us know what your skills and interests are.

Implement PSF photometry for fitting several overlapping objects at once

Suggested Mentor(s): Moritz Guenther, Brigitta Sipocz

Difficulty: Intermediate to Expert

Astronomy knowledge needed: basic understanding of what photometry is

Programming skills: Python

Description

The photutils package is an Astropy affiliated package that provides tools for photometry (measuring how bright a source is).

There are several ways to do photometry and the package currently implements aperture photometry (just add up all the flux in an image in some some area) and single source point-spread-function (PSF) fitting (fit a function such as a Gaussian to the image). In many situations, sources may overlap in the image, e.g. when observing a dense star cluster, so that we need to fit many functions at once. However, the simple brute-force approach “just fit a model with hundreds of parameters” if there are hundreds of stars usually fails.

This project includes looking at other astronomy codes to see how they tackle the problem; select, modify and improve an algorithm that fits into the astropy modelling framework; implement this in python; and, if it turns out that speed is a problem, move speed-critical parts to Cython. To verify that the new code works, we will compare it to the solutions of established PSF photometry codes.

See https://github.com/OpenAstronomy/openastronomy.github.io/pull/27 for a discussion of some problems and possible solutions that will be addressed in this project.

Bridge sherpa and astropy fitting

Suggested Mentor(s): D. Burke, T. Aldcroft, H. M. Guenther

Difficulty: Expert or better

Astronomy knowledge needed: fitting functions and statistics

Programming skills: Python, C, Cython

Description

Both astropy and Sherpa (https://github.com/sherpa/sherpa/) provide modelling and fitting capabilities; however, Sherpa’s features are way more advanced. Sherpa provides far more build-in models, a larger choice of optimizers and a real variety of fit statistics. Unfortunately Sherpa is less well known and for historical reasons the object-oriented user interface is less polished than the functional state-based interface. The main goal is the bring Sherpa’s optimizers and fit statistic functions to astropy; the stretch goal is to develop a bridge between both packages such that a user can use a astropy models completely interchangably with Sherpa models and fitters. Sherpa models should look like astropy models to astropy to enable situations where the model can be made out of three components (a user defined model, an astropy model and a Sherpa model) and this is then fitted to astropy data using the Sherpa fitters.

This project requires the student to get proficient in two major packages (Not an easy task!), but with code written in just a few weeks of GSoC it will give astropy users access to fitting capabilites that required many years of developer time and that are unfeasable redevelop from scratch.

Enhancements to Ginga, a Toolkit for Building Scientific Image Viewers

Suggested Mentor(s): Eric Jeschke, Pey-Lian Lim, Nabil Freij

Difficulty: Beginning to Advanced, depending on project choices

Astronomy knowledge needed: Some, depending on project choices

Programming skills: Python and scientific python (Numpy, Scipy, Astropy), git version control

Desirable: OpenCL, Javascript/web sockets, C/C++ programming, experience in image or array processing, concurrent programming, experience in using GUI toolkits, github-based workflow

Description

Ginga is a toolkit for constructing scientific image viewers in Python, with an emphasis toward astronomy. Ginga is being used at a number of observatories and institutes for observation and instrument control, quick look, custom data reduction and analysis tasks. The general aim is to build upon this toolkit improving its current features and to expand this toolkit in order for scientists to be able to easily accomplish preliminary data analysis.

We are looking for an individual to work on a few select project areas, depending on skill level and interest. Each project area itself would form a small part of the overall GSOC project. Essentially it would be a large pick and mix but do not let this put you off. This method would allow a range of different contributions to be made the Ginga toolkit that are for your choosing.

Beginning-level:

  • Improve and expand Ginga’s unit test suite and coverage
  • Improve documentation and tutorials, including via Jupyter notebooks and video voice-overs
  • Improve our “native app” packaging for Mac, Unix and Windows
  • Improving LineProfile and Slit plugins
  • Enhance existing plugins by adding GUIs for some common tasks like configuring catalog sources, which are currently done by editing config files
  • Add support for loading broken FITS files by [“fingerprinting” them] (https://github.com/ejeschke/ginga/issues/205)

Intermediate-level:

  • Improve Ginga backends for web browsers (native javascript/web sockets and/or Jupyter notebooks and/or Bokeh server)
  • Enhancements to “traditional” GUI backends (e.g. add support for gtk3, AGG support for python 3, improvements to Qt-based widgets)
  • Graft the astropy-helpers package into Ginga
  • Adding support for calculating approximate line-of-sight velocities
  • Enhance existing plugins for data analysis tasks, usually featuring astropy or affiliated packages

Advanced-level:

  • Implement an OpenCL module that leverages CPU and GPU resources for accelerating some common image processing operations (scaling, transformations, rotations) on numpy image arrays. Benchmark against current CPU based solutions.
  • Improving IO speeds by optimizing use of astropy.fits.io/cfitsio/numpy, lazy reads, file caching hints, optimizing concurrency, etc.
  • Adding support for a binary file format used by a very popular ground-based solar telescope and extending it to support Stokes data products

If you are interested in working on any of these aspects, or want to propose some other work on Ginga, please sign in to Github and comment on Assist the Ginga Project.

Astropy core package

Implement Distribution Support for Quantity

Suggested Mentor(s): Erik Tollerud

Difficulty: Beginner to Intermediate

Astronomy knowledge needed: none, but statistics knowledge/background useful

Programming skills: Python

Description

The Quantity class is powerful but doesn’t have particularly useful support for uncertainties on quantities or other statistical approaches to thinking about numbers. A very straightforward way to make progress on this would be to create a subclass of Quantity called “Distribution” (or similar) that represents a probability density function of a quantity as Monte-Carlo-sampled arrays. This project would involve implementing this subclass, propagating operations while combining distributions, as well as tools for extracting useful information from such distributions. If there is time, this could also involve expanding this system to support common analytically-representable distributions such as Gaussian and Poisson distributions.

Implement image rasterization methods for models

Suggested Mentor(s): Christoph Deil

Difficulty: Intermediate

Astronomy knowledge needed: Basic

Programming skills: Python, Cython

Description

When fitting models to binned data, evaluating the model at the bin centers leads to incorrect results if the model changes significantly within a bin. E.g. think of an image where the point spread function (PSF) only has a width slightly above the pixel size and you want to distinguish small Galaxies from stars.

Currently Astropy models have an evaluate method that can be used to evaluate them on a grid of pixel centers, there’s also an oversampling function to get a better representation of the expected flux in pixels. It would be useful to add methods that allow fast and precise rasterization of models, similar to what graphics libraries do (sparse subsampling or resampling of models evaluated on grids that are appropriate for each model, or anti-aliasing).

There are different options for how to proceed with this project, e.g. possibly add optional extension, sampling grid and bounding box information to the Astropy model classes, or contribute rasterisation code to astropy.modeling or photutils, or expand the existing resampling code in the reproject package. The student should be interested in model fitting and image rasterisation as well as profiling and extensive testing of a given method to make it “just work” for the end user.

Add indexing capability to Table object

Suggested Mentor(s): Tom Aldcroft (Astropy), Stuart Mumford (SunPy)

Difficulty: Intermediate

Astronomy knowledge needed: none

Programming skills: Python, Cython, familiarity with database algorithms

Description

The Table class is the core astropy class for storing and manipulating tabular data. Currently it supports a limited set of database-like capabilities including table joins and grouping. A natural extension of this is to provide the ability to create and maintain an index on one or more columns as well as a table primary key. With these indexed columns available then certain selection and query operations could be highly optimized. The challenge is to maintain the integrity of the indexes as column or table properties change, using state of the art algorithms for high performance.

There are various uses of this functionality, such as supporting time series data, where the index column would allow you to sort the Table correctly as well as performing operations such as truncations and merges while maintaining the integrity of the time series. Other uses include catalogs of positions in the night sky where an index column of astropy coordinate objects would maintain the uniqueness of every position.

To summarize:

  • Add method to create an index for a specified column
  • Add code to maintain these indexes when the table is modified
  • Add method to designate a column as a primary key (possibly maintaining table in sort order for that key)
  • Optimize existing table operations to use indexes where possible
  • Add new methods to select table rows based on column values

Unify and improve file handling

Suggested Mentor(s): Michael Droettboom

Difficulty: Intermediate to Expert

Astronomy knowledge needed: none

Programming skills: Python, Unix features

Description

We have a number of packages that read and write data to files and file-like objects. While there was some initial effort to unify this code in get_readable_fileobj and others, in general each package is handling its own file I/O. This sort of code is notoriously difficult to get right across versions of Python and the different platforms we support, so it would be beneficial to remove this duplication. This also means that some features, such as gzip handling or URL handling, are not universally available or inconsistent across packages. Once this is unified, we can move on to some more advanced features that don’t exist anywhere in astropy, such as HTTP Range fetching (see astropy/#3446, and OS-level file locking to make multiprocessing applications that write to files more robust.

Implement missing astropy.modeling functionality

Suggested Mentor(s): Christoph Deil

Difficulty: Intermediate to expert

Astronomy knowledge needed: Basic

Programming skills: Python

Description

Implement some basic features are still missing in the astropy.modeling package:

  • Fit parameter errors (symmetric and profile likelihood)
  • Poisson fit statistic
  • PSF-convolved models
  • model parameter and fit result serialisation, e.g. to YAML or JSON or XML (e.g. some astronomers use XML)

For the parameter error and Poisson fit statistic part some statistics background is needed, as well as interest in discussing and finding a good API for these things.

An optional fun application at the end of this project (if model and fit result serialisation is implemented) could be to develop an interactive image fitting GUI (e.g. with IPython widgets in the web browser) for common 2D Astropy models, showing data, model and residual images and letting the user adjust model parameters and display fit statistics and results interactively.

Implement framework for handling velocities and velocity transforms in astropy.coordinates

Suggested Mentor(s): Adrian Price-Whelan & Erik Tollerud

Difficulty: Intermediate to Expert

Astronomy knowledge needed: understanding of coordinate transformations, some knowledge of astronomical coordinate systems would be useful

Programming skills: Python

Description

The coordinates subpackage currently only supports transforming positional coordinates, but it would be useful to develop a consistent framework for also transforming velocities (e.g., proper motion to proper motion, or proper motion to cartesian) with full support for barycentric, galactocentric, and LSR motion. This project could be:

  1. working with us to develop a consistent API for handling velocities within coordinates,
  2. developing a trial implementation of an API,
  3. actually doing core development to implement the new features, or
  4. some combination of all of the above.

Implement Public API for ERFA

Suggested Mentor(s): Erik Tollerud

Difficulty: Intermediate to Expert

Astronomy knowledge needed: None required, but may be helpful for understanding ERFA functionality

Programming skills: Python, Cython, C

Description

Some of the major functionality for Astropy uses the ERFA C library (adapted from the IAU SOFA library) as the back-end for computational “heavy-lifting”. Members of the community have expressed a desire to use this lower-level python wrapper around ERFA for other purposes that may not be directly relevant for Astropy. So this project would involve making the necessary changes to make the ERFA python API public. This includes:

  • Getting the documentation up to the astropy standard (currently it is mostly auto-generated verbatim from the C comments).
  • Implementing a more complete test suite for the python side of the code.
  • Possibly moving it to a separate package as part of the liberfa GitHub organization. This would also include making the necessary changes to ensure everything continues to work in Astropy.
  • Any other steps necessary to ensure the resulting package (or sub-package of Astropy) is stable and relatively easy to use.

Packages affiliated with Astropy

Develop an affiliated package for observation planning / scheduling

Suggested Mentor(s): Christoph Deil

Difficulty: Beginner

Astronomy knowledge needed: Intermediate

Programming skills: Python

Description

Now that Astropy can transform from horizontal (altitude/azimuth) to sky coordinates it’s possible to develop tools for observation planning / scheduling (see here for an example). It would be nice to start developing an affiliated package that can be used by observers and observatories to plan and schedule observations. This project could go in a few different directions, including:

  • creating typical tables and plots for observation planning
  • optimising scheduling of observations for given target lists and telescope slew speed / exposure lengths for a given night or even month / year
  • contribute sun / moon rise / set functionality to astropy coordinates
  • a desktop or web GUI

The project could start with a look at the functionality of existing tools and then gather some input on the astropy mailing list what the community wants. The student should have an interest in coordinates, observations planning / scheduling and plotting / GUIs.

Contribute gamma-ray data analysis methods to Gammapy

Suggested Mentor(s): Christoph Deil, Axel Donath

Difficulty: Beginner to intermediate

Astronomy knowledge needed: Basic

Programming skills: Python

Description

Gammapy is an Astropy-affiliated package to simulate and analyse data from gamma-ray telescopes such as Fermi, H.E.S.S. and CTA. A lot of basic functionality is still missing, specifically we think that contributing to one of the sub-packages gammapy.background (background modeling), gammapy.detect (source detection methods) or gammapy.spectrum (spectral analysis methods) would be a good GSoC project if you are interested in implementing specific established data analysis algorithms (e.g. adaptive-ring or reflected region or template background estimation, or spectrum forward-folding or unfolding methods) used in gamma-ray astronomy (but no prior gamma-ray data experience / knowledge needed).

Astropy Acknowledgement/Citation Generator

Suggested Mentor(s): Erik Tollerud

Difficulty: Beginner to Intermediate

Astronomy knowledge needed: none, although some experience with astronomy citation practices might be useful

Programming skills: Python and LaTeX/BibTeX

Description

Some parts of Astropy and affiliated packages use algorithms or tools that have been published in the scientific literature (this includes Astropy itself). To encourage citing these works, it would be useful if Astropy had a feature to allow attaching citations to methods, functions, or packages. This would then allow a user to simply run a function along the lines of “write_citations” and have it print or write a file that tells them what papers to cite. Bonus points if this actually can show BibTeX or LaTeX bibliography entries that can be just dropped into papers with minimal effort on the part of the user.

Adding further spectral standards to specutils

Suggested Mentor(s): Adam Ginsburg & Wolfgang Kerzendorf

Difficulty: Intermediate

Programming skills: Python

Description

Specutils is a package within the astropy collection that deals with operations with spectra. Apart from imaging, spectra are the second main data product in astronomy. While imaging data is collected by hooking a giant DSLR at the end of telescope and sticking coloured glass between telescope and DSLR (a filter), spectra are obtained by breaking light up into its components and then observing the resulting distribution. These data are saved in a variety of formats.

Currently, we are able to read and write a subset of standards that are out there. As a project, we suggest to implement the remaining unsupported standards. All of the code is in Python and a good understanding of classes is needed for this project.

Improve pyregion and pyds9

Suggested Mentor(s): Christoph Deil

Difficulty: Intermediate

Astronomy knowledge needed: Basic

Programming skills: Python

Description

The pyregion package is very useful to work with ds9 and CIAO region files. It is now at https://github.com/astropy/pyregion but it is unfinished … someone has to improve and polish it. In particular the region file parser is very slow (see pyregion#48 and someone interested in parsing should find out why and make it fast. There are several other things to do, e.g. using astropy coordinates everywhere and implementing tests so that it is compatible with ds9 to a very high accuracy. The package could also be extended with Python functions to read / write / visualise MOC files or to unify and [improve the existing Python interfaces to ds9. The student should be interested in sky coordinates and regions, parsing, visualisation, writing tests and docs, and for the ds9 interfaces some Cython coding is probably needed.

Revamp astropython.org web site

Suggested Mentor(s): Tom Aldcroft

Difficulty: Intermediate

Astronomy knowledge needed: Basic / none

Programming skills: Python, web development (javascript etc)

Description

The http://www.astropython.org site is one of the top two generic informational / resource sites about Python in astronomy. This site uses Google App Engine and is basically all custom code built around the bloggart engine. Currently it is getting a bit stale for a few reasons:

  • There is no good mechanism for guest posting to expand the community of people contributing.
  • It is painful to add content because of the antiquated entry interface which now seems to work only on firefox.
  • The comment system is lacking (no feedback to comment authors etc).
  • The website code itself is convoluted and difficult to maintain / improve

The proposal is to start over with all modern tools to bring fresh energy and involvement into this project. All details of how to do this to be determined, but one requirement is to migrate all the current content. Part of this would be re-evaluating current resources as well as digging around to freshen up the resource list.

CasaCore

Improve Python bindings to CasaCore measures

Suggested Mentor(s): Ger van Diepen, Tammo Jan Dijkema

Difficulty: Intermediate

Astronomy knowledge needed: Some understanding of astronomical coordinate systems and transformations

Programming skills: Python, some C++

Description

CasaCore contains many features to perform astronomical coordinate transformations, for example from B1950 to J2000, or from J2000 to Azimuth-Elevation. Moreover, it can compute ephemerides, which may make it useful for many other projects. See http://casacore.github.io/casacore-notes/233 The current python binding python-casacore contains a python binding to the measures library, but this is not a very programmer friendly binding, and thus not much used. An interface to measures exists within CasaCore that makes converting coordinates much easier. This interface was written with TaQL in mind. This project concerns modifying the TaQL measures interface to a python measures interface, thus making casacore measures easily accessible from Python

Frequency conversions for TaQL / python-casacore

Suggested Mentor(s): Ger van Diepen, Tammo Jan Dijkema

Difficulty: Beginner / Intermediate

Astronomy knowledge needed: Some understanding of use of astronomical frequencies (regarding Doppler shifts etc.)

Programming skills: C++

Description

The casacore measures module contains code for converting frequencies between various reference frames (e.g. Rest frequency, Geocentric, Topocentric, Galacto centric). Having this module available in TaQL would make it much more convenient to perform these kind of conversions. Example code exists for other conversions, see e.g. http://casacore.github.io/casacore/group__MeasUDF__module.html

This project concerns writing such a converter for the Doppler and Frequency conversions. It will require tweaking in boost-python, but since the example code is available for other measures, it should not be too hard.

General python-casacore cleanup

Suggested Mentor(s): Gijs Molenaar, Ger van Diepen

Difficulty: Intermediate

Astronomy knowledge needed: none

Programming skills: python

Description

The current python-casacore code is already much improved over the previous “pyrap” implementaion. This python binding to casacore is now python 3 compatible, contains some unit tests, etc. But some work remains to be done:

  • Remove all compile warnings
  • Modernise code, add missing features, maybe more ‘pythonic’.
  • Improve test coverage (24% at the moment)

This is a typical project to learn making good code.

Table plotting for python-casacore

Suggested Mentor(s): Ger van Diepen, Tammo Jan Dijkema

Difficulty: Beginner

Astronomy knowledge needed: Some idea about astronomical units

Programming skills: Python

Radio interferometric data sets are almost always stored in casacore “Measurement Sets”. These can be queried through TaQL, see e.g. http://casacore.github.io/casacore-notes/199 It would be nice to have a plotting routine in python-casacore to easily plot two columns against each other, which nicely formatted axes etc (possibly using wcsaxes).

This would, at the very least, make a nice extension to the taql jupyter kernel underneath http://taql.astron.nl

ChiantiPy

GUI Spectral Explorer

Suggested Mentor(s): Ken Dere

Difficulty: Intermediate

Astronomy knowledge needed: A basic understand of astrophysical spectroscopy

Programming skills: Python

Description

The goal of this project is to provide a graphical user interface to enable a user to explore observed spectra and compare it with theoretical spectra. The basis for the theoretical spectra is the CHIANTI atomic database for astrophysical spectroscopy that was first released in 1997. Programmatic access to the database, which is freely available, is provided by the ChiantiPy package – a pure python package. It is highly object oriented with each ion, such as Fe XVII, being the basic object. Higher level objects are often assembled from a collection of ions, such as when calculating a spectrum. ChiantiPy uses the CHIANTI database to calculate line and continuum intensities as a function of temperature, electron density. This can be done for a set of elemental abundances in CHIANTI or for a user provided set of elemental abundances. At present, if a user wants to compare CHIANTI theoretical spectra it must be done on a case-by-case basis. A GUI explorer, written in Python and preferably PyQt or Wx based, will provide an integrated tool to import observed spectra and plot them alongside theoretical spectra. It will further allow the user to understand what spectra lines contribute to various spectral line profile, how the predicted spectra vary as a function of temperature and density.

It will be necessary to develop techniques to import observed spectra from a variety sources. Typical sources are in FITS files, HDF5 files, or csv files. It will also be important to allow users import their data through modules of their own.

IMS

Solar Storms forecasting server

Suggested Mentors: Antonio del Mastro , Olena Persianova

Difficulty: Intermediate to Hard

Astronomy knowledge needed: None beforehand, the student will be required to research relevant publications.

Programming skills: advanced Python; basic Theano or TensorFlow; basic Django or Flask; experience with some ANN library, such as Keras, theanets or Lasagne.

Description:

Solar storms are responsible for disruption of satellite communication, and damage to space electronical equipments. The storms have to be taken into account also for EVA and habitat maintenance activities, as the higher levels of radiation brought by them have a detrimental effect on the crew member’s health.

Prediction of these storms are essential to prevent said damage. A lot of astronomical data is generated on a daily basis, and this could be used in conjunction with machine learning methods to predict solar storms.

In this project, the student will be required to:

  • Using a machine learning approach, predict the duration and intensity solar storms:
    • The student should use preferably an artificial neural networks approach (although alternatives, such as random forests, SVM, bayesian models or HMMs, can be considered).
    • The predictions should be given with 24-48 hs in advance of a storm (depending on viability).
    • The student should evaluate training and test data provided by IMS, or find a suitable datasaet, if the data provided is unsuitable.
    • The student should evaluate an approach suggested by the IMS to test the model’s performance, or propose a testing procedure of his/her own.
  • Provide information on a dynamically updated web page, using preferably Django or Flask, which should at least include:
    • The real-time and historical sensor’s values; as plots, when appropriate.
    • Useful statistics about the sensors (TBD).
    • The model’s predictions.
    • Useful statistics about the predictions (e.g. RMSE)
  • Incorporate the prediciton model and the web page into the ERAS ecosystem, which means building Tango device servers (at least one for the predictor, more if necessary).

Currently, a few features are being used for the prediction of solar storms, among others:

  1. Radio flux
  2. Sunspot area
  3. Sunspot Number
  4. X-ray Background Flux

We recommend the student to research the viability of using more features in the model.

Some other resources to get started:

NASA’s Solar Storm and Space Weather FAQ

Space Weather Prediction Center’s Historical SWPC Products and Data Displays

Note: If you are interested in this project, please apply via the Python Software Fundation

JuliaAstro

Image compression and efficient table reading in FITSIO.jl

Suggested Mentor(s): Kyle Barbary, Ryan Giordan

Difficulty: Intermediate to Expert

Astronomy knowledge needed: none

Programming skills: Julia, some C

Description

FITS (Flexible Image Transport System) format files are the standard containers for imaging and tabular data in astronomy. The FITSIO.jl package provides support for reading and writing these files in Julia. It is implemented as a high-level, yet efficent, wrapper for the C library cfitsio. This project would involve improving the available functionality and performance of FITSIO.jl. Some desired features, such as I/O of compressed images, reading and writing subsets of tables and appending to existing tables already have implementations in cfitsio. Implementing these would involve understanding the cfitsio API and writing wrappers in Julia using ccall. A more advanced feature is fast I/O of large tables with multiple columns. No C API exists for this – it would involve understanding memory layout of FITS tables, pointer arithmetic and byteswapping. It will be interesting to see whether this can be done efficiently in Julia or if a C shim is required. Finally, benchmarks should be implemented to ensure that FITSIO.jl performance is as good as possible.

For a more detailed list of features, see this issue.

SunPy

Lightcurve Refactor

Suggested Mentor(s): Stuart Mumford, Dan Ryan, Andrew Inglis, Jack Ireland

Difficulty: Beginner

Astronomy knowledge needed: None

Programming skills: Python

Description

The Lightcurve class is one of the three core datatypes in SunPy, along with Map and Spectra. Lightcurve is designed to read in, process and store meta data related to solar physics time series data. Currently, Lightcurve uses the pandas library as its underlying data structure, however, this is subject to change in the future.

Much like the map submodule, lightcurve needs to be able to read in various supported data formats (such as FITS, ascii and others in the future), store their meta data and give users unified access to this metadata independently of the original source of the data.

As currently implemented (as of 0.6) the lightcurve module performs three core tasks:

  1. Download the raw data
  2. Read this data into a pandas dataframe
  3. store the meta data obtained with the data.

As of the SunPy 0.7 release the first stage will be moved out of lightcurve and into the net subpackage as part of the UnifiedDownloader Pull Request. This leaves lightcurve in a similar position to map where the data acquisition is not part of the core data type and is managed separately.

The objective of this project is to re-implement the core of the lightcurve submodule, such that it no longer contains the code to download data from the internet. The lightcurve module should be able to open file from disk that have been downloaded using the new UnifiedDownloader submodule. The lightcurve factory must be able to read files from multiple sources some of which will be able to be auto-detcted and some which will not. The lightcurve module must also be able to combine multiple files into a single timeseries.

Expected Outcomes

Someone under taking this project will complete the following tasks:

  1. Become familiar with the UnifiedDownloader code, if it has not been accepted into the SunPy codebase, complete the remaining tasks for this to be achieved.
  2. Write a factory class for lightcurve similar to the sunpy.map.Map class. This class will be a generic constructor for lightcurve allowing the user to instantiate any one of the many subclasses of GenericLightcurve present in sunpy.lightcurve.sources. The API design for the factory class is in SEP 7.
  3. Design and develop a robust method of dealing with lightcurve meta data, which can handle joining different parts of timeseries from different files, each with their own meta data. (See #1122)

A successful proposal for this project will demonstrate that the applicant has understood the mechanism behind the Map factory as already implemented in SunPy and presents a timeline of what things need to change in Lightcurve to mirror the design of Map and follow the design for Lightcurve in SEP 7.

Implementing AIA response functions in SunPy

Suggested Mentor(s): Drew Leonard, Will Barnes

Difficulty: Beginner

Astronomy knowledge needed: Some knowledge of coronal emission processes would be beneficial.

Programming skills: Python.

Description

The CHIANTI atomic physics database is a valuable resource for solar physics. The CHIANTI database holds a large amount of information on the physical properties of different elements in different ionisation states and enables the calculation of various parameters from this information. Using CHIANTI it is possible to calculate the spectra of various types of solar plasma (e.g., flare, quiet sun, etc.) from the observed elemental abundances and ionisation states. These synthetic spectra are essential for calculating reponse functions of various instruments. An instrument’s wavelength response function describes how much light emitted at a given wavelength is measured by the instrument. Similarly, the temperature response function describes the instrument’s sensitivity to light emitted by plasma at a particular temperature. These response functions play a vital role in correctly interpreting observations, as does proper calculation of these functions.

Currently, SunPy has no implementation of instrument response functions. This project would develop the routines necessary to calculate response functions using the Python interface to the CHIANTI database, ChiantiPy. The primary implementation of this would be to produce default wavelength and temperature response functions for the Atmospheric Imaging Assembly instrument. A detailed discussion of the AIA response functions can be found in Boerner et al 2012.

Other potential applications of ChiantiPy in SunPy include:

  1. Generalisation of the code to produce response functions using arbitrary values of physical parameters (elemental abundances, etc.).
  2. Calculation of reponse functions for other instruments.
  3. Conversion of ChiantiPy spectra objects to SunPy Spectra objects.

Expected Outcomes: This project would facilitate SunPy becoming independent from Solar SoftWare (SSW) for analysing AIA data, particularly with respect to inferring plasma properties such as temperature and density.

A successful proposal will outline a schedule for implementing at least a single set of temperature and wavelength response functions for AIA, and the response functions for arbitrary plasma conditions would be a bonus. Familiarity with CHIANTI, ChiantiPy and SSW’s implementation of the response functions will help to properly assess how long will be required to recreate them in SunPy.

Real time data access and visualisation tools

Suggested Mentor(s): David Perez-Suarez, Jack Ireland

Difficulty: Beginner-Intermediate

Astronomy knowledge needed: none

Programming skills: Python

Description

Real time data is very useful for spaceweather operations, SunPy provides access to data by different virtual observatories or services (like sunpy.net.vso or sunpy.net.hek) or by accessing to direct data archives. Fido (formerly called UnifiedDownloader) provides a single point of access to them all. However, this needs to be extended to other data archives, and a logic implemented so depending on the time range asked it downloads the data from the realtime archives or from the full-archive.

Additionally, this project should produce some visualisation tools to combine data from different sources. Some examples are overlay of active regions on top of solar images (like in SolarMonitor), GOES X-ray flux with active regions number on the flares detected (like in Latest Events), latest features observed available from HEK on top of a map (e.g. isolsearh).

In summary, this project has two objectives:

  1. Implementation of real time archives and logic on Fido.
  2. Creation of visualisation tools to represent real-time data.

Familiarisation with the unidown branch and matplotlib library will help you to create a proper timeline on how much time will take to implement, test and document each part of the project.

Improvements to the SunPy Database

Suggested Mentor(s): Stuart Mumford, Simon Liedtke, Steven Christe

Difficulty: Intermediate

Astronomy knowledge needed: None

Programming skills: Python, some database design knowledge would be helpful.

Description

The database module provides functionality to users to manage collections of files on disk in a way not reliant upon folder structure and file name. The database allows users to find files on disk by either physical parameters, such as wavelength and time or properties of the instrument such as name and spacecraft. It also allows more complex queries by enabling searches of the raw meta data associated with the files.

The SunPy database will also act as a proxy for some web services supported by SunPy. When used like this, the database module takes a user query, downloads the data from the web service and then stores it in the database, and then returns the query to the user. SunPy contains clients to various web services, the first and primary web service SunPy supported was the Virtual Solar Observatory (VSO), this is the web service the database was originally designed to support. Since the original development of the database module, the database has also been extended to support the HEK client.

The SunPy web clients, use a system named attrs (an abbreviation for attributes) to compose queries, this attrs system is also used by the database to perform queries on the database, with some of the attrs shared between the VSO client and the database. Recently, a new downloader front end (originally named UnifiedDownloader, now affectionately known as Fido) has been developed, this provides a Factory Class, with which various download clients (such as the VSO) can register with, providing information about which attrs and attr values that client supports. Using this approach, the Fido downloader provides a single interface to the many different services SunPy supports. The first part of this project will be to update the database module to support the new Fido interface, specifically by using Fido inside the database to retrieve data.

The second part of the project will be to update the caching mechanism implemented in the database module. The current caching system serialises the users VSO query and stores it as JSON, upon the user requesting another query, the query will be compared to the cache of serialised queries and if a match is found, the results from the cached query returned. This mechanism is limiting in that if the user requests 100 records in query A and 100 records in query B, but 50 of the records requested in both queries are the same (i.e. two overlapping time windows) then the 50 records will be re-downloaded as the cache of query A will not match query B. The updated caching system will store the records a query returns (before the data is downloaded) and then link the results of a query to the records in the database (once the data has been downloaded). Then when records are retrieved from a web service, any records that are stored in the cache table can be skipped for retrieval from the web service and returned from the records in the database. This will allow the caching of partial queries rather than whole queries as is currently implemented.

This project aims to achieve the following things:

  1. Update the current implementation of the database using the VSO attributes to use the slightly refactored Fido attributes and use Fido inside the database to download data from the VSO.
  2. Implement a new caching mechanism bases of the results of Queries with Fido rather than the current caching which is based upon the VSO query.

A successful proposal will schedule updates to the database package in small sections, rather than in one large pull request. The work should be understood and broken down into individual sections.

There are various other maintenance tasks which need undertaking (https://github.com/sunpy/sunpy/labels/Database) which would be a good way for someone interested in this project to familiarise themselves with the codebase.

GUI to use LCT tools

Suggested Mentor(s): Jose Iván Campos Rozo (National Astronomical Observatory, National University of Colombia), Santiago Vargas Domínguez (National Astronomical Observatory, National University of Colombia), David Pérez Suárez.

Difficulty: Intermediate

Astronomy knowledge needed: None

Programming skills: Python, basic knowledge of qt4, pyqt4, qt designer

Description:

The Local Correlation Tracking (LCT, November & Simon, 1988) technique is a robust method used to study the dynamics of structures in a time series of images. By tracking pixel displacements, using a correlation window, LCT can determine proper motions and generate flow maps of horizontal velocities. This procedure is used to study the dynamics of plasma in the solar photosphere at different spatial scales, e.g the analysis of granular and supergranular convective cells, meridional flows, etc. A widget implemented in Python was developed. It generates a user-friendly graphical user interface (GUI) to control various parameters for the process of calculating flow maps of proper motions for a series of filtergrams (data cube). Our purpose is to implement this tool in Sunpy using its structure and to improve it with some more options, i.e. masks, statistics, histograms, contours and multi-plots. Although an initial version is already developed, our proposal is to focus on the efficient integration of the code in the SunPy libraries. The code (without widget files yet) is https://github.com/Hypnus1803/flow_maps

Expected Outcomes: To integate efficiently the code in SunPy libraries.

SunPy

Improvements to the SunPy Database

Suggested Mentor(s): Stuart Mumford, Steven Christe

Difficulty: Beginner

Astronomy knowledge needed: None

Programming skills: Python, some database knowledge would be helpful, but not required.

Description

The database module provides functionality to users to manage collections of files on disk in a way not reliant upon folder structure and file name. The database allows users to find files on disk by either physical parameters, such as wavelength and time or properties of the instrument such as name and spacecraft. It also allows more complex queries by enabling searches of the raw meta data associated with the files.

The improvements to the database functionality that would be implemented by this project include:

  1. Integration of the new UnifiedDownloader code into the database search, to replace the direct VSO integration current present. (The VSO is a repository of solar physics data, SunPy’s VSO API has been wrapped by UnifiedDownloader.)
  2. Support for relative paths in the database module #783 to allow a centralised database with multiple users, all referencing a central file store mounted with different absolute paths on each client.
  3. Supporting all data supported by the sunpy.lightcurve module in the database. The major hurdle here is the lack of standardisation in the file used by this data.

There are various other maintenance tasks which need undertaking (https://github.com/sunpy/sunpy/labels/Database) which would be a good way for someone interested in this project to familiarise themselves with the codebase.

Integrating ChiantiPy and SunPy

Suggested Mentor(s): Dan Ryan, Ken Dere

Difficulty: Beginner

Astronomy knowledge needed: Some knowledge of spectra.

Programming skills: Python.

####Description

The CHIANTI atomic physics database is a valuable resource for solar physics. The CHIANTI database holds a large amount of information on the physical properties of different elements in different ionisation states and enabled the calculation of various parameters from this information. Using CHIANTI it is possible to calculate the spectra of various types of solar plasma (e.g., flare, quiet sun, etc.) from the observed elemental abundances and ionisation states. These synthetic spectra are essential for comparing to the data observed by various instruments to calculate the response functions of the instruments and to compare to the properties of observed plasma to allow the calculation of physical parameters such as temperature.

Currently, no SunPy code uses the Python interface to the CHIANTI database ChiantiPy. This project would develop the routines to be included in SunPy to use ChiantiPy for various physical calculations desired. The first potential use of ChiantiPy in SunPy is in the sunpy.instr.goes module, where currently data tables calculated using CHIANTI are downloaded from the Solar Software (SSW) distribution, these data tables should be created using SunPy.

Other potential application of ChiantiPy in SunPy include:

  1. Conversion of ChiantiPy spectra objects to SunPy Spectra objects.
  2. Calculation of AIA temperature response functions from ChiantiPy contribution functions.

Expected Outcomes: This project would facilitate SunPy becoming independent from Solar SoftWare (SSW) in producing and maintaining files required by the sunpy.instr.goes module for determining the thermodynamic properties of the emitting plasma observed by GOES. It would also allow SunPy users to calculate spectra and exclusively through Python without relying on SSW.

Support for analysis of Solar Energetic Particles

Suggested Mentor(s): David Pérez-Suárez

Difficulty: Beginner

Astronomy knowledge needed: None

Programming skills: Python.

####Description

SunPy is able to read a lightcurve from different sources (GOES x-ray, Lyra, Norh,…), however these are not all. SoHO/ERNE (Energetic and Relativistic Nuclei and Electron experiment on board SoHO) measures one of the important effects in Space Weather, Solar Energetic Particles (SEP). The data of such instrument (as for GOES particle measurements) comes as plaintext csv files with header information. This project should add ERNE to the SunPy supported instruments by being able to read these files in as a lightcurve object and allow to perform the basic operations used when such data is analysed: eg. energy ranges binning, visualisation, …

Lightcurve Refactor

Suggested Mentor(s): Stuart Mumford, Dan Ryan, Andrew Inglis

Difficulty: Intermediate

Astronomy knowledge needed: None

Programming skills: Python

Description

The Lightcurve class is one of the three core datatypes in SunPy, along with Map and Spectra. Lightcurve is designed to read in, process and store meta data related to solar physics time series data. Currently, Lightcurve uses the pandas library as its underlying data structure, however, this is subject to change in the future.

Much like the map submodule, lightcurve needs to be able to read in various supported data formats (such as FITS, ascii and others in the future), store their meta data and give users Beginner and unified access to this metadata independently of the original source of the data.

As currently implemented (as of 0.5) the lightcurve module performs three core tasks:

  1. Download the raw data
  2. Read this data into a pandas dataframe
  3. store the meta data obtained with the data.

As of the SunPy 0.6 release the first stage will be moved out of lightcurve and into the net subpackage as part of the UnifiedDownloader (name subject to change) Pull Request. This leaves lightcurve in a similar position to map where the data acquisition is not part of the core data type and is managed separately. Therefore, enabling the implementation of a factory class like Map for the lightcurve module.

Expected Outcomes

Someone under taking this project will complete the following tasks:

  1. Become familiar with the UnifiedDownloader code, if it has not been accepted into the SunPy codebase, complete the remaining tasks for this to be achieved.
  2. Re-write any new lightcurve sources that were not included in the UnifiedDownloader code as sources for UnifiedDownloader.
  3. Write a factory class for lightcurve similar to the sunpy.map.Map class. This class will be a generic constructor for lightcurve allowing the user to instantiate any one of the many subclasses of GenericLightcurve present in sunpy.lightcurve.sources. The API design for the factory class is here: https://github.com/sunpy/sunpy-SEP/pull/6
  4. Design and develop a robust method of dealing with lightcurve meta data, which can handle joining different parts of timeseries from different files, each with their own meta data. (See #1122)

IRIS, 4D Cubes and GUI

Suggested Mentors: Steven Christe (NASA GSFC, SunPy), Nabil Freij (Sheffield University)

Difficulty: Intermediate to Expert

Astronomy knowledge needed: None

Programming skills: Python and basic knowledge of GUI design.

Description:

Recently, a new Sun observing satellite was launched, called IRIS. It performs high-resolution, multi-wavelength observations of the solar atmosphere. As a result, the data is saved out as 4D cubes. These cubes have the following structure, [Time, Wavelength, Spatial]. This format is also used by other ground and space-based telescopes.

Traditionally (which is a powerful thing in astronomy), data analysis is done using a programming language called IDL. Using this language, a GUI was created called CRISPEX and is used to do simple but effective analysis.

This project aims to create a smaller scale version that uses Ginga as a backend. Ginga is a file viewer that was created with astrophysics in mind. It allows basic manipulation of FIT files, which are the standard data container in astrophysics. A Python plugin will be created and integrated into Ginga, allowing the user to open 3D/4D datasets and perform basic analysis, such as, slit extraction.

To achieve this, a previous ESA summer project created a cube class. While it was finished, it was never integrated into SunPy. The code was created to hold and manipulate complex datatypes. It is similar in style to the SunPy Map Class and follows that convention. It however, has extra features enabling specific data formats to be extracted that the user requires, for example, a spectrum. The student will need to become familiar with this code, as small tweaks need to occur before it is added to SunPy.

Finally, the plugin will be created using Python. However, a background in QT would ideally be needed but it is not necessary. Ginga uses multiple backends for the GUI but we plan to use QT.

Plugin Features:

  1. Open FITS file and call the correct SunPy Map or Cube class.
  2. Solar coordinate integration.
  3. Perform slit extraction with the ability to choose a time and/or wavelength range.

Sunpy Feature:

  1. Full IRIS support.

X core package

Implement x support for y

Suggested Mentor(s): Name Surname

Difficulty: Beginner, Intermediate or Expert

Astronomy knowledge needed: none (or some background)

Programming skills: Python, C, Julia, Java, super powers

Description

The y class is powerful but …

Implement m support for n

Suggested Mentor(s): Nombre Apellido

Difficulty: Beginner, Intermediate or Expert

Astronomy knowledge needed: none (or some background)

Programming skills: Python, C, Julia, Java, super powers

Description

The n class is super powerful but …

yt

If you are interested in one of the yt ideas, please see the GSoC 2016 Guidelines on the yt bitbucket wiki.

All projects in this section are for yt, an analysis and visualization environment for particle and mesh-based volumetric data. It has readers for most astrophysical simulation codes, as well as a few nuclear engineering simulation codes. It can handle data produced by particle-based codes, as well data produced by codes that use various types of mesh structures, including uniform and adaptively refined meshes as well as unstructured and semistructured meshes. yt is able to analyze and visualize these datasets with substantially different on-disk and in memory formats using a common language.

To learn more about how to use yt to interact with simulation data, take a look at the quickstart guide, as well as the rest of the yt documentation. We also provide a listing of sample test datasets that can be loaded by yt. We use a variety of public communication channels, including mailing lists, IRC, and a slack channel that can be joined by anyone interested in yt development.

For more information about contributing to yt, take a look at our developer guide. To see discussions about past yt projects, take a look at the yt enhancement proposal (YTEP) listing.

Integrate yt plots with interactive matplotlib backends

Suggested Mentor(s): Nathan Goldbaum, Matthew Turk

Difficulty: Intermediate

Knowledge needed: Familiarity with matplotlib. Knowledge of matplotlib’s object oriented API a plus.

Programming skills: Python. GUI programming.

Description

Currently, all yt plotting objects have a show() method that displays a version of the plot in Jupyter notebooks. This works for the most part and is relatively simple due to Jupyter’s data model. However, this reliance on the notebook fails for users who work primarily from the command line, either in the vanilla python interpreter or the IPython command line application. We receive many requests from confused users who do not understand why show() errors out in the regular python interpreter or appears to do nothing in IPython, when they expect a GUI window to pop up when they run show().

This project would have a student modify yt’s plotting objects to hook into matplotlib’s interactive backends so that plots can be optionally displayed using a GUI plotting window. Optimally, we would also enable callbacks so that zooming and selecting does the “right thing”, generating high resolution data when it is available.

This is constrained by maintaining backward compatibility: by default yt should not fail when generating plots on headless devices (e.g. when connecting over SSH to a supercomputer).

Deliverables:

  • A proof of concept demonstrating how to hook into matplotlib’s interactive backends using the matplotlib object-oriented API, or a way to show how to gracefully fall back to using pyplot instead of the object oriented API.

  • A YTEP describing the proposed approach for modifying yt’s plotting infrastructure to support matplotlib’s interactive plotting backends.

  • The implementation for the YTEP submitted as a bitbucket pull request to the main yt repository.

Improve test coverage and test performance

Suggested Mentor(s): Kacper Kowalik, Nathan Goldbaum

Difficulty: Beginner to Advanced, depending on where the student takes the project

Knowledge needed: Familiarity with the nose testing package.

Programming skills: Python, Cython

Description

Currently yt’s test suite is split between unit tests (which take about 45 minutes to run) and answer tests, which are normally only run on a continuous integration server. Altogether the tests only cover about a third of the yt codebase, so much of the code in yt needs test coverage. Additionally, the tests take a long time to run, and we would like to reduce the test runtime while simultaneously increasing code coverage.

This project could go in a number of directions:

  • Implement a way to retrofit the current tests for different geometries (e.g. cartesian, cylindrical, and spherical coordinates) and data styles (e.g. particle data, as well as various kind of mesh data, including uniform resolution, octree, patch AMR, and unstructured meshes). Ideally this would allow us to test all functionality for all possible data styles. This will require learning and improving the “Stream” frontend, which allows the injestion of in-memory data into yt.

  • Identify areas of the code that are not well tested and devise tests for them. This will require measuring the test coverage of yt’s Python and Cython components. The student working on this will need to gain familiarity with untested or undertested parts of the codebase and add new tests. Optimally the new tests will make use of new reusable infrastructure that will be helpful for tests across the yt codebase.

  • Improve volume rendering and visualization unit tests. Right now visualization tests rely heavily on answer testing and image comparison. It would be more flexible and easier to understand when things go wrong if the tests instead compared with a predicted answer using some sort of simplified geometry or via introspection.

Deliverables:

  • Develop a framework for measuring test covering in yt’s python and cython components. Triage the reports to look for areas that are user facing and have poor test coverage.

  • Make a number of pull requests adding tests across the yt codebase.

  • Modify existing testing infrastructure or develop new test infrastructure to improve testing of yt functionality on different data types.

Domain contexts and domain-specific fields

Suggested Mentor(s): Britton Smith, Matthew Turk

Difficulty: Beginner to Intermediate

Knowledge needed: Undergrad level Physics knowledge. More specific domain-specific knowledge of astronomy, hydrodynamics, finite-element methods, GIS, meteorology, geophysics, oceanography a plus

Programming skills: Python

The original focus of yt was to analyze datasets from astrophysical simulations. However, use of yt has been expanding to other scientific domains, such as nuclear physics, meteorology, and geophysics. Still, much of the infrastructure within yt is built upon the assumption that the datasets being loaded are astrophysical and hydrodynamic in nature. This assumption informs the choice of derived fields made available to the user as well as the default unit system. For example, fields such as “Jeans mass” and “X-ray emissivity” in CGS units are of little use to an earthquake simulation.

The goal of this project is to develop a system for domain contexts, sets of fields and unit systems associated with specific scientific domains. Rather than having all fields be made available to all datasets, each dataset is given a domain context, which specifies the relevant fields and most meaningful unit system. Domain contexts could also be subclassed to provide further specificity, for example, cosmology as a subclass of astrophysics.

Deliverables:

  • For each of the existing frontends, identify the relevant field plugins. Create a data structure to associate with each frontend that lists only the relevant plugins. Take the field plugin loading machinery, which currently just loops over all plugins, and have it only load plugins relevant to the loaded frontend.

  • With the above as an example, identify and document all of the places in the code where the domain is assumed to be astronomy. Use this to come up with a set of attributes that minimally describe a scientific domain, i.e., list of field plugins, unit system, etc.

  • Write up a YTEP describing the proposed design and ideas for implementation. Should identify an initial set of domain contexts, sort fields into domain contexts, and sketch how frontends should declare needed domain contexts.

  • Create a domain context class with the identified attributes. Implement an Base, astronomy, and possibly a nuclear engineering domain context and associate it with the existing frontends.

yt

Enable volume rendering of octree datasets

Suggested Mentor(s): Matthew Turk, Sam Skillman

Difficulty: Intermediate

Astronomy knowledge needed: None

Programming skills: Familiarity with Python and Cython, and a familiarity with data structures such as octrees and B-trees.

Description

At present, volume rendering in yt works best with patch-based AMR datasets. Extending this to support octree datasets will enable a much greater diversity of data types and formats to be visualized in this way.

This would include several specific, concrete actions:

  1. Development of viewpoint traversal-ordering for Octree datasets
  2. Refactoring grid traversal methods to travel along the octree data structure without explicit parentage links (i.e., using built-in neighbor-finding functions)
  3. Optimizing for parallel decomposition of octrees in this way

Implementation of deep image format

Suggested Mentor(s): Matthew Turk, Kacper Kowalik

Difficulty: Advanced

Astronomy knowledge needed: None

Programming skills: Familiarity with Python and Cython, and a familiarity with z-buffering.

Description

Deep image compositing can be used to create a notion of depth. This could be utilized for multi-level rendering, rendering of semi-transparent streamlines inside volumes.

This would require:

  1. Developing a sparse image format data container
  2. Utilizing aforementioned container for multi-level rendering

Volume Traversal

Suggested Mentor(s): Matthew Turk, Sam Skillman

Difficulty: Advanced

Astronomy knowledge needed: None

Programming skills: Familiarity with Python and Cython, and a familiarity with data structures such as octrees and B-trees.

Description

Currently yt uses several objects that utilize brick decomposition, i.e. a process by which overlapping grids are broken apart until a full tessellation of the domain (or data source) is created with no overlaps. This is done by the kD-tree decomposition. This project aims to enhance current capabilities by providing easy mechanisms for creating volume traversal mechanisms. There are two components to this: handling tiles of data, and creating fast methods for passing through the data and moving between tiles.

This would require:

  1. Creating flexible (in terms of ordering) iterator over the “tiles” that compose a given data object
  2. Designing and implementing object for storing values returned by aforementioned iterator, that would:
    • Cache a slice of the grid or data object that it operates on
    • Filter particles from the data object it operates on
    • Provide a mechanism for identifying neighbor objects from a given face index.
    • Provide mechanisms for generating vertex-centered data or cell-centered data quickly
  3. Implement a mechanism for integrating paths through tiles, that would:
    • define a method for determining when a ray has left an object
    • define a method for selecting the next brick to traverse or connect to
    • update the value of a ray’s direction