Astropy: Tables

Interact

Objectives

  • Create tables
  • Access data in tables
  • Combining tables
  • Aggregation
  • Masking
  • Reading/writing

Documentation

For more information about the features presented below, you can read the astropy.table docs.

Creating tables

import numpy as np
from astropy.table import Table
# Creating a table from scratch
t1 = Table()
t1['name'] = ['source 1', 'source 2', 'source 3']
t1['flux'] = [1.2, 2.2, 3.1]
# Looking at the table
t1
Table length=3
nameflux
str8float64
source 11.2
source 22.2
source 33.1
# Adding a column
t1['size'] = [1,5,4]
t1
Table length=3
namefluxsize
str8float64int64
source 11.21
source 22.25
source 33.14
# Accessing a column
t1['size']
<Column name='size' dtype='int64' length=3>
1
5
4
# Converting to a Numpy array
np.array(t1['size'])
array([1, 5, 4])
# Accessing a cell
t1['size'][0]
1
# Accessing a row
t1[0]
Row index=0
namefluxsize
str8float64int64
source 11.21

Units in tables

# Set unit on column
t1['size'].unit = 'cm'
t1
Table length=3
namefluxsize
cm
str8float64int64
source 11.21
source 22.25
source 33.14

Some unitful operations will then work:

t1['size'].to('m')

$[0.01,~0.05,~0.04] \; \mathrm{m}$

However, you may run into unexpected behavior, so if you are planning on using table columns as Quantities, we recommend that you use the QTable class:

type(t1['size'])
astropy.table.column.Column
from astropy.table import QTable
qt1 = QTable(t1)
type(qt1['size'])
astropy.units.quantity.Quantity

Challenge

  1. Make a table that contains three columns: spectral type, temperature, and radius, and incude 5 rows with fake data (or real data if you like, for example from here). Try including units on the columns that can have them.
  2. Find the mean temperature and the maximum radius
  3. Try and find out how to add and remove rows
  4. Add a new column which gives the luminosity (using $L=4\pi R^2 \sigma T^4$)
#1
from astropy import units as u
t = QTable()
t['spectral type'] = ['O5', 'B5', 'A5', 'F5', 'G5']
t['radius'] = [12, 3.9, 1.7, 1.3, 0.92] * u.R_sun
t['temperature'] = [45000, 15000, 8200, 6400, 5700] * u.K
t
QTable length=5
spectral typeradiustemperature
solRadK
str2float64float64
O512.045000.0
B53.915000.0
A51.78200.0
F51.36400.0
G50.925700.0
#2
print('Mean temperature:', np.mean(t['temperature']))
print('Maximum radius:', np.mean(t['radius']))
Mean temperature: 16060.0 K
Maximum radius: 3.964000000000001 solRad

#3
t.add_row({'spectral type': 'K5',
           'temperature': 4300 * u.K,
           'radius': 0.72 * u.R_sun})
t.remove_row(0)
t
QTable length=5
spectral typeradiustemperature
solRadK
str2float64float64
B53.915000.0
A51.78200.0
F51.36400.0
G50.925700.0
K50.724300.0
#4
from numpy import pi
from astropy.constants import sigma_sb
t['luminosity'] = (4 * pi * t['radius'] ** 2 * sigma_sb * t['temperature'] ** 4).to(u.L_sun)
t
QTable length=5
spectral typeradiustemperatureluminosity
solRadKsolLum
str2float64float64float64
B53.915000.0693.7250235023215
A51.78200.011.7718945281512
F51.36400.02.554463553120115
G50.925700.00.8049486705065919
K50.724300.00.15967316182594046

Iterating over tables

It is possible to iterate over rows or over columns. To iterate over rows, simply iterate over the table itself:

for row in t1:
    print(row)
  name   flux size
               cm 
-------- ---- ----
source 1  1.2    1
  name   flux size
               cm 
-------- ---- ----
source 2  2.2    5
  name   flux size
               cm 
-------- ---- ----
source 3  3.1    4

Rows can act like dictionaries, so you can access specific columns from a row:

for row in t1:
    print(row['name'])
source 1
source 2
source 3

Iterating over columns is also easy:

for colname in t1.columns:
    column = t1[colname]
    print(column)
  name  
--------
source 1
source 2
source 3
flux
----
 1.2
 2.2
 3.1
size
 cm 
----
   1
   5
   4

Accessing specific rows from a column object can also be done with the item notation:

for colname in t1.columns:
    column = t1[colname]
    print(column[0])
source 1
1.2
1

Joining tables

from astropy.table import join
t2 = Table()
t2['name'] = ['source 1', 'source 3']
t2['flux2'] = [1,9]
t3 = join(t1, t2, join_type='outer')
t3
Table masked=True length=3
namefluxsizeflux2
cm
str8float64int64int64
source 11.211
source 22.25--
source 33.149
np.mean(t3['flux2'])
5.0

Masked tables

t4 = Table(masked=True)
t4['name'] = ['source 1', 'source 2', 'source 3']
t4['flux'] = [1.2, 2.2, 3.1]
t4['flux'].mask = [1,0,1]
t4
Table masked=True length=3
nameflux
str8float64
source 1--
source 22.2
source 3--

Slicing

Tables can be sliced like Numpy arrays:

obs = Table.read("""name    obs_date    mag_b  mag_v
                    M31     2012-01-02  17.0   17.5
                    M31     2012-01-02  17.1   17.4
                    M101    2012-01-02  15.1   13.5
                    M82     2012-02-14  16.2   14.5
                    M31     2012-02-14  16.9   17.3
                    M82     2012-02-14  15.2   15.5
                    M101    2012-02-14  15.0   13.6
                    M82     2012-03-26  15.7   16.5
                    M101    2012-03-26  15.1   13.5
                    M101    2012-03-26  14.8   14.3
                    """, format='ascii')
obs[1:4]
Table length=3
nameobs_datemag_bmag_v
str4str10float64float64
M312012-01-0217.117.4
M1012012-01-0215.113.5
M822012-02-1416.214.5
obs[obs['mag_b'] > 16]
Table length=4
nameobs_datemag_bmag_v
str4str10float64float64
M312012-01-0217.017.5
M312012-01-0217.117.4
M822012-02-1416.214.5
M312012-02-1416.917.3
obs['mag_b', 'mag_v']
Table length=10
mag_bmag_v
float64float64
17.017.5
17.117.4
15.113.5
16.214.5
16.917.3
15.215.5
15.013.6
15.716.5
15.113.5
14.814.3

Challenge

Starting from the obs table:

  1. Make a new table that shows every other row, starting with the second row? (that is, the second, fourth, sixth, etc. rows).
  2. Make a new table the only contains rows where name is M31
#1
subset1 = obs[1::2]
subset1
Table length=5
nameobs_datemag_bmag_v
str4str10float64float64
M312012-01-0217.117.4
M822012-02-1416.214.5
M822012-02-1415.215.5
M822012-03-2615.716.5
M1012012-03-2614.814.3
#2
subset2 = obs[obs['name'] == 'M31']
subset2
Table length=3
nameobs_datemag_bmag_v
str4str10float64float64
M312012-01-0217.017.5
M312012-01-0217.117.4
M312012-02-1416.917.3

Grouping and Aggregation

obs_by_name = obs.group_by('name')
obs_by_name
Table length=10
nameobs_datemag_bmag_v
str4str10float64float64
M1012012-01-0215.113.5
M1012012-02-1415.013.6
M1012012-03-2615.113.5
M1012012-03-2614.814.3
M312012-01-0217.017.5
M312012-01-0217.117.4
M312012-02-1416.917.3
M822012-02-1416.214.5
M822012-02-1415.215.5
M822012-03-2615.716.5
for group in obs_by_name.groups:
    print(group)
    print("")
name  obs_date  mag_b mag_v
---- ---------- ----- -----
M101 2012-01-02  15.1  13.5
M101 2012-02-14  15.0  13.6
M101 2012-03-26  15.1  13.5
M101 2012-03-26  14.8  14.3

name  obs_date  mag_b mag_v
---- ---------- ----- -----
 M31 2012-01-02  17.0  17.5
 M31 2012-01-02  17.1  17.4
 M31 2012-02-14  16.9  17.3

name  obs_date  mag_b mag_v
---- ---------- ----- -----
 M82 2012-02-14  16.2  14.5
 M82 2012-02-14  15.2  15.5
 M82 2012-03-26  15.7  16.5


obs_by_name.groups.aggregate(np.mean)
Table length=3
namemag_bmag_v
str4float64float64
M10115.00000000000000213.725000000000001
M3117.017.400000000000002
M8215.69999999999999815.5

Writing data

obs.write('test.fits', overwrite=True)
obs.write('test.vot', format='votable', overwrite=True)

Reading data

t4 = Table.read('2mass.tbl', format='ascii.ipac')
t4
Table masked=True length=929
radecclonclaterr_majerr_minerr_angdesignationj_mj_cmsigj_msigcomj_snrh_mh_cmsigh_msigcomh_snrk_mk_cmsigk_msigcomk_snrph_qualrd_flgbl_flgcc_flgndetgal_contammp_flgdistanglej_hh_kj_k
degdegarcsecarcsecdegmagmagmagmagmagmagmagmagmag
float64float64str12str13float64float64int64str16float64float64float64float64float64float64float64float64float64float64float64float64str3str3str3str3str6int64int64float64float64float64float64float64
274.429506-13.87054718h17m43.08s-13d52m13.97s0.080.084518174308-135213916.3050.1420.1436.714.0480.1070.10813.613.2570.0660.06616.5CAA2221110ss06665500975.080151256.4482.2570.7913.048
274.423821-13.8697418h17m41.72s-13d52m11.06s0.060.069018174171-135211014.8020.0580.05926.712.6350.0590.0650.111.7680.0450.04665.2AAA2221110ss66666600993.752042256.8782.1670.8673.034
274.424587-13.73962918h17m41.90s-13d44m22.66s0.080.084518174190-134422616.328------14.3450.0590.0610.413.4050.0460.04714.4UAA0220110cc00366600995.726698284.113--0.94--
274.433933-13.76950218h17m44.14s-13d46m10.21s0.080.084518174414-134610216.2810.0980.0996.814.0570.0350.03613.512.9560.0320.03321.8CAA22211100006556600942.627418278.2522.2241.1013.325
274.437013-13.88569818h17m44.88s-13d53m08.51s0.090.094518174488-135308515.171------14.4120.1520.1529.813.7420.0950.09510.6UBA6220220cc00556600964.105389252.93--0.67--
274.433996-13.75244618h17m44.16s-13d45m08.81s0.080.089018174415-134508816.54------14.5190.0830.0838.813.6040.0430.04412.0UBA0220110cc00566600953.230532281.908--0.915--
274.418138-13.7721518h17m40.35s-13d46m19.74s0.080.089018174035-134619717.98------14.610.0430.0448.113.4560.0560.05713.8UBA02201100000164500996.047248277.25--1.154--
274.433695-13.89904918h17m44.09s-13d53m56.58s0.060.069018174408-135356513.0110.0210.024139.010.9170.020.021243.810.0130.0170.019328.3AAA22211100066666600990.166399250.4662.0940.9042.998
274.425482-13.7714918h17m42.12s-13d46m17.36s0.080.0813518174211-134617316.086------13.7090.0650.06618.612.5030.0440.04533.1UAA62201200c00555500970.896919277.582--1.206--
................................................................................................
274.81801-14.00124518h19m16.32s-14d00m04.48s0.180.16118191632-140004416.240.1130.1135.615.5310.1640.1642.515.252------CDU22011000006060000809.817146149.610.709----
274.822709-14.03725418h19m17.45s-14d02m14.11s0.070.074518191745-140214115.9990.0970.0987.014.0090.0320.03310.013.0770.0350.03616.4CAA22211100006265600931.339773152.7791.990.9322.922
274.880758-13.9995618h19m31.38s-13d59m58.42s0.060.069018193138-135958414.1630.0350.03737.811.1790.020.021135.69.7650.0170.019347.1AAA22211100055666600935.512452137.7622.9841.4144.398
274.652526-14.05510618h18m36.61s-14d03m18.38s0.060.069018183660-140318315.0350.0520.05419.413.0990.040.04127.512.2540.0410.04141.7AAA222111c0056666600908.109808190.6821.9360.8452.781
274.760586-13.99992718h19m02.54s-13d59m59.74s0.080.089018190254-135959716.3290.1220.1235.514.4880.0670.0676.413.6170.0510.05211.1CCA22211100006061600724.557553163.2271.8410.8712.712
274.831132-14.02002718h19m19.47s-14d01m12.10s0.080.084518191947-140112016.203------13.2380.020.02120.412.0160.0230.02443.6UAA02201100000666600891.347132149.27--1.222--
274.972435-13.76037418h19m53.38s-13d45m37.35s0.120.111018195338-134537317.472------16.755------14.4130.0840.0844.8UUD00200100000000600964.82893379.963------
274.870009-13.81777518h19m28.80s-13d49m03.99s0.080.084518192880-134903916.933------14.5140.0640.0656.312.9570.0410.04118.4UCA02201100000266600592.99805893.69--1.557--
274.735323-13.94157518h18m56.48s-13d56m29.67s0.140.144518185647-135629616.643------14.88------14.2910.1160.1176.0UUC00200100000000400498.524438165.968------
274.866294-13.84177818h19m27.91s-13d50m30.40s0.080.084518192791-135030415.615------13.9110.0750.07510.912.7650.1340.13421.9UAE0220110cc00554500591.97725102.147--1.146--

Challenge

Using the t4 table above:

  1. Make a plot that shows j_m-h_m on the x-axis, and h_m-k_m on the y-axis

  2. Make a new table that contains the subset of rows where the j_snr, h_snr, and k_snr columns, which give the signal-to-noise-ratio in the J, H, and K band, are greater than 10, and try and show these points in red in the plot you just made.

  3. Make a new table (based on the full table) that contains only the RA, Dec, and the j_m, h_m and k_m columns, then try and write out this catalog into a format that you can read into another software package. For example, try and write out the catalog into CSV format, then read it into a spreadsheet software package (e.g. Excel, Google Docs, Numbers, OpenOffice). You may run into an issue at this point - if so, take a look at https://github.com/astropy/astropy/issues/7357 to see how to fix it.

#1
import matplotlib.pyplot as plt
plt.scatter(t4['j_m'] - t4['h_m'], t4['h_m'] - t4['k_m'], )
<matplotlib.collections.PathCollection at 0x7f696f91a630>
#2
subset = t4[(t4['j_snr'] > 10) & (t4['h_snr'] > 10) & (t4['k_snr'] > 10)]
subset
Table masked=True length=264
radecclonclaterr_majerr_minerr_angdesignationj_mj_cmsigj_msigcomj_snrh_mh_cmsigh_msigcomh_snrk_mk_cmsigk_msigcomk_snrph_qualrd_flgbl_flgcc_flgndetgal_contammp_flgdistanglej_hh_kj_k
degdegarcsecarcsecdegmagmagmagmagmagmagmagmagmag
float64float64str12str13float64float64int64str16float64float64float64float64float64float64float64float64float64float64float64float64str3str3str3str3str6int64int64float64float64float64float64float64
274.423821-13.8697418h17m41.72s-13d52m11.06s0.060.069018174171-135211014.8020.0580.05926.712.6350.0590.0650.111.7680.0450.04665.2AAA2221110ss66666600993.752042256.8782.1670.8673.034
274.433695-13.89904918h17m44.09s-13d53m56.58s0.060.069018174408-135356513.0110.0210.024139.010.9170.020.021243.810.0130.0170.019328.3AAA22211100066666600990.166399250.4662.0940.9042.998
274.431606-13.78187718h17m43.59s-13d46m54.76s0.060.064518174358-134654713.870.0320.03463.013.4060.060.06124.613.3650.0870.08815.0AAA222111ccc66666600945.318343275.5080.4640.0410.505
274.433361-13.89224618h17m44.01s-13d53m32.09s0.060.064518174400-135332015.1510.0590.0619.413.370.0640.06525.512.5990.0480.04930.3AAA22211100056666600983.384329251.8341.7810.7712.552
274.427483-13.76861218h17m42.60s-13d46m07.00s0.060.069018174259-134607014.4230.0410.04337.913.9260.0640.06515.313.7440.0520.05310.6AAA222222ccc66666600965.406417278.2470.4970.1820.679
274.43155-13.88311118h17m43.57s-13d52m59.20s0.060.069018174357-135259115.7020.0840.08511.713.7250.0910.09118.412.9080.0690.0722.8AAA222222ccc26666600979.746164253.7781.9770.8172.794
274.432195-13.72343318h17m43.73s-13d43m24.36s0.060.069018174372-134324313.8430.0190.02364.611.3630.020.021161.710.2430.0190.021265.6AAA22211100066666600986.230333287.7792.481.123.6
274.429849-13.83867218h17m43.16s-13d50m19.22s0.060.064518174316-135019215.7260.0690.0711.413.9110.0530.05315.513.0250.0690.0720.5AAA222111c0016666600953.670451263.1511.8150.8862.701
274.427993-13.82985618h17m42.72s-13d49m47.48s0.060.069018174271-134947413.3070.0290.032105.812.9190.0630.06338.612.7950.0650.06525.3AEA222212c0c66556600956.908469265.0840.3880.1240.512
................................................................................................
274.788401-13.95384718h19m09.22s-13d57m13.85s0.060.069018190921-135713814.9910.0410.04318.912.7160.0360.03732.911.6880.0240.02665.5AAA222111ccc66665500610.307615149.8752.2751.0283.303
274.797491-14.06662418h19m11.40s-14d03m59.85s0.060.069018191139-140359813.9650.0360.03848.613.0190.0530.05324.912.610.0420.04328.0AAA22211100055556600993.135358160.1090.9460.4091.355
274.804877-13.9126818h19m13.17s-13d54m45.65s0.060.064518191317-135445614.3560.0350.03733.913.5040.0550.05615.913.2730.0490.0515.2AAA22211100066665600525.943465136.2140.8520.2311.083
274.900945-13.90323818h19m36.23s-13d54m11.66s0.060.069018193622-135411612.4540.0280.03182.311.810.030.03175.811.360.0240.02679.9AAA22211100055665500780.501247116.3080.6440.451.094
274.836101-14.03753618h19m20.66s-14d02m15.13s0.060.064518192066-140215115.2170.0610.06214.313.3580.0620.06218.212.4890.0510.05228.2AAA22211100026666600954.544823150.311.8590.8692.728
274.862392-13.8457318h19m26.97s-13d50m44.63s0.060.069018192697-135044614.3690.0410.04331.212.460.030.03141.711.5610.0260.02766.4AAA22211100066663300581.867714103.7991.9090.8992.808
274.858201-13.91829418h19m25.97s-13d55m05.86s0.060.064518192596-135505814.1560.0520.05438.012.130.0510.05256.511.2150.0370.03891.3AAA222111ccc66666600680.283673126.0152.0260.9152.941
274.611341-14.05634718h18m26.72s-14d03m22.85s0.060.069018182672-140322815.0660.070.07118.913.3670.0910.09121.512.6340.0650.06529.4AAA222111ccc66666600949.652829199.191.6990.7332.432
274.880758-13.9995618h19m31.38s-13d59m58.42s0.060.069018193138-135958414.1630.0350.03737.811.1790.020.021135.69.7650.0170.019347.1AAA22211100055666600935.512452137.7622.9841.4144.398
274.652526-14.05510618h18m36.61s-14d03m18.38s0.060.069018183660-140318315.0350.0520.05419.413.0990.040.04127.512.2540.0410.04141.7AAA222111c0056666600908.109808190.6821.9360.8452.781
#2 (continued)
import matplotlib.pyplot as plt
plt.scatter(t4['j_m'] - t4['h_m'],
            t4['h_m'] - t4['k_m'],
            s=5, color='black')
plt.scatter(subset['j_m'] - subset['h_m'],
            subset['h_m'] - subset['k_m'],
            s=30, color='red', alpha=0.5)
<matplotlib.collections.PathCollection at 0x7f696f80ac18>

png

#3
simple = t4['ra', 'dec', 'j_m', 'h_m', 'k_m']
simple
Table masked=True length=929
radecj_mh_mk_m
degdegmagmagmag
float64float64float64float64float64
274.429506-13.87054716.30514.04813.257
274.423821-13.8697414.80212.63511.768
274.424587-13.73962916.32814.34513.405
274.433933-13.76950216.28114.05712.956
274.437013-13.88569815.17114.41213.742
274.433996-13.75244616.5414.51913.604
274.418138-13.7721517.9814.6113.456
274.433695-13.89904913.01110.91710.013
274.425482-13.7714916.08613.70912.503
...............
274.81801-14.00124516.2415.53115.252
274.822709-14.03725415.99914.00913.077
274.880758-13.9995614.16311.1799.765
274.652526-14.05510615.03513.09912.254
274.760586-13.99992716.32914.48813.617
274.831132-14.02002716.20313.23812.016
274.972435-13.76037417.47216.75514.413
274.870009-13.81777516.93314.51412.957
274.735323-13.94157516.64314.8814.291
274.866294-13.84177815.61513.91112.765
#3 (continued)
simple.write('2mass_subset.csv', format='ascii.csv', overwrite=True, comment='#')