Basic Astronomical CCD Image Reduction

This notebook shows an example of basic astronomical image reduction with V-band Orion Nebula images and their associated bias and V-band flat frames. Prior to working through this example, you should be familiar with the types of noise that contribute to CCD images.

Familiarity with python is not assumed, however, where steps are generalized in the text, the comments in the code (look for the hashtags) fill in many of the specifics.

By Amanda Townsend, University of Florida, February 2017

The first step is to import all the python packages we'll be using:

In [1]:
import numpy as np

import matplotlib
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
%matplotlib inline

import glob
from astropy.io import fits
from scipy.ndimage import interpolation as interp

from skimage.feature.register_translation import (register_translation, _upsampled_dft)


## This turns off warnings: not a great way to code
## But when we show the images, sometimes we're taking the logarithm of zero and it doesn't like that
## Which would matter if we were doing math, but we're just taking a look at images, so we can ignore it. 
import warnings
warnings.filterwarnings('ignore')
/home/ajtownsend/anaconda2/lib/python2.7/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
  warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')

Next we're going to define a function to display a set of images in a grid, so that we can get an overview of a bunch of images all at the same time:

In [2]:
## function to plot an image cube
## for this code a "cube" means a stack of image data arrays

def plot_grid(datacube,imagenames):
    no_A = len(datacube) ## number of individual images in the cube
    xplots = int(np.around(np.sqrt(no_A))) ## number of image grid columns
    yplots = xplots + 1 ## number of image grid rows--sometimes there are one fewer, but that's okay

#     print no_A, xplots, yplots ## this line is for troubleshooting
    
    gs = gridspec.GridSpec(yplots, xplots) ## define the image grid
    plt.figure(figsize=(15,15)) ## set the figure size
    for i in range(no_A): 
        ## plots each individual image within the image grid: 
        B = datacube[i]
        plt.subplot(gs[i])
        plt.imshow(np.log10(B), origin='lower', cmap='gray')
        plt.title(imagenames[i])

Now we'll look at the images we have in our data folder:

In [3]:
ls *fit && ls *.fits 
bias10.fit*  bias3.fit*  bias5.fit*  bias7.fit*  bias9.fit*
bias2.fit*   bias4.fit*  bias6.fit*  bias8.fit*  bias.fit*
m42.319.fits  m42.321.fits  m42.323.fits    vflat.421.fits  vflat.423.fits
m42.320.fits  m42.322.fits  vflat.420.fits  vflat.422.fits

A few things to note:

  • These are biases instead of darks, so they've got the readout noise but not the dark current. For darks, we would use exactly the same process, but we'd have to keep track of exposure times as well.
  • Some of the files have different names and different extensions. Functionally, there's not a difference between .fits files and .fit files, but it's good to check and make sure we've got the correct names before we try to read in your files. If we tried to load "bias10.fits" we wouldn't find anything, because the file in this folder is called "bias10.fit"!
  • Related: it's generally useful to think about how we name our files before we even start taking images. A lot of current image reduction software will expect all of the images to be in the same folder, and will expect images to have a common name and only differing numbers. (ie: filename01.fits, filename02.fits, filename03.fits...)
  • It's also good practice not to put spaces or dots in the image filenames. Underscores and dashes are good alternatives!

Which brings us to one of the challenges in image reduction--and really, astronomy and science in general. We have to keep track of which images/data we have, which ones we want to use at any given time, and which ones we've edited (and how).

In this case we'll do a few really simple things:

  • Every time we need to apply some step in the reduction process to a set of images, we'll create a list for that set of images.
  • Every time we edit data, we'll save it to a new (but similar) name.
  • There's also a cheat sheet at the very end of this notebook with all of the lists and dictionaries that we're using, to keep track.

(I try to be very careful when saving files: if you write over your original data, you may lose it! I always try to keep a backup of the original data in a separate folder/on a different disk, and be careful what I name the files I'm saving.)

In this case, we'll start with lists of the different types of images we're using: biases, flats, and science images. These lists DO NOT contain any image data, they are just strings with the names of the files in them.

In [4]:
## make a list of the bias files:
bias_list = glob.glob('bias*fit')
# print biaslist ##this line is for troubleshooting

## make a list of v-filter flat images:
vflat_list = glob.glob('vflat*fits')
# print vflatlist ## for troubleshooting

## make a list of raw science images:
## in this case, the raw science frames are v-band images of the orion nebula (messier 42)
m42_list = glob.glob('m42*fits')
# print m42list ## for troubleshooting

## now put all the lists together for a masterlist of all the images:
all_images_list = bias_list + vflat_list + m42_list

Now we can read in the data for the images!

For the master list of images, we'll create a python dictionary which has the filename as a 'key' and the array of pixel values as a 'value' for each image.

In [5]:
raw_image_data = {}
for image_name in all_images_list: raw_image_data[image_name] = fits.getdata(image_name)

# for image in all_images_list: print raw_image_data[image].shape ##for troubleshooting
## (check to make sure they're all the same size)

We can choose any subset of image data by referencing the image dictionary with any of the other lists, for example: raw_image_data[m42_list] will return the image data for all of the science images.

We want to start with the bias frames so that we can subtract the readout noise from the rest of our images. For this example we're going to pull out the bias images from the raw_image_data dictionary defined above. Each image is a 2-dimensional array of pixel values, and we're going to stack them all up in a 3-dimensional array (which is called a cube for naming purposes in the code).

In [6]:
## create an array of bias images
biascube = np.stack([raw_image_data[bias_frame] for bias_frame in bias_list],axis=0)
#print biascube.shape ## this line is for troubleshooting

Now we'll use the handy-dandy plot_grid() function from above to take a look at all of our bias images. This is a good time to double check that we've got all the images we want and only the images we want before we proceed to the next step in the reduction process.

In [7]:
## show the bias image(s)
plot_grid(biascube,bias_list)

Some of the things you might notice in the bias images above are hot pixels, bad columns on the CCD, and some variation in brightness (noise) across the CCD chip due to how the charge is shifted across the CCD in the readout process.

We can combine and take a look at our flats and science frames the same way:

In [8]:
## create an array of v-flat images
vflatcube = np.stack([raw_image_data[vflat_frame] for vflat_frame in vflat_list],axis=0)

## create an array of raw M42 images
m42cube = np.stack([raw_image_data[science_frame] for science_frame in m42_list],axis=0)
In [9]:
## show the v-flat image(s)
plot_grid(vflatcube,vflat_list)