.. _tutorial:

*********************************************
Tutorial for the *MASCARA* reduction pipeline
*********************************************

Working principle
=================

Every *mascara* image has a **lst-idx** keyword in the header. Each day is divided by the LST exposure time of the image. The lst-index represents the exposure index within that day. Two images with identical lst-index will be exposed at the same Local Siderial Time.  The lst-index is defined in :py:func:`mascara.observer.Site.local_sidereal_time`. 
The reduction pipeline processes the images in batches of 50. The starting and end point of each batch is given by the **lst-idx** keyword:   
 
 - lst-idx mod 50 equals 0 => start of a batch
 - lst-idx mod 50 equals 49 => end of a batch
	
The batches are all treated the same way:


+------------------+----------------------------------+----------------------------+-----------------------------------+
| lst-idx, actions | 0                                | 1   to   48                | 49                                |
+==================+==================================+============================+===================================+
|                  | open image                       | open image                 | open image                        |
+                  +----------------------------------+----------------------------+-----------------------------------+
|                  | get date and lst-idx from header | get date and lst-idx       | get date and lst-idx              |
+                  +----------------------------------+----------------------------+-----------------------------------+
|                  | calculate altitude and           | calculate alt and az       | calculate alt and az              |
|                  | azimuth coordinates              |                            |                                   |
+                  +----------------------------------+----------------------------+-----------------------------------+
|                  | calculate the astrometric        |                            |                                   |
|                  | corrections for that batch       |                            |                                   |
+                  +----------------------------------+----------------------------+-----------------------------------+
|                  | calculate x and y positions      | calculate x and y postions | calculate x and y positions       |
+                  +----------------------------------+----------------------------+-----------------------------------+
|                  | initialise the stacker           | add image to stacker       | add image to stacker              |
+                  +----------------------------------+----------------------------+-----------------------------------+
|                  | do photometry                    | do photometry              | do photometry                     |
+                  +----------------------------------+----------------------------+-----------------------------------+
|                  | save in dictionary               | update dictionary          | update dictionary                 |
+                  +----------------------------------+----------------------------+-----------------------------------+
|                  |                                  |                            | bin images                        |
+                  +----------------------------------+----------------------------+-----------------------------------+
|                  |                                  |                            | calculate astrometric corrections |
|                  |                                  |                            | on the binned image               |
+                  +----------------------------------+----------------------------+-----------------------------------+
|                  |                                  |                            | calculate x and y positions       |
+                  +----------------------------------+----------------------------+-----------------------------------+
|                  |                                  |                            | do photometry on binned image     |
+                  +----------------------------------+----------------------------+-----------------------------------+
|                  |                                  |                            | save all dictionaries             |
+------------------+----------------------------------+----------------------------+-----------------------------------+

The dictionaries contain for each star the following informations:
 - the stellar identifier, *ASCC* number from the Kharchenko 2009+ catalogue
 - the stellar coordinates (ra, dec)
 - the V and B magnitude
 - the stellar spectral type
 - the N points of the light curve
 - the stellar positions on the CCD for these N images
 - the JDmid date, defined as the Julian date at mid-exposure
 - the lst date
 
At the end of a batch of 50, the dictionary are split according to stellar identifier. This process takes place in parallel to the reduction described above. 

Finally at the end of the night, all light curve pieces are combined together and saved into a .hdf5 file. 


Starting the reduction of existing data
=======================================

Requirements
------------

To start the reduction of existing data requires:

 - the directories are named after : 'YYYYMMDD' + site identifier + camera identifier. For instance, 20150202LPE, 20150202LPS ... for the night 2015-02-02, at the La Palma site for the cameras South and East.
 - the raw files are in a subdirectory called 'raw'
 - calibration files are identified in their names : 'calib_bias.fits'
 
Command lines
-------------

The python program **mascprocess** is recommended for reducing existing data. It is written to be launched directly from a command line. It can reduce simultaneously :

 - a set of nights for one camera
 - or the same set of nights for a sub-set of cameras
 - or a night for all five camera cameras. 
 
These options are entered as keywords for **mascprocess**. The program accepts different keywords: 
 - night, to specify the night(s) to reduce. If more than one night, separate each night with a *+* and no space;
 - camera, give in the initial of the camera you wish to reduce;
 - site, if more than one site available, to sort data by observing site.
 
Additional keywords are:
 - quiet, to avoid (most) of the screen prints
 - verbose, to get more of them
 - overwrite, to erase all existing directories (apart from *raw*). Default is False, 
 
then the existing directories are renamed as *???_old*
 
As result, the calling sequence is :
 - for one camera, four nights, with overwrite
 
 >>> python mascprocess.py /datadir/ --camera E --night 20150202+20150203+20150214+20150322 -o
 
 - for one night all camera:
 
 >>> python mascprocess.py /datadir/ --night 20150203

 - a mix of both:
 
 >>> python mascprocess.py /datadir/ --camera E+S --night 20150203+20150204 -q


Reduction functions
===================

The reduction relies on two main functions both in the :mod:`mascara.reduction` module:
    - :func:`mascara.reduction.Reduction` 
    - :func:`mascara.reduction.SaveItAll`
    
The former is in charge of the reduction itself, as described in the table above. 
Every 50 images (or when lstidx mod 50 == 0), **Reduction** saves the light 
curves of all the observed stars in a temporary .h5 file. 
The name of the file and its location are then communicated to the other 
function :func:`mascara.reduction.SaveItAll`. This function is in charge of all 
reading and writing to disk operations. Since those are the bottleneck of the 
mascara reduction procedure, the two functions are designed to be run simultaneously 
via parallel processing. 
**SaveItAll** reads in the temporary file, and extracts out of it the light curves
of some stars which shall be used later on as diagnostic tool. 
Furthermore, at the end of the night, **SaveItAll** is responsible of combining all
temporary files into one .hdf5 file.


Final file format
=================

The .hdf5 files containing the raw light curves follow the same inside format. There are 3 groups inside each hdf5 file:
    - data
    - header
    - header_table

and one dataset called *global*. 8Global* contains all the informations which are common to all stars inside the file, such as the camera properties, and the station's location.
Accessing the content of an hdf5 file is pretty easy with python. 

    >>> import h5py
    >>> myfile = h5py.File('fLC_20150719LPC.hdf5', 'r')
    >>> print myfile.keys()
        [u'data', u'header', u'header_table', u'global']

The u'data' means that the key is coded in unicode format. 
To access the content under the *global* key:
    
    >>> print myfile['global'].attrs.keys()
        u'STATION', u'CAMERA', u'ALT0', u'AZ0', u'TH0', u'X0', u'Y0', u'AVERSION', 
        u'RVERSION', u'CVERSION', u'EXPTIME', u'CCDTEMP', u'NAPER', u'APER0', 
        u'APER1', u'NSTAPER', u'SKYRAD0', u'SKYRAD1']
    >>> print myfile['global'].attrs.get('CAMERA')
        'Central'
    >>> print myfile['global'].attrs.get('APER0')
        '2.5'

     


The data group
--------------

The *data* group is sorted by ASCC stellar identifier number. As such there can be 
up to 25 000 unique keys to access the *data* group.
Each entry hold the entire light curve of that star for the night. 

To retrieve the light curve of one star:

    >>> import h5py
    >>> import numpy as np
    >>> myf = h5py.File('fLC_20150719LPC.hdf5', 'r')
    >>> mystar = '100007'
    >>> mydata = np.copy(myf['data'][mystar])  ### Make a copy of the entry is optional

We have retrieve now a record array. This type of numpy array works like a dictionary, where each
data is stored under a specific keyword. 
The keywords used for the light curve arrays are all identical. To list them:

    >>> print mydata.dtype.names
        ('flag', 'flux0', 'eflux0', 'flux1', 'eflux1', 'sky', 'esky', 'peak', 
        'x', 'y', 'alt', 'az', 'ccdtemp', 'exptime', 'jdmid', 'lst', 'lstidx', 'lstseq')
    >>> print mydata['flux0'].mean()
        5525.723444
        
But now we may want to know more abouth that star. This information is to be found in the header group

The header group
----------------

Likewise to the data group, the header information is stored in a record array. THe information is
extracted here like in a normal fits header file. 
Let's say we want to extract the header of our previous star

    >>> hdr = myf['header'][mystar]
    >>> print hdr.dtype.names
        ('jdstart', 'ra', 'dec', 'vmag', 'bmag', 'spectype', 'blend', 'blendvalue', 'nobs')
    >>> print hdr['ra']
        array([ 3.30304445])
    >>> print hdr['spectype']
        array(['AM'], dtype='|S9')