Tutorial Step 4: Using the example API

This tutorial picks up where Step 3 left off.

In Step 3, we looked at how data quality information is stored in a GWOSC data file. Now that you know how data quality information is stored, you may want to wrap some of the things you learned into a few handy functions.

Load one data file

As an example, pip-install ReadLIGO. In addition to reading HDF5 files, ReadLIGO also reads gravitational wave frame files, like those described on the GWpy documentation.

Create a new file called use_readligo.py, and drop the following code into it:

import numpy as np
import matplotlib.pyplot as plt
import readligo as rl

To read in all data from a single GWOSC data file, you can use the loaddata method:

#----------------------------------------------------------------
# Load all GWOSC data from a single file 
#----------------------------------------------------------------
strain, time, chan_dict = rl.loaddata(
                          'H-H1_LOSC_4_V1-815411200-4096.hdf5', 'H1')

The function loaddata() returns the strain time series, the gps time of each sample, and a dictionary of all of the data quality flags in the file. As in the last step of this tutorial, each data quality channel in chan_dict is stored as a 1 Hz time series.

GWOSC data files contain gaps where strain data is not available - the DATA flag is a 1 at times with strain data. To loop over strain segments which contain usable DATA:

slice_list = rl.dq_channel_to_seglist(chan_dict['DATA'])
for slice in slice_list:
    time_seg = time[slice]
    strain_seg = strain[slice]
    # -- Do stuff with strain segment here

If you are working with a large data set, a common use case is to start with a "Segment List", a list of GPS times when we are interested making our analysis. The API provides support for creating segment lists, as well as reading in data in a given GPS segment.

Load data using a segment list

The methods getstrain() and getsegs() only work with data files that begin at GPS times which are integer multiples of 4096 seconds. This is not true for releases of individual events. To load a single file, please see the example above.

Let's download a few GWOSC data files to see how the API works.

  1. Go to the S5 Data Archive
  2. Choose the S5 data set and the H1 detector
  3. Enter start time 842656000 and end time 842670000
  4. Click Continue to get a list of data files. You should see 4 data files
  5. Click HDF5 to download all 4 data files, and save them in the same directory where you have been working.

Once the data files are downloaded and ReadLIGO is installed, the API will make it easy to construct segment lists and load data on demand.

First, let's construct a segment list representing data that passes the CBC Low Mass category 2 data quality flag:

start = 842656000
stop =  842670000
segList = rl.getsegs(start, stop, 'H1', flag='CBCLOW_CAT2')

Typing print segList will show the GPS times of Science Mode times for this data. You can get the same information by requesting segment lists from the Timeline Query Forms.

As an example of how to use segment lists to load only Science Mode data, we can plot the first few seconds of each segment

#-------------------------------------------
# Plot a few seconds of each "good" segment
#-------------------------------------------
N = 10000
for (begin, end) in segList:
    # -- Use the getstrain() method to load the data
    strain, meta, dq = rl.getstrain(begin, end, 'H1')

    # -- Make a plot
    plt.figure()
    ts = meta['dt']
    rel_time = np.arange(0, end-begin, meta['dt'])
    plt.plot(rel_time[0:N], strain[0:N])
    plt.xlabel('Seconds since GPS ' + str(begin) )
plt.show()

Running this loop should make a figure for each good segment, and plot the first few seconds of strain data for each segment. In general, a loop over a list of "good segments" is a convenient way to apply data quality information, and only analyze the data you want.

For more examples using the API, try typing help(rl) into the Python interpreter, or take a look at the API Examples page.

You made it!

This concludes the introductory tutorial on reading GWOSC data files. At this point, you should be ready to download and use GWOSC data from the Data & Catalogs page. You may want to look at some of the other tutorials on specific uses of GWOSC datas.

Feedback

While we are not staffed to answer specific questions, the GWOSC development team would welcome any feedback you have that may help improve this tutorial. If you found it useful, that would be nice to know, too!