Tutorial Step 4: Using the example API
This tutorial picks up where Step 3 left off.
In Step 3, we looked at how data quality information is stored in a GWOSC data file. Now that you know how data quality information is stored, you may want to wrap some of the things you learned into a few handy functions.
Load one data file
As an example, pip-install ReadLIGO. In addition to reading HDF5 files, ReadLIGO also reads gravitational wave frame files, like those described on the GWpy documentation.
Create a new file called use_readligo.py
, and drop the following
code into it:
import numpy as np
import matplotlib.pyplot as plt
import readligo as rl
To read in all data from a single GWOSC data file, you can use the
loaddata
method:
#----------------------------------------------------------------
# Load all GWOSC data from a single file
#----------------------------------------------------------------
strain, time, chan_dict = rl.loaddata(
'H-H1_LOSC_4_V1-815411200-4096.hdf5', 'H1')
The function loaddata()
returns the strain time series, the gps
time of each sample, and a
dictionary
of all of the data quality flags in the file. As in the
last step
of this tutorial, each data quality channel in
chan_dict
is stored as a 1 Hz time series.
GWOSC data files contain gaps where strain data is not available - the
DATA
flag is a 1 at times with strain data. To loop over strain
segments which contain usable DATA
:
slice_list = rl.dq_channel_to_seglist(chan_dict['DATA'])
for slice in slice_list:
time_seg = time[slice]
strain_seg = strain[slice]
# -- Do stuff with strain segment here
If you are working with a large data set, a common use case is to start with a "Segment List", a list of GPS times when we are interested making our analysis. The API provides support for creating segment lists, as well as reading in data in a given GPS segment.
Load data using a segment list
getstrain()
and getsegs()
only work with data files that begin at GPS times which are integer multiples
of 4096 seconds. This is not true for releases of individual events. To load a
single file, please see the example above.
Let's download a few GWOSC data files to see how the API works.
- Go to the S5 Data Archive
- Choose the S5 data set and the H1 detector
- Enter start time 842656000 and end time 842670000
- Click Continue to get a list of data files. You should see 4 data files
- Click HDF5 to download all 4 data files, and save them in the same directory where you have been working.
Once the data files are downloaded and ReadLIGO is installed, the API will make it easy to construct segment lists and load data on demand.
First, let's construct a segment list representing data that passes the CBC Low Mass category 2 data quality flag:
start = 842656000
stop = 842670000
segList = rl.getsegs(start, stop, 'H1', flag='CBCLOW_CAT2')
Typing print segList
will show the GPS times of Science Mode
times for this data. You can get the same information by requesting segment
lists from the Timeline Query Forms.
As an example of how to use segment lists to load only Science Mode data, we can plot the first few seconds of each segment
#-------------------------------------------
# Plot a few seconds of each "good" segment
#-------------------------------------------
N = 10000
for (begin, end) in segList:
# -- Use the getstrain() method to load the data
strain, meta, dq = rl.getstrain(begin, end, 'H1')
# -- Make a plot
plt.figure()
ts = meta['dt']
rel_time = np.arange(0, end-begin, meta['dt'])
plt.plot(rel_time[0:N], strain[0:N])
plt.xlabel('Seconds since GPS ' + str(begin) )
plt.show()
Running this loop should make a figure for each good segment, and plot the first few seconds of strain data for each segment. In general, a loop over a list of "good segments" is a convenient way to apply data quality information, and only analyze the data you want.
For more examples using the API, try typing help(rl)
into the
Python interpreter, or take a look at the
API Examples page.
You made it!
This concludes the introductory tutorial on reading GWOSC data files. At this point, you should be ready to download and use GWOSC data from the Data & Catalogs page. You may want to look at some of the other tutorials on specific uses of GWOSC datas.
Feedback
While we are not staffed to answer specific questions, the GWOSC development team would welcome any feedback you have that may help improve this tutorial. If you found it useful, that would be nice to know, too!