sigpyproc.base#

Base classes for manipulating frequency-major order pulsar data.

This module contains the Filterbank class for manipulating frequency-major order pulsar data.

Classes

Filterbank

Base class for manipulating frequency-major order pulsar data.

class sigpyproc.base.Filterbank[source]#

Bases: ABC

Base class for manipulating frequency-major order pulsar data.

The Filterbank class should never be instantiated directly. Instead it should be inherited by data reading classes.

Attributes:
header

Header metadata of input file.

chan_stats

Channel statistics of the data.

Methods

apply_channel_mask(chan_mask[, mask_value, ...])

Apply a channel mask and write to a new file.

bandpass([method, gulp, start, parallel, nsamps])

Compute the bandpass of the data.

clean_rfi([method, threshold, freq_mask, ...])

Clean RFI from the filterbank data and write to a new file.

collapse([method, gulp, start, nsamps, parallel])

Sum across all frequencies for each time sample.

compute_stats([gulp, start, nsamps])

Compute channelwise statistics of data.

compute_stats_basic([gulp, start, nsamps])

Compute channelwise statistics of data (basic).

dedisperse(dm[, ref_freq, gulp, start, ...])

Dedisperse and collapse to a time series.

downsample([tfactor, ffactor, outfile_name, ...])

Decimate in time and frequency and write to file.

extract_bands(chanstart, nchans[, ...])

Extract a subset of Sub-bands and write to file.

extract_chans([chans, outfile_base, ...])

Extract a subset of channels and write to file.

extract_samps(start, nsamps[, outfile_name, ...])

Extract a subset of samples and write to file.

fold(period, dm[, accel, nbins, nints, ...])

Fold the data and return a 3D data cube.

invert_freq([outfile_name, gulp, start, ...])

Invert frequency axis and write to a new file.

read_block(start, nsamps[, fch1, nchans])

Read a data block from the filterbank file stream.

read_chan(ichan[, gulp, start, nsamps])

Read a single frequency channel as a time series.

read_dedisp_block(start, nsamps, dm)

Read a block of dedispersed filterbank data.

read_plan(*[, gulp, start, nsamps, ...])

Read sequential filterbank in gulps and yield.

remove_zerodm([outfile_name, gulp, start, ...])

Remove zero-DM and write to a new file.

requantize(nbits_out[, outfile_name, ...])

Requantize the data and write to a new file.

subband(dm, nsub[, ref_freq, outfile_name, ...])

Subband the data and write to a new file.

abstract property header#

Header metadata of input file.

Returns:
Header

Header object containing metadata of the input file.

abstractmethod read_block(start, nsamps, fch1=None, nchans=None)[source]#

Read a data block from the filterbank file stream.

Parameters:
startint

First time sample of the block to be read.

nsampsint

Number of samples in the block (i.e. block will be nsamps*nchans in size).

fch1float, optional

Frequency of the first channel, by default None (header value).

nchansint, optional

Number of channels in the block, by default None (header value).

Returns:
FilterbankBlock

2-D array of filterbank data with observational metadata.

Raises:
ValueError

if requested nsamps or nchans are out of range.

abstractmethod read_dedisp_block(start, nsamps, dm)[source]#

Read a block of dedispersed filterbank data.

Best used in cases where I/O time dominates reading a block of data.

Parameters:
startint

First time sample of the block to be read.

nsampsint

Number of samples in the block (i.e. block will be nsamps*nchans in size).

dmfloat

Dispersion measure to dedisperse at.

Returns:
FilterbankBlock

2-D array of filterbank data with observational metadata.

Raises:
ValueError

if requested dedispersed nsamps are out of range.

abstractmethod read_plan(*, gulp=16384, start=0, nsamps=None, skipback=0, description=None, quiet=False, allocator=None)[source]#

Read sequential filterbank in gulps and yield.

Parameters:
gulpint, optional

Number of time samples in each read, by default 16384.

startint, optional

Starting sample to read from, by default 0 (start of file).

nsampsint, optional

Total number of samples to read, by default None (end of the file).

skipbackint, optional

Number of samples to skip back after each read, by default 0.

descriptionstr, optional

Annotation for progress bar (rich), by default Calling Stack.

quietbool, optional

Disable progress bar and logging, by default False.

allocatorCallable[[int], Buffer], optional

An allocator callback that returns an object implementing the Python Buffer Protocol interface (PEP 3118) for the data to be read into, by default None.

Yields:
Iterator[tuple[int, int, ndarray]]

Tuple of number of samples read, index of read, and the unpacked data read.

Raises:
ValueError

If read samples < skipback.

Notes

For each read, the generator yields a tuple x, where:

  • x[0] is the number of samples read

  • x[1] is the index of the read (i.e. x[1]=0 is the first read)

  • x[2] is a 1-D numpy array containing the data that was read

Examples

The normal calling syntax for this is function is:

>>> for nsamps_r, ii, data in self.read_plan(**plan_kwargs):
        # do something
where data always has contains ``nchans*nsamps`` points.
property chan_stats#

Channel statistics of the data.

Returns:
ChannelStats | None

Channel statistics object if computed, else None.

compute_stats(gulp=16384, start=0, nsamps=None, **plan_kwargs)[source]#

Compute channelwise statistics of data.

Channel statistics include mean, rms, skewness, kurtosis, maxima, and minima.

Parameters:
gulpint, optional

Number of samples in each read, by default 16384.

startint, optional

Start sample, by default 0.

nsampsint, optional

Number of samples to read, by default all.

**plan_kwargsdict

Keyword arguments for read_plan().

compute_stats_basic(gulp=16384, start=0, nsamps=None, **plan_kwargs)[source]#

Compute channelwise statistics of data (basic).

Channel statistics include mean, rms, maxima, and minima.

Parameters:
gulpint, optional

Number of samples in each read, by default 16384.

startint, optional

Start sample, by default 0.

nsampsint, optional

Number of samples to read, by default all.

**plan_kwargsdict

Keyword arguments for read_plan().

collapse(method='sum', gulp=16384, start=0, nsamps=None, *, parallel=False, **plan_kwargs)[source]#

Sum across all frequencies for each time sample.

Parameters:
method{“mean”, “sum”}, optional

Method to collapse the data, by default “mean”.

gulpint, optional

Number of samples in each read, by default 16384.

startint, optional

Start sample, by default 0.

nsampsint, optional

Number of samples to read, by default all.

parallelbool, optional

If True, use parallel processing, by default False.

**plan_kwargsdict

Additional Keyword arguments for read_plan().

Returns:
TimeSeries

A zero-DM time series.

bandpass(method='sum', gulp=16384, start=0, *, parallel=False, nsamps=None, **plan_kwargs)[source]#

Compute the bandpass of the data.

Average across each time sample for all frequencies.

Parameters:
method{“mean”, “sum”}, optional

Method to collapse the data, by default “mean”.

gulpint, optional

Number of samples in each read, by default 16384.

startint, optional

Start sample, by default 0.

nsampsint, optional

Number of samples to read, by default all.

parallelbool, optional

If True, use parallel processing, by default False.

**plan_kwargsdict

Keyword arguments for read_plan().

Returns:
TimeSeries

Bandpass of the data.

dedisperse(dm, ref_freq='fch1', gulp=16384, start=0, nsamps=None, *, parallel=False, **plan_kwargs)[source]#

Dedisperse and collapse to a time series.

Parameters:
dmfloat

Dispersion measure to dedisperse to.

ref_freqstr | float, optional

Reference frequency to use for dedispersion, by default “fch1”.

gulpint, optional

Number of samples in each read, by default 16384.

startint, optional

Start sample, by default 0.

nsampsint, optional

Number of samples to read, by default all.

parallelbool, optional

If True, use parallel processing, by default False.

**plan_kwargsdict

Additional keyword arguments for read_plan().

Returns:
TimeSeries

A dedispersed time series.

Notes

If gulp < maximum dispersion delay, gulp is taken to be twice the maximum dispersion delay.

read_chan(ichan, gulp=16384, start=0, nsamps=None, **plan_kwargs)[source]#

Read a single frequency channel as a time series.

Parameters:
ichanint

Channel index to retrieve (0 is the highest frequency channel).

gulpint, optional

Number of samples in each read, by default 16384.

startint, optional

Start sample, by default 0.

nsampsint, optional

Number of samples to read, by default all.

**plan_kwargsdict

Additional keyword arguments for read_plan().

Returns:
TimeSeries

Selected channel as a time series.

Raises:
ValueError

If ichan is out of range (ichan < 0 or ichan > nchans).

invert_freq(outfile_name=None, gulp=16384, start=0, nsamps=None, *, parallel=False, **plan_kwargs)[source]#

Invert frequency axis and write to a new file.

Parameters:
outfile_namestr, optional

Name of output file, by default basename_inverted.fil.

gulpint, optional

Number of samples in each read, by default 16384.

startint, optional

Start sample, by default 0.

nsampsint, optional

Number of samples to read, by default all.

parallelbool, optional

If True, use parallel processing, by default False.

**plan_kwargsdict

Additional keyword arguments for read_plan().

Returns:
str

Name of output file.

apply_channel_mask(chan_mask, mask_value=0, outfile_name=None, gulp=16384, start=0, nsamps=None, *, parallel=False, **plan_kwargs)[source]#

Apply a channel mask and write to a new file.

Parameters:
chan_maskndarray

1D Boolean array of channel mask (1 or True for bad channels).

mask_valuefloat, optional

Value to replace masked channels with, by default 0.

outfile_namestr, optional

Name of the output filterbank file, by default basename_masked.fil.

gulpint, optional

Number of samples in each read, by default 16384.

startint, optional

Start sample, by default 0.

nsampsint, optional

Number of samples to read, by default all.

parallelbool, optional

If True, use parallel processing, by default False.

**plan_kwargsdict

Additional keyword arguments for read_plan().

Returns:
str

Name of output file.

downsample(tfactor=1, ffactor=1, outfile_name=None, gulp=16384, start=0, nsamps=None, **plan_kwargs)[source]#

Decimate in time and frequency and write to file.

Parameters:
tfactorint, optional

Factor by which to downsample in time, by default 1.

ffactorint, optional

Factor by which to downsample in frequency, by default 1.

outfile_namestr, optional

Name of file to write to, by default basename_tfactor_ffactor.fil.

gulpint, optional

Number of samples in each read, by default 16384.

startint, optional

Start sample, by default 0.

nsampsint, optional

Number of samples to read, by default all.

**plan_kwargsdict

Additional keyword arguments for read_plan().

Returns:
str

Name of output file.

Raises:
ValueError

If number of channels is not divisible by ffactor.

extract_samps(start, nsamps, outfile_name=None, gulp=16384, **plan_kwargs)[source]#

Extract a subset of samples and write to file.

Parameters:
startint

Starting time sample to extract.

nsampsint

Number of time samples to extract.

outfile_namestr, optional

Output file name, by default basename_samps_{start}_{start+nsamps}.fil.

gulpint, optional

Number of samples in each read, by default 16384.

**plan_kwargsdict

Additional keyword arguments for read_plan().

Returns:
str

Name of output file.

Raises:
ValueError

If start or nsamps are out of bounds.

extract_chans(chans=None, outfile_base=None, batch_size=200, gulp=16384, start=0, nsamps=None, **plan_kwargs)[source]#

Extract a subset of channels and write to file.

Time series are written to disk with names based on channel number.

Parameters:
chansArrayLike, optional

Channel numbers to extract, by default all channels.

outfile_basestr, optional

Base name of output files, by default header.basename.

batch_sizeint, optional

Number of channels to extract in each batch, by default 200.

gulpint, optional

Number of samples in each read, by default 16384.

startint, optional

Start sample, by default 0.

nsampsint, optional

Number of samples to read, by default all.

**plan_kwargsdict

Keyword arguments for read_plan().

Returns:
list[str]

Names of all files written to disk.

Raises:
ValueError

If chans are out of range (chan < 0 or chan > total channels).

extract_bands(chanstart, nchans, chanpersub=None, outfile_base=None, batch_size=200, gulp=16384, start=0, nsamps=None, **plan_kwargs)[source]#

Extract a subset of Sub-bands and write to file.

Filterbanks are written to disk with names based on sub-band number.

Parameters:
chanstartint

Start channel to extract.

nchansint

Number of channel to extract.

chanpersubint, optional

Number of channels in each sub-band, by default nchans.

outfile_basestr, optional

Base name of output files, by default header.basename.

batch_sizeint, optional

Number of sub-bands to extract in each batch, by default 200.

gulpint, optional

Number of samples in each read, by default 16384.

startint, optional

Start sample, by default 0.

nsampsint, optional

Number of samples to read, by default all.

**plan_kwargsdict

Keyword arguments for read_plan().

Returns:
list[str]

Names of all files written to disk.

Raises:
ValueError

If chanpersub is less than 1 or greater than nchans.

ValueError

If nchans is not divisible by chanpersub.

ValueError

If chanstart is out of range (chanstart < 0 or chanstart > total channels).

requantize(nbits_out, outfile_name=None, *, remove_bandpass=False, gulp=16384, start=0, nsamps=None, **plan_kwargs)[source]#

Requantize the data and write to a new file.

Parameters:
nbits_outint

Number of bits into requantize the data.

outfile_namestr, optional

Name of output file, by default basename_digi.fil.

remove_bandpassbool, optional

Remove the bandpass from the data, by default False.

gulpint, optional

Number of samples in each read, by default 16384.

startint, optional

Start sample, by default 0.

nsampsint, optional

Number of samples to read, by default all.

**plan_kwargsdict

Keyword arguments for read_plan().

Returns:
str

Name of output file.

Raises:
ValueError

If nbits_out is less than 1 or greater than 32.

remove_zerodm(outfile_name=None, gulp=16384, start=0, nsamps=None, *, parallel=False, **plan_kwargs)[source]#

Remove zero-DM and write to a new file.

Remove the channel-weighted zero-DM from the data and write to disk.

Parameters:
outfile_namestr, optional

Name of output file , by default basename_noZeroDM.fil.

gulpint, optional

Number of samples in each read, by default 16384.

startint, optional

Start sample, by default 0.

nsampsint, optional

Number of samples to read, by default all.

parallelbool, optional

If True, use parallel processing, by default False.

**plan_kwargsdict

Keyword arguments for read_plan().

Returns:
str

Name of output file.

Notes

Based on Presto implementation of Eatough, Keane & Lyne 2009 [1].

References

[1]

R. P. Eatough, E. F. Keane, A. G. Lyne, An interference removal technique for radio pulsar searches, MNRAS, Volume 395, Issue 1, May 2009, Pages 410-415.

subband(dm, nsub, ref_freq='fch1', outfile_name=None, gulp=16384, start=0, nsamps=None, **plan_kwargs)[source]#

Subband the data and write to a new file.

Produce a set of dedispersed subbands from the data.

Parameters:
dmfloat

The DM of the subbands.

ref_freqstr | float, optional

Reference frequency to use for dedispersion, by default “fch1”.

nsubint

The number of subbands to produce.

outfile_namestr, optional

Output file name of subbands, by default basename_DM.subbands.

gulpint, optional

Number of samples in each read, by default 16384.

startint, optional

Start sample, by default 0.

nsampsint, optional

Number of samples to read, by default all.

**plan_kwargsdict

Additional keyword arguments for read_plan().

Returns:
str

Name of output file.

fold(period, dm, accel=0, nbins=50, nints=32, nbands=32, gulp=16384, start=0, nsamps=None, **plan_kwargs)[source]#

Fold the data and return a 3D data cube.

Fold data into discrete phase, subintegration and subband bins.

Parameters:
periodfloat

Period in seconds to fold with.

dmfloat

Dispersion measure to dedisperse to.

accelfloat, optional

Acceleration in m/s/s to fold with, by default 0.

nbinsint, optional

Number of phase bins in output, by default 50.

nintsint, optional

Number of subintegrations in output, by default 32.

nbandsint, optional

Number of subbands in output, by default 32.

gulpint, optional

Number of samples in each read, by default 16384.

startint, optional

Start sample, by default 0.

nsampsint, optional

Number of samples to read, by default all.

**plan_kwargsdict

Additional keyword arguments for read_plan().

Returns:
FoldedData

3 dimensional data cube.

Raises:
ValueError

If nbands * nints * nbins is too large.

Notes

If gulp < maximum dispersion delay, gulp is taken to be twice the maximum dispersion delay.

clean_rfi(method='mad', threshold=3, freq_mask=None, custom_funcn=None, mask_value=None, outfile_name=None, gulp=16384, start=0, nsamps=None, **plan_kwargs)[source]#

Clean RFI from the filterbank data and write to a new file.

Parameters:
methodstr, optional

Method to use for cleaning (“mad”, “iqrm”), by default “mad”.

thresholdfloat, optional

Sigma threshold for finding outliers, by default 3.

freq_masklist[tuple[float, float]], optional

List of frequency ranges to mask, by default None.

custom_funcnCallable, optional

Custom function to apply to the mask, by default None.

mask_valuefloat, optional

Value to replace masked channels with, by default median of channels mean.

outfile_namestr, optional

Output file name, by default None.

gulpint, optional

Number of samples in each read, by default 16384.

startint, optional

Start sample, by default 0.

nsampsint, optional

Number of samples to read, by default all.

**plan_kwargsdict

Additional keyword arguments for read_plan().

Returns:
tuple[str, RFIMask]

Filename and mask of cleaned data.

Raises:
ValueError

If method is not “mad” or “iqrm”.