sigpyproc.base

sigpyproc.base#

Base classes for manipulating frequency-major order pulsar data.

This module contains the Filterbank class for manipulating frequency-major order pulsar data.

Classes

Filterbank

Base class for manipulating frequency-major order pulsar data.

class sigpyproc.base.Filterbank[source]#

Bases: ABC

Base class for manipulating frequency-major order pulsar data.

The Filterbank class should never be instantiated directly. Instead it should be inherited by data reading classes.

Attributes:

header: Header metadata of input file.
chan_stats: Channel statistics of the data.

Methods

`apply_channel_mask`(chan_mask[, mask_value, ...])	Apply a channel mask and write to a new file.
`bandpass`([method, gulp, start, parallel, nsamps])	Compute the bandpass of the data.
`clean_rfi`([method, threshold, freq_mask, ...])	Clean RFI from the filterbank data and write to a new file.
`collapse`([method, gulp, start, nsamps, parallel])	Sum across all frequencies for each time sample.
`compute_stats`([gulp, start, nsamps])	Compute channelwise statistics of data.
`compute_stats_basic`([gulp, start, nsamps])	Compute channelwise statistics of data (basic).
`dedisperse`(dm[, ref_freq, gulp, start, ...])	Dedisperse and collapse to a time series.
`downsample`([tfactor, ffactor, outfile_name, ...])	Decimate in time and frequency and write to file.
`extract_bands`(chanstart, nchans[, ...])	Extract a subset of Sub-bands and write to file.
`extract_chans`([chans, outfile_base, ...])	Extract a subset of channels and write to file.
`extract_samps`(start, nsamps[, outfile_name, ...])	Extract a subset of samples and write to file.
`fold`(period, dm[, accel, nbins, nints, ...])	Fold the data and return a 3D data cube.
`invert_freq`([outfile_name, gulp, start, ...])	Invert frequency axis and write to a new file.
`read_block`(start, nsamps[, fch1, nchans])	Read a data block from the filterbank file stream.
`read_chan`(ichan[, gulp, start, nsamps])	Read a single frequency channel as a time series.
`read_dedisp_block`(start, nsamps, dm)	Read a block of dedispersed filterbank data.
`read_plan`(*[, gulp, start, nsamps, ...])	Read sequential filterbank in gulps and yield.
`remove_zerodm`([outfile_name, gulp, start, ...])	Remove zero-DM and write to a new file.
`requantize`(nbits_out[, outfile_name, ...])	Requantize the data and write to a new file.
`subband`(dm, nsub[, ref_freq, outfile_name, ...])	Subband the data and write to a new file.

abstract property header#

Header metadata of input file.

Returns:

Header: Header object containing metadata of the input file.

abstractmethod read_block(start, nsamps, fch1=None, nchans=None)[source]#

Read a data block from the filterbank file stream.

Parameters:

startint: First time sample of the block to be read.
nsampsint: Number of samples in the block (i.e. block will be nsamps*nchans in size).
fch1float, optional: Frequency of the first channel, by default None (header value).
nchansint, optional: Number of channels in the block, by default None (header value).

Returns:

FilterbankBlock: 2-D array of filterbank data with observational metadata.

Raises:

ValueError: if requested nsamps or nchans are out of range.

abstractmethod read_dedisp_block(start, nsamps, dm)[source]#

Read a block of dedispersed filterbank data.

Best used in cases where I/O time dominates reading a block of data.

Parameters:

startint: First time sample of the block to be read.
nsampsint: Number of samples in the block (i.e. block will be nsamps*nchans in size).
dmfloat: Dispersion measure to dedisperse at.

Returns:

FilterbankBlock: 2-D array of filterbank data with observational metadata.

Raises:

ValueError: if requested dedispersed nsamps are out of range.

abstractmethod read_plan(*, gulp=16384, start=0, nsamps=None, skipback=0, description=None, quiet=False, allocator=None)[source]#

Read sequential filterbank in gulps and yield.

Parameters:

gulpint, optional: Number of time samples in each read, by default 16384.
startint, optional: Starting sample to read from, by default 0 (start of file).
nsampsint, optional: Total number of samples to read, by default None (end of the file).
skipbackint, optional: Number of samples to skip back after each read, by default 0.
descriptionstr, optional: Annotation for progress bar (rich), by default Calling Stack.
quietbool, optional: Disable progress bar and logging, by default False.
allocatorCallable[[int], Buffer], optional: An allocator callback that returns an object implementing the Python Buffer Protocol interface (PEP 3118) for the data to be read into, by default None.

Yields:

Iterator[tuple[int, int, ndarray]]: Tuple of number of samples read, index of read, and the unpacked data read.

Raises:

ValueError: If read samples < skipback.

Notes

For each read, the generator yields a tuple x, where:

x[0] is the number of samples read

x[1] is the index of the read (i.e. x[1]=0 is the first read)

x[2] is a 1-D numpy array containing the data that was read

Examples

The normal calling syntax for this is function is:

>>> for nsamps_r, ii, data in self.read_plan(**plan_kwargs):
        # do something
where data always has contains ``nchans*nsamps`` points.

property chan_stats#

Channel statistics of the data.

Returns:

ChannelStats | None: Channel statistics object if computed, else None.

compute_stats(gulp=16384, start=0, nsamps=None, **plan_kwargs)[source]#

Compute channelwise statistics of data.

Channel statistics include mean, rms, skewness, kurtosis, maxima, and minima.

Parameters:

gulpint, optional: Number of samples in each read, by default 16384.
startint, optional: Start sample, by default 0.
nsampsint, optional: Number of samples to read, by default all.
**plan_kwargsdict: Keyword arguments for read_plan().

compute_stats_basic(gulp=16384, start=0, nsamps=None, **plan_kwargs)[source]#

Compute channelwise statistics of data (basic).

Channel statistics include mean, rms, maxima, and minima.

Parameters:

gulpint, optional: Number of samples in each read, by default 16384.
startint, optional: Start sample, by default 0.
nsampsint, optional: Number of samples to read, by default all.
**plan_kwargsdict: Keyword arguments for read_plan().

collapse(method='sum', gulp=16384, start=0, nsamps=None, *, parallel=False, **plan_kwargs)[source]#

Sum across all frequencies for each time sample.

Parameters:

method{“mean”, “sum”}, optional: Method to collapse the data, by default “mean”.
gulpint, optional: Number of samples in each read, by default 16384.
startint, optional: Start sample, by default 0.
nsampsint, optional: Number of samples to read, by default all.
parallelbool, optional: If True, use parallel processing, by default False.
**plan_kwargsdict: Additional Keyword arguments for read_plan().

Returns:

TimeSeries: A zero-DM time series.

bandpass(method='sum', gulp=16384, start=0, *, parallel=False, nsamps=None, **plan_kwargs)[source]#

Compute the bandpass of the data.

Average across each time sample for all frequencies.

Parameters:

method{“mean”, “sum”}, optional: Method to collapse the data, by default “mean”.
gulpint, optional: Number of samples in each read, by default 16384.
startint, optional: Start sample, by default 0.
nsampsint, optional: Number of samples to read, by default all.
parallelbool, optional: If True, use parallel processing, by default False.
**plan_kwargsdict: Keyword arguments for read_plan().

Returns:

TimeSeries: Bandpass of the data.

dedisperse(dm, ref_freq='fch1', gulp=16384, start=0, nsamps=None, *, parallel=False, **plan_kwargs)[source]#

Dedisperse and collapse to a time series.

Parameters:

dmfloat: Dispersion measure to dedisperse to.
ref_freqstr | float, optional: Reference frequency to use for dedispersion, by default “fch1”.
gulpint, optional: Number of samples in each read, by default 16384.
startint, optional: Start sample, by default 0.
nsampsint, optional: Number of samples to read, by default all.
parallelbool, optional: If True, use parallel processing, by default False.
**plan_kwargsdict: Additional keyword arguments for read_plan().

Returns:

TimeSeries: A dedispersed time series.

Notes

If gulp < maximum dispersion delay, gulp is taken to be twice the maximum dispersion delay.

read_chan(ichan, gulp=16384, start=0, nsamps=None, **plan_kwargs)[source]#

Read a single frequency channel as a time series.

Parameters:

ichanint: Channel index to retrieve (0 is the highest frequency channel).
gulpint, optional: Number of samples in each read, by default 16384.
startint, optional: Start sample, by default 0.
nsampsint, optional: Number of samples to read, by default all.
**plan_kwargsdict: Additional keyword arguments for read_plan().

Returns:

TimeSeries: Selected channel as a time series.

Raises:

ValueError: If ichan is out of range (ichan < 0 or ichan > nchans).

invert_freq(outfile_name=None, gulp=16384, start=0, nsamps=None, *, parallel=False, **plan_kwargs)[source]#

Invert frequency axis and write to a new file.

Parameters:

outfile_namestr, optional: Name of output file, by default basename_inverted.fil.
gulpint, optional: Number of samples in each read, by default 16384.
startint, optional: Start sample, by default 0.
nsampsint, optional: Number of samples to read, by default all.
parallelbool, optional: If True, use parallel processing, by default False.
**plan_kwargsdict: Additional keyword arguments for read_plan().

Returns:

str: Name of output file.

apply_channel_mask(chan_mask, mask_value=0, outfile_name=None, gulp=16384, start=0, nsamps=None, *, parallel=False, **plan_kwargs)[source]#

Apply a channel mask and write to a new file.

Parameters:

chan_maskndarray: 1D Boolean array of channel mask (1 or True for bad channels).
mask_valuefloat, optional: Value to replace masked channels with, by default 0.
outfile_namestr, optional: Name of the output filterbank file, by default basename_masked.fil.
gulpint, optional: Number of samples in each read, by default 16384.
startint, optional: Start sample, by default 0.
nsampsint, optional: Number of samples to read, by default all.
parallelbool, optional: If True, use parallel processing, by default False.
**plan_kwargsdict: Additional keyword arguments for read_plan().

Returns:

str: Name of output file.

downsample(tfactor=1, ffactor=1, outfile_name=None, gulp=16384, start=0, nsamps=None, **plan_kwargs)[source]#

Decimate in time and frequency and write to file.

Parameters:

tfactorint, optional: Factor by which to downsample in time, by default 1.
ffactorint, optional: Factor by which to downsample in frequency, by default 1.
outfile_namestr, optional: Name of file to write to, by default basename_tfactor_ffactor.fil.
gulpint, optional: Number of samples in each read, by default 16384.
startint, optional: Start sample, by default 0.
nsampsint, optional: Number of samples to read, by default all.
**plan_kwargsdict: Additional keyword arguments for read_plan().

Returns:

str: Name of output file.

Raises:

ValueError: If number of channels is not divisible by ffactor.

extract_samps(start, nsamps, outfile_name=None, gulp=16384, **plan_kwargs)[source]#

Extract a subset of samples and write to file.

Parameters:

startint: Starting time sample to extract.
nsampsint: Number of time samples to extract.
outfile_namestr, optional: Output file name, by default basename_samps_{start}_{start+nsamps}.fil.
gulpint, optional: Number of samples in each read, by default 16384.
**plan_kwargsdict: Additional keyword arguments for read_plan().

Returns:

str: Name of output file.

Raises:

ValueError: If start or nsamps are out of bounds.

extract_chans(chans=None, outfile_base=None, batch_size=200, gulp=16384, start=0, nsamps=None, **plan_kwargs)[source]#

Extract a subset of channels and write to file.

Time series are written to disk with names based on channel number.

Parameters:

chansArrayLike, optional: Channel numbers to extract, by default all channels.
outfile_basestr, optional: Base name of output files, by default header.basename.
batch_sizeint, optional: Number of channels to extract in each batch, by default 200.
gulpint, optional: Number of samples in each read, by default 16384.
startint, optional: Start sample, by default 0.
nsampsint, optional: Number of samples to read, by default all.
**plan_kwargsdict: Keyword arguments for read_plan().

Returns:

list[str]: Names of all files written to disk.

Raises:

ValueError: If chans are out of range (chan < 0 or chan > total channels).

extract_bands(chanstart, nchans, chanpersub=None, outfile_base=None, batch_size=200, gulp=16384, start=0, nsamps=None, **plan_kwargs)[source]#

Extract a subset of Sub-bands and write to file.

Filterbanks are written to disk with names based on sub-band number.

Parameters:

chanstartint: Start channel to extract.
nchansint: Number of channel to extract.
chanpersubint, optional: Number of channels in each sub-band, by default nchans.
outfile_basestr, optional: Base name of output files, by default header.basename.
batch_sizeint, optional: Number of sub-bands to extract in each batch, by default 200.
gulpint, optional: Number of samples in each read, by default 16384.
startint, optional: Start sample, by default 0.
nsampsint, optional: Number of samples to read, by default all.
**plan_kwargsdict: Keyword arguments for read_plan().

Returns:

list[str]: Names of all files written to disk.

Raises:

ValueError: If chanpersub is less than 1 or greater than nchans.
ValueError: If nchans is not divisible by chanpersub.
ValueError: If chanstart is out of range (chanstart < 0 or chanstart > total channels).

requantize(nbits_out, outfile_name=None, *, remove_bandpass=False, gulp=16384, start=0, nsamps=None, **plan_kwargs)[source]#

Requantize the data and write to a new file.

Parameters:

nbits_outint: Number of bits into requantize the data.
outfile_namestr, optional: Name of output file, by default basename_digi.fil.
remove_bandpassbool, optional: Remove the bandpass from the data, by default False.
gulpint, optional: Number of samples in each read, by default 16384.
startint, optional: Start sample, by default 0.
nsampsint, optional: Number of samples to read, by default all.
**plan_kwargsdict: Keyword arguments for read_plan().

Returns:

str: Name of output file.

Raises:

ValueError: If nbits_out is less than 1 or greater than 32.

remove_zerodm(outfile_name=None, gulp=16384, start=0, nsamps=None, *, parallel=False, **plan_kwargs)[source]#

Remove zero-DM and write to a new file.

Remove the channel-weighted zero-DM from the data and write to disk.

Parameters:

outfile_namestr, optional: Name of output file , by default basename_noZeroDM.fil.
gulpint, optional: Number of samples in each read, by default 16384.
startint, optional: Start sample, by default 0.
nsampsint, optional: Number of samples to read, by default all.
parallelbool, optional: If True, use parallel processing, by default False.
**plan_kwargsdict: Keyword arguments for read_plan().

Returns:

str: Name of output file.

Notes

Based on Presto implementation of Eatough, Keane & Lyne 2009 [1].

References

[1]

R. P. Eatough, E. F. Keane, A. G. Lyne, An interference removal technique for radio pulsar searches, MNRAS, Volume 395, Issue 1, May 2009, Pages 410-415.

subband(dm, nsub, ref_freq='fch1', outfile_name=None, gulp=16384, start=0, nsamps=None, **plan_kwargs)[source]#

Subband the data and write to a new file.

Produce a set of dedispersed subbands from the data.

Parameters:

dmfloat: The DM of the subbands.
ref_freqstr | float, optional: Reference frequency to use for dedispersion, by default “fch1”.
nsubint: The number of subbands to produce.
outfile_namestr, optional: Output file name of subbands, by default basename_DM.subbands.
gulpint, optional: Number of samples in each read, by default 16384.
startint, optional: Start sample, by default 0.
nsampsint, optional: Number of samples to read, by default all.
**plan_kwargsdict: Additional keyword arguments for read_plan().

Returns:

str: Name of output file.

fold(period, dm, accel=0, nbins=50, nints=32, nbands=32, gulp=16384, start=0, nsamps=None, **plan_kwargs)[source]#

Fold the data and return a 3D data cube.

Fold data into discrete phase, subintegration and subband bins.

Parameters:

periodfloat: Period in seconds to fold with.
dmfloat: Dispersion measure to dedisperse to.
accelfloat, optional: Acceleration in m/s/s to fold with, by default 0.
nbinsint, optional: Number of phase bins in output, by default 50.
nintsint, optional: Number of subintegrations in output, by default 32.
nbandsint, optional: Number of subbands in output, by default 32.
gulpint, optional: Number of samples in each read, by default 16384.
startint, optional: Start sample, by default 0.
nsampsint, optional: Number of samples to read, by default all.
**plan_kwargsdict: Additional keyword arguments for read_plan().

Returns:

FoldedData: 3 dimensional data cube.

Raises:

ValueError: If nbands * nints * nbins is too large.

Notes

If gulp < maximum dispersion delay, gulp is taken to be twice the maximum dispersion delay.

clean_rfi(method='mad', threshold=3, freq_mask=None, custom_funcn=None, mask_value=None, outfile_name=None, gulp=16384, start=0, nsamps=None, **plan_kwargs)[source]#

Clean RFI from the filterbank data and write to a new file.

Parameters:

methodstr, optional: Method to use for cleaning (“mad”, “iqrm”), by default “mad”.
thresholdfloat, optional: Sigma threshold for finding outliers, by default 3.
freq_masklist[tuple[float, float]], optional: List of frequency ranges to mask, by default None.
custom_funcnCallable, optional: Custom function to apply to the mask, by default None.
mask_valuefloat, optional: Value to replace masked channels with, by default median of channels mean.
outfile_namestr, optional: Output file name, by default None.
gulpint, optional: Number of samples in each read, by default 16384.
startint, optional: Start sample, by default 0.
nsampsint, optional: Number of samples to read, by default all.
**plan_kwargsdict: Additional keyword arguments for read_plan().

Returns:

tuple[str, RFIMask]: Filename and mask of cleaned data.

Raises:

ValueError: If method is not “mad” or “iqrm”.

sigpyproc.base

On this page

sigpyproc.base#