riboraptor package¶

Submodules¶

riboraptor.cli module¶

riboraptor.coherence module¶

riboraptor.coherence.get_periodicity(values, input_is_stream=False)[source]¶

Calculate periodicty wrt 1-0-0 signal.

Parameters:	values : array like List of values
Returns:	periodicity : float Periodicity calculated as cross correlation between input and idea 1-0-0 signal

riboraptor.coherence.naive_periodicity(values, identify_peak=False)[source]¶

Calculate periodicity in a naive manner

Take ratio of frame1 over avg(frame2+frame3) counts. By default the first value is treated as the first frame as well

Parameters:	values : Series Metagene profile
Returns:	periodicity : float Periodicity

riboraptor.count module¶

Utilities for read counting operations.

riboraptor.count.bam_to_bedgraph(bam, strand=u'both', end_type=u'5prime', saveto=None)[source]¶

Create bigwig from bam.

Parameters:	bam : str Path to bam file strand : str, optional Use reads mapping to ‘+/-/both’ strands end_type : str Use only end_type=5prime(5’) or “3prime(3’)” saveto : str, optional Path to write bedgraph
Returns:	genome_cov : str Bedgraph output

riboraptor.count.bedgraph_to_bigwig(bedgraph, sizes, saveto, input_is_stream=False)[source]¶

Convert bedgraph to bigwig.

Parameters:	bedgraph : str Path to bedgraph file sizes : str Path to genome chromosome sizes file or genome name saveto : str Path to write bigwig file input_is_stream : bool True if input is sent through stdin

riboraptor.count.collapse_gene_coverage_to_metagene(gene_coverages, target_length, outfile=None)[source]¶

Collapse gene coverages to specific target length.

Parameters:	gene_coverages : string Path to gene coverages.tsv target_lenght : int Collapse to target length
Returns:	collapsed_gene_coverage : Series like Collapsed version

riboraptor.count.count_feature_genewise(feature_bed, bam, force_strandedness=False, use_multiprocessing=False)[source]¶

Count features genewise.

Parameters:	bam : str Path to bam file feature_bed : str Path to features bed file
Returns:	counts : dict Genewise feature counts

riboraptor.count.count_reads_bed(bam, region_bed_f, saveto)[source]¶

Count number of reads following in each region.

Parameters:	bam : str Path to bam file (unique mapping only) region_bed_f : pybedtools.BedTool or str Genomic regions to get distance from prefix : str Prefix to output pickle files
Returns:	counts_by_region : Series Series with counts indexed by gene id region_lengths : Series Series with gene lengths counts_normalized_by_length : Series Series with normalized counts

riboraptor.count.count_reads_in_features(feature_bed, bam, force_strandedness=False, use_multiprocessing=False)[source]¶

Count reads overlapping features.

Parameters:	feature_bed : str Path to features bed file bam : str Path to bam file force_strandedness : bool Should count feature only if on the same strand use_multiprocessing : bool True if multiprocessing mode Returns ——- counts : int Number of intersection between bam and bed

riboraptor.count.count_reads_per_gene(bw, bed, prefix=None, n_cores=16, collapse_intervals=True)[source]¶

Count number of reads following in each region.

Parameters:

bw : str: Path to bigWig file
bed : pybedtools.BedTool or str: Genomic regions to get distance from
prefix : str: Prefix to output pickle files
n_cores : int: Use multiple cores (Default: 16). Set to 1 to disable multiprocessing
collapse_intervals : bool: Should the intervals be collapsed based on the ‘name’ column in gene This should be set to False for things like tRNA where the tRNA can span multiple chromosomes

Returns:

counts_by_region : Series: Series with counts indexed by gene id
region_lengths : Series: Series with gene lengths
counts_normalized_by_length : Series: Series with normalized counts

riboraptor.count.count_utr5_utr3_cds(bam, utr5_bed=None, cds_bed=None, utr3_bed=None, genome=None, force_strandedness=False, genewise=False, saveto=None, use_multiprocessing=False)[source]¶

One shot counts over UTR5/UTR3/CDS.

Parameters:	bam : str Path to bam file utr5_bed : str Path to 5’UTR feature bed file utr3_bed : str Path to 3’UTR feature bed file cds_bed : str Path to CDS feature bed file saveto : str, optional Path to output file use_multiprocessing : bool SHould use multiprocessing? Not been well tested if it really helps
Returns:	counts : dict Dict with keys as feature type and counts as values

riboraptor.count.diff_region_enrichment(numerator, denominator, prefix)[source]¶

Calculate enrichment of counts of one region over another.

Parameters:	numerator : str Path to pickle file denominator : str Path to pickle file prefix : str Prefix to save pickles to
Returns:	enrichment : series

riboraptor.count.export_gene_coverages(bigwig, region_bed_f, saveto, offset_5p=60, offset_3p=0, ignore_tx_version=True)[source]¶

Export all gene coverages.

Parameters:

bigwig : str: Path to bigwig file
region_bed_f : str: Path to region bed file (CDS/3’UTR/5’UTR) with bed name column as gene or a genome name (hg38_utr5, hg38_cds, hg38_utr3)
saveto : str: Path to write output tsv file
offset_5p : int: number of bases to count upstream (5’)
offset_30 : int: number of bases to count downstream (3’)
ignore_tx_version : bool: Should versions be ignored for gene names

Returns:

gene_profiles: file

with the following format: gene1 5poffset1 3poffset1 length1 mean1 median1 stdev1 cnt1_1 cnt1_2 cnt1_3 …

gene2 5poffset2 3poffset2 length2 mean2 median2 stdev2 cnt2_1 cnt2_2 cnt2_3 cnt2_4 …

riboraptor.count.export_metagene_coverage(bigwig, region_bed_f, max_positions=None, saveto=None, offset_5p=60, offset_3p=0, ignore_tx_version=True)[source]¶

Calculate metagene coverage.

Parameters:

bigwig : str: Path to bigwig file
region_bed_f : str: Path to region bed file (CDS/3’UTR/5’UTR) or a genome name (hg38_utr5, hg38_cds, hg38_utr3)
max_positions: int: Number of positions to consider while calculating the normalized coverage Higher values lead to slower implementation
saveto : str: Path to write output tsv file
offset_5p : int: Number of bases to offset upstream(5’)
offset_3p : int: Number of bases to offset downstream(3’)
ignore_tx_version : bool: Should versions be ignored for gene names

Returns:

metagene_profile : series: Metagene profile

riboraptor.count.extract_uniq_mapping_reads(inbam, outbam)[source]¶

Extract only uniquely mapping reads from a bam.

Parameters:	inbam : string Path to input bam file outbam : string Path to write unique reads bam to

riboraptor.count.gene_coverage(gene_name, bed, bw, gene_group=None, offset_5p=0, offset_3p=0, collapse_intervals=True)[source]¶

Get gene coverage.

Parameters:	gene_name : str Gene name bed : str Path to CDS or 5’UTR or 3’UTR bed bw : str Path to bigwig to fetch the scores from offset_5p : int (positive) Number of bases to count upstream (5’) offset_3p : int (positive) Number of bases to count downstream (3’) collapse_intervals : bool Should bed be collapsed based on gene name
Returns:	coverage_combined : series Series with index as position and value as coverage intervals_for_fasta_read : list List of tuples index_to_genomic_pos_map : series gene_offset : int Gene wise offsets

riboraptor.count.gene_coverage_sum(gene_name, bed, bw, collapse_intervals=True)[source]¶

Keep track of only the sum

Parameters:	gene_name : str Name of gene bed : str Path to bed file bw : str Path to bigwig file collapse_intervals : bool Should the intervals be collapsed based on the ‘name’ column in gene This should be set to False for things like tRNA where the tRNA can span multiple chromosomes

riboraptor.count.get_fasta_sequence(fasta, intervals)[source]¶

Extract fasta sequence given a list of intervals.

Parameters:	fasta : str Path to fasta file intervals : list(tuple) A list of tuple in the form [(chrom, start, stop, strand)]
Returns:	seq : list List of sequences at intervals

riboraptor.count.get_region_sizes(bed)[source]¶

Get collapsed lengths of gene in bed.

Parameters:	bed : str Path to bed file
Returns:	region_sizes : dict Region sies with gene names as key and value as size of this named region

riboraptor.count.htseq_to_cpm(htseq_f, saveto=None)[source]¶

Convert HTSeq counts to CPM.

Parameters:	htseq_f : str Path to HTseq counts file saveto : str, optional Path to output file
Returns:	cpm : dataframe CPM

riboraptor.count.htseq_to_tpm(htseq_f, cds_bed_f, saveto=None)[source]¶

Convert HTSeq counts to TPM.

Parameters:	htseq_f : str Path to HTseq counts file region_sizes : dict Dict with keys as gene and values as length (CDS/Exon) of that gene saveto : str, optional Path to output file
Returns:	tpm : dataframe TPM

riboraptor.count.interval_coverage(bw, intervals)[source]¶

Get coverage at custom intervals

Parameters:	bw : str Path to bigwig file intervals : list of tuples [(chrom, start, stop, strand)]
Returns:	coverage : list of series Coverage for each interval, so that it is sorted oritentation wise

riboraptor.count.mapping_reads_summary(bam, prefix)[source]¶

Count number of mapped reads.

Parameters:	bam : str Path to bam file prefix : str Prefix to save pickle to (optional)
Returns:	counts : counter Counter with keys as number of times read maps and values as number of reads of that type

riboraptor.count.pickle_bed_file(bed, collapse_intervals=True)[source]¶

Create a lookup pickle file for genewise CDS/UTR coordinates.

In order to prevent recalculating the coordinates that should be fetched for each genes’ CDS or UTR regions, they can be stored in a pickle file.

Parameters:	bed : string Path to bed file collapse_intervals : bool Should the intervals be collapsed based on the ‘name’ column in gene This should be set to False for things like tRNA where the tRNA can span multiple chromosomes

riboraptor.count.read_enrichment(read_lengths, enrichment_range=[28, 29, 30, 31, 32], input_is_stream=False, input_is_file=False)[source]¶

Calculate read enrichment for a certain range of lengths

Parameters:	read_lengths : Counter A counter with read lengths and their counts enrichment_range : range or str Range of reads to concentrate upon (28-32 or range(28,33)) input_is_stream : bool True if input is sent through stdin
Returns:	ratio : float Enrichment in this range (Scae 0-1)

riboraptor.count.read_htseq(htseq_f)[source]¶

Read HTSeq file.

Parameters:	htseq_f : str Path to htseq counts file
Returns:	htseq_df : dataframe HTseq counts as in a dataframe

riboraptor.count.read_length_distribution(bam, saveto)[source]¶

Count read lengths.

Parameters:	bam : str Path to bam file saveto: str Path to write output tsv file Returns ——- lengths : counter Counter of read length and counts

riboraptor.count.unique_mapping_reads_count(bam)[source]¶

Count number of mapped reads.

Parameters:	bam : str Path to bam file
Returns:	n_mapped : int Count of mapped reads

riboraptor.download module¶

Utilities to download data from NCBI SRA

riboraptor.download.run_download_sra_script(download_root_location=None, ascp_key_path=None, srp_id_file=None, srp_id_list=None)[source]¶

Download data from SRA.

Parameters:	download_root_location : string Path to download SRA files ascp_key_path : string Location for aspera private keypp srp_id_list : list List of SRP ids for download srp_id_file : string File containing list of SRP Ids, one per line

riboraptor.dtw module¶

riboraptor.dtw.dtw(X, Y, metric=u'euclidean', ddtw=False, ddtw_order=1)[source]¶

Parameters:

X : array_like: M x D matrix
Y : array_like: N x D matrix
metric : string: The distance metric to use. Can be : ‘braycurtis’, ‘canberra’, ‘chebyshev’, ‘cityblock’, ‘correlation’, ‘cosine’, ‘dice’, ‘euclidean’, ‘hamming’, ‘jaccard’, ‘kulsinski’, ‘mahalanobis’, ‘matching’, ‘minkowski’, ‘rogerstanimoto’, ‘russellrao’, ‘seuclidean’, ‘sokalmichener’, ‘sokalsneath’, ‘sqeuclidean’, ‘wminkowski’, ‘yule’. See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.cdist.html
ddtw : bool: Should use derivative DTW where the distance matrix is created using the derivate values at each point rather than the point themselves
ddtw_order : int [1,2]: First order uses one difference method Second order uses np.gradient for an approximation upto second order
Returns
——-
total_cost : float: Total (minimum) cost of warping
pointwise_cost : array_like: M x N matrix with cost at each (i, j)
accumulated_cost : array_like: M x N matrix with (minimum) cost accumulated till (i,j) having started from (0, 0)

riboraptor.dtw.get_path(D)[source]¶

Traceback path of minimum cost

Given accumulated cost matrix D, trace back the minimum cost path

Parameters:	D : array_like M x N matrix as obtained from accumulated_cost using: total_cost, pointwise_cost, accumulated_cost = dtw(X, Y, metric=’euclidean’)
Returns:	traceback_x, traceback_x : array_like M x 1 and N x 1 array containing indices of movement starting from (0, 0) going to (M-1, N-1)

riboraptor.dtw.plot_warped_timeseries(x, y, pointwise_cost, accumulated_cost, path, colormap=<Mock name='mock.pyplot.cm.Blues' id='140086391779792'>, linecolor=u'#D55E00')[source]¶

riboraptor.fasta module¶

riboraptor.fasta.complete_gene_fasta(utr5_bed_f, cds_bed_f, utr3_bed_f, fasta_f, prefix)[source]¶

Merge Utr5, CDS, UTR3 coordinates to get one fasta.

Parameters:	utr5_bed : str Path to 5’UTR bed cds_bed : str Path to CDS bed utr3_bed : str Path to 3’UTR bed

riboraptor.fasta.export_all_fasta(region_bed_f, chrom_sizes, fasta, prefix, offset_5p=60, offset_3p=0, ignore_tx_version=True)[source]¶

Export all gene coverages.

Parameters:	region_bed_f : str Path to region bed file (CDS/3’UTR/5’UTR) with bed name column as gene chrom_sizes : str Path to chrom.sizes file prefix : str Prefix to write output file offset_5p : int number of bases to count upstream (5’) offset_30 : int number of bases to count downstream (3’) ignore_tx_version : bool Should versions be ignored for gene names

riboraptor.fasta.export_fasta_from_bed(gene_name, bed, chrom_sizes, fasta_f, gene_group=None, offset_5p=0, offset_3p=0)[source]¶

Extract fasta genewise given coordinates in bed file

Parameters:	gene_name : str Gene name bed : str Path to CDS or 5’UTR or 3’UTR bed fasta_f : str Path to fasta file chrom_sizes : str Path to chrom.sizes file offset_5p : int (positive) Number of bases to count upstream (5’) offset_3p : int (positive) Number of bases to count downstream (3’)
Returns:	gene_offset : int Gene wise offsets

riboraptor.fasta.get_fasta_sequence(fasta_f, intervals)[source]¶

Extract fasta sequence given a list of intervals.

Parameters:	fasta_f : str Path to fasta file intervals : list(tuple) A list of tuple in the form [(chrom, start, stop, strand)] NOTE: 1-based start and stop only!
Returns:	seq : list List of sequences at intervals

riboraptor.genome module¶

riboraptor.helpers module¶

All functions that are not so useful, but still useful.

riboraptor.helpers.check_file_exists(filepath)[source]¶

Check if file exists.

Parameters:	filepath : str Path to file

riboraptor.helpers.codon_to_anticodon(codon)[source]¶

Codon to anticodon.

Parameters:	codon : string Input codon

riboraptor.helpers.collapse_bed_intervals(intervals, chromosome_lengths=None, offset_5p=0, offset_3p=0)[source]¶

Collapse intervals into non overlapping manner

# NOTE # TODO : This function has a subtle bug that it will be offset by 1 # position when the gene is on negative strand # So essentially if you have CDS on a negative strand # The first position should be discarded # Similary for the last position in the gene on + strand # you have an extra position in the end

Parameters:

intervals : list of tuples: Like [(‘chr1’, 310, 320, ‘+’), (‘chr1’, 321, 330, ‘+’)]
chromosome_lengths : dict: A map of each chromosome’e length Only used with offset_3p, offset_5p>0
offset_5p : int (positive): Number of bases to count upstream (5’)
offset_3p : int (positive): Number of bases to count downstream (3’)

Returns:

interval_combined : list of tuples: A collapsed version of interval This is useful when the annotations are overlapping. Example: chr1 310 320 gene1 + chr1 319 324 gene1 + Returns: chr1 310 324 gene1 +
intervals_for_fasta_read : list of tuples: This list can be used to directly fetch fasta from pyfaidx. NOTE: DO NOT do offset adjustments as they are already adjusted for pyfaidx format (1-end both start and end)
gene_offset_5p, gene_offset_3 : in: Gene wise offsets. This might be different from offset_5p in cases where offset_5p leads to a negative coordinate

riboraptor.helpers.create_ideal_periodic_signal(signal_length)[source]¶

Create ideal ribo-seq signal.

Parameters:	signal_length : int Length of signal to create
Returns:	signal : array_like 1-0-0 signal

riboraptor.helpers.get_strandedness(filepath)[source]¶

Parse output of infer_experiment.py from RSeqC to get strandedness.

Parameters:	filepath : str Path to infer_experiment.py output
Returns:	strandedness : str reverse or forward or none

riboraptor.helpers.identify_peaks(coverage)[source]¶: Given coverage array, find the site of maximum density

riboraptor.helpers.list_to_ranges(list_of_int)[source]¶

Convert a list to a list of range object

Parameters:	list_of_int: list List of integers to be squeezed into range
Returns:	list_of_range: list List of range objects

riboraptor.helpers.load_pickle(filepath)[source]¶: Read pickled files easy in Python 2/3

riboraptor.helpers.millify(n)[source]¶

Convert integer to human readable format.

Parameters:	n : int
Returns:	millidx : str Formatted integer

riboraptor.helpers.mkdir_p(path)[source]¶

Python version mkdir -p

Parameters:	path : str

riboraptor.helpers.pad_five_prime_or_truncate(some_list, offset_5p, target_len)[source]¶

Pad first the 5prime end and then the 3prime end or truncate

Parameters:	some_list : list Input list offset_5p : int 5’ offset target_length : int Final length of list If being extended, returns list padded with NAs.

riboraptor.helpers.pad_or_truncate(some_list, target_len)[source]¶

Pad or truncate a list upto given target length

Parameters:	some_list : list Input list target_length : int Final length of list If being extended, returns list padded with NAs.

riboraptor.helpers.parse_star_logs(infile, outfile=None)[source]¶

Parse star logs into a dict

Parameters:	infile : str Path to starlogs.final.out file
Returns:	star_info : dict Dict with necessary records parsed

riboraptor.helpers.path_leaf(path)[source]¶: Get path’s tail from a filepath

riboraptor.helpers.r2(x, y)[source]¶

Calculate pearson correlation between two vectors.

Parameters:	x : array_like Input y : array_like Input

riboraptor.helpers.round_to_nearest(x, base=5)[source]¶

Round to nearest base.

Parameters:	x : float Input
Returns:	v : int Output

riboraptor.helpers.set_xrotation(ax, degrees)[source]¶

Rotate labels on x-axis.

Parameters:	ax : matplotlib.Axes Axes object degrees : int Rotation degrees

riboraptor.helpers.summarize_counters(samplewise_dict)[source]¶

Summarize gene counts for a collection of samples.

Parameters:	samplewise_dict : dict A dictionary with key as sample name and value as another dictionary of counts for each gene
Returns:	totals : dict A dictionary with key as sample name and value as total gene count

riboraptor.helpers.summary_stats_two_arrays_welch(old_mean_array, new_array, old_var_array=None, old_n_counter=None, carried_forward_observations=None)[source]¶

Average two arrays using welch’s method

Parameters:

old_mean_array : Series: Series of previous means with index as positions
old_var_array : Series: Series of previous variances with index as positions
new_array : array like: Series of new observations (Does noes Ciunts of number of positions at a certain index

Returns:

m : array like: Column wise Mean array
var : array like: Column wise variance
Consider an example: [1,2,3], [1,2,3,4], [1,2,3,4,5]
old = [1,2,3]
new = [1,2,3,4]
counter = [1,1,1]
mean = [1,2,3,4] Var =[na, na, na, na], carried_fowrad = [[1,1], [2,2], [3,3], [4]]
old = [1,2,3,4]
new = [1,2,3,4,5]
couter = [2,2,2,1]
mean = [1,2,3,4,5]
var = [0,0,0, na, na]
carried_forward = [[], [], [], [4,4], [5]]

riboraptor.normalization module¶

riboraptor.normalization.deseq2_normalization(list_of_profiles)[source]¶

Perform DESeq2 like normalization position specific scores

Parameters:	list_of_profiles: array-like array of profiles across samples for one gene
Returns:	normalized_profiles: array-like array of profiles across samples

riboraptor.plotting module¶

Plotting methods.

riboraptor.plotting.create_wavelet(data, ax)[source]¶

riboraptor.plotting.plot_featurewise_barplot(utr5_counts, cds_counts, utr3_counts, ax=None, saveto=None, **kwargs)[source]¶

Plot barplots for 5’UTR/CDS/3’UTR counts.

Parameters:

utr5_counts : int or dict: Total number of reads in 5’UTR region or alternatively a dictionary/series with genes as key and 5’UTR counts as values
cds_counts : int or dict: Total number of reads in CDs region or alternatively a dictionary/series with genes as key and CDS counts as values
utr3_counts : int or dict: Total number of reads in 3’UTR region or alternatively a dictionary/series with genes as key and 3’UTR counts as values
saveto : str: Path to save output file to (<filename>.png/<filename>.pdf)

riboraptor.plotting.plot_framewise_counts(counts, frames_to_plot=u'all', ax=None, title=None, millify_labels=False, position_range=None, saveto=None, ascii=False, input_is_stream=False, **kwargs)[source]¶

Plot framewise distribution of reads.

Parameters:	counts : Series A series with position as index and value as counts frames_to_plot : str or range A comma seaprated list of frames to highlight or a range ax : matplotlib.Axes Default none saveto : str Path to save output file to (<filename>.png/<filename>.pdf)

riboraptor.plotting.plot_read_counts(counts, ax=None, marker=None, color=u'royalblue', title=None, label=None, millify_labels=False, identify_peak=True, saveto=None, position_range=None, ascii=False, input_is_stream=False, ylabel=u'Normalized RPF density', **kwargs)[source]¶

Plot RPF density aro und start/stop codons.

Parameters:

counts : Series/Counter: A series with coordinates as index and counts as values
ax : matplotlib.Axes: Axis to create object on
marker : string: ‘o’/’x’
color : string: Line color
label : string: Label (useful only if plotting multiple objects on same axes)
millify_labels : bool: True if labels should be formatted to read millions/trillions etc
saveto : str: Path to save output file to (<filename>.png/<filename>.pdf)

riboraptor.plotting.plot_read_length_dist(read_lengths, ax=None, millify_labels=True, input_is_stream=False, title=None, saveto=None, ascii=False, **kwargs)[source]¶

Plot read length distribution.

Parameters:	read_lengths : array_like Array of read lengths ax : matplotlib.Axes Axis object millify_labels : bool True if labels should be formatted to read millions/trillions etc input_is_stream : bool True if input is sent through stdin saveto : str Path to save output file to (<filename>.png/<filename>.pdf)

riboraptor.plotting.setup_axis(ax, axis=u'x', majorticks=5, minorticks=1, xrotation=45, yrotation=0)[source]¶

Setup axes defaults

Parameters:	ax : matplotlib.Axes axis : str Setup ‘x’ or ‘y’ axis majorticks : int Length of interval between two major ticks minorticks : int Length of interval between two major ticks xrotation : int Rotate x axis labels by xrotation degrees yrotation : int Rotate x axis labels by xrotation degrees

riboraptor.plotting.setup_plot()[source]¶: Setup plotting defaults

riboraptor.statistics module¶

riboraptor.statistics.KDE(values)[source]¶

Perform Univariate Kernel Density Estimation.

Wrapper utility around statsmodels for quick KDE TODO: scikit-learn has a faster implementation (?)

Parameters:	values : array like
Returns:	support : array_like cdf : array_like

riboraptor.statistics.KS_test(a, b)[source]¶

Perform KS test between a and b values

Parameters:	a, b : array-like Input
Returns:	D : int KS D statistic effect_size : float maximum difference at point of D-statistic cdf_a, cdf_b : float CDF of a, b Note: By default this method does testing for alternative=lesser implying that the test will reject H0 when the CDf of b is ‘above’ a

riboraptor.statistics.calculate_cdf(data)[source]¶

Calculate CDF given data points

Parameters:	data : array-like Input values
Returns:	cdf : series Cumulative distribution funvtion calculated at indexed points

riboraptor.statistics.series_cdf(series)[source]¶

Calculate cdf of series preserving the index

Parameters:	series : series like
Returns:	cdf : series

riboraptor.utils module¶

riboraptor.utils.determine_cell_type(sample_attribute)[source]¶

riboraptor.utils.get_cell_line_or_tissue(row)[source]¶

riboraptor.utils.get_enrichment_cds_stats(pickle_file)[source]¶

riboraptor.utils.get_fragment_enrichment_score(txt_file)[source]¶

riboraptor.utils.get_strain_type(sample_attribute)[source]¶

riboraptor.utils.get_tissue_type(sample_attribute)[source]¶

riboraptor.utils.load_tpm(path)[source]¶

riboraptor.utils.summary_starlogs_over_runs(directory, list_of_srr)[source]¶

riboraptor.wig module¶

class riboraptor.wig.WigReader(wig_location)[source]¶

Bases: object

Class for reading and querying wigfiles.

get_chromosomes¶

Return list of chromsome and their sizes as in the wig file.

Returns:	chroms : dict Dictionary with {“chr”: “Length”} format .. currentmodule:: .WigReader .. autosummary:: .WigReader

query(intervals)[source]¶

Query regions for scores.

Parameters:	intervals : list(tuple) A list of tuples with the following format: (chr, chrStart, chrEnd, strand)
Returns:	scores : array_like A numpy array containing scores for each tuple .. currentmodule:: .WigReader .. autosummary:: .WigReader

Contents

Previous topic

Next topic

This Page

riboraptor package¶

Submodules¶

riboraptor.cli module¶

riboraptor.coherence module¶

riboraptor.count module¶

riboraptor.download module¶

riboraptor.dtw module¶

riboraptor.fasta module¶

riboraptor.genome module¶

riboraptor.helpers module¶

riboraptor.normalization module¶

riboraptor.plotting module¶

riboraptor.statistics module¶

riboraptor.utils module¶

riboraptor.wig module¶

Module contents¶