thapbi_pict.summary module

Summarise classification results at sample and read level.

This implements the thapbi_pict summary ... command.

The code uses the term metadata to refer to the user-provided information about each sample (via a plain text TSV table), and statistics for the internally tracked information about each sample like the number of raw reads in the original FASTQ files (via header lines in the intermediate FASTA files).

thapbi_pict.summary.color_bands(meta_groups, sample_color_bands, default_fmt=None, debug: bool = False) list

Return a list for formats, one for each sample.

thapbi_pict.summary.main(inputs, report_stem: str, method: str, min_abundance: int = 1, metadata_file: str | None = None, metadata_encoding: str | None = None, metadata_cols: str | None = None, metadata_groups: str | None = None, metadata_fieldnames: str | None = None, metadata_index: str | None = None, require_metadata: bool = False, show_unsequenced: bool = True, ignore_prefixes: tuple[str] | None = None, biom: bool = False, debug: bool = False) int

Implement the thapbi_pict summary command.

The expectation is that the inputs represent all the samples from a meaningful group, likely from multiple sequencing runs (plates).

thapbi_pict.summary.read_summary(markers, marker_md5_to_seq, marker_md5_species, marker_md5_abundance, abundance_by_samples, stem_to_meta, meta_names, group_col, sample_stats, stats_fields, output, method, min_abundance=1, excel=None, biom=None, debug=False) None

Create reads (rows) vs species (cols) report.

The expectation is that the inputs represent all the samples from one (96 well) plate, or some other meaningful batch.

thapbi_pict.summary.sample_summary(sample_species_counts, meta_to_stem, stem_to_meta, meta_names, group_col, sample_stats, stats_fields, show_unsequenced, output, excel, method, min_abundance=1, debug=False)

Create samples (rows) vs species (cols) report.

The expectation is that the inputs represent all the samples from a meaningful group, likely from multiple sequencing runs (plates).