thapbi_pict.sample_tally module

Prepare a non-redundant TSV file using MD5 naming.

This implements the thapbi_pict sample-tally ... command.

thapbi_pict.sample_tally.main(inputs: str | list[str], synthetic_controls: list[str], negative_controls: list[str], output: str, session, marker: str | None = None, spike_genus=None, fasta=None, min_abundance: int = 100, min_abundance_fraction: float = 0.001, total_min_abundance: int = 0, min_length: int = 0, max_length: int = 9223372036854775807, denoise_algorithm: str = '-', unoise_alpha: float = 2.0, unoise_gamma: int = 4, gzipped: bool = False, tmp_dir: str | None = None, debug: bool = False, cpu: int = 0) None

Implement the thapbi_pict sample-tally command.

Arguments min_length and max_length are applied while loading the input per-sample FASTA files.

Argument algorithm is a string, “-” for no read correction (denoising), “unoise-l” for our reimplementation of the UNOISE2 algorithm, or “usearch” or “vsearch” to invoke those tools at the command line.

Arguments min_abundance and min_abundance_fraction are applied per-sample (after denoising if being used), increased by pool if negative or synthetic controls are given respectively. Comma separated string argument spike_genus is treated case insensitively.