THAPBI PICT is a command line tool, meaning you must open your command line
terminal window and key in instructions to use the tool. The documentation
examples use the
$ (dollar sign) to indicate the prompt, followed by text
to be entered. For example, this should run the tool with no instructions:
$ thapbi_pict ...
Rather than literally printing dot dot dot, the tool should print out some terse help, listing various sub-command names, and an example of how to get more help.
-v (minus sign, lower case letter v) or
minus, version in lower case) can be added to find out the version of the tool
$ thapbi_pict -v THAPBI PICT v0.8.1
THAPBI PICT follows the sub-command style popularised in bioinformatics by
samtools (also used in the version control tool
git). This means most
of the instructions take the form
thapbi_pict sub-command ..., where the
dots indicate some additional options.
The main sub-commands are to do with classifying sequence files and reporting the results, and these are described in the first worked example:
prepare- turn paired FASTQ input files for each sample, giving de-duplicated FASTA files
sample-tallypooling intermediate files for analysis
classify- produce genus/species level predictions as tab-separated-variable TSV files
summary- summarise a set of predictions by sample (with human readable report), and by unique sequence and sample (both with Excel reports)
edit-graph- draw the unique sequences as nodes on a graph, connected by edit-distance
assess- compare classifier output to known positive controls
pipeline- run all of the above in sequence
There are further sub-commands to do with making or inspecting an SQLite3 format barcode marker sequence database, most of which are covered in the second worked example, with a custom database:
dump- export a DB as TSV or FASTA format
load-tax- import a copy of the NCBI taxonomy
import- import a FASTA file, e.g. using the NCBI style naming
curated-seq- label prepared reads with known species assignment (single isolate positive controls)
conflicts- report on genus or species level conflicts in the database
And some other miscellaneous commands:
ena-submit- write a TSV table of your paired FASTQ files for use with the ENA interactive submission system.
Start with reading the help for any command using
$ thapbi_pict pipeline -h ...
Most of the commands have required arguments, and if you omit a required argument it will stop with an error:
$ thapbi_pict pipeline ... thapbi_pict pipeline: error: the following arguments are required: -i/--input, -o/--output