Command Line
THAPBI PICT is a command line tool, meaning you must open your command line
terminal window and key in instructions to use the tool. The documentation
examples use the $
(dollar sign) to indicate the prompt, followed by text
to be entered. For example, this should run the tool with no instructions:
$ thapbi_pict
...
Rather than literally printing dot dot dot, the tool should print out some terse help, listing various sub-command names, and an example of how to get more help.
For example, -v
(minus sign, lower case letter v) or --version
(minus,
minus, version in lower case) can be added to find out the version of the tool
installed:
$ thapbi_pict -v
THAPBI PICT v0.8.1
THAPBI PICT follows the sub-command style popularised in bioinformatics by
samtools
(also used in the version control tool git
). This means most
of the instructions take the form thapbi_pict sub-command ...
, where the
dots indicate some additional options.
The main sub-commands are to do with classifying sequence files and reporting the results, and these are described in the first worked example:
prepare
- turn paired FASTQ input files for each sample, giving de-duplicated FASTA filesfasta-nr
andsample-tally
pooling intermediate files for analysisclassify
- produce genus/species level predictions as tab-separated-variable TSV filessummary
- summarise a set of predictions by sample (with human readable report), and by unique sequence and sample (both with Excel reports)edit-graph
- draw the unique sequences as nodes on a graph, connected by edit-distanceassess
- compare classifier output to known positive controlspipeline
- run all of the above in sequence
There are further sub-commands to do with making or inspecting an SQLite3 format barcode marker sequence database, most of which are covered in the second worked example, with a custom database:
dump
- export a DB as TSV or FASTA formatload-tax
- import a copy of the NCBI taxonomyimport
- import a FASTA file, e.g. using the NCBI style namingconflicts
- report on genus or species level conflicts in the database
And some other miscellaneous commands:
ena-submit
- write a TSV table of your paired FASTQ files for use with the ENA interactive submission system.
Start with reading the help for any command using -h
or --help
as
follows:
$ thapbi_pict pipeline -h
...
Most of the commands have required arguments, and if you omit a required argument it will stop with an error:
$ thapbi_pict pipeline
...
thapbi_pict pipeline: error: the following arguments are required: -i/--input, -o/--output