DaisySuite example

To run DaisySuite on an example dataset, first copy the example into a directory of your choice by running

DaisySuite_example .

to copy the folder example into the current working directory.

Next, you need to edit the following parameters in the example/example.yaml:

  • outputdir (full path to example/output/)
  • ncbidir
  • bwa (in case you are using bwa)
  • yara_index or bwa_index

Finally, you can run DaisySuite:

DaisySuite --configfile example/example.yaml

You can also use multiple threads by adding --cores <thread_number>, e.g. --cores 10, to the command.

DaisyGPS results

You will find the acceptor Escherichia coli str. K-12 substr. DH10B [NC_010473.1] and the donor Helicobacter pylori [NZ_AP014710.1] in the example/output/candidates/sim1HP_candidates.tsv file.

DaisyGPS Results
Type Name Accession.Version TaxID Parent TaxID Species TaxID Abundance Num. Reads Unique Reads Coverage Validity Homogeneity Mapping Error Property Score Property
Acceptor Escherichia coli str. K-12 substr. DH10B NC_010473.1 316385 83333 562 0.946692320209823 197800 136 36.8658152545 0.25427675964 0.08146934189377475 0.021496966633 0.0030079101341754125 0.17280741774622527
Acceptor Escherichia coli K-12 NZ_CP010445.1 83333 562 562 0.8952416506332022 187050 0 35.7282033189 0.237248626581 0.07496815501320375 0.0214397576406 0.0026711615990942022 0.16228047156779624
Donor [Haemophilus] ducreyi NZ_CP015434.1 730 724 730 0.0015411270328997118 322 0 24.7946611905 0.00109715387698 0.9254876430438622 0.0265838509317 -2.619313788987036e-05 -0.9243904891668824
Donor Salmonella enterica subsp. enterica serovar Anatum str. USDA-ARS-USMARC-1676 NZ_CP014620.1 1454587 58712 28901 0.0006030497085259743 126 0 0.0645783951687 0.00108528243123 0.9193443687114572 0.013492063492100002 -1.0181504759172115e-05 -0.9182590862802272
Donor Klebsiella oxytoca KONIH1 NZ_CP008788.1 1333852 571 571 0.008571920856904919 1791 0 41.2926529358 0.00105750960227 0.7946631781949077 0.0263539921831 -0.00012507673507004732 -0.7936056685926377
Donor Helicobacter pylori NZ_AP014710.1 210 209 210 0.043812039935291806 9154 9091 47.3515414856 0.0177922034096 0.7999996890279752 0.00920544752749 -0.0006300993983310352 -0.7822074856183753
Acceptor-like Donor Escherichia coli NZ_CP016182.1 562 561 562 0.35694799414180284 74580 0 6.99537943225 0.0939148463399 0.0879420602326042 0.0211231786896 3.919904897022354e-05 0.00597278610729579

Daisy results

Furthermore, you will find the base pair positions of the transfer in example/output/hgt_eval/sim1HP.vcf. Bases 1322000 to 1350000 of the donor have been inserted at base 1120262 of the acceptor. This is indicated by two breakpoints in the vcf, one representing the beginning of the insert (acceptor 1120261, donor 1322000) and one representing the end of the insert (acceptor 1120263, donor 1350000).

The example/output/hgt_eval/sim1HP.tsv also provides a more intuitive representation of putative transferred regions, but please note that those candidates have not been filtered by the sampling values.

Daisy TSV header
#AN: Acceptor name
#DN: Donor name
#AS: Acceptor start position
#AE: Acceptor end position
#DS: Donor start position
#DE: Donor end position
#MC: Mean coverage in region
#Split: Total number split-reads per region (including duplicates!)
#PS-S: Pairs spanning HGT boundaries
#PS-W: Pairs within HGT boundaries
#Phage: PS-S and PS-W reads mapping to phage database
#BS:MC/PS-S/PS-W: Percent of bootstrapped random regions with MC/PS-S/PS-W smaller than candidate
Daisy Results TSV
AN DN AS AE MC BS:MC DS DE MC Split PS-S PS-W Phage BS:MC BS:PS-S BS:PS-W
NZ_CP010445.1 NZ_AP014710.1 1880235 1880237 44.00 7 1322002 1350000 94.62 152 182 8712 0.0000 100 100 100
NZ_CP010445.1 NZ_CP015434.1 3904873 3904886 40.54 3 114928 126957 30.41 871 156 884 0.0000 100 100 100
NZ_CP010445.1 NZ_CP015434.1 3904873 3916007 97.63 20 125626 126957 129.68 1571 108 258 0.0000 100 100 100
NZ_CP010445.1 NZ_CP015434.1 3904885 3916007 97.69 18 114927 125626 18.06 253 43 279 0.0000 100 99 100
NC_010473.1 NZ_AP014710.1 1120261 1120263 43.00 3 1322002 1350000 94.62 154 182 8712 0.0000 100 100 100
Daisy VCF header
##fileformat=VCFv4.2
##source=DAISY
##INFO=<ID=EVENT,Number=1,Type=String,Description="Event identifier for breakends.">
##contig=<ID=NC_010473.1>
##contig=<ID=NZ_CP010445.1>
##contig=<ID=NZ_CP015434.1>
##contig=<ID=NZ_CP014620.1>
##contig=<ID=NZ_CP008788.1>
##contig=<ID=NZ_AP014710.1>
##contig=<ID=NZ_CP016182.1>
Daisy Results VCF
CHROM POS ID REF ALT QUAL FILTER INFO FORMAT
NZ_CP010445.1 1880235 BND_1_1 A A[NZ_AP014710.1:1322002[ PASS SVTYPE=BND;EVENT=HGT1 . 1
NZ_CP010445.1 1880237 BND_1_2 G ]NZ_AP014710.1:1350000]G PASS SVTYPE=BND;EVENT=HGT1 . 1
NZ_CP010445.1 3904873 BND_1_1 T T[NZ_CP015434.1:114928[ PASS SVTYPE=BND;EVENT=HGT1 . 1
NZ_CP010445.1 3904886 BND_1_2 C ]NZ_CP015434.1:126957]C PASS SVTYPE=BND;EVENT=HGT1 . 1
NC_010473.1 1120261 BND_1_1 A A[NZ_AP014710.1:1322002[ PASS SVTYPE=BND;EVENT=HGT1 . 1
NC_010473.1 1120263 BND_1_2 G ]NZ_AP014710.1:1350000]G PASS SVTYPE=BND;EVENT=HGT1 . 1