Why do full-length transcriptome recommend "3+2"?
So far most of the reported full length transcriptome articles adopt the mode of "3+2". “3” refers to the third generation sequencing that detects more accurate structural variation. For the species with reference genome, it can accurately detect alternative splicing, fusion genes, and predict new genes. “2” refers to the second generation sequencing that achieve more accurate quantification than the third generation sequencing. For the species without reference genome, “3+2” sequencing strategy can provide accurate reference sequences, which is helpful for subsequent differential expression analysis. In addition, the "3+2" sequencing strategy can use the data of the second generation sequencing to correct the third generation sequencing data.
Does the full-length transcriptome sequencing library need to be fragmented and assembled?
The full-length transcriptome sequencing is based on the PacBio sequencing platform, which can directly obtain the complete transcripts containing 5 ', 3'UTR, polyA tail. It overcomes the limitations of short assembly and incomplete information of transcripts of the species without reference genome.
How to choose the data amount of full-length transcriptome sequencing?
Regarding to PacBio Sequel platform, it is recommended to have 40G sequencing data amount. If low expression genes are interested, then 60G even more data amount are suggested.