Massively parallel sequencing technology has a wide range of applications. Its use in SNP detection is already widespread and promises results of high accuracy. The aim is to validate data generated with the Genome Analyser II. SNPs, detected by using different mapping and SNP estimation parameters implemented in the bioinformatics tool CLC were compared to SNPs that are detected by using the method of conventional Sanger sequencing. Lines of D. mauritiana were used. As reference genome in CLC I was bound to use the genome of D. simulans, because there is no available genome of D. mauritiana until now. Using a stringent parameter set (short parameter set) that allows few mismatches of reads when mapping against the reference genome, only few regions are recovered which show divergence to the reference. 60%-81% SNPs detected with this parameter set are false positive ones compared to Sanger sequencing. If a less stringent parameter set is used this results in a very high number of false positive SNPs. Two times more SNPs were recovered with this parameter set, thereof 70-80% are false positive ones. Based on my results I drew conclude that the best approach for SNP detection is to make one first run permitting a high number of mismatches. The next step should be the use of more stringent values to reduce the high number of false positive SNPs.
|Number of Pages||56|
|Book Type||Mathematics & science|
|Country of Manufacture||India|
|Product Brand||AV Akademikerverlag|
|Product Packaging Info||Box|
|In The Box||1 Piece|
|Product First Available On ClickOnCare.com||2015-10-08 00:00:00|