140 likes | 231 Views
Detailed parameters. SOAP3-DP. Step 1: SOAP3 (2 mismatches). Case 1:. YES. YES. Report the alignments. Insert size is within range. Step 1: SOAP3 (2 mismatches). r ead 2. read 1. Insert size is out of range. Case 2:. YES. YES. Let x = # of all valid hits of read 1.
E N D
Detailed parameters SOAP3-DP
Step 1: SOAP3 (2 mismatches) Case 1: YES YES Report the alignments Insert size is within range
Step 1: SOAP3 (2 mismatches) read 2 read 1 Insert size is out of range Case 2: YES YES Let x = # of all valid hits of read 1 Let y = # of all valid hits of read 2 If x > 30, only retains the best hits of read 1 and reset x = # of best hits of read 1. If y > 30, only retains the best hits of read 2 and reset y = # of best hits of read 2. a) x,y <= 30 YES NO NO YES ARRAY A b) x <= 30 < y ARRAY A YES NO c) y <= 30 < x ARRAY A NO YES d) 30 < x < y ARRAY B YES NO e) 30 < y <= x ARRAY B NO YES Store the read ID and hits of YES to ARRAY A or B
Step 1: SOAP3 (2 mismatches) Case 3: One read has hit but another has not YES NO Let x = # of all valid hits of YES If x > 30, only retains the best hits of YES and reset x = # of best hits of YES. a) x <= 30 ARRAY A YES NO a) x > 30 ARRAY B YES NO Store the read ID and hits of YES to ARRAY A or B ARRAY C Case 4: NO NO Store the read ID to ARRAY C
Step 2: DEFAULT-DP DEFAULT-DP ARRAY A Case 1: Valid paired alignments found Report the alignments Store the read ID to ARRAY C ARRAY C Case 2: No valid paired alignment found
Step 3: NEW DEFAULT-DP NEW DEFAULT-DP Unaligned end: 3 seeds, length 38 @ 1, 23, 45; Then verify by DP ARRAY B Case 1: Valid paired alignments found Report the alignments Store the read ID to ARRAY C ARRAY C Case 2: No valid paired alignment found
Step 4: 2-level Deep DP ARRAY C ROUND 1 SEEDING for both ends Seed length: 26 Sample rate: 1/13 Max # of hits allowed: 100 If (1) there exists a seed with too many hits; AND (2) no pairs of hits within insert size. If there exists pairs of hits within insert size. If there exists pairs of hits within insert size. Perform DP for those pairs of hits within insert size. ROUND 2 SEEDING for both ends Seed length: 30 Sample rate: 1/15 Max # of hits allowed: 1000 Case 1: Valid paired alignments found Report the alignments Store the readID of both ends to ARRAY D Case 2: No valid paired alignment found
Parameters • Default DP
Parameters • New default DP
Parameters • Round-1 Deep DP Sample rate: 1 / seedLength
Parameters • Round-2 Deep DP Seed distance apart: 0.5 * seedLength
Parameters • Single DP (for length <= 300)
Parameters • Single DP (for length > 300) • Number of seeds (N) = 3 + readlength / 100 • Seed-length: 70 • Seed-distance-apart (D): (read-length-H-T) /N where H and T are 15% of read-length • Seed positions: H, H+D, H+2D, …
Other parameters • Batch size: 6M • For pair-end reads, if read length > 150, skip the SOAP3 step and directly go to deep-dp module.