1 / 17

期末 Demo 報告 Cross-based

期末 Demo 報告 Cross-based. 2012/08/13 指導教授:詹寶珠教授 報告者:王邦威. Outline. Flow chart Method Implement on GPU Experimental results. Introduction. left. right. Local base algorithm. L. R. y. P(Lx,Ly). P’( Rx,Ry ). Disparity= Lx-Rx. Flow chart. Support region construction. Matching cost.

fleur
Download Presentation

期末 Demo 報告 Cross-based

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 期末Demo報告Cross-based 2012/08/13 指導教授:詹寶珠教授 報告者:王邦威 1

  2. Outline Flow chart Method Implement on GPU Experimental results 2

  3. Introduction left right 3

  4. Local base algorithm L R y P(Lx,Ly) P’(Rx,Ry) Disparity= Lx-Rx 4

  5. Flow chart Support region construction Matching cost Cost aggregation Winner-take-all Post-processing 5

  6. Cross-based local support region construction • Two constraints • L • d • 2 6

  7. 左圖 右圖 Locally adaptive matching cost aggregation R. Zabih and J. Woodfill, “Non-parametric local transforms for computing visual correspondence,” in Proc. ECCV, 1994, pp. 151–158. String 2 • Matching cost • CAD : • Ccensus : Hamming distance of the two strings that stand for p and pd 7

  8. Locally adaptive matching cost aggregation • Cost aggregation • d • Winner-take-all • f 8

  9. Left/right consistency check • We apply occlusion treatment via left/right consistency checking to check the condition . • Then we fill in the disparity for invalidated pixels. For an invalidated pixel , we search its closest valid pixel to the left and to the right. 9

  10. Implement on GPU Host Device CPP檔 (主程式且包含呼叫OpenCV程式片段) DLL檔 (包含呼叫GPU程式片段) GPU (kernel function) • 整合CUDA和Open CV • 建一個專案,將含有cuda的程式碼部分包成一個Dll檔 • 對主函式和含有Open CV的程式碼另外建一個專案,寫在Cpp檔 • 在利用到GPU就呼叫Dll檔 10

  11. Implement on GPU Width One block Height Number of threads : width Number of blocks : height 11

  12. Experimental results 384x288 執行時間:0.058秒 12

  13. Experimental results時間分析-AD • 原程式 • 計算matching cost和水平區域cost總和:0.625秒 • 長區域所花時間:2.67秒 • 計算垂直上每點的水平cost總合和WTA:0.561秒 • 後處理:0.017秒 • 總共:3.941秒 • 平行化的程式 • 計算matching cost和水平區域cost總和:0.022秒 • 長區域所花時間:0.012秒 • 計算垂直上每點的水平cost總合和WTA:0.022秒 • 後處理:0.002秒 • 總共:0.058秒 • 加速68倍 13

  14. Experimental results 384x288 執行時間:0.273秒 14

  15. Experimental results時間分析-AD&Census • 原程式 • 計算Census cost:1.646秒 • 加總水平區域的cost:0.406秒 • 長區域: 2.67秒 • 計算垂直上每點的水平cost總合和WTA:0.561秒 • 後處理:0.017秒 • 總共:5.684秒 • 平行化的程式 • 計算Census cost:0.21秒 • 加總水平區域的cost:0.026秒 • 長區域:0.012秒 • 計算垂直上每點的水平cost總合和WTA:0.025秒 • 後處理:0.002秒 • 總共:0.273秒 • 加速20倍 15

  16. Conclusions 對於擁有許多不相依性計算的方法,可以很容易達到不錯的加速效能 適當的利用share memory,可以達到更快的速度 16

  17. Thank for your attention! 17

More Related