230 likes | 377 Views
Development of a Compact Cluster with Embedded CPUs. Sritrusta Sukaridhoto, Yoshifumi Sasaki, Koichi Ito and Takafumi Aoki. Embedded CPU. Embedded CPU. Embedded CPU. Introduction. Home Automation. Ubiquitous environment
E N D
Development of a Compact Cluster with Embedded CPUs Sritrusta Sukaridhoto, Yoshifumi Sasaki, Koichi Ito and Takafumi Aoki
Embedded CPU Embedded CPU Embedded CPU Introduction Home Automation • Ubiquitous environment • Electronics equipment in our surrounding (equipped with embedded CPU) • Connected by network http://bsc.jp.yamatake.com/products/secu_ftouchm.html Network Distributed cooperation / parallel processing Navigation System Mobiles http://www.kayoo.org/home/mext/joho-kiki/
Ubiquitous Low Cost Distributed / Parallel Computing Embedded CPUs Prototyping Environment Ubiquitous Computing Cluster (UCC) Introduction Development
Contents • Introduction • Cluster Computer Structures • Implementations • Performance Evaluations • Application: Fingerprint Verifications • Conclusions and Future Plans
Terminal HUB 100Mbps LAN Connections Node#3 Node#2 Node#1 Node#0 Embedded Devices Power consumption: 60W (TYP) Cluster Computer Structures Ubiquitous Computing Cluster Hardware
CPU SH4 (SH7751R, 266MHz) Memory 64MB SDRAM HDD 120GB, ATA133, 5400rpm NIC 10/100 BASE-T (RTL-8139C+) I/F USB 2.0×2port Power 14W(TYP) Cluster Computer Structures Specification of Calculation Node • Embedded Network Attached Storage (NAS) • Include: Embedded CPU(SH4),memory,USB I/F,network I/F,HDD. Able to act as network computer • Logically have function as general computer. • Small space, low power consumption
Cluster Computer Structures Ubiquitous Computing Cluster Software • Operating System • Debian GNU Linux 2.4.21 for SH4 • Stable inter-processor communication • Compact kernel and daemons • Servers and daemons • Inter-processor communication(rsh, rexec, rcp) • login(telnet),file transfer(FTP) • Network File sharing(NFS),Network Information Services (NIS) • Development environment • compiler (GNU gcc-3.0.4, g++, Fortran77) • editor (GNU Emacs, vi) • Parallel process interface (MPI, PVM)
Terminal Login, File Transfer (telnet, FTP) Inter-process communication (rsh) Node#3 Node#2 Node#1 Node#0 Cluster Computer Structures How it works ??? UCC HUB 100Mbps Fast Ethernet Node#0is also working as administrator server (NIS, NFS)
Terminal UCC HUB 100Mbps Fast Ethernet Node#3 Node#2 Node#1 Node#0 Cluster Computer Structures • Computing node is embedded CPU • Suitable for prototyping the next generation computer • Using COTS product • Low cost system • Using Linux as Operating System • A stable inter-processor communication • Open Source Embedded Devices COTS: Commercial Off-The-Shelf
Features • Size: 390mm×280mm×150mm,Power Consumption: 60W(TYP)
Login to Node#0 Terminal Node#3 Node#2 Node#1 Node#0 • Run: mpirun –np 4 hello Hello world! from 0 of 4 Hello world! from 1 of 4 Hello world! from 2 of 4 Hello world! from 3 of 4 Implementation • Write a parallel program: hello.c • #include "mpi.h" • #include <stdio.h> • void main(int argc, char *argv[]) • { • int numprocs, prognum; • /* Initialize MPI */ • MPI_Init(&argc, &argv); • MPI_Comm_rank(MPI_COMM_WORLD, &procnum); • MPI_Comm_size(MPI_COMM_WORLD, &numprocs); • printf ("Hello world! from %d of %d", procnum, numprocs); • MPI_Finalize(); • return; • } UCC HUB • Compile: mpicc –o hello hello.c
Advance Application • USB port connect to another devices (Fingerprint sensor, USB-Audio, Camera, etc) Speech / Voice Recognition System Fingerprint Verification System Image Processing System
Performance Evaluations • Pallas MPI Benchmark (PMB)* • :Performance evaluationsfor MPI communication • Ping-Pong measuring delay time when transferring data between 2 processors • Broadcast measuring the biggest delay time when transferring from node#0 to the other nodes. *http://www.pallas.com/e/products/pmb/
Performance Evaluations (cont.) • ping-pong communication test • The biggest transfer ability is 70 Mbps • It gives enough performance using 100 Mbps ethernet
Performance Evaluations (cont.) • Communication broadcast test • The biggest broadcast communication ability is around 36Mbps • It gives enough performance using ordinary HUB
Node#3 Node#2 Node#1 Node#0 Application: Fingerprint Verifications • Distributed processing for verifying the fingerprint in a database. • Fingerprint matching algorithm→ POC Registered Fingerprint Input Fingerprint Fingerprint Sensor Fingerprint matching in each node using POC
DFT DFT correlation Input image 1 phase amplitude DFT DFT Standard image amplitude phase correlation DFT DFT phase Example of Phase-Only Correlation Input image 2 amplitude What is Phase-Only Correlation (POC) ? • correlation using only image phase component • according to similarity degree of the image, sharp peak produced • Algorithm based on signal processing
#Node 0 × IFFT Fingerprint image #Node 1 FFT Registered fingerprint (phase) Peak extraction 128×128 phasing × IFFT Registered fingerprint (phase) #Node 2 × Registered fingerprint (phase) Peak extraction IFFT × IFFT Peak extraction #Node 3 Check result Registered fingerprint (phase) Peak extraction Peak comparison Fingerprint Matching Algorithm Using POC Function
Fingerprint Verification Performance Evaluation • The number of computing nodes : 1, 2, or 4 (can be changed) • The number of registered finger print: 12 (3 images per node) • Evaluation of the matching time on input fingerprint and registered fingerprint Image from sensor Registered finger print Node #0 Node #1 Node #2 Node #3
Result • Processing time: less than 2 seconds with 4 nodes. • Enough performance with embedded CPUs
Conclusion • Development of a Ubiquitous Computing Cluster with embedded CPUs • World smallest cluster computer in size, power consumption and cost • Suitable for prototyping the next generation ubiquitous application • Application: Fingerprint verification • performance evaluation shows satisfactory result • UCC is capable for advance applications
Next Generation Ubiquitous Application Robot System RFID certification Computer vision Fingerprint verification Image tracking Cipher Fault tolerant system Voice/Speech recognition Redundant Server system Future Plans Face recognition Ubiquitous Computing Cluster