340 likes | 452 Views
Categorization of Library Function Call Patterns. Noritoshi Atsumi † Shinichiro Yamamoto ‡ Kiyoshi Agusa † † Dept. of Information Engineering, Nagoya University ‡ Dept. of Information Science, Aichi Prefectural University. Outline. Introduction Background
E N D
Categorization of Library Function Call Patterns Noritoshi Atsumi† Shinichiro Yamamoto‡ Kiyoshi Agusa† †Dept. of Information Engineering, Nagoya University ‡ Dept. of Information Science, Aichi Prefectural University
Outline • Introduction • Background • The problems on retrieving • Retrieval of know-how • FCDG (Function Call Dependency Graph) • Categorization of • same FCDG • similar FCDG • Our System • Conclusions and Future Works • Introduction • Background • The problems on retrieving • Retrieval of know-how • FCDG (Function Call Dependency Graph) • Categorization of • same FCDGs • similar FCDGs • Our System • Conclusions and Future Works
Background • Source codes of many programs are acquirable • know-how for coding know-how source code retrieve archive
The know-how for coding • Library Function • is used in various programs • primitive function • is used by certain combination Common Vocabulary developer find out it string retrieval by grep or some tools know-how = combination of library function
Outline • Introduction • Background • The problems on retrieving • Retrieval of know-how • FCDG (Function Call Dependency Graph) • Categorization of • same FCDGs • similar FCDGs • Our System • Conclusions and Future Works
The problems on string retrieval • Too many retrieval results • Indistinctive the difference among retrieval results library function call in source tree of FreeBSD fh = fopen(“/var/log/log”, “a”); ftrace = fopen(file, “a”); if ((fp = fopen(name, “r”)) == NULL) { if ((fp = fopen(dumpfile, “r”)) == NULL) { fp = fopen(“acp”, “w”); fopen : 332 socket : 212 getopt : 175 It’s necessary to categorize the results
fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) The retrieval results by string
if ((sockfd = socket (res->ai_family, res->ai_socktype, res->ai_protocol)) < 0) err_sys ("Can't open socket"); if (udp) { … } } if (setsockopt (sockfd, SOL_IP, IP_TOS, (void *) &tos, (socklen_t) sizeof (tos)) != 0) { err_sys ("Failed setting IP type of service octet"); } if (!ttcp && !icp) { if ((errno == EINTR) && (timeout_flag)) { printf ("Timeout while connecting\n"); close (sockfd); continue; } if ((nr < 0 || nr != n) && timeout_flag) { close(sockfd); } } close(sockfd); if ((sockfd = socket (res->ai_family, res->ai_socktype, res->ai_protocol)) < 0) err_sys ("Can't open socket"); if (udp) { … } } if (setsockopt (sockfd, SOL_IP, IP_TOS, (void *) &tos, (socklen_t) sizeof (tos)) != 0) { err_sys ("Failed setting IP type of service octet"); } if (!ttcp && !icp) { if ((errno == EINTR) && (timeout_flag)) { printf ("Timeout while connecting\n"); close (sockfd); continue; } if ((nr < 0 || nr != n) && timeout_flag) { close(sockfd); } } close(sockfd); Dependencies between library function calls • complex control structure • long code description make unclear dependency How combine with the library functions?
Outline • Introduction • Background • The problems on retrieving • Retrieval of know-how • FCDG (Function Call Dependency Graph) • Categorization of • same FCDGs • similar FCDGs • Our System • Conclusions and Future Works
Function Call Dependency Graph [’98Miura] • Nodes • Definition Node • stores the return value of a library function call in a variable • Reference Node • referes the return value of a library function call as the argument of a library function call • Controlled Node • depends on the truth value of the condition • Control Node • refers to the return value of a library function call in the condition • Edges (Data and Control Dependencies)
The Dependencies in FCDG • Data dependency • the return value of function call f is referred in other function call g 1. g ( … , f ( ), … ) ; 2. a = f ( ) ; … ; g ( … , a , … ) ; • Control dependency • whether function call f is executed or not is determined by the condition c while ( c ) { f ( ); } if ( c ) { f( ); }
Example of FCDG fd = fopen (fname, "r"); if (fd != NULL) { ptr = fgets(line, sizeof(line), fd); if (ptr!= NULL) { p = strstr(line, NAME); if (p != NULL) { p++; strcpy(name, p); } } fclose(fd); } fopen t !=NULL t fgets fclose !=NULL t strstr !=NULL t strcpy
Example of FCDG fd = fopen (fname, "r"); if (fd != NULL) { ptr = fgets(line, sizeof(line), fd); if (ptr!= NULL) { p = strstr(line, NAME); if (p != NULL) { p++; strcpy(name, p); } } fclose(fd); } fopen t t !=NULL t t fgets fclose !=NULL t strstr !=NULL t strcpy Library Function Call Pattern
Outline • Introduction • Background • The problems on retrieving • Retrieval of know-how • FCDG (Function Call Dependency Graph) • Categorization of • same FCDGs • similar FCDGs • Our System • Conclusions and Future Works
fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) Categorization of same FCDGs
Same FCDG fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) Categorization of same FCDGs
Same FCDG fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) Categorization of same FCDGs
Same FCDG fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) fp = fopen(...) Categorization of same FCDGs
Same FCDG fp = fopen(...) fp = fopen(...) fp = fopen(...) Categorization of same FCDGs
Number of function call Number of FCDG type fopen : 332 socket : 212 getopt : 175 fopen : 78 socket : 43 getopt : 22 The problems on categrization of same FCDGs • Same FCDG • in many programs • in a few programs include only a few know-how a few nodes include many know-how many similar FCDGs many nodes
Outline • Introduction • Background • The problems on retrieving • Retrieval of know-how • FCDG (Function Call Dependency Graph) • Categorization of • same FCDGs • similar FCDGs • Our System • Conclusions and Future Works
Similarity between FCDGs • wij: weight value of edge(ni, nj) • sim(Fx, Fy) : similarity between FCDG Fx and Fy same edge occur in many FCDGs otherwise natural dependency characteristic element in FCDG
Same FCDG Categorization of similar FCDGs
Same FCDG Similar FCDG Categorization of similar FCDGs
Similar FCDG Categorization of similar FCDGs
Extraction of know-hows • From FreeBSD-4.5RELEASE source tree (/usr/src/usr.sbin) • target program : C language • number : 162 , line : 311,653 • Library functions • Declaration in /usr/include categorization of similar FCDGs categorization of same FCDGs fopen : 332 socket : 212 getopt : 175 fopen : 78 socket : 43 getopt : 22 fopen : 10 socket : 6 getopt : 4
Outline • Introduction • Background • The problems on retrieving • Retrieval of know-how • FCDG (Function Call Dependency Graph) • Categorization of • same FCDGs • similar FCDGs • Our System • Conclusions and Future Works
Retrieval System Configuration Diagram source code retrieval system categorize extract FCDG FCDG DB
frequent dependency socket – setsockopt socket – bind socket – close return value check Result of Categorization $1 = socket(); if ($1 < 0) { } $2 = setsockopt($1); if ($2 < 0) { } $3 = bind($1); if ($3 < 0) { } $4 = listen($1); if ($4 < 0) { } close($1); $1 = socket(); if ($1 < 0) { } $2 = bind($1); if ($2 < 0) { } $3 = listen($1); if ($3 < 0) { } $1 = socket(); if ($1 < 0) { } $2 = setsockopt($1); if ($2 < 0) { } $3 = bind($1); if ($3 < 0) { } $4 = ioctl($1); if ($4 < 0) { } close($1);
frequent dependency socket – setsockopt socket – bind socket – close return value check Result of Categorization $1 = socket(); if ($1 < 0) { } $2 = setsockopt($1); if ($2 < 0) { } $3 = bind($1); if ($3 < 0) { } $4 = listen($1); if ($4 < 0) { } close($1); $1 = socket(); if ($1 < 0) { } $2 = setsockopt($1); if ($2 < 0) { } $3 = bind($1); if ($3 < 0) { } $4 = listen($1); if ($4 < 0) { } close($1); $1 = socket(); if ($1 < 0) { } $2 = bind($1); if ($2 < 0) { } $3 = listen($1); if ($3 < 0) { } $1 = socket(); if ($1 < 0) { } $2 = setsockopt($1); if ($2 < 0) { } $3 = bind($1); if ($3 < 0) { } $4 = ioctl($1); if ($4 < 0) { } close($1);
frequent dependency socket – setsockopt socket – bind socket – close return value check Result of Categorization $1 = socket(); if ($1 < 0) { } $2 = setsockopt($1); if ($2 < 0) { } $3 = bind($1); if ($3 < 0) { } $4 = listen($1); if ($4 < 0) { } close($1); $1 = socket(); if ($1 < 0) { } $2 = setsockopt($1); if ($2 < 0) { } $3 = bind($1); if ($3 < 0) { } $4 = listen($1); if ($4 < 0) { } close($1); $1 = socket(); if ($1 < 0) { } $2 = bind($1); if ($2 < 0) { } $3 = listen($1); if ($3 < 0) { } $1 = socket(); if ($1 < 0) { } $2 = setsockopt($1); if ($2 < 0) { } $3 = bind($1); if ($3 < 0) { } $4 = ioctl($1); if ($4 < 0) { } close($1); $1 = socket(); if ($1 < 0) { } $2 = setsockopt($1); if ($2 < 0) { } $3 = bind($1); if ($3 < 0) { } $4 = ioctl($1); if ($4 < 0) { } close($1);
Conclusions • Extract the know-how for the usage of library function • Dependency between library function calls • Extraction of FCDG • Retrieve the usage of library function • Categorization of FCDG It enable to find out the objective usage easily
Future Works • Inter-function dependency analysis • To extract much more patterns • Agent system for coding • To navigate coding Such as MS-Office Asistant