250 likes | 267 Views
Web 应用演化中的 代码修改定位. 王啸吟 10648871. 个人介绍. 王啸吟 2006 级直博生 软件工程研究所 程序理解与测试组 导师:梅宏 指导老师:张路. 代码演化即是在软件完成之后,对代码进行的修改,以及与修改相关的其他活动. 研究背景:代码演化. 代码演化过程模型 —— 程序员视角. 科研方向. Web 应用演化中的 代码修改定位. 基于自然语言和执行信息的错误报告去重. Web 应用 国际化. Web 应用演化的核心挑战 —— 动态生成代码. 传统软件运行方式. Web 应用运行方式.
E N D
Web应用演化中的代码修改定位 王啸吟 10648871
个人介绍 • 王啸吟 • 2006级直博生 • 软件工程研究所 • 程序理解与测试组 • 导师:梅宏 • 指导老师:张路
代码演化即是在软件完成之后,对代码进行的修改,以及与修改相关的其他活动代码演化即是在软件完成之后,对代码进行的修改,以及与修改相关的其他活动 研究背景:代码演化 代码演化过程模型—— 程序员视角
科研方向 Web应用演化中的 代码修改定位 基于自然语言和执行信息的错误报告去重 Web应用 国际化
Web应用演化的核心挑战——动态生成代码 传统软件运行方式 Web应用运行方式 echo “<input type = button onclick=f(”._Post(‘type’) “)>” <input type = button onclick=f(3)>”
解决动态生成代码问题的技术路线——字符串分析解决动态生成代码问题的技术路线——字符串分析 字符串分析可以判断一个字符串变量的所有可能取值,它的输出是一个包含字符串变量所有可能取值的上下文无关文法
Web应用国际化 • Locating need-to-translate constant strings in web applications • Xiaoyin Wang, Lu Zhang, Tao Xie, Hong Mei, Jiasu Sun • SIGSOFT 2010, to appear
Introduction • I18n Internationalization(I18n) is the process of designing a software application so that it can be adapted to various languages and regions without engineering changes. • L10n Localization(L10n) is the process of adapting software for a specific region or language by adding locale-specific components and translating text. • Globalization I18n and L10n are referred to as globalization
Developer Globalization Process English Version German Version All language specific code elements are externalized to property files Chinese Version L10n English Property I18n German Property One-language Version Internationalized Version I18n Chinese Property • Old software projects • New project with no global plan at first • Using old components L10n
Language Specific Code Elements • Constant Strings • Date/Number Formats • Currency/Measures • Writing Direction • Color/Culture related elements • … Constant Strings are of the largest number, and some of them are very hard to be located.
Example of I18n and L10n • Original Code Elements • Externalized Code Elements • Property files
Basic Idea We assume that all need-to-translate strings are those strings that are sent to the GUI String Variables /Expressions GUI Constant Strings
字符串分析示例 • 对于下面的代码段: out = “”; i=0; while(input!=null){ input = _POST(“addressline”+i) out = out . “<label>” . Input.replace(‘\n’,‘’) . “</label>\n”; i++;} echo out; 通过字符串分析可以获得如下文法: input-> (Σ-\n)* out0 -> “” out1 -> out0 | out2 out2 -> out1 “<label>” input “</label>\n”
Motivation Need-to-translate string locating approach for traditional application can not be used directly in web application. The problem is that the web application will not only output user-visible strings but also tags.
Motivating Example • Constant String inside Tags
Motivating Example • Constant String outside Tags • Constant String in input tags
Flag Propagation Algorithm Four steps: • Add a left flag and a right flag to each variable in the CFG • Initialize the flags of terminals containing ‘>’ and ‘<’ with inside/outside • Propagate flags in the CFG using four propagating rules iteratively • Algorithm ends when no more propagation can be made, and whether a constant string is outside-tag is determined by the flags of its corresponding terminal
Flag Propagation Algorithm • Four Propagation Rules • If two variables are neighboring at the right side of a production, we perform a propagation between the right flag of the first variable and the left flag of the second variable. S->A(r)(l)B • If a terminal contains neither ‘<’ nor ‘>’, we perform a propagation between the left flag and right flag of the terminal. (r)S(l) -> “abc” • We perform a propagation between the right flag of the production's left-side-variable and the right flag of the last variable in the production's right side. (l)S-> (l)AB • We perform a propagation between the left flag of the production's left-side-variable and the left flag of the first variable in the production's right side. S(r)-AB(r)
Illustration(1) • S -> P|PS (1) • P -> A"<input name=“DE (2) • A -> "studentID" (3) • D -> "student" (4) • E -> ">" (5)
Illustration(2) Initialization: (U)S(U) -> (U)P(U)|(U)P(U)(U)S(U) (1) (U)P(U) -> (U)A(U)(O)"<input name=“(I)(U)D(U)(U)E(U) (2) (U)A(U) -> (U)"studentID“(U) (3) (U)D(U) -> (U)"student“(U) (4) (U)E(U) -> (I)">“(O) (5)
Illustration(3) After 1 step: (U)S(U) -> (U)P(U)|(U)P(U)(U)S(U) (1) (U)P(U) -> (U)A(O)(O)"<input name=“(I)(I)D(U)(I)E(O) (2) (U)A(O) -> (U)"studentID“(U) (3) (I)D(U) -> (U)"student“(U) (4) (I)E(O) -> (I)">“(O) (5)
Illustration(4) Final: (O)S(O) -> (O)P(O)|(O)P(O)(O)S(O) (1) (O)P(O) -> (O)A(O)(O)"<input name=“(I)(I)D(I)(I)E(O) (2) (O)A(O) -> (O)"studentID“(O) (3) (I)D(I) -> (I)"student“(I) (4) (I)E(O) -> (I)">“(O) (5)
Experiment Setup • Three php projects • Lime Survey • Squirrel • Mrbs Table 1
Experiment Result • Basic Result