Dynamic Benchmarking in Software Development

DynamicBenchmarking Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com Software development though competition

This slide intentionally left blank

Contents • Dynamic Benchmarking Introduction • Uses of the Benchmarking Game model • Software Development (CS 4500) • A Lesson I’ve learned Caution: Slide layout may cause drowsiness.

Benchmarking • Assesses relative performance • Typically by running standardized tests • Produces scores which are then compared • SATs • Other options exist • Allowing software to compete directly • Chess game

The Traditional Approach Developer A Software A Static Benchmark Score A Developer B Software B Score B Developer C Software C Score C Parameterized by the domain.

The Dynamic Approach Software A Team A Agent Artificial World (Game) Benchmark A Agent Ranking Software B Team B Agent Benchmark B Agent Software C Team C Benchmark C Parameterized by the domain.

An Artificial WorldAgent’s View Agent Beliefs, Challenges, Problems, Solutions Opponents’ communication, Feedback • Problems: Benchmark output • Solutions: Software output • Beliefs/Challenges: statements about algorithms Administrator Results

Problems & Solutions • Problem communication: • Define an instance of a problem in the domain • Solution communication: • Respond to an opponent’s problem • Administrator has a metric for determining how good a solution is • This metric is well defined and known by all

Beliefs & Challenges • General statements about algorithms • Belief: • Defines a subset of the problems in the domain • Makes a statement about the problems in that subset • Challenge: • A response to a belief of an opponent

Administrator • Opponents’ communication • Filter all communication through the Administrator for security • Filter information when necessary • Feedback: • Inform agents of rule violations • Inform agents of status changes

Administrator • Results • Track state changes through the game • Produce the agent ranking from the end game state

What’s next • Dynamic Benchmarking Introduction • Uses of the Benchmarking Game model • Software Development (CS 4500) • A Lesson I’ve learned If you can read this, you don’t need glasses.

Overhead • Requires mature Administrator, communication system for accurate results • Reuse between domains is possible • Requires new translation for each problem domain

Software Development • Ranks software without a mature benchmark • Dynamic approach excels when a well-defined benchmark does not exist • Creates data to build better benchmarks • Because Agents, not Software, are ranked • Forces developers to consider both their solutions and the problem domain

Education • Motivates students • Mature Administrator/Agent not required • Creates interesting student interaction • Creates a realistic software development environment

What’s next • Dynamic Benchmarking Introduction • Uses of the Benchmarking Game model • Software Development (CS 4500) • A Lesson I’ve learned Yeah, I got nothing.

Specker Challenge Game • The SCG is the basis for Professor Karl Lieberherr’s Software Development class • Uses an arity 3 boolean constraint satisfaction problem (CSP) as our domain • Teams of 2~3 produce the components of an Agent

(Some of the) Skills Involved • Using outsourced tools • DemeterF (developed by Bryan Chadwick) • Component Market • Dealing with users • Underspecified requirements • Source control • Constraint Satisfaction algorithms • Data mining

Added bonus Domain Knowledge Experts Code So what? Programmers Requirements Limitations How-to Non-technical Requirements Gibberish Salespeople Customers Users

It’s a busy class • Traditional grading would not work • The competition keeps students motivated

What’s next • Dynamic Benchmarking Introduction • Uses of the Benchmarking Game model • Software Development (CS 4500) • A Lesson I’ve learned

Administrator Security • Never accept extra input • Transaction: Challenge: ID, Type, Price • vs. • Transaction: Challenge: ID • Check all necessary input • Transaction: Deliver Problem: ID, Problem • Check: Does the Problem match the Type?

General Lesson • Never trust user input • Sanitize data • Protect against buffer overflows

More General Lesson • It’s good to see things before they can do you or others harm • Users you can yell at • Security flaws that don’t cost money • Underspecified requirements

Thank you! Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com

Dynamic Benchmarking in Software Development

Dynamic Benchmarking in Software Development

Presentation Transcript

BENCHMARKING

BENCHMARKING:

BENCHMARKING

BENCHMARKING

BENCHMARKING:

BENCHMARKING

BENCHMARKING

Benchmarking

Benchmarking

Benchmarking

Benchmarking

Benchmarking

Benchmarking

BENCHMARKING

Benchmarking

Benchmarking

Benchmarking