310 likes | 416 Views
The Devil and Packet Trace Anonymization. Authors : Ruoming Pangy, Mark Allmanz, Vern Paxsonz, Jason Lee Princeton University, International Computer Science Institute, Lawrence Berkeley National Laboratory (LBNL)
E N D
The Devil and Packet Trace Anonymization Authors: Ruoming Pangy, Mark Allmanz, Vern Paxsonz, Jason Lee Princeton University, International Computer Science Institute, Lawrence Berkeley National Laboratory (LBNL) Publication: Computer Communication Review, January 2006. Presenter: Radha V. Maldhure
AGENDA • ANONYMIZATION • PROBLEM WITH CURRENT TECHNIQUES • USE OF ANONYMIZATION • PAPER’S CONTENTS • METHODOLOGY • ANONYMIZATION POLICY • INFORMATION LOSS • VALIDATION • CONCLUSION • CONTRIBUTIONS • WEAKNESSES • SUGGESTIONS
INTRODUCTION TO IMPROVE / TO DEVELOP RESEARCHER Released data ATTACKER TO ATTACK DATA e.g. packet traces RESEARCHER Released data anonymization ATTACKER
ANONYMIZATION • Releasing network measurement data to research community • Publishing traces require balance between security needs of organization and research usefulness • Example: “tcpdpriv” removes TCP options from traces, no physical fingerprinting, no research value Research Usefulness Research Usefulness Security Needs SecurityNeeds
AGENDA • ANONYMIZATION • PROBLEM WITH CURRENT TECHNIQUES • USE OF ANONYMIZATION • PAPER’S CONTENTS • METHODOLOGY • ANONYMIZATION POLICY • INFORMATION LOSS • VALIDATION • CONCLUSION • CONTRIBUTIONS • WEAKNESSES • SUGGESTIONS
PROBLEM WITH CURRENT TECHNIQUES Existing publicly released traces have problems as: • No careful guidance on anonymization policy for public release • No tool that adapts to particular policy • Example : NLANR’s PMA packet traces
AGENDA • ANONYMIZATION • PROBLEM WITH CURRENT TECHNIQUES • USE OF ANONYMIZATION • PAPER’S CONTENTS • METHODOLOGY • ANONYMIZATION POLICY • INFORMATION LOSS • VALIDATION • CONCLUSION • CONTRIBUTIONS • WEAKNESSES • SUGGESTIONS
USE OF ANONYMIZATION Some uses of anonymization: • Your web site's performance and availability • Understanding of the Internet’s structure and behavior
AGENDA • ANONYMIZATION • PROBLEM WITH CURRENT TECHNIQUES • USE OF ANONYMIZATION • PAPER’S CONTENTS • METHODOLOGY • ANONYMIZATION POLICY • INFORMATION LOSS • VALIDATION • CONCLUSION • CONTRIBUTIONS • WEAKNESSES • SUGGESTIONS
PAPER’S CONTENTS • Arrives at acceptable anonymization policy • Presents a tool “tcpmkpub” that implements the suggested transformations • Provides meta-data about each trace for analysis
AGENDA • ANONYMIZATION • PROBLEM WITH CURRENT TECHNIQUES • USE OF ANONYMIZATION • PAPER’S CONTENTS • METHODOLOGY • ANONYMIZATION POLICY • INFORMATION LOSS • VALIDATION • CONCLUSION • CONTRIBUTIONS • WEAKNESSES • SUGGESTIONS
METHODOLOGY Precise method for anonymization Purpose of transform Concerns for appearing traffic Policy decisions Anonymization tool
Example Specification Specification of IP Header anonymization:
AGENDA • ANONYMIZATION • PROBLEM WITH CURRENT TECHNIQUES • USE OF ANONYMIZATION • PAPER’S CONTENTS • METHODOLOGY • ANONYMIZATION POLICY • INFORMATION LOSS • VALIDATION • CONCLUSION • CONTRIBUTIONS • WEAKNESSES • SUGGESTIONS
ANONYMIZATION POLICY • Focuses on traces that include only packet headers • A possible policy but not completely a correct policy • It is crucial to prevent users of the trace files from determining: • identities of specific hosts • identities of internal hosts such that a map could be constructed of which hosts support which services • security practices of the organization
Protocol Stack Application Layer FTP/ Telnet/ SNMP/ DNS Transport Layer TCP/ UDP Internet Layer IP/ ARP/ ICMP/ IGMP Network Interface Layer Ethernet/ ATM/ FR
CHECKSUMS Reason to anonymize: Re-calculate checksums in traces for two reasons: • Gives content of data even when application data removed • To determine if original checksum were valid Way to anonymize:Original checksum Co, Calculated checksum Cc • Replace Co by Cc • Insert “1” into appropriate checksum field to mark packet as failed checksum
NETWORK INTERFACE LAYER: Ethernet Address Reason to anonymize: • Ethernet Addresses are distinct to individual NICs • Can be used by an attacker to uncover actions of given user Way to anonymize: • Three Different methods of randomizing Ethernet addresses • Scrambling the entire 6 byte address • Scrambling only the lower 3 bytes of address • Scrambling lower 3 and upper 3 bytes independently
INTERNET LAYER: IP Address Reason to anonymize: • Attacker can attain accounting of user’s activities if he knows IP Address • Can plan an attack using information about services running on the host Way to anonymize: Remap addresses differently based on type of addresses Multicast addresses preserved in anonymized trace
TRANSPORT LAYER: TCP/UDP Reason to anonymize: Not given Way to anonymize: • Preserves port number and sequence number but not the timestamp • They transform timestamps into separate monotonically increasing counters • Research use: uniqueness and transmission order of segments
AGENDA • ANONYMIZATION • PROBLEM WITH CURRENT TECHNIQUES • USE OF ANONYMIZATION • PAPER’S CONTENTS • METHODOLOGY • ANONYMIZATION POLICY • INFORMATION LOSS • VALIDATION • CONCLUSION • CONTRIBUTIONS • WEAKNESSES • SUGGESTIONS
INFORMATION LOSS • The effectiveness in preserving information is checked by analyzing original and anonymized traces • Two tools for analysis: “tcpsum” and “pOF” • tcpsum : Used to find number of packets and bytes sent in each direction Crunches each Tcp connection in trace • Except for IP addresses, crunching original and transformed traces matched • No value lost in transformation pOF : Did not get what they tried to explain!
AGENDA • ANONYMIZATION • PROBLEM WITH CURRENT TECHNIQUES • USE OF ANONYMIZATION • PAPER’S CONTENTS • METHODOLOGY • ANONYMIZATION POLICY • INFORMATION LOSS • VALIDATION • CONCLUSION • CONTRIBUTIONS • WEAKNESSES • SUGGESTIONS
VALIDATION • Need to validate information intended to mask was indeed transformed or left out of anonymized trace • Two ad hoc validations: • Inspected the log created by “tcpmkpub” • Flags all unexpected aspects of a packet trace • Used “ipsumdump” to dump Tcp options • Picked timestamps, sorted and verified • Timestamp re-numbering appears accurate
AGENDA • ANONYMIZATION • PROBLEM WITH CURRENT TECHNIQUES • USE OF ANONYMIZATION • PAPER’S CONTENTS • METHODOLOGY • ANONYMIZATION POLICY • INFORMATION LOSS • VALIDATION • CONTRIBUTIONS • WEAKNESSES • SUGGESTIONS
CONTRIBUTIONS • Enumerated and explored devil-ish details in preparing packet traces • A framework for implementing anonymization policy and developed “tcpmkpub” • Sets framework for future work of packet trace anonymization
AGENDA • ANONYMIZATION • PROBLEM WITH CURRENT TECHNIQUES • USE OF ANONYMIZATION • PAPER’S CONTENTS • METHODOLOGY • ANONYMIZATION POLICY • INFORMATION LOSS • VALIDATION • CONTRIBUTIONS • WEAKNESSES • SUGGESTIONS
WEAKNESSES • No timing information for analyzing TCP dynamics • Preserving port number may lead to identification of a particular machine • No performance analysis
AGENDA • ANONYMIZATION • PROBLEM WITH CURRENT TECHNIQUES • USE OF ANONYMIZATION • PAPER’S CONTENTS • METHODOLOGY • ANONYMIZATION POLICY • INFORMATION LOSS • VALIDATION • CONCLUSION • CONTRIBUTIONS • WEAKNESSES • SUGGESTIONS
SUGGESTIONS • Needs to deal with different protocols at each layer of protocol stack • Should present performance analysis that indicates • tool’s efficiency in terms of maintaining security needs • preserving research values