260 likes | 395 Views
Personal Contact Information. Charles Shields, Jr., Ph.D., J.D. Research Associate at University of Texas at Dallas (UTD) cshields@utdallas.edu www.utdallas.edu/~cshields. Review of Two Papers.
E N D
Personal Contact Information • Charles Shields, Jr., Ph.D., J.D. • Research Associate at University of Texas at Dallas (UTD) • cshields@utdallas.edu • www.utdallas.edu/~cshields
Review of Two Papers • Richard T. Snodgrass, Stanley Yao and Christian Collberg, "Tamper Detection in Audit Logs," In Proceedings of the International Conference on Very Large Databases, Toronto, Canada, August–September 2004, pp. 504–515. • Kyri Pavlou and Richard T. Snodgrass, "Forensic Analysis of Database Tampering," in Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), pages 109-120, Chicago, June, 2006.
"Tamper Detection in Audit Logs“Overview of Paper • Emphasize the fact that audit logs be correct and verifiable • Required now by several US Federal laws (e.g. Sarbanes-Oxley, HIPAA, etc.) • Review of existing audit log techniques • Presentation of their basic idea (converting the audit log to a transaction time database with periodic validation and notarization) • Give some performance enhancements (e.g. opportunistic hashing, linked hashing) • Performance graphs and final summary
Transaction Time Database • A subset of “Temporal Databases” • http://en.wikipedia.org/wiki/Temporal_database • A temporal database is a database that tracks, among other things, two different time parameters: valid-time and transaction-time. • Valid time denotes the time period during which a fact is true with respect to the real world (i.e. “real” time) • Transaction time refers to the time period during which a fact is stored in the database. • Bitemporal data combines both Valid and Transaction Time.
Transaction Time Database • Records and retains the history of its content.[1] • All past states are retained and can be reconstructed from the information in the DB. • Past state reconstruction enabled by the append only property: [1] • All new information is added only • No information is ever deleted. • In addition, the transaction time component must be auditable. That is, • An audit log is maintained • Can be examined later by a validator
Transaction Time Database • Ultimate goal is to have enough information to both: • detect a bad event • determine exactly when, how, and by whom it occurred.
Transaction Time Database • Transaction time table contains all the columns a normal database table might have, with two extra fields: Start and Stop. • START: tracks when the data item was added to the database (transaction time) • STOP: tracks different states of the row (tuple) • Example operations that maintain history: • Deletion: STOP marked deleted, but row is retained • Modification: Deletion of old value; insertion of new • Invisible to user; maintained by DBMS. • Extra fields are carried for each tuple (row).
"Tamper Detection in Audit Logs“Main Steps of Basic Algorithm • On each modification of a tuple, the DBMS: • Gets a timestamp for the modification • Computes a cryptographically strong one-way hash of the (new) data and the time stamp together. • Sends that value to a trusted notarization service, which sends back a unique Notary ID based on that value. • The Notary ID is then stored with the tuple.
"Tamper Detection in Audit Logs“Important observations – Basic Algorithm • If the data or timestamp are modified, the ID will be inconsistent with the new tuple (i.e. detected when rehashed and re-notarized). • Holds even if intruder has access to the hash function. He can calculate a new hash, but it won’t match the ID. • It is very important that the ID cannot be calculated from the data in the database (i.e. must be calculated by an independent and trusted source): • This prevents an intruder from changing the database and then recalculating the ID.
"Tamper Detection in Audit Logs“Validation • An independent and trusted audit log validation service can then be used to verify the integrity of the DB. • For each tuple (basic algorithm), the validation service will rehash the data and time-stamp, recalculate the ID, and compare. Called a “Validation Event” (VE) • Inconsistencies are reported as an “Corruption Event” (CE).
"Tamper Detection in Audit Logs“ • Modern systems can update thousands of tuples per second, leading to time efficiency problems. • Optimizations seek to minimize the time spent calculating hashes and interacting with the notarization service: • Opportunistic hashing • Reduce the interactions with the notary to one per transaction, rather than to one per tuple. • Linked hashing • Final commit hash done at midnight each day. • Reduces the interactions with the notary to one per day • creates a “hash chain” that can be used in later analysis
Hashing FunctionsVerifying the accuracy of the copy • A hashing function can be used to generate a “digest” specific for each file. • The digest is usually a hexadecimal number that is, with a high probability, unique for each file. • A hashing function is secure if, for a given algorithm, it is computationally infeasible • to find a message that corresponds to a given message digest, or • to find two different messages that produce the same message digest (i.e. “collision”) • In general, any change to a message will, with a very high probability, result in a different message digest. • Failure called a “collision”
Hashing Functions -- MD5 • MD5 Hash Function • Most commonly used (although it has been shown to have flaws (i.e. collisions)) • developed by Ronald Rivest, 1991. • produces a 32 character (16 digit) hex number. • Example of MD5 hash: • md5.exe • 609F46A341FEDEAEEC18ABF9FB7C9647 • Demo
Hashing FunctionsReduce work load • Hashing functions can be used to cut down on the number of files that have to be analyzed. • Databases of known hash results are maintained (e.g. KFF – “Known File Filter” in FTK) • Can be used to identify “Known Bad” files • Hacking tools • Training manuals • Contraband photographs • Ignore “Known Good” files • Microsoft Windows files • Standard application files • Standard build files (corporate server deployments)
"Tamper Detection in Audit Logs“Summary of main points • Main contributions of first paper: • the DBMS can maintain a transaction-time audit log in the background • transactions can be cryptographically hashed to generate a secure, one-way hash of the transaction • the transaction hash can be notarized by an independent and trusted service to generate a unique and secure ID value. • optimizations that reduce the overhead
“Forensic Analysis of Database Tampering”Overview of Paper • Motivates the need for forensic analysis (e.g. legal requirements, need to determine who, what, and when). • Reviews the contributions of first paper • Defines the “Corruption Diagram” and gives an example • Gives details of and comparisons between the four algorithms discussed • Describes the notion of “forensic strength” • Related work, summary
“Forensic Analysis of Database Tampering”Basic Definitions • Corruption Event (CE) • any event that corrupts data or compromises the database • Corruption Time (tc) • Notarization Event (NE) (tn) • Notarization Interval (IN) • Validation Event (VE) (tv) • Validation Interval (IV) • Corruption locus data (lc) • the data that have been corrupted • Locus time (tl) • the time when the locus data (lc) were stored
“Forensic Analysis of Database Tampering” • A “Corruption Diagram” is used to perform the analysis. • Terminology (based on temporal database): • x axis is the “transaction time” (“where” axis) • y axis is the actual (i.e. “valid”) time (“when” axis) • action axis – 45 degree line relating the transaction time and valid time • tfvf – time of “first validation failure” – time when corruption of log first detected by a VE • See example, p112 of paper (data only event)
“Forensic Analysis of Database Tampering” • Once corruption has been detected, the forensic analyzer begins working. • Objective: to define the corruption region, i.e. the bounds on the “where” and “when” of the CE, as narrowly as possible. • The paper presents four algorithms for doing this: • Trivial Forensic Analysis algorithm • Monochromatic Forensic Analysis algorithm • RGB (Red-Green-Blue) algorithm • Polychromatic algorithm
“Forensic Analysis of Database Tampering”Monochromatic Forensic Analysis • Let’s use the “Monochromatic Forensic Analysis” algorithm to define the Corruption Region: • the analyzer rehashes the log from the beginning to determine the time of “most recent validation success” (trvs). This is the “Lower Spatial Bound” = LSB. • “Upper Spatial Bound” (USB) = LSB + IN • “Lower Temporal Bound” (LTB) = tFVF – IV • “Upper Temporal Bound” (UTB) = tFVF • Refer to example
“Forensic Analysis of Database Tampering”Classification of Corruption Events • Time of Occurrence: • Retroactive CE: locus time (tl) occurs before the second to last validation event (VE) • Introactive CE: tl occurs after the next to last VE • Type of Corruption: • data only • backdating: a timestamp is changed to indicate a time earlier than the tuple time • postdating: timestamp changed to indicate a later time
Data only Backdating Postdating Retroactive Introactive X “Forensic Analysis of Database Tampering”Classification of Corruption Events • This leads to 6 different types of Corruption Events:
“Forensic Analysis of Database Tampering”Classification of Corruption Events • Each of the four algorithms handles these events differently • See Fig 4 for an example for postdating and backdating CE’s.
“Forensic Analysis of Database Tampering”Summary • Rest of the paper describes in detail the four algorithms: • Trivial Forensic Analysis algorithm • when tFVF detected, return entire upper triangle • Monochromatic Forensic Analysis algorithm • calculate LSB, USB, LTB, and UTB as in example • RGB (Red-Green-Blue) algorithm • localizes Corruption Region more tightly by re-hashing selected portions of the database (instead of the entire hash chain) • Polychromatic algorithm • adds additional hash chains to reduce the size of the corruption region to one day
“Forensic Analysis of Database Tampering”Summary • The forensic strength of an algorithm is determined by: • the effort or work of the analysis, i.e. the effort it takes to calculate tc, tl, and tp • the region area • the uncertainty