750 likes | 865 Views
Security Aspects in Java Bytecode Engineering (Blackhat Briefings 2002, Las Vegas, Aug 01,02). Marc Schönefeld Software-Architect schonef@acm.org. Agenda. Java Security Architecture: an Overview The JVM: Structures and Concepts Bytecode Basics Bytecode (Reverse) Engineering
E N D
Security Aspects in Java Bytecode Engineering(Blackhat Briefings 2002, Las Vegas, Aug 01,02) Marc SchönefeldSoftware-Architect schonef@acm.org
Agenda • Java Security Architecture: an Overview • The JVM: Structures and Concepts • Bytecode Basics • Bytecode (Reverse) Engineering • Applications of Bytecode (Reverse) Engineering
Java Security Architecture: an overview
How the court defines JavaFindings of Judge Thomas Penfield Jackson, Nov. 5, 1999 • 73. The term "Java" refers to four interlocking elements. • First, there is a Java programming language with which developers can write applications. • Second, there is a set of programs written in Java that expose APIs on which developers writing in Java can rely. These programs are called the "Java class libraries." • The third element is the Java compiler, which translates the code written by the developer into Java "bytecode." • Finally, there are programs called "Java virtual machines," or "JVMs" which translate Java bytecode into instructions comprehensible to the underlying operating system.
Java Architectural Stack EJBeansServlets, JSP Applet Application J2EE App-Server Browser JRE Standalone JRE,J2SE Web Start Java Virtual Machine Operating System Hardware
The typical JVM development environment Development Runtime Dog.java Javac/ jikes Dog. class Class loader V E R I F I E R System Class loader String system class Cat.py jython Cat. class Class loader Object system class Bird.j J A S M Bird. class Class loader <other> System class J V M
The basic security architecture • Java security (APIs) • (access): The Security manager • (origin): Signed Codebases • (behalf): Principle-based access control (JAAS) • cryptography • JVM security • Class loaders • Class file verification process • JVMintrinsic security features
Java securitySecurity Manager and its API • Central instance for access control as far as code is concerned • Policies define access to outer-domain ressources • SecurityManager objects instances enforce policies, throwing SecurityExceptions • By Default java programs do not have a security manager, therefor it is a good precaution to instantiate one
Java securitySecurity Manager and its API • Fine-grained control to Limit access on: • SocketConnections (create, accept, multicast) • Thread Groups • Dynamic Library Loading (JNI) • Files (read, write, delete) • Access to External shared ressources (printjob, clipboard) • Program control (exit, toplevelwindow) • Runtime components (member, package, classloader)
Java securityCode Base Authentication JAR JAR Cat.class Dog.class Bird.class Cat.class Dog.class Bird.class Signed hash • Java-Archives (JARs) store codebases • Proof of Origin can be be achieved by signing the jars hash Private key sign
Java securityJAAS: Security based on principals • Enables login functionality • Username, password • Fingerprint • ... • Execution permitted/denied depending on the identity who runs the code • Policy based access to functionality • Fine-grained permission handling possible
JVM securityintrinsic features • Non-continuous memory model, distinct data areas • Java stack frames (execution state) • Method area (bytecode storage) • Garbage-collected heap (object storage) • Type-safe casting • Noself-modifying code • Automated garbage-collecting disallows explicit free operation • Automatic Array bounds-checking prevents off-by-one and buffer overflow scenarios
JVM security Class loaders • Classloaders load a classfile as byte array into the JVM • Can load from • file, • network or • dynamically generated byte array • Can even compile on the fly (so Java behaves like Perl) • Security features • Establishing name spaces • Enforcing separation of trusted system library code from user-supplied code via parent-delegation
JVM security Verifier • Task: check loaded classfile for integrity • 4-step process • 1st step: structural correctness • 2nd step: data type correctness • 3rd step: bytecode checks • 4th step: symbolical references management (runtime)
JVM security Classfile verification Verifier P A S S 1 P A S S 2 P A S S 3 P A S S 4 public class Cat { void bite (int times) { ... } } J V M JAVAC Class loader CA FE BA BE 00 03 00 2D 00 13 07 00 17 12 30 11 .. .. .. CA FE BA BE 00 03 00 2D 00 13 07 00 17 12 30 11 .. .. .. bytecode assembler .class public Dog .method bite I .invokestatic seekVictim ... .end method .end class
The Verification Process Pass 1: Basic Structural checks • the classloader delivers byte array • Magic number = 0xCAFEBABE ? • Version id: 1.1=45.3, 1.2=46.0, 1.3=47.0, 1.4=48.0 • All recognized attributes need to be in proper length • The class file must not be truncated or have extra bytes at the end • The constant pool must not contain any „superficially unrecognizable information“
The Verification ProcessPass 2: Check Context-Pool (CP) information • final classes are not subclassed, and final methods are not overridden. • All classes (except java.lang.Object) must have a superclass. • Check constraints for CP-entries: For example, class references in the CP can be resolved via a field to a string reference in the CP. • Checking that all field references and method references in the CP must have legal names, classes, and type signature.
The Verification ProcessPass 3 : Bytecode verification • Core part of verification • Static constraints • Checking maximal local variable count throughout control flow • Checking control-flow correctness (branch always to start of instruction, not beyond end of code) • all exception-handlers are valid (no partial overlap) • ... • Structural constraints • Reachability : subroutines (scope), exception handlers • data-flow : Instances initialization and new objects, stack size
The Verification ProcessPass 4: delayed checks during runtime • Verifies that currently executing class is allowed to reference the given class. • The first time an instruction calls a method, or accesses or modifies a field, the verifier checks the following: • method or field class • Method or field signature • that the currently executing method has access to the given method or field • insert „quick“ optimized instructions
Problems with JDK verifierNot enabled by default public class IllegalAccess2{ public String str = "The trade secret"; public void test(){ System.out.println("Test2: " + str); } } public class IllegalAccess{ public static void main(String args[]){ IllegalAccess2 t2 = new IllegalAccess2(); System.out.println("Test1: " + t2.str); t2.test(); t2.str = "an open hint"; System.out.println("Test1: " + t2.str); System.exit(0); } } D:\entw\java\blackhat>java -classpath . IllegalAccess Test1: The trade secret Test2: The trade secret Test1: an open hint
Problems with JDK verifierNot enabled by default public class IllegalAccess2{ private String str = "The trade secret"; public void test(){ System.out.println("Test2: " + str); } } public class IllegalAccess{ public static void main(String[] args){ IllegalAccess2 t2 = new IllegalAccess2(); System.out.println("Test1: " + t2.str); t2.test(); t2.str = "an open hint"; System.out.println("Test1: " + t2.str); System.exit(0); } } Variable str now restricted, JVM should now complain access, but at least on JDK1.3.1 and 1.4.0_01 2000 the following happens...
Problems with JDK verifierNot enabled by default D:\entw\java\blackhat>java IllegalAccess Test1: The trade secret Test2: The trade secret Test1: an open hint D:\entw\java\blackhat>java -verify IllegalAccess Exception in thread "main" java.lang.IllegalAccessError: try to access field IllegalAccess2.str from class IllegalAccess at IllegalAccess.main(IllegalAccess.java:7) Only the explicit -verify flag restricts access to restricted variable „str“ !
Problems with Inner classes • Inner classes can access private fields of outer classes final class Outer { private String secret = "you will never be able to read this" ; public void alter_secret(String x) { secret = x; } private String reverseSecret() { StringBuffer b = new StringBuffer(secret); return b.reverse().toString(); } class Inner { private String innersecret = secret; private String reverseinner = reverseSecret(); } public static void main(String[] args) { Outer outer = new Outer(); } } new
Problems with Inner classes • You can‘t access these private fields from other classes via java code, but you can with a handcrafted bytecode class new new Outer dup invokespecial Outer/<init> ()V // new Outer dup // dup here to avoid local vars, take it directly from stack dup invokestatic Outer/access$000 (LOuter;)Ljava/lang/String; getstatic java/lang/System/out Ljava/io/PrintStream; swap // correct the positions invokevirtual java/io/PrintStream/println (Ljava/lang/String;)V invokestatic Outer/access$100 (LOuter;)Ljava/lang/String; getstatic java/lang/System/out Ljava/io/PrintStream; swap // correct the positions invokevirtual java/io/PrintStream/println (Ljava/lang/String;)V return
Other Problems with JDK verifier • SDK and JRE 1.3.1_01 or earlier • „A vulnerability in the Java(TM) Runtime Environment Bytecode Verifier may be exploited by an untrusted applet to escalate privileges. „ • Also some Netscape browser versions were affected
Problems with JDK verifier • You can check your classes with a standalone verifier • Open source solution „Justice“ supplied in Apache Jakarta project BCEL new
Problems with Java securityWhat is also still missing • Checks in terms of hard and soft limits on • memory allocation • Thread activation • Excessive memory usage and threading utilization often leads to Denial of Service problems
The JVM: Structures andConcepts
JVM Internals • The architecture • JVM is an abstract concept • Sun just specified the interface • implementation details depend on specific product (SUN JDK, IBM JDK, Blackdown) • Java bytecode, the internal language • independent from CPU-type (bytecode) • Stackoriented, object-oriented, type-safe
Runtime view on a JVM Runtime Data storage Method Area (Classes) Heap (Objects) JVM runtime Class loader PC registers Stack Frames Native methods Native method stacks
Runtime data • Frame: • Saves runtime state of execution threads, therefore holds information for method execution (program counter) • All frames of a thread are managed in a stack frame
Runtime data • Method area • Runtime information of the class file • Type information • Constant Pool • Method information • Field information • Class static fields • Reference to the classloader of the class • Reference to reflection anchor (Class)
The Constant Pool • The "constant pool" is a heterogenous array of data. Each entry in the constant pool can be one of the following: • string , class or interface name , reference to a field or method , numeric value , constant String value • No other part of the class file makes specific references to strings, classes, fields, or methods. All references constants and also for names of methods and fields are via lookup into the constant pool.
The Class File Structure • You can use a classdumper like javap -c or DumpClass to analyze these inner details H E A D E R CONSTANT- POOL I N T E R F A C E S ACCESS FLAGS (Final, Native, Private, Protected, ...) A T T R I B U T E S FIELDS METHODS
The Class File Format • Java class files are brought into the JVM via the classloader • The class file is basically just a plain byte array, following the rules of the byte code verifier. • All 16-bit and 32-bit quantities are formed by reading in two or four 8-bit bytes, respectively, and joining them together in big-endian format.
Methods and Fields • The type of a field or method is indicated by a string called its signature. • Fields may have an additional attribute giving the field's initial value. • Methods have an additional CODE attribute giving the java bytecode for executing that method.
The CODE Attribute • maximum stack space • maximum number of local variables • The actual bytecode for executing the method. • A table of exceptionhandlers, • start and end offset into the bytecodes, • an exception type, and • the offset of a handler for the exception
Bytecode Basics
The JVM types • JVM-Types and their prefixes • Byte b • Short s • Integer i (java booleans are mapped to jvm ints!) • Long l • Character c • Single float f • double float d • References a to Classes, Interfaces, Arrays • These Prefixes used in opcodes (iadd, astore,...)
The JVM Instruction Mnemonics • Shuffling (pop, swap, dup, ...) • Calculating (iadd, isub, imul, idiv, ineg,...) • Conversion (d2i, i2b, d2f, i2z,...) • Local storage operation (iload, istore,...) • Array Operation (arraylength, newarray,...) • Object management (get/putfield, invokevirtual, new) • Push operation (aconst_null, iconst_m1,....) • Control flow (nop, goto, jsr, ret, tableswitch,...) • Threading (monitorenter, monitorexit,...)
Bytecode • Java Bytecode (JBC) are followed by zero or more bytes of additional operand information. • Table lookup instructions (tableswitch, lookupswitch) have a flexible length • The wide operation extension allows the base operations to use „large“ operands • Noself-modifying code • Nobranching to arbitrary locations, only to beginning of instructions limited to scope of current method (enforced by verifier!)
Bytecode (Reverse) Engineering
Bytecode Engineering tools • Obfuscators • Remove/Manipulate all information that can be used for reverse engineering • Native compilers • „Real“ compile of java bytecodes to native instructions (x86/sparc) • Build your own bytecode • Programmatic Generation • Manipulate classfiles with an API
ObfuscatorsTechniques used • Identifier Name Mangling • The JVM does not need useful names for Methods and Fields • They can be renamed to single letter identifiers • Constant Pool Name Mangling • Decrypts constant pool entries on runtime • Control flow obfuscation • Insertion of phantom variables, stack scrambling • And by relying on their default values inserting ghost branch instructions, which never execute
ObfuscatorsProblems with Obfuscation • Constant value Mangling implies overhead processing in extra method call of an „deobfuscatename“ method in each retrieval from constant pool • Dynamic class loading may become broken as classes get new names and reflection calls like class.forName(„Account“) will fail because class „Account“ now known as by it‘s obfuscated name „b16“! • And: Obfuscation breaks patterns that can be recognized by JIT-engines for optimization
Protecting the Source Code:Native Compilers • Convert Java bytecode to C • Generate executable via normal c-build • fast execution • Additional decompilation effort needed • Long turnaround times • Even for small java programs you get monster size executable files (67mb source for Viva.java) from some commercial products • Transformed program may than be vulnerable to buffer overflows and off-by-ones
Bytecode Reverse Engineering • Decompilation • Get Source code from class files • Graphical Analysis • Rebuild the logical control flow • Disassembly • Get symbolic bytecode from class files
Decompilers But only one really works • Several available, • MOCHA , by Hanpeter van Vliet was the first one • JASMINE , patch of Mocha - crashed less often • JAD, the fast Java Decompiler (written in C++) by Pavel Kouznetsov is quite usable, but can also be fooled • DJ JAVA , GUI for JAD • NMI, GUI for JAD
Decompilers • General method of decompilation • Rebuilding control flow from class file code segment • Matching flow patterns to java language contructs • Associating constants and external references from constant pool entries • Do not redo field/method and constant pool entry mangling • Condensed constant pool entries therefore sometimes result in non-valid java identifiers in the generated source
Graphical AnalysisOverview • All Java Classfiles (those which are obfuscated, too) have to be compliant to the JVM specification • Although Control flow is interleaved in the obfuscated class file, a graphical flow reveals most of the original control flow • The inner structures and dependencies of the methods can be discovered by graphical analysis