320 likes | 452 Views
Flexible Dynamic Linking for .NET. Susan Eisenbach s.eisenbach@imperial.ac.uk Alex Buckley a.buckley@imperial.ac.uk. Agenda. Introduction to dynamic linking Flexibility v. safety at link-time Developer-centric flexibility Design issues Conclusion. Dynamic linking. Turns
E N D
Flexible Dynamic Linkingfor .NET Susan Eisenbach s.eisenbach@imperial.ac.uk Alex Buckley a.buckley@imperial.ac.uk
Agenda • Introduction to dynamic linking • Flexibility v. safety at link-time • Developer-centric flexibility • Design issues • Conclusion
Dynamic linking Turns ldfldMemberDescriptor generated at compile-time into ldfld0x100405 in the run-time environment by using assembly and class definitions
Dynamic linking • Is taken as given in modern execution environments • Saves space by sharing code between programs • Enables binding policies such as: • User/system-wide upgrades (v1.0 → v2.0) • Servicing policy (v1.1.9 → v1.2) • Unification policy (use version corresponding to the CLR) • Local and remote probing (check GAC, then URL) • Supports bytecode verification on the user’s machine
call void[mscorlib]System.Console::WriteLine(string) Linker MSCorLib v1.1.1322 PublicKey=99999 Culture=US Calc v1.2.3.4 PublicKey=12345 Culture=UK Assembly metadata import mscorlib = MSCorLib, v1.1.1322, PK=9999, Culture=US Assembly metadata Class metadata export System.Console IL code Class metadata export A,B,C,D IL code namespace System;class Console {void WriteLine(string s){…}}
Linking is constrained by compiler decisions Console.WriteLine(“Hi”); Known to the compiler call void[mscorlib]System.Console::WriteLine(string) Available at runtime call void[monolib]System.Console::WriteLine(string)
Platform- or vendor-specific assemblies might be available at runtime only • Generic ODBC v. SQLServer ODBC • Microsoft FTP v. third-party SecureFTP A concrete example: • Imperial’s LTSA model-checkercan use non-redistributable NASA algorithms • How to avoid compile-time dependencies on NASA code? • Separate compilation still checks dependencies • Have to use error-prone reflection
How can I bind to the runtime environment when the compiler forces its choices on me? or, Instead of the database library, how can I target a database library?
Initial idea: make bytecode more flexible Assembly type variable call void [X]System.Console::WriteLine(…) call void [mscorlib]System.Console::WriteLine(…) Class type variable call void [mscorlib]Y::WriteLine(…) call void [mscorlib]System.Console::WriteLine(…) Compile-time Link-time
Flexible bytecode • Enables late binding between program & environment • Use “generic” classes without naming specific assembly • May have many versions of an assembly: Development/Testing/Production/Archive • Want to develop binary components off-site, using stub assemblies, then execute on-site, using full assemblies • Simplified command-line compilation (fewer /r: args) • Augments Fusion re-versioning with renaming • Use specific assembly without mentioning a class • E.g. programmer uses List interface but implementation picked later
Type-safety • The substitute for an assembly or class should provide members used by the programmer, e.g. • Assembly X provides interface List • Interface List provides ‘void compare(List l);’ • Class Y is a subclass of/implements List So: • Collect constraints from bytecode • Search the GAC for suitable assemblies at run-time • OK?
Problems with constraints • Subtype constraints require data-flow analysis • Substitution may be over-constrained by unreachable code • Constraints say nothing about behaviour • Debugging is impractical if unknown components are chosen at run-time
Semantic substitutions Policy: Programmer has to know valid assemblies and classes • Custom attributes declare possible substitutions [LinkAssembly(Assembly1, Assembly2)] [LinkAssembly(Assembly1, Assembly3)] [LinkClass(Class1, Class2)] [LinkClass(Class1, Class3)] • Any assembly or class can be independently substituted →All types are type variables • Still need member constraints, but only to avoid resolution errors, not guide substitutions [LinkMember(Assembly1, Class1, B m(D))] [LinkMember(Assembly1, Class1, B f]
Attribute scoping • Many classes will not require flexible resolution • Minimise impact by choosing the right scope [assembly: LinkClass(…)] [module: LinkClass(…)] [LinkClass(…)] class App { void m() // Search class,module,assembly [LinkClass(…)] void n() // Use this method’s LinkClass • Can rebindan assembly/class name across scopes • Only the most local scope is used
Substitution interfaces • Group substitutions by platform/vendor/maturity: [LinkAssembly(A1,ABC,”win32”)] [LinkAssembly(A2,DEF,”win32”)] [LinkAssembly(A1,GHI,”win64”)] [LinkAssembly(A2,JKL,”win64”)] • On a Win32 machine, only the A1→ABC and A2→DEF substitutions will be possible • Once A1 or A2 has been substituted, we should stay within the “win32” interface
Interface policies [LinkAssembly(A1,ABC,”win32”, LOCAL_INTERFACE)] Demand only [LinkAssembly]+[LinkClass] from “win32” [LinkAssembly(A1,ABC,”win32”, LOCAL_INTERFACE_PREFERRED)] Try “win32” attributes first, but allow others on failure [LinkAssembly(A1,ABC,”win32”, LOCAL_INTERFACE_EAGER)] Eagerly check that all “win32” attributes will succeed [LinkAssembly(A1,ABC,”win32”,ANY_INTERFACE)] No restrictions on later attributes
Preparing bytecode for flexible linking Hello.cs Hello.il [assembly: LinkClass] [LinkClass] class App { void m() { … } [LinkClass] void n() { … } } .assembly Hello { .custom instance LinkClass .custom instance LinkMember .class App extends … { .custom instance LinkClass .custom instance LinkMember ILASM Metadata is backward - compatible Infer member constraints Hello.exe Assembly metadata Hello.il Class metadata Compiler .assembly Hello { .custom instance LinkClass .class App extends … { .custom instance LinkClass IL code ILDASM Must be able to compile in some default environment Avoid source code: compilers are hard to change, and there are many of them
Just-In-Time substitution • CLI-compliant linking is very flexible • If verification happens, its timing is not specified • Timing of resolution is very loose • As early as install time, as late as execution time • But actually, the CLR is lazy • Resolves when an expression is JIT-compiled • Verification happens at resolution • We extend resolution to handle [Link*()] attributes
Modifying the SSCLI x86 code Verifier/JIT compiler Attribute collection CEEInfo::findClass/Field/Method Resolution cache Standard resolution Constraint verification FDL resolution Assembly/class loading Fusion Filesystem
Attribute collection • A LinkContext encapsulates a single resolution attempt, e,g, call [A]C::m … = A fully-qualified member reference needing resolution + The nearest [LinkAssembly] and [LinkClass] attributes in scope + Set of constraints applying to these attributes • JIT-compiling a method creates a MasterLinkContext • Finding custom attributes is easy with Metadata Importers • LinkContexts in a method share a MasterLinkContext • Need a LinkContext for caller’s and callee’s scope
Nested LinkContexts To resolve call [A]C::m(D,E,F) • (MasterLinkContext is already created) • Create LinkContext for this instruction • Use LinkContext to choose for A, and C • Load (substituted version of) [A]C, and find m [A]C::m expects to execute in an environment where its own custom attributes are obeyed • Create nested LinkContext for method m in [A]C • Resolve D,E,F under original + nested LinkContexts
Issue: Flexible fields class A { [LinkAssembly(...)] [LinkClass(B, C)] private B f = new B(); public void A() { .. } In A’s constructor (.ctor): newobj instance [..]B stfld class [..]B [..]A::f • newobj could be making a B object destined for any field • Only at stfld do we find that it is destined for a flexlinked field • Don’t want to rely on bytecode being in a precise order • Don’t want to look-ahead in the JIT-compiler class A { private B f = new B(); [LinkAssembly(...)] [LinkClass(B, C)] public void A() { .. } Rely on an IDE to move attributes to the constructor: (Where f is initialised)
Issue: Static resolution of flexible fields • C# 1.x compiler resolves fields statically • Has the effect of hiding members that should be substituted • Java 1.3 did the same; changed in 1.4 // Compile-time env class A { String f; } class B extends A {} [LinkClass(B,C)] new B().f; ldfld […]A::f // Does not match // the LinkClass(B,…)
Conclusion • CLI is a good home for flexible dynamic linking • Different runtimes and frameworks (WinFx, .NETCF, OpenCF, Mono, Portable.NET) have different API implementations → more choices for the programmer • Resolution guided by rich metadata • Easy to represent FDL-related facts • Flexibly-linked bytecode is still verifiable (type-safe) • Tiny amounts of code in the right place are very effective • Future work • Modify compiler for independent compilation • (Overcome static resolution problem) • Implement extended resolution mechanism with aspects
Resolution cache • Resolving a member reference gives a metadata token • Which token gets cached depends on the method called first: class A { [LinkAssembly(“mscorlib”,”msphone”,…)] [LinkClass(“System.Console”,”Speech.Output”,…)] void m1() { System.Console.WriteLine(…); } // No attributes apply to m2 void m2() { System.Console.WriteLine(…); } • Caching flexible members → non-FDL code will use them • Not caching flexible members → repetitious FDL resolution • Generate new member refs for flexiblemembers → complex
True runtime discovery? • Rather than specifying substitutions through attributes, make bytecode more abstract with variable types • [fdl05a_X]Class • [Assembly]fdl05c_X • [fdl05a_X]fdl05c_X • Gather constraints on variable types, as we did for classes named by [LinkClass()] • Assembly A is variable • Variable assembly A has class C • [A]C has field f with signature t • [A]C has method m with signature t • ilasm never checks existence of referenced classes, so TypeRefs are implicitly variable
Representing variable types in metadata .assembly extern FDRAttributes { version 0.0.0.0 } .assembly extern fdl05a_X { .custom instance void [FDRAttributes]VariableTypeAttribute::ctor() } .assembly HelloWorld { .custom instance void [FDRAttributes]VariableAsmHasClassAttribute::.ctor(string,string) = ( 01 00 08 … // ...fdl05a_X.fdl05c_X ) .custom instance void [FDRAttributes]VariableClassHasMethodAttribute::.ctor(string,string,…) = ( 01 00 08 … // ...fdl05a_X.fdl05c_X.WriteLine… ) } .class C { .method void Main(string[] args) managed { call void [fdl05a_X]fdl05c_X::WriteLine(…) } }
Implementing FDL JIT-compiler CEEInfo::findClass/Field/Method Add LinkContext to thread’s stack Add reference’s info to LinkContext Caching Find lowest scope level with the appropriate [LinkAssembly()] Choose [LinkAssembly()] directives w.r.t. interface policy Get class substitutions and constraints relating to assembly Find exact substitution For each GAC assembly For each class substitution Check existence of requested member Signature and constraint verification of found member
1) JIT-compile call [X]C::… 2) Ask ClassLoader of current assembly if [X]C is in current module (No) 3) Does current assembly’s metadata have a TypeRef for C? (Yes) 5) User input to substitute assembly type variable to a GAC assembly 4) TypeRef points to AssemblyRef, which indicates a type variable 6) Substituted assembly’s ClassLoader recognises class type variables (‘fdlC…’) and checks map
Assembly binding can only use the names in IL Compilation environment class A {B f;} class B {C g;} new A().f.g new A.f[A,B].g[B,C] compiles to Execution environment class A {D f;} class D {C g;} new A.f[A,B].g[B,C] ResolutionError executes as