360 likes | 482 Views
10 DOs and 10 DON’Ts for the SQL Server CLR. Matt Whitfield. Who am I?. A geek, mostly I hang about on ask.sqlservercentral.com I provide (now free-of-charge) SQL Server tools I help run the #SQLSoton user group I work for a small company in Southampton. What’s this presentation about?.
E N D
10 DOs and 10 DON’Ts for the SQL Server CLR Matt Whitfield
Who am I? • A geek, mostly • I hang about on ask.sqlservercentral.com • I provide (now free-of-charge) SQL Server tools • I help run the #SQLSoton user group • I work for a small company in Southampton
What’s this presentation about? • The CLR is a hugely untapped resource in SQL Server • I’m on a bit of a mission to make people more aware of what you can do and how you can do it best • This presentation addresses the ‘how you can do it best’ bit… …at a ‘starter’ level
So what is the CLR anyway? • CLR = Common Language Runtime • In the SQL Server world, we use the term CLR to refer to database objects that are implemented in a CLR language (C#, VB.NET …) • We can make procedures, functions, user-defined types, user-defined aggergates and triggers
DO #1:Know your transactions • The easiest way to wrap your transactions in the CLR is using the TransactionScope() class • If you are just using a single connection, then a local transaction will be issued • However, if you connect using a different connection string (even in the same database) a distributed transaction will be enlisted
DON’T #1:Add strings together • Basic concept in all .NET code • string is Immutable • An instance of a string cannot be changed • Special compiler support for strings can make it look like you can change it • Adding strings together places load on the managed heap and garbage collector • Use StringBuilder instead
DO #2:Know structs v classes • In .NET, there are two types of object • Reference types • Value types • Reference types are passed around by their reference – functions will operate on the same object • Value types are passed around in their entirety – functions will operate on a copy of the object
DO #2:Know structs v classes • What does that mean? DEMO TIME Any volunteers?
DON’T #2:Go overboard • The CLR support in SQL Server is a great programming tool • It allows you to perform an extremely wide variety of operations • It also, therefore, allows you to do dumb stuff • Don’t start trying to implement all your procedures in CLR for the sake of it
DON’T #2:Go overboard • Don’t wrap your T-SQL in a CLR procedure that just passes that T-SQL back to the engine (i.e. don’t make CRUD procedures in the CLR) • Don’t create CLR types when they don’t really add value to the task at hand – remember there is an overhead
DO #3:Dispose your IDisposables • If an object implements IDisposable, then it expects you to call Dispose() on it when you are finished • The idea is that Dispose() gives the object the chance to release unmanaged resources early – before garbage collection • You don’t necessarily know why an object is IDisposable, so you need to trust that it is necessary • There is huge misconception about this
DO #3:Dispose your IDisposables • The easiest way to guarantee disposal is through the using() {} block. • When you exit the scope of a using block, the resource that you allocated in that block is guaranteed to be disposed, no matter how you exit the block (Exception, return) • You can find incorrect examples of this all over the web, MSDN included
DO #3:Dispose your IDisposables DEMO TIME
DON’T #3:Forget that data scales • Data is often bigger in your production environment than in your test environment • Don’t implement anything that assumes a fixed size of data (e.g. a static array without bounds checking) • Don’t assume that an access method that is fastest with a small amount of data will be fastest with a large amount of data
DO #4:Aim to write SAFE code • Permission sets control what your CLR code can do • SAFE is a restrictive permission set, but targeting it means that you cannot affect the stability of the SQL Server Process • EXTERNAL_ACCESS allows access to resources outside of SQL Server, but be careful not to introduce unnecessary waits • Web service call in a trigger – I’m looking right at you
DO #4:Aim to write SAFE code • UNSAFE is called unsafe for a reason. • You need to be very sure about what you are doing under the UNSAFE permission set. • How long will that static variable live for? • What exactly is happening when another thread is started?
DON’T #4:Treat NULL as an afterthought • NULLs happen • Always assume that you will receive a NULL value as a parameter or in data that you read back from the database • Use nullable types to wrap simple types in your data structures if you need to • The Sqlxxx types all represent NULL with the IsNull property • Nullable types represent NULL with the HasValue property
DO #5:Understand the GAC-lack • Only ‘blessed’ assemblies can be loaded from the GAC • Other assemblies need to be loaded through CREATE ASSEMBLY • This can create a maintenance issue if application layer code shares an assembly with the database layer code
DO #5:Understand the GAC-lack • Microsoft.VisualBasic.dll • Microsoft.VisualC.dll • CustomMarshalers.dll • System.Data.OracleClient.dll • Mscorlib.dll • System.Data.dll • System.dll • System.Xml.dll • System.Security.dll • System.Web.Services.dll • System.Data.SqlXml.dll • System.Transactions.dll • System.Configuration.dll
DON’T #5:Use finalisers • Finalisers are the .NET equivalent of destructors – code which is fired when an object is freed. • In .NET, garbage collection does the free, and so calls your finaliser for you • This means that your class is always promoted to a Generation 1 collection… • What on earth is that?
DON’T #5:Use finalisers • Garbage collection is done in generations 0, 1 and 2 • Generation 0 is the cheapest, implying the least work for the garbage collector • A ‘dead’ object with a finaliser is always skipped – guaranteeing promotion to Generation 1 • All objects referenced by the object with the finaliser are also kept alive
DO #6:Understand managed memory • Managed memory isn’t the same as Native memory • The managed memory used by the CLR comes from the MemToLeave address space • Moving objects between native and managed memory is called marshalling, and it has a cost associated with it
DON’T #6:Reinvent the wheel • Does your boss like the ‘Just Do It’ attitude? • It works for Nike, for coding not so much • Take the time to find out if the function you want already exists (it probably does) • If you can’t find it, try community sites – they are immensely helpful
DO #7:Use the context connection • If you need to get data from the database that your code is running against… …use the context connection • Connection string is: “context connection=true;” • Connecting using a standard connection requires elevation to EXTERNAL_ACCESS
DON’T #7:Access data in scalar functions • CLR Scalar Functions can be a lot faster than their T-SQL equivalents, particularly for string manipulations and complex procedural logic • Accessing data in scalar functions is not cool • You will absolutely kill performance by setting up a connection, running a query, returning the result
DO #8:Write efficient code • If you are running CLR code in SQL Server, then you are running it in the place within your architecture that is hardest to scale • It might be easy to write a brute-force algorithm, but when will it become a problem? • When you’re on holiday – and you certainly don’t want to spend the time on re-writing the algorithm to be more efficient then • Put your code in a normal app to profile it
DON’T #8:Over optimise • Hang on… didn’t you just say to write efficient code? • Yes, but there’s a difference between efficient and unintelligible • Simple, well structured code often has suitable performance characteristics • Don’t spend 90% of the time on the last 10% performance gain • Find and solve the major bottlenecks first
DO #9:Understand boxing • No, not the sport, or the day after christmas • We learned about value and reference types earlier • All objects can be treated the same way • So how is a value type passed in a context that expects a reference (e.g. List<T>) • Boxing
DO #9:Understand boxing • Boxing is taking your value type, and enclosing it in an Object instance, so it becomes a reference type • This has implications for equality • 1 == 1, but (object)1 != (object)1 • Why? • Because when boxed, each 1 has it’s own box & reference, and these are not equal
DON’T #9:Move the middle tier in • We’ve established that sometimes it can be useful to run middle tier code in the database • This should not be a default position • Running middle tier code in the database will limit your ability to scale-out – scaling out the database layer is inherently more complex than scaling out a middle tier.
DO #10:Use Dictionary<TKey, TValue> • Dictionary objects allow fast (close to O(1)) access to a large list of objects • Finding an object by key rather than looping through is the exact equivalent of doing an index seek rather than a scan • When implementing objects for dictionary access, over-ride the GetHashCode and Equals methods of Object
DO #10:Use Dictionary<TKey, TValue> • Be careful, though • If two objects would return true when Equals was called between them, then their GetHashCode calls must return the same value • This makes it unsuitable for fuzzy matching (e.g. allowing tolerance between float values)
DON’T #10:Call Environment.Exit • So Mladen Prajdic told me I should put this in. I think he was joking but… • This shows some of the dangers of working in the UNSAFE permission set • What happens when you call Environment.Exit? • It exits your procedure and rolls back?
DOs: • Know your transactions • Understand structs & classes • Dispose your IDisposables • Aim to write SAFE code • Understand the GAC • Understand managed memory • Use the context connection • Write efficient code • Understand boxing • Use Dictionary objects
DON’Ts: • Add strings together • Go overboard • Forget that data scales • Treat NULL as an afterthought • Use finalisers • Re-invent the wheel • Access data in scalar functions • Over-optimise • Move the middle tier in • Call Environment.Exit
Thanks! • Any questions – drop me an email: mattw@code9.net @atlantis_uk on twitter • If you live near Southampton – come to the #SQLSoton user group • If not – find your local user group at Community corner