400 likes | 417 Views
Django & The OWASP Top 10. Jarret Raim Denim Group 2008. What is Django ?. Django is named after Django Reinhardt, a gypsy jazz guitarist from the 1930s to early 1950s. The ‘D’ is silent. It’s pronounced ‘ Jango ’.
E N D
Django& The OWASP Top 10 JarretRaim Denim Group 2008
What is Django? • Django is named after Django Reinhardt, a gypsy jazz guitarist from the 1930s to early 1950s. • The ‘D’ is silent. It’s pronounced ‘Jango’. • “Django is a high-level Python Web framework that encourages rapid development and clean, pragmatic design.” • Developed by in-house programmers for several news sites. • Provides advanced functionality (ORM) and full applications (Admin) out of the box.
Overall Design Goals • Loose Coupling • The various layers of the framework shouldn’t “know” about each other unless absolutely necessary. • Less Code • Django apps should use as little code as possible; they should lack boilerplate. • Django should take full advantage of Python’s dynamic capabilities, such as introspection. • Quick Development • The point of a Web framework in the 21st century is to make the tedious aspects of Web development fast. Django should allow for incredibly quick Web development.
Overall Design Goals • Don’t Repeat Yourself (DRY) • Every distinct concept and/or piece of data should live in one, and only one, place. Redundancy is bad. Normalization is good. • Explicit Is Better Than Implicit • Magic is worth using only if it creates a huge convenience unattainable in other ways, and it isn’t implemented in a way that confuses developers who are trying to learn how to use the feature. • Consistency • The framework should be consistent at all levels. Consistency applies to everything from low-level (the Python coding style used) to high-level (the “experience” of using Django).
Beautiful is better than ugly. • Explicit is better than implicit. • Simple is better than complex. • Complex is better than complicated. • Flat is better than nested. • Sparse is better than dense. • Readability counts. • Special cases aren't special enough to break the rules. • Although practicality beats purity. • Errors should never pass silently. • Unless explicitly silenced. • In the face of ambiguity, refuse the temptation to guess. • There should be one-- and preferably only one --obvious way to do it. • Although that way may not be obvious at first unless you're Dutch. • Now is better than never. • Although never is often better than *right* now. • If the implementation is hard to explain, it's a bad idea. • If the implementation is easy to explain, it may be a good idea. • Namespaces are one honking great idea -- let's do more of those! The Zen Of Python • Python is a open source dynamic programming language. • Python can interact heavily with C based plugins for speed. • Python can be run on the CLR (IronPython) or on the JVM (Jython).
A Note About Tools • While perfect IDE integration for a dynamic language is possible, the current options are not great. • Standard options available • Eclipse • Emacs • Vim • Several debuggers • ActiveState • Winpdb • Komodo / Open Komodo • Expensive
Django and MVC • Django appears to be a MVC framework, but redefines some basic terms. • The Controller is the “view” • The View the “template” • Model stays the same • In Django, the “view” describes the data that gets presented to the user. • It’s not necessarily how the data looks, but which data is presented. • The view describes which data you see, not how you see it. • Where does the “controller” fit in, then? • In Django’s case, it’s probably the framework itself: the machinery that sends a request to the appropriate view, according to the Django URL configuration. • Django is a “MTV” framework • “model”, “template”, and “view.”
Request Handling • Request is handled by server (mod_python, etc). • URLConf routes request to View. • View renders response. • Each section can be extended by middleware layers. • Example middleware • CSRF Protection • Authentication / Authorization • Cache • Transactions
URLConf • urlpatterns = patterns('', • (r'^articles/(?P<year>\d{4})/(?P<month>\d{2})/$', views.month_archive), • (r'^foo/$', views.foobar_view, {'template_name': 'template1.html'}), • (r'^mydata/birthday/$', views.my_view, {'month': 'jan', 'day': '06'}), • (r'^mydata/(?P<month>\w{3})/(?P<day>\d\d)/$', views.my_view), • (r'^admin/', include('django.contrib.admin.urls')), • ) • A regex mapping between the URLs of your application and the views that handle them. • Can point to custom views or Django supplied views. • Uses positional or named groups for parameter passing (?P<year>). • Can override view parameters like template name. • URLs can include ‘fake’ captured data.
URLConf Design Goals • Loose Coupling • URLs in a Django app should not be coupled to the underlying Python code. • For example, one site may put stories at /stories/, while another may use /news/. • Infinite Flexibility • URLs should be as flexible as possible. Any conceivable URL design should be allowed. • Encourage Best Practices • The framework should make it just as easy (or even easier) for a developer to design pretty URLs than ugly ones. • No file extensions or vignette-syle commas • Definitive URLs • Technically, foo.com/bar and foo.com/bar/ are two different URLs, and search-engine robots (and some Web traffic-analyzing tools) would treat them as separate pages. Django should make an effort to “normalize” URLs so that search-engine robots don’t get confused.
Simple Views • The job of the view is to build a ‘Context’ containing all data that will be passed to the template. • Views can return any data such as JSON objects, PDFs, streamed data, etc. def current_datetime(request): now = datetime.datetime.now() html = "<html><body>It is now %s.</body></html>" % now return HttpResponse(html) def current_datetime(request): now = datetime.datetime.now() return render_to_response('current_datetime.html', {'current_date': now})
View Design Goals • Simplicity • Writing a view should be as simple as writing a Python function. Developers shouldn’t have to instantiate a class when a function will do. • Use Request Objects • Views should have access to a request object — an object that stores metadata about the current request. The object should be passed directly to a view function, rather than the view function having to access the request data from a global variable. This makes it light, clean and easy to test views by passing in “fake” request objects. • Loose Coupling • A view shouldn’t care about which template system the developer uses — or even whether a template system is used at all. • Differentiate Between GET & Post • GET and POST are distinct; developers should explicitly use one or the other. The framework should make it easy to distinguish between GET and POST data.
Basic Templates <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"> <html lang="en"> <head> <title>Future time</title> </head> <body> <h1>My helpful timestamp site</h1> <p> In {{ hour_offset }} hour(s), it will be {{ next_time }}. </p> <hr> <p>Thanks for visiting my site.</p> </body> </html>
Template Design Goals • Separate Logic from Presentation • We see a template system as a tool that controls presentation and presentation-related logic — and that’s it. The template system shouldn’t support functionality that goes beyond this basic goal. • Discourage Redundancy • Support template inheritance to support DRY. • Be Decoupled from HTML • Template system should generate any data, not just HTML. • XML should not be used for template languages • Using an XML engine to parse templates introduces a whole new world of human error in editing templates — and incurs an unacceptable level of overhead in template processing. • Assume Designer Competence • Django expects template authors are comfortable editing HTML directly.
Template Design Goals, Part Deux • Treat whitespace obviously • Any whitespace that’s not in a template tag should be displayed. (No Magic) • Don’t invent a programming language • The template system intentionally doesn’t allow the following: • Assignment to variables • Advanced logic • The Django template system recognizes that templates are most often written by designers, not programmers, and therefore should not assume Python knowledge. • Safety and security • The template system, out of the box, should forbid the inclusion of malicious code such as commands that delete database records. • Extensibility • The template system should recognize that advanced template authors may want to extend its technology.
Template Examples {% for country in countries %} <table> {% for city in country.city_list %} <tr> <td>Country #{{ forloop.parentloop.counter }}</td> <td>City #{{ forloop.counter }}</td> <td>{{ city }}</td> </tr> {% endfor %} </table> {% endfor %} {{ name|lower }} {{ my_text|escape|linebreaks }} {{ bio|truncatewords:"30" }} {% if today_is_weekend %} <p>Welcome to the weekend!</p> {% else %} <p>Get back to work.</p> {% endif %} {% ifequal user currentuser %} <h1>Welcome!</h1> {% endifequal %} <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"> <html lang="en"> <head> <title>{% block title %}{% endblock %}</title> </head> <body> <h1>My helpful timestamp site</h1> {% block content %}{% endblock %} {% block footer %} <hr> <p>Thanks for visiting my site.</p> {% endblock %} </body> </html> {% extends "base.html" %} {% block title %} The current time {% endblock %} {% block content %} <p>It is now {{ current_date }}.</p> {% endblock %}
Interacting with the Database: Models • ORM definitions are simple Python objects. • Additional field types with additional semantics. • Generates database agnostic schemas. • Removes boilerplate (surrogate keys). class Author(models.Model): salutation = models.CharField(maxlength=10) first_name = models.CharField(maxlength=30) last_name = models.CharField(maxlength=40) email = models.EmailField() headshot = models.ImageField(upload_to='/tmp') CREATE TABLE "books_author" ( "id" serial NOT NULL PRIMARY KEY, "salutation" varchar(10) NOT NULL, "first_name" varchar(30) NOT NULL, "last_name" varchar(40) NOT NULL, "email" varchar(75) NOT NULL, "headshot" varchar(100) NOT NULL );
Basic Data Access >>> from books.models import Publisher >>> p1 = Publisher(name=‘X', address=Y', ... city='Boston', state_province='MA', country='U.S.A.', ... website='http://www.apress.com/') >>> p1.save() >>> publisher_list = Publisher.objects.all() >>> publisher_list • Data access is lazy and accomplished as a set of Model methods (think extension methods in .NET). • Generates SQL injection safe statements.
Advanced Data Access >>> Publisher.objects.filter(name="Apress Publishing") >>> Publisher.objects.filter(country=“USA", state_province="CA") >>> Publisher.objects.filter(name__contains="press") >>> Publisher.objects.get(name="Penguin") >>> Publisher.objects.order_by("name") >>> Publisher.objects.filter(country="U.S.A.").order_by("-name") >>> Publisher.objects.all()[0] • All queries are not executed until the query is ‘read’ by a list, count or slice operation meaning that all queries can be chained. • Each query object contains an internal cache to minimize database accesses. • Extension methods (‘name__contains’) exist for most database operations like case (in)sensitive matching, >, <, in, startswith, etc.
Model Design Goals • Explicit is Better than Implicit • Fields shouldn’t assume certain behaviors based solely on the name of the field. • Include all Relevant Domain Logic • Models should encapsulate every aspect of an “object,” following Martin Fowler’s Active Record design pattern. • SQL Efficiency • Execute SQL statements as few times as possible, and it should optimize statements internally. • Terse, Powerful Syntax • The database API should allow rich, expressive statements in as little syntax as possible. It should not rely on importing other modules or helper objects. • Raw SQL When Needed • The database API should realize it’s a shortcut but not necessarily an end-all-be-all. The framework should make it easy to write custom SQL — entire statements, or just custom WHERE clauses as custom parameters to API calls.
Administration Site Django’s focus on removing the boilerplate work of web development led to the creation of a full-featured, configurable administrative interface to any Django app.
Form Processing • Forms can be auto-generated just like models and can even be pulled directly from a model class. • Includes built in validation for fields like email and image. • Rendering includes screen reader hints like <label> tags and can be rendered in multiple ways to allow for custom CSS usage. class ContactForm(forms.Form): topic = forms.ChoiceField(choices=TOPIC_CHOICES) message = forms.CharField() sender = forms.EmailField(required=False) TOPIC_CHOICES = ( ('general', 'General enquiry'), ('bug', 'Bug report'), ('suggestion', 'Suggestion'), ) <h1>Contact us</h1> <form action="." method="POST"> <table> {{ form.as_table }} </table> <ul> {{ form.as_ul }} </ul> <p> {{ form.as_p }} </p> <input type="submit" value="Submit /> </form>
Generic Views publisher_info = { "queryset" : Publisher.objects.all(), } urlpatterns = patterns('', (r'^publishers/$', list_detail.object_list, publisher_info) ) Because it’s such a common task, Django comes with a handful of built-in generic views that make generating list and detail views of objects incredibly easy. {% block content %} <h2>Publishers</h2> <ul> {% for publisher in object_list %} <li>{{ publisher.name }}</li> {% endfor %} </ul> {% endblock %}
Middleware • Session Framework • This session framework lets you store and retrieve arbitrary data on a per-site visitor basis using encrypted cookies. • Users & Authentication • Handles user accounts, groups, permissions, and cookie-based user sessions. • Access limits can be expressed in code, decorator methods or template language. • Caching • Implements several caching mechanisms for development and production. • CSRF Protection • Implements the random seeded form method for protecting from CRSF attacks. • Transaction • This middleware binds a database COMMIT or ROLLBACK to the request / response phase. If a view function runs successfully, a COMMIT is issued. If the view raises an exception, a ROLLBACK is issued.
OWASP Top 10 • Cross Site Scripting (XSS) • Injection Flaws • Malicious File Execution • Insecure Direct Object Reference • Cross Site Request Forgery • Information Leakage & Improper Error Handling • Broken Authentication & Session Management • Insecure Cryptographic Storage • Insecure Communications • Failure to Restrict URL Access
URL Mapping as a Security Measure • URL mappings are just regexsso they are your first line of defense. • \d{4} • Only allows combinations of numbers to be input, otherwise a 404 is issued. • Valid: 2008, 3456, 1924, 2345 • Invalid: 23<script>alert(‘hi!’)</script>, -3454, 34, 1, 4h8g, #$#$$#ojif, etc. • Still have to logically validate, but structural validation is easy. • Helps prevent: Injection Flaws, XSS, CSRF, File Execution • urlpatterns = patterns('', • (r'^articles/(?P<year>\d{4})/(?P<month>\d{2})/$', views.month_archive), • (r'^foo/$', views.foobar_view, {'template_name': 'template1.html'}), • (r'^mydata/birthday/$', views.my_view, {'month': 'jan', 'day': '06'}), • (r'^mydata/(?P<month>\w{3})/(?P<day>\d\d)/$', views.my_view), • (r'^admin/', include('django.contrib.admin.urls')), • )
Custom Data Typing • Django adds custom fields that are not provided by SQL. • These fields provide validation for model and form data. • Users can define their own fields with custom validation. • unique_for_* constraints • Validator lists for arbitrary fields. • Enums are first class • CommaSeparatedIntegerField • DateField / DateTimeField • EmailField • FileField, FilePathField • ImageField • IpField • PhoneNumber • UrlField • UsStateField GENDER_CHOICES = ( ('M', 'Male'), ('F', 'Female'), )
Cross Site Scripting (XSS) • All variables output by the template engine are escaped. • Django provides a ‘safe’ filter to bypass the protection. {{ name|safe}}
Security – SQL Injection • Django automatically escapes all special SQL parameters, according to the quoting conventions of the database server you’re using (e.g., PostgreSQL or MySQL). • Exception: The where argument to the extra() method. That parameter accepts raw SQL by design. • Exception: Queries done “by hand” using the lower-level database API.
Malicious File Execution • Standard Django practice is to have a separate server for all static media files. • Prevents standard directory traversal attacks (PHP). • Provides ImageField for standard user image manipulation and validation. • Custom storage backends can be developed to have specific validation behavior or custom storage. • Eg. Amazon S3, etc.
Insecure Direct Object Reference • Django defines a special field called ‘slug’. • In newspaper editing, a slug is a short name given to an article that is in production. • Examples: /blog/my-blog-entry, /home/this-is-a-slug • Slugs allow for SEO friendly URLs & limit the usage of passing of IDs as GET parameters. • Views can declare the user / roles allowed to call. • Permission to specific objects still needs to be checked by hand. • from django.contrib.auth.decorators import user_passes_test • @user_passes_test(lambda u: u.has_perm('polls.can_vote')) • def my_view(request): • # ...
Cross Site Request Forgery • Django provides included middleware to implement protection. • Seeds each form with a hash of the session ID plus a secret key. • Ensures that the same hash is passed back with each post. • Django style guidelines recommend that all GET posts are idempotent.
Information Leakage & Improper Error Handling • Django provides a DEBUG setting in the settings.py file. This prevents the framework from outputting any sensitive information. • The ExceptionMiddleware defines how the framework handles exceptions. • Custom ExceptionMiddleware can be created to handle exceptions.
Session Forging / Hijacking • Django’s session framework doesn’t allow sessions to be contained in the URL. • Unlike PHP, Java, etc. • The only cookie that the session framework uses is a single session ID; all the session data is stored in the database. • Session IDs are stored as hashes (instead of sequential numbers), which prevents a brute-force attack. • A user will always get a new session ID if she tries a nonexistent one, which prevents session fixation.
Insecure Cryptographic Storage New Google library to allow for easy encryption, Keyczar. • def _get_ssn(self): • enc_obj= Blowfish.new( settings.SECRET_KEY ) • return u"%s" % enc_obj.decrypt( binascii.a2b_hex(self.social_security_number) ).rstrip() • def _set_ssn(self, ssn_value): • enc_obj= Blowfish.new( settings.SECRET_KEY ) • repeat = 8 - (len( ssn_value ) % 8) ssn_value = ssn_value + " " * repeat self.social_security_number = binascii.b2a_hex(enc_obj.encrypt( ssn_value )) • ssn= property(_get_ssn, _set_ssn) Simple to create transparent encrypted storage for model fields.
Failure to Restrict URL Access • Insecure Communications • Links not specifically called out in the Urlconf don’t exist. • Views can be tagged with roles & permissions. • Static files loaded via forced browsing don’t exist in Django. • The framework has little to do with insecure communications. • Django uses the secure cookie protocol specified by Professor Alex X. Liu of Michigan State University. • Django authentication can be marked to use a secure cookie to force SSL.
Deployment & Scaling • Standard LAMP stack. • Choice of mod_python, mod_wsgi and fast-cgi for request forwarding. • Standard Django deployment requires a separate static media server. • Scaling • Offload database. • Increase application servers. • Increate media servers. • Load balance. • Memcached
Deployment & Scaling Over 10 million / day served.