1 / 32

KNIME

KNIME. Visual Programming for Metabolomics. Stephan Beisken. Visual Programming.

ferrol
Download Presentation

KNIME

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. KNIME Visual Programming for Metabolomics Stephan Beisken

  2. Visual Programming • “Visual programming languages enable physicians and other computer users with little knowledge of programming to develop computer software. The physician uses a visual paradigm to "draw" the computer interface and then attaches short segments of computer code to buttons, menus, and list boxes.” Ebell, M. H. (1993). Visual programming languages. M.D. Computing : Computers in Medical Practice, 10(5), 305–11.

  3. Motivation • Simplify your (working) life • Data processing and analysis requires various different tools to work together in sequence • Data input and output • Spreadsheets • Data transformation • Transposition, aggregation, string manipulation • IsaCreator • Formatting of tables

  4. Agenda • Introduction • Tutorial • Installation and Extensions • Overview of the Workbench • Nodes and Table Models • Exercises • Introductory Examples • MassCascade • OpenMS • XCMS • Slides, software, workflows, and data for takeaway

  5. Disclaimer • Workflows are great • It does not have to be KNIME, there are many other solutions • Every method that captures information in a consistent manner and enables reproducibility is great • Transparency • Ability to share data and ‘everything’ that was done to the data

  6. Who is already a KNIME user?

  7. Introduction • KNIME: Konstanz Information Miner • http://www.knime.org/ • Developed at University of Konstanz in Germany • Desktop version available free of charge (open source) • Modular platform for building and executing workflows using predefined components: nodes • Core functionality available for tasks such as data mining, analysis, and manipulation • Extra features and functionality available in KNIME through extensions from various groups (community) and vendors • Written in Java based on the Eclipse SDK platform

  8. Workflow Concepts • Workflow execution • Can execute complex, multi-step operations on input data • Can be run be “non-experts” using predefined parameter templates ensuring optimal results • Can be set up for specific measurement systems • Can be shared across researchers

  9. Functionality • Data manipulation and analysis • File & database I/O, sorting, filtering, grouping, joining, pivoting • Data mining and machine learning • R, WEKA, KNIME, interactive plotting • Cheminformatics • Conversions, similarity, clustering, (Q)SAR analysis, etc. • Scripting integration • R, Perl, Python, Matlab, Octave, Groovy • Reporting and much more • Bioinformatics, HTS & image analysis, network & text mining • Marketing, big data and business analytics

  10. Modules (Community Extensions) • http://tech.knime.org/community • Chemoinformatics • CDK (EMBL-EBI), RDKit (Novartis), Indigo (GGA), • ErlWood(Eli Lilly), Enalos (NovaMechanics) • ChEMBL and ChEBI (EMBL-EBI) • Bioinformatics • OpenMS (Tübingen, ETH Zurich) • MassCascade (EMBL-EBI) • HCS (MPI), NGS (Konstanz), Image analysis • Integration • Python, Perl, R, Groovy, Matlab (MPI), PDB web services client (Vernalis), REST and SOAP web service support

  11. Workflow Platforms

  12. Applications

  13. Applications cont.

  14. Applications cont.

  15. Applications cont.

  16. Applications cont. Regression Calibration

  17. Advantages Disadvantages • Intuitive to use • No or little programming experience required • Good for prototyping • Lots of functionality • Very modular and flexible • Active community • Extensible • Visual Feedback • Steep learning cure • Resource greedy • No (free) server edition • Slower execution than standalone scripts

  18. Installation • Download and unzip KNIME • No further setup required • ./knime.ini contains arguments for launch • Install new modules (nodes) from update sites • Explorer and installation wizard provided • Workflows and data are stored in a workspace • ~/<user>/knime/workspace • C:\Users\<user>\knime\workspace • Preferences in: File Preferences  KNIME

  19. Workbench Auto-layout Execute Execute all nodes Node description tabs workflow projects favorite nodes public server workflow editor node repository outline console

  20. Nodes • Node: Basic processing unit of a workflow • performs a particular task Input port(s) – on the left of icon Title Output port(s) – on the right of icon Icon • Status display (‘traffic lights’) • Red (not ready) • Amber (ready) • Green (executed) • Blue bar during execution (with percentage or flashing) Right-click menu To configure and execute the node, display the output views, edit the node, and display data for the ports Sequence number

  21. Dialogs • Double-click opens configuration dialogs • Explicit column types

  22. Tables Table rows Column specifications Various renderers Column types

  23. Exercises: Preliminaries • Pre-installed KNIME Desktop 2.9.1 • Workflows • starters, xcms, openms, masscascade • Data • FAAH knockout LC/MS data • ESB tomato LC/MS QC data • ChEBI SDFile, KEGG SDFile • Plug-Ins (more in About KNIME  Installation Details) • R (interactive) • Erl Wood, CDK • OpenMS, MassCascade

  24. Exercises: Installation • Open your KNIME directory • ~/Desktop/knime_2.9.1 • ./knime.exe • Memory allocation • ./knime.ini

  25. Exercises: Starters • More examples available from the Examples repository

  26. Exercises: MassCascade https://bitbucket.org/sbeisken/masscascadeknime/wiki/ExampleWorkflows

  27. Exercises: XCMS http://www.bioconductor.org/packages/devel/data/experiment/manuals/faahKO/man/faahKO.pdf

  28. Exercises: OpenMS http://ftp.mi.fu-berlin.de/OpenMS/release-documentation/OpenMS_tutorial.pdf

  29. Final Remarks • Workflows can make exploratory or repetitive data tasks easier and save time • Extensive data pre-processing functionality • Extensions for statistics, machine learning, bio-, and cheminformatics • Integration of R (XCMS) and spectrometry extensions can help you to build elaborate pipelines and share work • Can help to organize one’s thoughts. • It’s actually quite a bit of fun.

  30. Resources • KNIME Forum • http://www.knime.org/ • KNIME Learning Hub • http://www.knime.org/learning-hub • QuickstartGuide • http://tech.knime.org/files/KNIME_quickstart.pdf • Happy to Help • beisken@ebi.ac.uk

  31. Q&A

More Related