270 likes | 282 Views
Explore the intersection of biology and technology in animal agriculture, leveraging data for genomic research, gene expression, and protein function. Discover the future of biology with artificial intelligence.
E N D
Animal Agriculture in the New Biological Economy • Very high conservation amongst vertebrates for all biological domains • Anatomy • Physiology • Genomics • Gene Expression • Protein Function • Regulatory and Signaling Pathways • etc. • The Amount of data being generated surpasses our ability to capture its value. • Future Biology will rely on artificial intelligence to capture and analyze the data. • Animal Agriculture must evolve to leverage new technology.
What types of data do we want to remember and analyze? • LIMS Data – Historical information about laboratory data generation and analysis • Phenotypic Data – The observable traits or characteristics of an organism • Expression Profiling Data – What genes are expressed when and where in an organism • Genetic Mapping Data – The relative positions of genes on a chromosome and the distance between them. • Physical Mapping Data – A chromosome map of a species that shows the specific physical locations of its genes and/or markers on each chromosome. • Gene and Protein Functional Data – What does it do and how?
Genetics Molecular Genetics Classical Genetics Comparative Genomics Phenotype Sequence Functional Genomics Genotype Proteomics
Minnesota Animal Genome and Ontology Database Annotation Sequencer Pipeline Sequence Analysis Sequence Markers Clones Oligos Genotype Libraries Animal Biological Sample Phenotype Arrays Phenotype Ontology
oligo_in_project seq_id (FK) oligo_position mispriming_score oligo_id (FK) project_id (FK) oligo_curator oligo_curator oligo_storage public_oligo oligo_storage_id oligo_id (FK) plate plate_address oligo_sent_for_rh box rh_panel box_address oligo_name sent_to oligo_curator (FK) oligo_id (FK) oligo_id (FK) date_sent oligo_name sent_by oligo_desc oligo_rh_volume_sent oligo oligo_rh_concentration_sent oligo_id oligo_submission oligo_seq oligo_id (FK) oligo_type person_id (FK) penalty date tm project_id (FK) gc_percent self_any self_end end_stability extinction_coeffecient genotype oligo_design_run_id (FK) seq_mask_run_parameter binding_site animalid (FK) mask_run_parameter_id maskid (FK) oligo_design_run snp_quality masking_group_id (FK) oligo_design_run_id seq_orientation mask_run_parameter_name oligo_design_run_date conflict_flag oligo_purchase mask_run_parameter_value sqs_version_id (FK) sqs_version_id (FK) seq_reaction purchased_date oligo_id (FK) source_dna_id (FK) purchased_by dye_chemistry purchased_from haplotyping_method sequence_plate_id (FK) purpose genotype_haplotype program_id sequencing_oligo_run genotype_set haplotype_id (FK) version seq_rxn_address_row oligo_design_parameter animalid (FK) animalid (FK) algorithm seq_rxn_address_column project_id (FK) oligo_design_parameter_id author software_version probability oligo_design_run_id (FK) program_name sqs_version_id submitted_trace seq_snp oligo_design_parameter_name update_date seq_quality_run_parameter sequence_mask_run software_name seq_snp_quality submission_id (FK) oligo_design_parameter_description sequence_quality_run_id (FK) software_version maskid workingset_groupid (FK) parameter_id install_date masking_group_id (FK) far_id (FK) uninstall_date parameter_name user_id (FK) masking_method sqs_notes parameter_value masking_time chromat_submit_format quality_file mask_fasta_file_id (FK) chromat_submit_date haplotype_run phd_file_id sqs_version_id (FK) chromat_format_method haplotype_id sequence_quality_run_id (FK) workingset_groupid (FK) chromat_accession_number haplotype_string far_id (FK) seq_plate user_id (FK) chromat_file_path frequency sequence_quality_run oligo_strand chromat_run_time sequence_plate_id phd_file_name seq_working_set parent sequence_quality_run_id oligo_strand_id phd_file_path workingset_groupid plate_name program_id (FK) oligo_id (FK) chromat_file phd_file_size plate_format user_id (FK) seq_extraction call_method oligo_strand plate_run_number workingset_groupname seq_extraction_run_id sequence_quality_run_time plate_application workingset_creationdate sqs_version_id (FK) cd_name group_annotation seq_extraction_time sqr_raw_fasta_file_path seq_plate_originator pi_id (FK) sqr_raw_fasta_file_name seq_extraction_method seq_working_set_in_project plate_alias member_in_pcr_additional_oligo_group far_id (FK) number (FK) workingset_groupid (FK) additional_oligo_set_id (FK) user_id (FK) oligo_id (FK) project_id (FK) oligo_orientation submission submission_id workingset_groupid (FK) oligo_forward_in_sequence oligo_reverse_in_sequence user_id (FK) person_id (FK) seq_id (FK) submission_date oligo_position submission_type mispriming_score library_file submitted_seq population_as_parent contact_file seq_id (FK) population_as_parent_id publication_file seq_submitted_name mixed_sex_population_id (FK) terminal_length submission_id (FK) breed_composition_id (FK) terminal_quality workingset_groupid (FK) species_id (FK) internal_mask_length intenal_mask_quality quality_status internal_cont_length microsatellite isnew internal_cont_quality marker_id (FK) trim_fasta phred_run_chromat project_id (FK) trim_quality repeat_composition phd_file_id (FK) mask_fasta_file seq_file sequence_quality_run_id (FK) mask_fasta_file_id trim_start_position far_id (FK) file_name trim_end_position file_path trim_fasta_length oligo_as_source_dna source_dna_id (FK) oligo_id (FK) check_marker_phenotype location_id (FK) animalid (FK) species_id (FK) location marker_id (FK) location_id lab_id (FK) location_description state person city person_id mask_seq_group sscp country mixed_sex_population masking_group_id marker_id (FK) last_name mixed_sex_population_id first_name gel_type marker breed_composition_id (FK) initials gel_temperature marker_id species_id (FK) isauthor gel_time marker_type isinvestigator gel_length marker_name email gel_width marker_curator login gel_thickness public_marker person_status voltage international_number person_telephone amperage member_in_masking_seq_group marker_type_description wattage masking_group_id (FK) object_id (FK) fa db_id (FK) machine_id machine_type insertion_sequence rapd genome_version machine_name insertion_id (FK) marker_id (FK) genome_version_id person_id (FK) genome_version_source animal_to_insertion genome_version_description animalid (FK) breed_composition end genome_version_release_date insertion_id (FK) breed_composition_id length person_id (FK) rflp species_id (FK) status information_control oligo_pair far marker_id (FK) breed_composition_description forward_oligo_id (FK) far_id probe control_descripition machine_id (FK) oligo_pair_penalty array_use_count capillary_run compl_any far_date far_id (FK) enzyme compl_end far_operator sequencing_reaction_in_far chromat_directory product_size enzyme_id polymer_type far_id (FK) chromat_name marker_id (FK) polymer_lot insertion source_dna_id (FK) well_address polymer_install_date insertion_id array_serial_number bac_assignment person_id (FK) array_install_date marker_sequence bac_assignment_id external_id buffer species_id (FK) db_modification_date marker_id (FK) far_start_time marker_id (FK) assembly_run chromosome_id (FK) pcr_product_in_far far_end_time species_id (FK) assembly_run_id far_id (FK) far_run_start_date annotation_project source_dna_id (FK) assembly_method run_id assembly_parameter pcr_rxn_id (FK) annotation_project_id elapsed_time assembly_parameter_id forward_oligo_id (FK) sqs_version_id (FK) proejct_name assembly_parameter_value seq_workingset_member star assembly_parameter_id (FK) ap_version assembly_parameter_name seq_peptide seq workingset_groupid (FK) assembly_parameter_name (FK) marker_id (FK) release_date sqs_version_id (FK) seq_id (FK) user_id (FK) assembly_parameter_value (FK) start_ohr nearest_gene sequence somatic_cell_hybrid_result seq_genbank_accession mapping_id (FK) peptide_fasta sch_result_id seq_description annotation_project_id (FK) pcr_product seq_ohr seq_name marker_marker (FK) singleton insertion_id (FK) forward_oligo_id (FK) sequence_mapping seq_oligo_seq chomosome_assignment_marker (FK) genome_version_id (FK) assembly_run_id (FK) contig mapping_id seq_raw_fasta nuc_position (FK) size contig_id nuc_position seq_type person_id (FK) person_in_project marker_id (FK) singleton_type assembly_run_id (FK) insertion_id (FK) contig_position seq_length singleton_desc obj_db group_member_id (FK) genome_version_id (FK) nearest_gene contig_name seq_plate_id (FK) contig_id (FK) project_id (FK) object_id person_id (FK) other_id consensus_fasta chromosome_id (FK) seq_access_control_code (FK) db_id (FK) investigator_id (FK) nearest_gene_description consensus_seq_length seq_run_number assembly_run_id (FK) sex_segregated_population cm_position join_date description relative_position contig_member_count seq_submission_status chromosome_start sex_segregated_population_id sm_strand obj_type distance current_contig_datafile seq_quality_score chromosome_end breed_composition_id (FK) chromosome_id (FK) poly_directory seq_masked_fasta species_id (FK) contig_quality_method seq_curator marker_reference pcr_product_annotation contig_consensus_quality seq_processing_method reference_id (FK) pcr_product_in_project project_id (FK) contig_source member_in_contig seq_genbank_index marker_id (FK) project_id (FK) oligo_pair (FK) seq_file_path contig_id (FK) oligo_pair seq_quality_run_id (FK) seq_id (FK) last_modify_time contig_in_annotation effective_end_position_member project_id (FK) sequenom_oligo embryo_collection tissue_type effective_start_position_member contig_id (FK) library_relationship oligo_number embryo_collection_id member_orientation tissue_type_id oligo_pair (FK) external_db chromosome_assignment cloning_source (FK) sire_of_population (FK) member_start_pos_in_contig oligo_name assembly_run_id (FK) tissue_type_name source_submission two_point_linkage marker_id (FK) cloning_result (FK) db_id dam_of_population (FK) source_dna_submission_id chromosome_id (FK) map_id (FK) contribution_to_ref_by_author breed_composition_id (FK) db_name species_id (FK) used_as_new_marker (FK) submitter_id (FK) species_id (FK) release reference_id (FK) animal_in_annotation used_as_map_marker (FK) date db_version person_id (FK) map be_sired_by_animal (FK) animalid (FK) pcr_product_as_clone_source format contig_member_feature be_dammed_by_animal (FK) map_id sample_number library_id (FK) contig_id (FK) investigator_of_project freeze_date forward_oligo_id (FK) project_id (FK) old_source_dna_id (FK) seq_id (FK) mapping_method person_id (FK) reverse_oligo_id (FK) submission_type pcr_rxn_id (FK) contig_feature_id (FK) map_curator project_id (FK) forward_oligo_id (FK) pcr_product_from_rxn genomic_library map_date source_dna_by_submission member_feature_type source_dna_id (FK) library_id (FK) map_program source_dna_id (FK) pcr_rxn_id (FK) member_feature_name map_program_version tissue_type_id (FK) source_dna_submission_id (FK) member_feature_start_position forward_oligo_id (FK) genomic_library_name member_feature_end_position species_id (FK) map_position number_of_product sequenom_oligo_in_seq marker_id (FK) seq_id (FK) chromosome map_id (FK) tissue library chromosome_id linkage_group_id (FK) library_id tissue_id end_of_oligo contig_feature species_id (FK) map_position_order animalid (FK) begin_of_oligo library_type chromosome_name contig_id (FK) map_position_position end_of_seq library_name tissue_collection_date chromosome_length contig_feature_id map_position_linkage_group map_reference begin_of_seq vector tissue_quantity assembly_run_id (FK) reference_id (FK) oligo_seq titer tissue_tower map_id (FK) iub_code (FK) average_insertion_length tissue_box contig_feature_start_position insertion_for tissue_row_of_box pcr_product_as_source_dna clone contig_feature_end_position host tissue_column_of_box current_source_dna_id (FK) clone_id contig_feature_type who_made_it tissue_type_id (FK) old_source_dna_id (FK) clone_plate_id (FK) contig_feaure_name storage_location location_id (FK) pcr_rxn_id (FK) blast_software number_of_clones forward_oligo_id (FK) tissue_in_tissue_sample blast_software_id clone_address_row library_file chromosome_assignment_reference tissue_id (FK) clone_previous_plate_id cdna_library blast_program_name library_is_arrayed animal_source_in_source_dna marker_id (FK) animalid (FK) clone_previous_address_row polymorphism_definition blast_program_version library_id (FK) library_number_of_plates blast_result blast_run source_dna_id (FK) reference_id (FK) tissue_sample_id (FK) lab clone_address_column tissue_type_id (FK) iub_code blast_id (FK) blast_id animalid (FK) chromosome_id (FK) clone_prev_address_column lab_id cdna_library_name desc species_id (FK) blast_software_id (FK) clone_external_id lab_name species_id (FK) nucleotide blast_file blast_time clone_alias lab_institution is_null blast_type db_id (FK) source_dna project source_dna_id project_id source_dna_type project_description pcr_reaction_score project_name blast_address pcr_reaction_score investigator_id (FK) genomic_dna_as_source_dna linkage_group blast_id (FK) project_animal_group pcr_reaction_desc genomic_dna_id (FK) clone_as_source_dna cdna_plate linkage_group_id member_id (FK) animalid (FK) animalid (FK) source_dna_plate clone_id (FK) cdna_plate_id begin_of_subject chromosome_id (FK) project_id (FK) source_dna_id (FK) source_dns_plate_id clone_plate_id (FK) blast_run_member end_of_subject source_dns_plate_id (FK) plate_id (FK) library_id (FK) source_dna_plate_format begin_of_query blast_id (FK) source_dna_plate_name end_of_query object_id (FK) seq_id (FK) animal pcr_rxn_in_plate contig_id (FK) animalid score pcr_plate_id (FK) assembly_run_id (FK) linkage_group_map_pair expected_value species_id (FK) rna_prep oligo_id (FK) source_dna_id (FK) map_id (FK) identity external id pcr_rxn_id (FK) linkage_group_id (FK) pcr_rxn_as_source_dna positive animal_date_of_birth rna_prep_type pcr_rxn_id (FK) pcr_additional_oligo_group log_likelihood animal_date_of_death rna_prep_date source_dna_id (FK) additioanl_oligo_set_id adminal_db_modification_date rna_prep_person forward_oligo_id (FK) breed (FK) pcr_reaction marker_status rna_prep_yield breed_species (FK) pcr_rxn_id species rna_prep_id animalid (FK) source_dna_id (FK) tissue_sample_id (FK) species_id species_id (FK) forward_oligo_id (FK) species_name lab_id (FK) pcr_plate cnda_in_plate pcr_reaction_date pcr_plate_id cdna_id (FK) pcr_reaction_person rna_to_cdna source_dna_id (FK) pcr_plate_name pcr_profile pcr_reaction_machine cdna_id (FK) cdna_plate_id (FK) source_dns_plate_id (FK) pcr_condition_id (FK) pcr_profile_id cdna_plate_row pcr_step cdna_column pcr_instruction cdna_prep_as_source_dna cdna_id (FK) source_dna_as hyb_probe pcr_condition blast_translation hybridization_probe_id reference pcr_condition_id blast_id (FK) source_dna_id (FK) reference_id cdna_prep dna_concentration specific_activity magnesium_concentration year cdna_id hyb_array probe_concentration dntp_concentration reference_volume cdna_preparation_type array_id polymerase_type beginning_page cdna_preparation_date hyb_probe_in_experiment array_name polymerase_amount ending_page cdna_preparation_person total_array_row hybridization_probe_id (FK) pcr_volume journal project_species_element reverse_transcription_oligo total_array_column hybridization_experiment_id (FK) pcr_profile title reverse_transcription_condition species_id (FK) genomic_dna number_of_element_on_array source_dna_id (FK) pcr_annealing_temp second_strand_oligo project_id (FK) hyb_target_on_subarray genomic_dna_id source_dna_as_hyb_target array_printing_date pcr_profile_id (FK) second_strand_synthesis_positions source_dna_id (FK) animalid (FK) array_printer_type source_dna_id (FK) subarray_id (FK) plate_id (FK) array_printing_person hyb_target_description array_id (FK) array_slide_manufactuer tissue_sample_id (FK) slide_coating_method subarray_column_address genomic_dna_row slide_coating_date subarray_row_address genomic_dna_column hyb_target_in_experiment array_design_date genomic_dna_preparation_date hybridization_experiment_id (FK) array_owner genomic_dna_preparation_type source_dna_id (FK) hyb_experiment genomic_dna_preparer target_quantity_in_moles hybridization_experiment_id tissue_sample hyb_experiment_condition hyb_experiment_stringency tissue_sample_id hyb_experiment_type tissue_sample_description source_dna_in_plate genomic_dna_plate hybridization_date source_dna_plate_id (FK) plate_id source_dna_id (FK) source_dns_plate_id (FK) hyb_subarray source_dna_address_row marker_within_source_dna subarray_id source_dna_address_column array_id (FK) marker_id (FK) clone_plate source_dna_id (FK) total_subarray_row clone_plate_id total_subarray_column library_id (FK) array_row_address clone_plate_format array_column_address clone_plate_name clone_plate_creation_date clone_plate_storage_location thaw_count last_thaw_date replication_date replication_plate_id replication_storage_location source_dns_plate_id (FK)
A relational data model for gene expression analysis • Track microarray production and use • Track tissues and probe generation • Track hybridizations and data acquisition • Track data normalization and submission • Track analysis • Interpretation
Statistical Analysis • Model: Y =X + a + b + … • F-test • T-test • Fold Change • Slide Quality • Spot Diameter • Spot Area • Footprint • Front Channel and Background • Channel Signal Uniformity • Data Quality • Dynamic Range • Signal-Noise Ratio • Signal Distribution of • Front/Background Channel • Sample Correlation • and Cluster • Results • Differential expressed • genes • Sample and gene expression • pattern cluster • Gene/pathways identification
Species Clone Plate Tissue Clone Library Genomic DNA Prep Tissue in Tissue Sample cDNA Prep RNA Prep PCR RXN PCR Product Tissue Sample Primer Pair Animal Source DNA Primer F Animal to Source DNA R
oligo_as_source_dna source_dna_id (FK) oligo_id (FK) source_dna source_dna_id source_dna_type source_dna_as hyb_probe hybridization_probe_id source_dna_id (FK) specific_activity hyb_array probe_concentration array_id hyb_probe_in_experiment array_name total_array_row hybridization_probe_id (FK) total_array_column hybridization_experiment_id (FK) source_dna_id (FK) hyb_target_on_subarray source_dna_as_hyb_target array_printing_date source_dna_id (FK) array_printer_type source_dna_id (FK) subarray_id (FK) array_printing_person hyb_target_description array_id (FK) array_slide_manufactuer slide_coating_method subarray_column_address slide_coating_date subarray_row_address hyb_target_in_experiment array_design_date hybridization_experiment_id (FK) array_owner source_dna_id (FK) hyb_experiment target_quantity_in_moles hybridization_experiment_id hyb_experiment_condition hyb_experiment_stringency hyb_experiment_type hybridization_date hyb_subarray subarray_id array_id (FK) total_subarray_row total_subarray_column array_row_address array_column_address Remembering the Hybridization
Porcine Microarray Platform Pig Array-Ready Oligo Set v.1.0 13,297 70-mer oligonucleotide microarray. • Chip Platform: • Slide: Corning GAPS II (25 X 75 X 1.1 mm) • Print Conditions: 3X SSC; • Oligo concentration printed: 20 μM • Print temperature: Room temp (72-74 F); • Print pin number: 48; • Spot diameter: 140-160 μm; • Distance between spots: 240 μm
13,297 oligos TIGR TC hits Blastn Blastn Blastx Blastn Ensembl DNA Ensembl cDNA Ensembl peptide Gene Ontology perfect match * * * The top hit containing a bit score > 100, an E-value < 0.001, and similarity > 50%
Estimate of Redundancy for SSWG1 Database Database Start Start Total Hit * Total Hit * Unique Hit Unique Hit Duplicates Duplicates No No Redundancy Redundancy Sequence Sequence Hit* Hit* Number Number TIGR5 TIGR5 13,297 13,297 13,297 13,297 13,286 13,286 11 11 0 0 0.08% 0.08% Oligos Oligos TIGR10 TIGR10 13,297 13,297 13,021 13,021 12,042 12,042 979 979 276 276 7.51% 7.51% Oligos Oligos Ensembl Ensembl 12,042 12,042 8,608 8,608 7,280 7,280 1,328 1,328 3,434 3,434 15.43% 15.43% TCs TCs Peptide Peptide Ensembl Ensembl 12,042 12,042 10,836 10,836 8,657 8,657 2,179 2,179 1,206 1,206 20.10% 20.10% TCs TCs cDNA cDNA Ensembl Ensembl 1,206 1,206 896 896 896 896 unknown unknown 310 310 unknown unknown TCs TCs HG DNA HG DNA GO GO 13,297 13,297 7,600 7,600 6,935 (TC) 6,935 (TC) 665 (TC) 665 (TC) 5,697 5,697 8.75% (TC) 8.75% (TC) Annotation Annotation Oligos Oligos ( ( oligo oligo ) ) 5,763 5,763 1,837 1,837 24.17% 24.17% ( ( Enst Enst ) ) ( ( Enst Enst ) ) ( ( Enst Enst ) ) Redundancy estimation for SSWG1. - -
Microarray Analysis of Islet Quality and Function • Diabetes Mellitus is world wide health problem • (1-2% of the population, 1 million U.S. citizens affected.) • Type 1 diabetes mellitus - autoimmune destruction of the insulin secreting pancreatic islet cells, a disease of the young. • Current management of diabetes involves daily blood sugar testing, insulin injections, and careful meal planning. • Insulin controls but doesn’t cure Diabetes. • >40% of Diabetic Patients experience complications (can lead to blindness, kidney failure, heart attacks and foot ulcers) • The only way to CURE diabetes is to replace the destroyed beta cells – Pancreas or Islet transplantation.
Islet Transplantation Obligatory Ex-Vivo Culture • Enzymatic Digestion to release islets • Tissue Culture (48 Hours) • Viability Assays • Gene Expression Based Diagnostics?
C D A B Cytokine Glucose Islets Comprised of 1000-2000 cells cells (70%) Insulin cells (15%) Glucagon cells (5%) PP cells (5%) Endothelial (5%) • Islet Culture Conditions • A: 5.6 mMol Glucose • B: 16.7 mMol Glucose • C: 5.6 mMol Glucose • + IL-1β (2ng/ml) • + TNFα (1000U/ml) • + IFNγ (1000U/ml) • D: 16.7 mMol Glucose • + IL-1β (2ng/ml) • + TNFα (1000U/ml) • + IFNγ (1000U/ml)
Intracellular ATP Insulin content
Hybridization Pig Islet Gene Expression • Porcine Islet isolation and Culture: • Islet cell donors: adult Landrace • sows • Culture in medium 199 with • pig serum • Culture condition: regular • medium 48 h, change to • conditional medium for 48 h • RNA Isolation: • Qiagen Rneasy mini kit • RNA 6000 Nano LabChip Kit • Hybridization Protocol: • Indirect labeling • Data Acquisition: • ScanArray 5000 (Scan) Cell isolation In vitro culture RNA isolation Labeling
Differentially expressed spots identified by microarray analysis using GeneSpring and R/maanova methods The number of spots with a p-value < 0.05 after pairwise examination are shown
Multifactorial Analysis of Expression Data • The mixed ANOVA models for pair-wise comparison, main effect and interaction determination were: • ygijk = + Di + Aj + Sk + gijk (Pair-wise comparison) • ygijk = + Di + Aj + Cg + Gg + gijk (Main effect determination) • ygijk = + Di + Aj + Cg + Gg + (CG)g + gijk (Interaction determination) = mean value D = dye effect A = array S = sample C = cytokine G = glucose CG = interaction effect of glucose and cytokine = stochastic error
Apoptosis interaction network from Pathway Assist showing genes differentially expressed under conditions of glucose and cytokine stress. Differentially expressed genes were identified by microarray using r/maanova analysis with a p-value < 0.05. Genes responding to a predicted interaction of cytokines and glucose are highlighted with green ellipses. Genes with blue ellipses were differentially expressed both in response to glucose alone and to cytokines alone.
Acknowledgements Hering Lab Maria Hardstedt Fahrenkrug Lab Hehuang Xie Min Wang Bhupinder Junej Open Resources Murtaugh Lab TIGR R/maanova ENSEMBL Cheryl Dvorak