Help page - HEMU

Before you read...

This is a simplified version of our user manual, with the purpose of helping users have a quick glance of the main functions regarding toolkits and modules in HEMU.
If you want to perform advanced or customized analyses using the platform, be sure not to miss our DETAILED USER MANUAL HERE.

1 Introduction

1.1 Introduction of HEMU

HEMU has a user-friendly graphical interface that allows users to effortlessly explore genomic data across various representative Andropogoneae species with just a click. Leveraging a total of 4287 RNA-seq datasets from the public database, we applied a widely recognized transcriptome analysis process to construct 73 significant genomes of the Andropogoneae tribe.

We have designed four distinct search toolkits for users to utilize. Our server infrastructure is built on Django, MySQL, and Shiny technologies. You can conveniently access various types of visual analysis results (further details provided below).

1.2 A useful function — Task ID

This Task ID will be visible in the URL and will remain effective indefinitely. Therefore, you have the option to save the Task ID and retrieve its associated results whenever needed.

2 Toolkit I: Genome analysis

2.1 Gene Information & Structure Search

This page is to Visualize transcript-level gene structures, allowing you to explore the gene structure of representative Andropogoneae species. It will provide you with details about the major transcript, CDS (Coding Sequences), UTR (Untranslated Regions), and other structural information associated with the specified gene.

2.2 Gene Functional Annotation Search

This page is to explore gene functional annotations across RNA-seq samples that are part of the HEMU catalog. A table will display comprehensive information, encompassing the primary transcript ID, a concise functional description, as well as GO/KEGG term and KEGG pathway term details.

2.3 Multi-omics Genome Browser

This page is designed to showcase visualized genomic data for individual species that have been processed. A total of 9 species are available, and custom parameters can be applied to visualize Multi-omics genome data in a linear view.

2.4 Genome Synteny Viewer

This page is to show synteny analyses among three pairs of species and have visualized the results for user convenience. Click on the green link below each species to view the synteny results for that specific pair of species.

2.5 The BLAST Server

We have incorporated part of the BLAST search engine framework from Sequance Server (sequenceserver.com) and supplemented it with nucleotide and protein databases assembled for our species. By selecting the relevant database information, you can obtain alignment results(please see user manual page 9, section 2.5).

3 Toolkit II: Transcriptome-derived analysis

3.1 Gene Expression Profile Search

This module provides an Online Query Module for gene expression in the Andropogoneae transcriptome database. The results presented through two panels showcasing gene expression profiles.

Expression Plots: We offer two interactive expression plots to visualize the profile outcomes, encompassing both sample-level and tissue-level data. In the sample-level plots, by hovering your cursor over any sample data point, you can view its corresponding ID, along with the associated TPM/FPKM value and sample ID. Likewise, within the tissue-level plots, you can employ the same approach to retrieve information about the tissue type and its corresponding TPM/FPKM value for any given sample data point.
Fundamental Information: This section provides essential information about the queried gene, including frequency, maximum, minimum, and median expression levels across samples. This furnishes you with a fundamental understanding of the Gene ID you're interested in. By clicking the "Search gene sequence" button, you will be directed to the Raw Sequence Acquisition page, enabling you to access sequences for the target gene based on genome annotation profiles.

3.2 Raw Sequence Acquisition

This module is to retrieve gene, transcript, or protein sequences associated with your chosen Gene ID. FASTA-formatted sequences can be displayed for the results (please see user manual page 11, section 3.2).

3.3 Differential Gene Expression (DGE) Analysis

This module offers a comprehensive platform for Differential Gene Expression (DGE) Analysis, which can investigate the variations in expression across different tissues or organs. A comprehensive set of analysis results categorized into four distinct sections, include Project summary, Overview & Normalization of Expression Data, Principal Component Analysis and Differential Analysis, can be presented (please see user manual page 12, section 3.3).

3.4 GO/KEGG Enrichment

This module enable users to explore functional annotation details of gene transcripts in representative Andropogoneae species. The table contains comprehensive data from the Enrichment Analysis, and it includes options for previewing and downloading the data.

3.5 Weighted Gene Co-expression Network Analysis (WGCNA)

This module divides the overall WGCNA process into three sections. Section 1 covers Data curation, filtering, and sft selection. Section 2 involves Co-expression network construction, and Section 3 focuses on Module-trait correlation analysis. It's noteworthy that the analyses in Section 2 and Section 3 are relatively independent, as you will observe from the subsequent workflow (please see user manual page 15, section 3.5).

4 Toolkit III: Gene family analysis

4.1 Family Identification (HMM & BLASTP)

This module offers two distinct gene sequence alignment methods (HMM and BLASTP) to cater to the requirements for identifying shared domains and distinctions within the gene sequences (please see user manual page 17, section 4.1).

4.2 Phylogenetic Analysis

This page offers a phylogenetic analysis toolkit for the automatic construction of a foundational tree, which can help elucidate the evolutionary and genetic connections among ta set of potential genes (please see user manual page 19, section 4.2).

4.3 Family Expression Heatmap

A heatmap generation tool designed is offered to help you visualize your gene family data. The system will display a visual representation comprising a heatmap depicting gene expression levels and a dendrogram illustrating sample clustering. This will be accompanied by a comprehensive table showcasing the gene expression data across various samples.

5 Toolkit IV: Transposable element (TE) analysis

5.1 TE Search

This page provides two methods for querying TEs in genomes, which can help researchers identify where TEs are inserted, their classifications and putative biological functions in the genome.

5.2 TE (Transposable Elements) Expression Profile Search

TE expression data of representative Andropogoneae species has been analyzed and an interactive query interface has been provided. During our analysis, individual TEs are classified into families based on the 80-80-80 rule proposed by Wicker et al. That is, two elements belong to the same family if they share 80% (or more) sequence identity in at least 80% of their coding or internal domain, or within their terminal repeat regions, or in both. The results conclude two panels of the gene expression profiles for every TE ID.

Expression Plots: We provide two interactive expression plots to display the profile results, both sample level and tissue level. In Sample level plots, stay your cursor on any sample data, then you can see its ID, value of TPM/FPKM and sample ID. Similarly, in tissue level, you can query the tissue type and its value of TPM/FPKM any sample data belongs to by the same way.
Fundamental Information: This panel displays fundamental information regarding the query TE. Including frequency, max, min and median expression in samples, which give a basic recognition of your interested TE ID. The standard of an expressed TE family is set to be TPM/FPKM>1. Additionally, buttons of previewing or downloading raw TE expression data table used to generate the plots are provided.

5.3 TE Chromosomal Distribution Search

This page provides an interactive TE Chromosomal Distribution Search platform, which calculates the number of TE on chromosomes to determine their distribution and study their role in genome structure and function.

6 Toolkit V: Epigenome analysis

6.1 Chromatin Accessibility Search

This page provides two ways to search for open chromatin regions, both of which will return relevant information on all open chromatin regions at that location. About the results, a table of the information of all chromatin regions in what you query will be provided, including its start and end, score, p-value and so on.

6.2 ChIP-seq Peak Annotation Analysis

This page is to show a database for ChIP-Seq data of representative Andropogoneae species we built and the entire process of ChIP-Seq peak annotation analysis we streamlined. The results of the analysis include five subgraphs: Heatmap, Density Map, Venn Plot, Pie Plot and TF Binding Loco Distribution Map (please see user manual page 27, section 6.2).

7 Other Resources

7.1 Data Warehouse

All data used for display and visualization has been categorized into three sections here: Genomic Resources, Transcriptomic Resources, and Epigenomic Resources.

7.2 External Links

This section contains some useful and essential links to other databases related to the project.

The HEMU tutorial page