Gene family analysis toolkit: BLASTP-based family identification

BLASTP-based gene family member identification can be performed directly using the HEMU BLAST module.

Below are instructions on curating reference protein sequence and performing the BLASTP search.

Step1: obtain protein reference sequence (RefSeq)

First, one should obtain representative sequence from a protein family for member identification in other species.
Reference sequence can be curated manually, or downloaded from a publicly-available RefSeq database.

Step2: BLAST RefSeq against proteins of interested species

Set a rational E-value (like 1E-20) and perform the protein-protein BLAST.