Python to access Uniprot and ClustalO

Python to access Uniprot and ClustalO

We have summer interns at Diadem Biotherapeutics , so I'm teaching them some handy bioinformatics and thought I would share. We're using Google Colab to execute the Python code.

1) Let's install libraries

import requests, io
import pandas as pd        

2) Lets define the Uniprot API

Uniprot_API = 'https://rest.uniprot.org/uniprotkb/search?'        

3) Let's write a "do_request" function to access the Uniprot API

def do_request(Uniprot_API, entry='', **kwargs):
    params = ''
    req = requests.get('%s%s%s' % (Uniprot_API, entry, params),params=kwargs)
    if not req.ok:
        print(req.text)
        req.raise_for_status()
        sys.exit()
    return req        

4) Let's use our "do_request" function query Uniprot using p53 as an example. To customize, more information on query fields and formats can be found here: https://www.uniprot.org/help/api

req = do_request(Uniprot_API, query='gene:p53 AND reviewed:true',
                 format='tsv',
                 fields='accession,id,length,organism_name,organism_id,xref_pdb,xref_hgnc',
                 size='50')

print(req.text)        

Should return the following:

Entry	Entry Name	Length	Organism	Organism (ID)	PDB	HGN
P04637	P53_HUMAN	393	Homo sapiens (Human)	9606	1A1U;1AIE;1C26;1DT7;1GZH;1H26;1HS5;1JSP;1KZY;1MA3;1OLG;1OLH;1PES;1PET;1SAE;1SAF;1SAK;1SAL;1TSR;1TUP;1UOL;1XQH;1YC5;1YCQ;1YCR;1YCS;2AC0;2ADY;2AHI;2ATA;2B3G;2BIM;2BIN;2BIO;2BIP;2BIQ;2F1X;2FEJ;2FOJ;2FOO;2GS0;2H1L;2H2D;2H2F;2H4F;2H4H;2H4J;2H59;2J0Z;2J10;2J11;2J1W;2J1X;2J1Y;2J1Z;2J20;2J21;2K8F;2L14;2LY4;2MEJ;2MWO;2MWP;2MWY;2MZD;2OCJ;2PCX;2RUK;2VUK;2WGX;2X0U;2X0V;2X0W;2XWR;2YBG;2YDR;2Z5S;2Z5T;3D05;3D06;3D07;3D08;3D09;3D0A;3DAB;3DAC;3IGK;3IGL;3KMD;3KZ8;3LW1;3OQ5;3PDH;3Q01;3Q05;3Q06;3SAK;3TG5;3TS8;3ZME;4AGL;4AGM;4AGN;4AGO;4AGP;4AGQ;4BUZ;4BV2;4HFZ;4HJE;4IBQ;4IBS;4IBT;4IBU;4IBV;4IBW;4IBY;4IBZ;4IJT;4KVP;4LO9;4LOE;4LOF;4MZI;4MZR;4QO1;4RP6;4RP7;4X34;4XR8;4ZZJ;5A7B;5AB9;5ABA;5AOI;5AOJ;5AOK;5AOL;5AOM;5BUA;5ECG;5G4M;5G4N;5G4O;5HOU;5HP0;5HPD;5LAP;5LGY;5MCT;5MCU;5MCV;5MCW;5MF7;5MG7;5MHC;5MOC;5O1A;5O1B;5O1C;5O1D;5O1E;5O1F;5O1G;5O1H;5O1I;5OL0;5UN8;5XZC;6FF9;6FJ5;6GGA;6GGB;6GGC;6GGD;6GGE;6GGF;6LHD;6R5L;6RJZ;6RK8;6RKI;6RKK;6RKM;6RL3;6RL4;6RL6;6RM5;6RM7;6RWH;6RWI;6RWS;6RWU;6RX2;6RZ3;6S39;6S3C;6S40;6S9Q;6SHZ;6SI0;6SI1;6SI2;6SI3;6SI4;6SIN;6SIO;6SIP;6SIQ;6SL6;6SLV;6T58;6V4F;6V4H;6VQO;6VR1;6VR5;6VRM;6VRN;6W51;6XRE;6ZNC;7B46;7B47;7B48;7B49;7B4A;7B4B;7B4C;7B4D;7B4E;7B4F;7B4G;7B4H;7B4N;7BWN;7DHY;7DHZ;7DVD;7EAX;7EDS;7EEU;7EL4;7NMI;7RM4;7V97;7XZX;7XZZ;7YGI;8A31;8A32;8A92;8DC4;8DC6;8DC7;8DC8;8F2H;8F2I;	HGNC:11998;
P10361	P53_RAT	391	Rattus norvegicus (Rat)	10116		
P02340	P53_MOUSE	390	Mus musculus (Mouse)	10090	1HU8;2GEQ;2IOI;2IOM;2IOO;2P52;3EXJ;3EXL;	
Q42578	PER53_ARATH	335	Arabidopsis thaliana (Mouse-ear cress)	3702	1PA2;1QO4;	
O09185	P53_CRIGR	393	Cricetulus griseus (Chinese hamster) (Cricetulus barabensis griseus)	10029		
Q8SPZ3	P53_DELLE	387	Delphinapterus leucas (Beluga whale)	9749		
Q9TTA1	P53_TUPBE	393	Tupaia belangeri (Common tree shrew) (Tupaia glis belangeri)	37347		
P61260	P53_MACFU	393	Macaca fuscata fuscata (Japanese macaque)	9543		
P56424	P53_MACMU	393	Macaca mulatta (Rhesus macaque)	9544		
P79892	P53_HORSE	280	Equus caballus (Horse)	9796		
Q29537	P53_CANLF	381	Canis lupus familiaris (Dog) (Canis familiaris)	9615		
P56423	P53_MACFA	393	Macaca fascicularis (Crab-eating macaque) (Cynomolgus monkey)	9541		
Q9TUB2	P53_PIG	386	Sus scrofa (Pig)	9823		
Q9W678	P53_BARBU	369	Barbus barbus (Barbel) (Cyprinus barbus)	40830		
P25035	P53_ONCMY	396	Oncorhynchus mykiss (Rainbow trout) (Salmo gairdneri)	8022		
O12946	P53_PLAFE	366	Platichthys flesus (European flounder) (Pleuronectes flesus)	8260		
O57538	P53_XIPHE	342	Xiphophorus hellerii (Green swordtail)	8084		
O93379	P53_ICTPU	376	Ictalurus punctatus (Channel catfish) (Silurus punctatus)	7998		
P79820	P53_ORYLA	352	Oryzias latipes (Japanese rice fish) (Japanese killifish)	8090		
Q92143	P53_XIPMA	342	Xiphophorus maculatus (Southern platyfish) (Platypoecilus maculatus)	8083		
Q9W679	P53_TETMU	367	Tetraodon miurus (Congo puffer)	94908C        

4) Let's make it pretty with Pandas. Pandas is a great way to work with structured data. More here: https://pandas.pydata.org/docs/

uniprot_list = pd.read_table(io.StringIO(req.text), sep='\t'
uniprot_list.head()        

Should return the following

No alt text provided for this image

5) Let's use our do_request function to get the protein sequences in FASTA format

req = do_request(Uniprot_API, query='gene:p53 AND reviewed:true',
                 format='fasta')
fasta = req.text
print(fasta)        

This should return the following:

>sp|P04637|P53_HUMAN Cellular tumor antigen p53 OS=Homo sapiens OX=9606 GN=TP53 PE=1 SV=
MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGP
DEAPRMPEAAPPVAPAPAAPTPAAPAPAPSWPLSSSVPSQKTYQGSYGFRLGFLHSGTAK
SVTCTYSPALNKMFCQLAKTCPVQLWVDSTPPPGTRVRAMAIYKQSQHMTEVVRRCPHHE
RCSDSDGLAPPQHLIRVEGNLRVEYLDDRNTFRHSVVVPYEPPEVGSDCTTIHYNYMCNS
SCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENLRKKGEPHHELP
PGSTKRALPNNTSSSPQPKKKPLDGEYFTLQIRGRERFEMFRELNEALELKDAQAGKEPG
GSRAHSSHLKSKKGQSTSRHKKLMFKTEGPDSD
>sp|P10361|P53_RAT Cellular tumor antigen p53 OS=Rattus norvegicus OX=10116 GN=Tp53 PE=1 SV=1
MEDSQSDMSIELPLSQETFSCLWKLLPPDDILPTTATGSPNSMEDLFLPQDVAELLEGPE
EALQVSAPAAQEPGTEAPAPVAPASATPWPLSSSVPSQKTYQGNYGFHLGFLQSGTAKSV
MCTYSISLNKLFCQLAKTCPVQLWVTSTPPPGTRVRAMAIYKKSQHMTEVVRRCPHHERC
SDGDGLAPPQHLIRVEGNPYAEYLDDRQTFRHSVVVPYEPPEVGSDYTTIHYKYMCNSSC
MGGMNRRPILTIITLEDSSGNLLGRDSFEVRVCACPGRDRRTEEENFRKKEEHCPELPPG
SAKRALPTSTSSSPQQKKKPLDGEYFTLKIRGRERFEMFRELNEALELKDARAAEESGDS
RAHSSYPKTKKGQSTSRHKKPMIKKVGPDSD
>sp|P02340|P53_MOUSE Cellular tumor antigen p53 OS=Mus musculus OX=10090 GN=Tp53 PE=1 SV=4
MTAMEESQSDISLELPLSQETFSGLWKLLPPEDILPSPHCMDDLLLPQDVEEFFEGPSEA
LRVSGAPAAQDPVTETPGPVAPAPATPWPLSSFVPSQKTYQGNYGFHLGFLQSGTAKSVM
CTYSPPLNKLFCQLAKTCPVQLWVSATPPAGSRVRAMAIYKKSQHMTEVVRRCPHHERCS
DGDGLAPPQHLIRVEGNLYPEYLEDRQTFRHSVVVPYEPPEAGSEYTTIHYKYMCNSSCM
GGMNRRPILTIITLEDSSGNLLGRDSFEVRVCACPGRDRRTEEENFRKKEVLCPELPPGS
AKRALPTCTSASPPQKKKPLDGEYFTLKIRGRKRFEMFRELNEALELKDAHATEESGDSR
AHSSYLKTKKGQSTSRHKKTMVKKVGPDSD
>sp|Q42578|PER53_ARATH Peroxidase 53 OS=Arabidopsis thaliana OX=3702 GN=PER53 PE=1 SV=1
MAVTNLPTCDGLFIISLIVIVSSIFGTSSAQLNATFYSGTCPNASAIVRSTIQQALQSDT
RIGASLIRLHFHDCFVNGCDASILLDDTGSIQSEKNAGPNVNSARGFNVVDNIKTALENA
CPGVVSCSDVLALASEASVSLAGGPSWTVLLGRRDSLTANLAGANSSIPSPIESLSNITF
KFSAVGLNTNDLVALSGAHTFGRARCGVFNNRLFNFSGTGNPDPTLNSTLLSTLQQLCPQ
NGSASTITNLDLSTPDAFDNNYFANLQSNDGLLQSDQELFSTTGSSTIAIVTSFASNQTL
FFQAFAQSMINMGNISPLTGSNGEIRLDCKKVNGS
>sp|O09185|P53_CRIGR Cellular tumor antigen p53 OS=Cricetulus griseus OX=10029 GN=TP53 PE=2 SV=1
MEEPQSDLSIELPLSQETFSDLWKLLPPNNVLSTLPSSDSIEELFLSENVTGWLEDSGGA
LQGVAAAAASTAEDPVTETPAPVASAPATPWPLSSSVPSYKTYQGDYGFRLGFLHSGTAK
SVTCTYSPSLNKLFCQLAKTCPVQLWVNSTPPPGTRVRAMAIYKKLQYMTEVVRRCPHHE
RSSEGDSLAPPQHLIRVEGNLHAEYLDDKQTFRHSVVVPYEPPEVGSDCTTIHYNYMCNS
SCMGGMNRRPILTIITLEDPSGNLLGRNSFEVRICACPGRDRRTEEKNFQKKGEPCPELP
PKSAKRALPTNTSSSPPPKKKTLDGEYFTLKIRGHERFKMFQELNEALELKDAQASKGSE
DNGAHSSYLKSKKGQSASRLKKLMIKREGPDSD
>sp|Q8SPZ3|P53_DELLE Cellular tumor antigen p53 OS=Delphinapterus leucas OX=9749 GN=TP53 PE=2 SV=1
MEESQAELGVEPPLSQETFSDLWKLLPENNLLSSELSPAVDDLLLSPEDVANWLDERPDE
APQMPEPPAPAAPTPAAPAPATSWPLSSFVPSQKTYPGSYGFHLGFLHSGTAKSVTCTYS
PALNKLFCQLAKTCPVQLWVSSPPPPGTRVRAMAIYKKSEYMTEVVRRCPHHERCSDYSD
GLAPPQHLIRVEGNLRAEYLDDRNTFRHSVVVPYEPPEVGSDCTTIHYNFMCNSSCMGGM
NRRPILTIITLEDSNGNLLGRNSFEVRVCACPGRDRRTEEENFHKKGQSCPELPTGSAKR
ALPTGTSSSPPQKKKPLDGEYFTLQIRGRERFEMFRELNEALELKDAQAGKEPGESRAHS
SHLKSKKGQSPSRHKKLMFKREGPDSD
>sp|Q9TTA1|P53_TUPBE Cellular tumor antigen p53 OS=Tupaia belangeri OX=37347 GN=TP53 PE=2 SV=1
MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGP
DEAPRMPEAAPPVAPAPAAPTPAAPAPAPSWPLSSSVPSQKTYQGSYGFRLGFLHSGTAK
SVTCTYSPDLNKLFCQLAKTCPVQLWVDSAPPPGTRVRAMAIYKQSQYVTEVVRRCPHHE
RCSDSDGLAPPQHLIRVEGNLHAEYSDDRNTFRHSVVVPYEPPEVGSDCTTIHYNYMCNS
SCMGGMNRRPILTIITLEDSSGKLLGRNSFEVRICACPGRDRRTEEENFRKKGESCPKLP
TGSIKRALPTGSSSSPQPKKKPLDEEYFTLQIRGRERFEMLREINEALELKDAMAGKESA
GSRAHSSHLKSKKGQSTSRHRKLMFKTEGPDSD
>sp|P61260|P53_MACFU Cellular tumor antigen p53 OS=Macaca fuscata fuscata OX=9543 GN=TP53 PE=2 SV=1
MEEPQSDPSIEPPLSQETFSDLWKLLPENNVLSPLPSQAVDDLMLSPDDLAQWLTEDPGP
DEAPRMSEAAPPMAPTPAAPTPAAPAPAPSWPLSSSVPSQKTYHGSYGFRLGFLHSGTAK
SVTCTYSPDLNKMFCQLAKTCPVQLWVDSTPPPGSRVRAMAIYKQSQHMTEVVRRCPHHE
RCSDSDGLAPPQHLIRVEGNLRVEYSDDRNTFRHSVVVPYEPPEVGSDCTTIHYNYMCNS
SCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENFRKKGEPCHQLP
PGSTKRALPNNTSSSPQPKKKPLDGEYFTLQIRGRERFEMFRELNEALELKDAQAGKEPA
GSRAHSSHLKSKKGQSTSRHKKFMFKTEGPDSD
>sp|P56424|P53_MACMU Cellular tumor antigen p53 OS=Macaca mulatta OX=9544 GN=TP53 PE=2 SV=1
MEEPQSDPSIEPPLSQETFSDLWKLLPENNVLSPLPSQAVDDLMLSPDDLAQWLTEDPGP
DEAPRMSEAAPPMAPTPAAPTPAAPAPAPSWPLSSSVPSQKTYHGSYGFRLGFLHSGTAK
SVTCTYSPDLNKMFCQLAKTCPVQLWVDSTPPPGSRVRAMAIYKQSQHMTEVVRRCPHHE
RCSDSDGLAPPQHLIRVEGNLRVEYSDDRNTFRHSVVVPYEPPEVGSDCTTIHYNYMCNS
SCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENFRKKGEPCHQLP
PGSTKRALPNNTSSSPQPKKKPLDGEYFTLQIRGRERFEMFRELNEALELKDAQAGKEPA
GSRAHSSHLKSKKGQSTSRHKKFMFKTEGPDSD
>sp|P79892|P53_HORSE Cellular tumor antigen p53 (Fragment) OS=Equus caballus OX=9796 GN=TP53 PE=2 SV=2
PAVNNLLLSPDVVNWLDEGPDEAPRMPAAPAPLAPAPATSWPLSSFVPSQKTYPGCYGFR
LGFLNSGTAKSVTCTYSPTLNKLFCQLAKTCPVQLLVSSPPPPGTRVRAMAIYKKSEFMT
EVVRRCPHHERCSDSSDGLAPPQHLIRVEGNLRAEYLDDRNTFRHSVVVPYEPPEVGSDC
TTIHYNFMCNSSCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENF
RKKEEPCPEPPPRSTKRVLSSNTSSSPPQKKKPLDGEYFT
>sp|Q29537|P53_CANLF Cellular tumor antigen p53 OS=Canis lupus familiaris OX=9615 GN=TP53 PE=2 SV=2
MEESQSELNIDPPLSQETFSELWNLLPENNVLSSELCPAVDELLLPESVVNWLDEDSDDA
PRMPATSAPTAPGPAPSWPLSSSVPSPKTYPGTYGFRLGFLHSGTAKSVTWTYSPLLNKL
FCQLAKTCPVQLWVSSPPPPNTCVRAMAIYKKSEFVTEVVRRCPHHERCSDSSDGLAPPQ
HLIRVEGNLRAKYLDDRNTFRHSVVVPYEPPEVGSDYTTIHYNYMCNSSCMGGMNRRPIL
TIITLEDSSGNVLGRNSFEVRVCACPGRDRRTEEENFHKKGEPCPEPPPGSTKRALPPST
SSSPPQKKKPLDGEYFTLQIRGRERYEMFRNLNEALELKDAQSGKEPGGSRAHSSHLKAK
KGQSTSRHKKLMFKREGLDSD
>sp|P56423|P53_MACFA Cellular tumor antigen p53 OS=Macaca fascicularis OX=9541 GN=TP53 PE=2 SV=2
MEEPQSDPSIEPPLSQETFSDLWKLLPENNVLSPLPSQAVDDLMLSPDDLAQWLTEDPGP
DEAPRMSEAAPPMAPTPAAPTPAAPAPAPSWPLSSSVPSQKTYHGSYGFRLGFLHSGTAK
SVTCTYSPDLNKMFCQLAKTCPVQLWVDSTPPPGSRVRAMAIYKQSQHMTEVVRRCPHHE
RCSDSDGLAPPQHLIRVEGNLRVEYSDDRNTFRHSVVVPYEPPEVGSDCTTIHYNYMCNS
SCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENFRKKGEPCHQLP
PGSTKRALPNNTSSSPQPKKKPLDGEYFTLQIRGRERFEMFRELNEALELKDAQAGKEPA
GSRAHSSHLKSKKGQSTSRHKKFMFKTEGPDSD
>sp|Q9TUB2|P53_PIG Cellular tumor antigen p53 OS=Sus scrofa OX=9823 GN=TP53 PE=2 SV=1
MEESQSELGVEPPLSQETFSDLWKLLPENNLLSSELSLAAVNDLLLSPVTNWLDENPDDA
SRVPAPPAATAPAPAAPAPATSWPLSSFVPSQKTYPGSYDFRLGFLHSGTAKSVTCTYSP
ALNKLFCQLAKTCPVQLWVSSPPPPGTRVRAMAIYKKSEYMTEVVRRCPHHERSSDYSDG
LAPPQHLIRVEGNLRAEYLDDRNTFRHSVVVPYEPPEVGSDCTTIHYNFMCNSSCMGGMN
RRPILTIITLEDASGNLLGRNSFEVRVCACPGRDRRTEEENFLKKGQSCPEPPPGSTKRA
LPTSTSSSPVQKKKPLDGEYFTLQIRGRERFEMFRELNDALELKDAQTARESGENRAHSS
HLKSKKGQSPSRHKKPMFKREGPDSD
>sp|Q9W678|P53_BARBU Cellular tumor antigen p53 OS=Barbus barbus OX=40830 GN=tp53 PE=2 SV=1
MAESQEFAELWERNLISTQEAGTCWELINDEYLPSSFDPNIFDNVLTEQPQPSTSPPTAS
VPVATDYPGEHGFKLGFPQSGTAKSVTCTYSSDLNKLFCQLAKTCPVQMVVNVAPPQGSV
IRATAIYKKSEHVAEVVRRCPHHERTPDGDGLAPAAHLIRVEGNSRALYREDDVNSRHSV
VVPYEVPQLGSEFTTVLYNFMCNSSCMGGMNRRPILTIISLETHDGQLLGRRSFEVRVCA
CPGRDRKTEESNFRKDQETKTLDKIPSANKRSLTKDSTSSVPRPEGSKKAKLSGSSDEEI
YTLQVRGKERYEMLKKINDSLELSDVVPPSEMDRYRQKLLTKGKKKDGQTPEPKRGKKLM
VKDEKSDSD
>sp|P25035|P53_ONCMY Cellular tumor antigen p53 OS=Oncorhynchus mykiss OX=8022 GN=tp53 PE=2 SV=1
MADLAENVSLPLSQESFEDLWKMNLNLVAVQPPETESWVGYDNFMMEAPLQVEFDPSLFE
VSATEPAPQPSISTLDTGSPPTSTVPTTSDYPGALGFQLRFLQSSTAKSVTCTYSPDLNK
LFCQLAKTCPVQIVVDHPPPPGAVVRALAIYKKLSDVADVVRRCPHHQSTSENNEGPAPR
GHLVRVEGNQRSEYMEDGNTLRHSVLVPYEPPQVGSECTTVLYNFMCNSSCMGGMNRRPI
LTIITLETQEGQLLGRRSFEVRVCACPGRDRKTEEINLKKQQETTLETKTKPAQGIKRAM
KEASLPAPQPGASKKTKSSPAVSDDEIYTLQIRGKEKYEMLKKFNDSLELSELVPVADAD
KYRQKCLTKRVAKRDFGVGPKKRKKLLVKEEKSDSD
>sp|O12946|P53_PLAFE Cellular tumor antigen p53 OS=Platichthys flesus OX=8260 GN=tp53 PE=2 SV=1
MMDEQGLDGMQILPGSQDSFSELWASVQTPSIATIAEEFDDHLGNLLQNGFDMNLFELPP
EMVAKDSVTPPSSTVPVVTDYPGEYGFQLRFQKSGTAKSVTSTFSELLKKLYCQLAKTSP
VEVLLSKEPPQGAVLRATAVYKKTEHVADVVRRCPHHQTEDTAEHRSHLIRLEGSQRALY
FEDPHTKRQSVTVPYEPPQLGSETTAILLSFMCNSSCMGGMNRRQILTILTLETPDGLVL
GRRCFEVRVCACPGRDRKTDEESSTKTPNGPKQTKKRKQAPSNSAPHTTTVMKSKSSSSA
EEEDKEVFTVLVKGRERYEIIKKINEAFEGAAEKEKAKNKVAVKQELPVPSSGKRLVQRG
ERSDSD
>sp|O57538|P53_XIPHE Cellular tumor antigen p53 OS=Xiphophorus hellerii OX=8084 GN=tp53 PE=2 SV=1
MEEADLTLPLSQDTFHDLWNNVFLSTENESLAPPEGLLSQNMDFWEDPETMQETKNVPTA
PTVPAISNYAGEHGFNLEFNDSGTAKSVTSTYSVKLGKLFCQLAKTTPIGVLVKEEPPQG
AVIRATSVYKKTEHVGEVVKRCPHHQSEDLSDNKSHLIRVEGSQLAQYFEDPNTRRHSVT
VPYERPQLGSEMTTILLSFMCNSSCMGGMNRRPILTILTLETTEGEVLGRRCFEVRVCAC
PGRDRKTEEGNLEKSGTKQTKKRKSAPAPDTSTAKKSKSASSGEDEDKEIYTLSIRGRNR
YLWFKSLNDGLELMDKTGPKIKQEIPAPSSGKRLLKGGSDSD
>sp|O93379|P53_ICTPU Cellular tumor antigen p53 OS=Ictalurus punctatus OX=7998 GN=tp53 PE=2 SV=1
MEGNGERDTMMVEPPDSQEFAELWLRNLIVRDNSLWGKEEEIPDDLQEVPCDVLLSDMLQ
PQSSSSPPTSTVPVTSDYPGLLNFTLHFQESSGTKSVTCTYSPDLNKLFCQLAKTCPVLM
AVSSSPPPGSVLRATAVYKRSEHVAEVVRRCPHHERSNDSSDGPAPPGHLLRVEGNSRAV
YQEDGNTQAHSVVVPYEPPQVGSQSTTVLYNYMCNSSCMGGMNRRPILTIITLETQDGHL
LGRRTFEVRVCACPGRDRKTEESNFKKQQEPKTSGKTLTKRSMKDPPSHPEASKKSKNSS
SDDEIYTLQVRGKERYEFLKKINDGLELSDVVPPADQEKYRQKLLSKTCRKERDGAAGEP
KRGKKRLVKEEKCDSD
>sp|P79820|P53_ORYLA Cellular tumor antigen p53 OS=Oryzias latipes OX=8090 GN=tp53 PE=2 SV=2
MDPVPDLPESQGSFQELWETVSYPPLETLSLPTVNEPTGSWVATGDMFLLDQDLSGTFDD
KIFDIPIEPVPTNEVNPPPTTVPVTTDYPGSYELELRFQKSGTAKSVTSTYSETLNKLYC
QLAKTSPIEVRVSKEPPKGAILRATAVYKKTEHVADVVRRCPHHQNEDSVEHRSHLIRVE
GSQLAQYFEDPYTKRQSVTVPYEPPQPGSEMTTILLSYMCNSSCMGGMNRRPILTILTLE
TEGLVLGRRCFEVRICACPGRDRKTEEESRQKTQPKKRKVTPNTSSSKRKKSHSSGEEED
NREVFHFEVYGRERYEFLKKINDGLELLEKESKSKNKDSGMVPSSGKKLKSN
>sp|Q92143|P53_XIPMA Cellular tumor antigen p53 OS=Xiphophorus maculatus OX=8083 GN=tp53 PE=2 SV=2
MEEADLTLPLSQDTFHDLWNNVFLSTENESLPPPEGLLSQNMDFWEDPETMQETKNVPTA
PTVPAISNYAGEHGFNLEFNDSGTAKSVTSTYSVKLGKLFCQLAKTTPIGVLVKEEPPQG
AVIRATAVYKKTEHVGEVVKRCPHHQSEDLSDNKSHLIRVEGSQLAQYFEDPNTRRHSVT
VPYERPQLGSEMTTILLSFMCNSSCMGGMNRRPILTILTLETTEGEVLGRRCFEVRVCAC
PGRDRKTEEGNLEKSGTKQTKKRKSAPAPDTSTAKKSKSASSGEDEDKEIYTLSIRGRNR
YLWFKSLNDGLELMDKTGPKIKQEIPAPSSGKRLLKGGSDSD
>sp|Q9W679|P53_TETMU Cellular tumor antigen p53 OS=Tetraodon miurus OX=94908 GN=tp53 PE=2 SV=1
MEEENISLPLSQDTFQDLWDNVSAPPISTIQTAALENEAWPAERQMNMMCNFMDSTFNEA
LFNLLPEPPSRDGANSSSPTVPVTTDYPGEYGFKLRFQKSGTAKSVTSTYSEILNKLYCQ
LAKTSLVEVLLGKDPPMGAVLRATAIYKKTEHVAEVVRRCPHHQNEDSAEHRSHLIRMEG
SERAQYFEHPHTKRQSVTVPYEPPQLGSEFTTILLSFMCNSSCMGGMNRRPILTILTLET
QEGIVLGRRCFEVRVCACPGRDRKTEETNSTKMQNDAKDAKKRKSVPTPDSTTIKKSKTA
SSAEEDNNEVYTLQIRGRKRYEMLKKINDGLDLLENKPKSKATHRPDGPIPPSGKRLLHR
GEKSDSD4        

6) Lets define a helper function to access another url (for ClustalO)

def get_url(url, **kwargs):
  response = requests.get(url, **kwargs);

  if not response.ok:
    print(response.text)
    response.raise_for_status()
    sys.exit()

  return response        

7) Let's use our function to submit our FASTA sequences to ClustalO for sequence alignments. Get the job ID and the status.

req = requests.post("https://www.ebi.ac.uk/Tools/services/rest/clustalo/run", data={

    "email": "[email protected]",
    "iterations": 0,
    "outfmt": "clustal_num",
    "order": "input",
    "sequence": fasta
})

job_id = req.text
print(job_id)

req = get_url(f"https://www.ebi.ac.uk/Tools/services/rest/clustalo/status/{job_id}")
print(req.text)        

This should return something that looks like:

clustalo-R20230708-222339-0868-90023705-p1m
QUEUED        

8) you can re-run the following to check if the job is finished

req = get_url(f"https://www.ebi.ac.uk/Tools/services/rest/clustalo/status/{job_id}")
print(req.text)        

When ready you should get

FINISHED        

9) Now lets get the results

req = get_url(f"https://www.ebi.ac.uk/Tools/services/rest/clustalo/result/{job_id}/aln-clustal_num")
print(req.text)        

You should get a nice multiple sequence alignment

CLUSTAL O(1.2.4) multiple sequence alignmen


sp|P04637|P53_HUMAN        ---MEEPQSDPSVEPPLSQETFSDLWKLLPENNV------LSPL-P--SQAMDDLMLSPD	48
sp|P10361|P53_RAT          ---MEDSQSDMSIELPLSQETFSCLWKLLPPDDI------LPTTATGSPNSM-EDLFLPQ	50
sp|P02340|P53_MOUSE        MTAMEESQSDISLELPLSQETFSGLWKLLPPEDI------LPS-----PHCM-DDLLLPQ	48
sp|Q42578|PER53_ARATH      ------------------------------------------------------------	0
sp|O09185|P53_CRIGR        ---MEEPQSDLSIELPLSQETFSDLWKLLPPNNV------LSTLP--SSDSI-EELFLSE	48
sp|Q8SPZ3|P53_DELLE        ---MEESQAELGVEPPLSQETFSDLWKLLPENNL------LSSELS--PA-VDDLLLSPE	48
sp|Q9TTA1|P53_TUPBE        ---MEEPQSDPSVEPPLSQETFSDLWKLLPENNV------LSPL-P--SQAMDDLMLSPD	48
sp|P61260|P53_MACFU        ---MEEPQSDPSIEPPLSQETFSDLWKLLPENNV------LSPL-P--SQAVDDLMLSPD	48
sp|P56424|P53_MACMU        ---MEEPQSDPSIEPPLSQETFSDLWKLLPENNV------LSPL-P--SQAVDDLMLSPD	48
sp|P79892|P53_HORSE        -------------------------------------------------PAVNNLLLSP-	10
sp|Q29537|P53_CANLF        ---MEESQSELNIDPPLSQETFSELWNLLPENNV------LSSELC--PAV--DELLLPE	47
sp|P56423|P53_MACFA        ---MEEPQSDPSIEPPLSQETFSDLWKLLPENNV------LSPL-P--SQAVDDLMLSPD	48
sp|Q9TUB2|P53_PIG          ---MEESQSELGVEPPLSQETFSDLWKLLPENNL------LSSELS--LAAVNDLLLSP-	48
sp|Q9W678|P53_BARBU        ---------------MAESQEFAELWERNLISTQ---------EAGTCWELI-ND----E	31
sp|P25035|P53_ONCMY        --MA---DLAENVSLPLSQESFEDLWKMNLNLVA------VQPPETESWVGY-DNFMMEA	48
sp|O12946|P53_PLAFE        --MMDEQGLDGMQILPGSQDSFSELWASVQTPSIATIAEEF-------------DDHLGN	45
sp|O57538|P53_XIPHE        ---ME----EADLTLPLSQDTFHDLWNNVFLSTENESLA----PPEG---------LLSQ	40
sp|O93379|P53_ICTPU        --MEGNGERDTMMVEPPDSQEFAELWLRNLIVRD---------N--SLWGKE-------E	40
sp|P79820|P53_ORYLA        --------MDPVPDLPESQGSFQELWETVSYPPLETLSLPTVNEPTGSWVATGDMFLLDQ	52
sp|Q92143|P53_XIPMA        ---ME----EADLTLPLSQDTFHDLWNNVFLSTENESLP----PPEG---------LLSQ	40
sp|Q9W679|P53_TETMU        ---ME----EENISLPLSQDTFQDLWDNVSAPPISTIQTAAL--ENEAWPAERQMNMMCN	51
                                                                                       

sp|P04637|P53_HUMAN        DIEQWFTEDPGPDEAPRMPEAAPPVAPAPAAPTPAAPAPAPSWPLSSSVPSQKTYQGSYG	108
sp|P10361|P53_RAT          DVAELLEGPEEALQV---S-APAAQEPGTEAPAPVAPASATPWPLSSSVPSQKTYQGNYG	106
sp|P02340|P53_MOUSE        DVEEFFEGPSEALRV---SGAPAAQDPVTETPGPVAPAPATPWPLSSFVPSQKTYQGNYG	105
sp|Q42578|PER53_ARATH      ----------------------------------------------MAVTNLPTCDGLFI	14
sp|O09185|P53_CRIGR        NVTGWLEDSGGALQGVAAAAASTAEDPVTETPAPVASAPATPWPLSSSVPSYKTYQGDYG	108
sp|Q8SPZ3|P53_DELLE        DVANWLDER--PDEAPQMPEP-----PAPAAPTPAAPAPATSWPLSSFVPSQKTYPGSYG	101
sp|Q9TTA1|P53_TUPBE        DIEQWFTEDPGPDEAPRMPEAAPPVAPAPAAPTPAAPAPAPSWPLSSSVPSQKTYQGSYG	108
sp|P61260|P53_MACFU        DLAQWLTEDPGPDEAPRMSEAAPPMAPTPAAPTPAAPAPAPSWPLSSSVPSQKTYHGSYG	108
sp|P56424|P53_MACMU        DLAQWLTEDPGPDEAPRMSEAAPPMAPTPAAPTPAAPAPAPSWPLSSSVPSQKTYHGSYG	108
sp|P79892|P53_HORSE        DVVNWLDEG--PDEAPRMPAA-----P-----APLAPAPATSWPLSSFVPSQKTYPGCYG	58
sp|Q29537|P53_CANLF        SVVNWLDED--SDDAPRMPAT-----SA-----PTAPGPAPSWPLSSSVPSPKTYPGTYG	95
sp|P56423|P53_MACFA        DLAQWLTEDPGPDEAPRMSEAAPPMAPTPAAPTPAAPAPAPSWPLSSSVPSQKTYHGSYG	108
sp|Q9TUB2|P53_PIG          -VTNWLDEN--PDDASRVPAP-----PAATAPAPAAPAPATSWPLSSFVPSQKTYPGSYD	100
sp|Q9W678|P53_BARBU        YLPSSFDPN---------IFDNVL----------TEQPQPSTSPPTASVPVATDYPGEHG	72
sp|P25035|P53_ONCMY        PLQVEFDPS---------LFEVSA---TEPAPQPSISTLDTGSPPTSTVPTTSDYPGALG	96
sp|O12946|P53_PLAFE        LLQNGFDMN---------LFELPP----------EMVAKDSVTPPSSTVPVVTDYPGEYG	86
sp|O57538|P53_XIPHE        ------NMD---------FWE-DP----------ETMQETKNVPTAPTVPAISNYAGEHG	74
sp|O93379|P53_ICTPU        EIPDDLQEV---------PCDVLL---SD-----MLQPQSSSSPPTSTVPVTSDYPGLLN	83
sp|P79820|P53_ORYLA        DLSGTFDDK---------IFDIPI----------EPVPTNEVNPPPTTVPVTTDYPGSYE	93
sp|Q92143|P53_XIPMA        ------NMD---------FWE-DP----------ETMQETKNVPTAPTVPAISNYAGEHG	74
sp|Q9W679|P53_TETMU        FMDSTFNEA---------LFNLLP----------EPPSRDGANSSSPTVPVTTDYPGEYG	92
                                                                           *       *   

sp|P04637|P53_HUMAN        FRLGFLHSGTAKSVTCTYSPALNKMFCQLAKTCPV-------------------------	143
sp|P10361|P53_RAT          FHLGFLQSGTAKSVMCTYSISLNKLFCQLAKTCPV-------------------------	141
sp|P02340|P53_MOUSE        FHLGFLQSGTAKSVMCTYSPPLNKLFCQLAKTCPV-------------------------	140
sp|Q42578|PER53_ARATH      ISLIVIV----SSIFGTSSAQLNATFY--SGTCPNASAIVRSTIQQALQSDTRIGASLIR	68
sp|O09185|P53_CRIGR        FRLGFLHSGTAKSVTCTYSPSLNKLFCQLAKTCPV-------------------------	143
sp|Q8SPZ3|P53_DELLE        FHLGFLHSGTAKSVTCTYSPALNKLFCQLAKTCPV-------------------------	136
sp|Q9TTA1|P53_TUPBE        FRLGFLHSGTAKSVTCTYSPDLNKLFCQLAKTCPV-------------------------	143
sp|P61260|P53_MACFU        FRLGFLHSGTAKSVTCTYSPDLNKMFCQLAKTCPV-------------------------	143
sp|P56424|P53_MACMU        FRLGFLHSGTAKSVTCTYSPDLNKMFCQLAKTCPV-------------------------	143
sp|P79892|P53_HORSE        FRLGFLNSGTAKSVTCTYSPTLNKLFCQLAKTCPV-------------------------	93
sp|Q29537|P53_CANLF        FRLGFLHSGTAKSVTWTYSPLLNKLFCQLAKTCPV-------------------------	130
sp|P56423|P53_MACFA        FRLGFLHSGTAKSVTCTYSPDLNKMFCQLAKTCPV-------------------------	143
sp|Q9TUB2|P53_PIG          FRLGFLHSGTAKSVTCTYSPALNKLFCQLAKTCPV-------------------------	135
sp|Q9W678|P53_BARBU        FKLGFPQSGTAKSVTCTYSSDLNKLFCQLAKTCPV-------------------------	107
sp|P25035|P53_ONCMY        FQLRFLQSSTAKSVTCTYSPDLNKLFCQLAKTCPV-------------------------	131
sp|O12946|P53_PLAFE        FQLRFQKSGTAKSVTSTFSELLKKLYCQLAKTSPV-------------------------	121
sp|O57538|P53_XIPHE        FNLEFNDSGTAKSVTSTYSVKLGKLFCQLAKTTPI-------------------------	109
sp|O93379|P53_ICTPU        FTLHFQESSGTKSVTCTYSPDLNKLFCQLAKTCPV-------------------------	118
sp|P79820|P53_ORYLA        LELRFQKSGTAKSVTSTYSETLNKLYCQLAKTSPI-------------------------	128
sp|Q92143|P53_XIPMA        FNLEFNDSGTAKSVTSTYSVKLGKLFCQLAKTTPI-------------------------	109
sp|Q9W679|P53_TETMU        FKLRFQKSGTAKSVTSTYSEILNKLYCQLAKTSLV-------------------------	127
                           : * .      .*:  * *  *   :   : *                            

sp|P04637|P53_HUMAN        -------------QLWVDST------PPPGTRVRAMAIYKQ-SQHMTEVVRRCPHHERCS	183
sp|P10361|P53_RAT          -------------QLWVTST------PPPGTRVRAMAIYKK-SQHMTEVVRRCPHHERCS	181
sp|P02340|P53_MOUSE        -------------QLWVSAT------PPAGSRVRAMAIYKK-SQHMTEVVRRCPHHERCS	180
sp|Q42578|PER53_ARATH      LHFHDCFVNGCDASILLDDTGSIQSEKNAGPNVNSARGFNVVDNIKTALENACPGVVSCS	128
sp|O09185|P53_CRIGR        -------------QLWVNST------PPPGTRVRAMAIYKK-LQYMTEVVRRCPHHERSS	183
sp|Q8SPZ3|P53_DELLE        -------------QLWVSSP------PPPGTRVRAMAIYKK-SEYMTEVVRRCPHHERCS	176
sp|Q9TTA1|P53_TUPBE        -------------QLWVDSA------PPPGTRVRAMAIYKQ-SQYVTEVVRRCPHHERCS	183
sp|P61260|P53_MACFU        -------------QLWVDST------PPPGSRVRAMAIYKQ-SQHMTEVVRRCPHHERCS	183
sp|P56424|P53_MACMU        -------------QLWVDST------PPPGSRVRAMAIYKQ-SQHMTEVVRRCPHHERCS	183
sp|P79892|P53_HORSE        -------------QLLVSSP------PPPGTRVRAMAIYKK-SEFMTEVVRRCPHHERCS	133
sp|Q29537|P53_CANLF        -------------QLWVSSP------PPPNTCVRAMAIYKK-SEFVTEVVRRCPHHERCS	170
sp|P56423|P53_MACFA        -------------QLWVDST------PPPGSRVRAMAIYKQ-SQHMTEVVRRCPHHERCS	183
sp|Q9TUB2|P53_PIG          -------------QLWVSSP------PPPGTRVRAMAIYKK-SEYMTEVVRRCPHHERSS	175
sp|Q9W678|P53_BARBU        -------------QMVVNVA------PPQGSVIRATAIYKK-SEHVAEVVRRCPHHERTP	147
sp|P25035|P53_ONCMY        -------------QIVVDHP------PPPGAVVRALAIYKK-LSDVADVVRRCPHHQSTS	171
sp|O12946|P53_PLAFE        -------------EVLLSKE------PPQGAVLRATAVYKK-TEHVADVVRRCPHHQT--	159
sp|O57538|P53_XIPHE        -------------GVLVKEE------PPQGAVIRATSVYKK-TEHVGEVVKRCPHHQS--	147
sp|O93379|P53_ICTPU        -------------LMAVSSS------PPPGSVLRATAVYKR-SEHVAEVVRRCPHHERSN	158
sp|P79820|P53_ORYLA        -------------EVRVSKE------PPKGAILRATAVYKK-TEHVADVVRRCPHHQN--	166
sp|Q92143|P53_XIPMA        -------------GVLVKEE------PPQGAVIRATAVYKK-TEHVGEVVKRCPHHQS--	147
sp|Q9W679|P53_TETMU        -------------EVLLGKD------PPMGAVLRATAIYKK-TEHVAEVVRRCPHHQN--	165
                                         : :            .  :.:   ::   .    : . **      

sp|P04637|P53_HUMAN        D-SDGLAPPQHLIRVEGNLRVEYLDDRNTFR-------HSVVVPYEPPEVGSDCTTIHYN	235
sp|P10361|P53_RAT          D-GDGLAPPQHLIRVEGNPYAEYLDDRQTFR-------HSVVVPYEPPEVGSDYTTIHYK	233
sp|P02340|P53_MOUSE        D-GDGLAPPQHLIRVEGNLYPEYLEDRQTFR-------HSVVVPYEPPEAGSEYTTIHYK	232
sp|Q42578|PER53_ARATH      DVLA-LASEASVSLAGGPSWTVLLGRRDSLTANLAGANSSIPSPIE------SLSNITFK	181
sp|O09185|P53_CRIGR        E-GDSLAPPQHLIRVEGNLHAEYLDDKQTFR-------HSVVVPYEPPEVGSDCTTIHYN	235
sp|Q8SPZ3|P53_DELLE        DYSDGLAPPQHLIRVEGNLRAEYLDDRNTFR-------HSVVVPYEPPEVGSDCTTIHYN	229
sp|Q9TTA1|P53_TUPBE        D-SDGLAPPQHLIRVEGNLHAEYSDDRNTFR-------HSVVVPYEPPEVGSDCTTIHYN	235
sp|P61260|P53_MACFU        D-SDGLAPPQHLIRVEGNLRVEYSDDRNTFR-------HSVVVPYEPPEVGSDCTTIHYN	235
sp|P56424|P53_MACMU        D-SDGLAPPQHLIRVEGNLRVEYSDDRNTFR-------HSVVVPYEPPEVGSDCTTIHYN	235
sp|P79892|P53_HORSE        DSSDGLAPPQHLIRVEGNLRAEYLDDRNTFR-------HSVVVPYEPPEVGSDCTTIHYN	186
sp|Q29537|P53_CANLF        DSSDGLAPPQHLIRVEGNLRAKYLDDRNTFR-------HSVVVPYEPPEVGSDYTTIHYN	223
sp|P56423|P53_MACFA        D-SDGLAPPQHLIRVEGNLRVEYSDDRNTFR-------HSVVVPYEPPEVGSDCTTIHYN	235
sp|Q9TUB2|P53_PIG          DYSDGLAPPQHLIRVEGNLRAEYLDDRNTFR-------HSVVVPYEPPEVGSDCTTIHYN	228
sp|Q9W678|P53_BARBU        D-GDGLAPAAHLIRVEGNSRALYREDDVNSR-------HSVVVPYEVPQLGSEFTTVLYN	199
sp|P25035|P53_ONCMY        ENNEGPAPRGHLVRVEGNQRSEYMEDGNTLR-------HSVLVPYEPPQVGSECTTVLYN	224
sp|O12946|P53_PLAFE        --EDTAEHRSHLIRLEGSQRALYFEDPHTKR-------QSVTVPYEPPQLGSETTAILLS	210
sp|O57538|P53_XIPHE        --EDLSDNKSHLIRVEGSQLAQYFEDPNTRR-------HSVTVPYERPQLGSEMTTILLS	198
sp|O93379|P53_ICTPU        DSSDGPAPPGHLLRVEGNSRAVYQEDGNTQA-------HSVVVPYEPPQVGSQSTTVLYN	211
sp|P79820|P53_ORYLA        --EDSVEHRSHLIRVEGSQLAQYFEDPYTKR-------QSVTVPYEPPQPGSEMTTILLS	217
sp|Q92143|P53_XIPMA        --EDLSDNKSHLIRVEGSQLAQYFEDPNTRR-------HSVTVPYERPQLGSEMTTILLS	198
sp|Q9W679|P53_TETMU        --EDSAEHRSHLIRMEGSERAQYFEHPHTKR-------QSVTVPYEPPQLGSEFTTILLS	216
                                      :    *           .          *:  * *      . : :  .

sp|P04637|P53_HUMAN        YMCNSSCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENLRKKGEP	295
sp|P10361|P53_RAT          YMCNSSCMGGMNRRPILTIITLEDSSGNLLGRDSFEVRVCACPGRDRRTEEENFRKKEEH	293
sp|P02340|P53_MOUSE        YMCNSSCMGGMNRRPILTIITLEDSSGNLLGRDSFEVRVCACPGRDRRTEEENFRKKEVL	292
sp|Q42578|PER53_ARATH      FSA-----VGLNTNDLVA----------LSGAHTFGRARCGVFN----NRLFNFSGTGNP	222
sp|O09185|P53_CRIGR        YMCNSSCMGGMNRRPILTIITLEDPSGNLLGRNSFEVRICACPGRDRRTEEKNFQKKGEP	295
sp|Q8SPZ3|P53_DELLE        FMCNSSCMGGMNRRPILTIITLEDSNGNLLGRNSFEVRVCACPGRDRRTEEENFHKKGQS	289
sp|Q9TTA1|P53_TUPBE        YMCNSSCMGGMNRRPILTIITLEDSSGKLLGRNSFEVRICACPGRDRRTEEENFRKKGES	295
sp|P61260|P53_MACFU        YMCNSSCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENFRKKGEP	295
sp|P56424|P53_MACMU        YMCNSSCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENFRKKGEP	295
sp|P79892|P53_HORSE        FMCNSSCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENFRKKEEP	246
sp|Q29537|P53_CANLF        YMCNSSCMGGMNRRPILTIITLEDSSGNVLGRNSFEVRVCACPGRDRRTEEENFHKKGEP	283
sp|P56423|P53_MACFA        YMCNSSCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENFRKKGEP	295
sp|Q9TUB2|P53_PIG          FMCNSSCMGGMNRRPILTIITLEDASGNLLGRNSFEVRVCACPGRDRRTEEENFLKKGQS	288
sp|Q9W678|P53_BARBU        FMCNSSCMGGMNRRPILTIISLETHDGQLLGRRSFEVRVCACPGRDRKTEESNFRKDQET	259
sp|P25035|P53_ONCMY        FMCNSSCMGGMNRRPILTIITLETQEGQLLGRRSFEVRVCACPGRDRKTEEINLKKQQET	284
sp|O12946|P53_PLAFE        FMCNSSCMGGMNRRQILTILTLETPDGLVLGRRCFEVRVCACPGRDRKTDEESSTKTPNG	270
sp|O57538|P53_XIPHE        FMCNSSCMGGMNRRPILTILTLETTEGEVLGRRCFEVRVCACPGRDRKTEEGNLEK--SG	256
sp|O93379|P53_ICTPU        YMCNSSCMGGMNRRPILTIITLETQDGHLLGRRTFEVRVCACPGRDRKTEESNFKKQQEP	271
sp|P79820|P53_ORYLA        YMCNSSCMGGMNRRPILTILTLET-EGLVLGRRCFEVRICACPGRDRKTEEESRQKTQP-	275
sp|Q92143|P53_XIPMA        FMCNSSCMGGMNRRPILTILTLETTEGEVLGRRCFEVRVCACPGRDRKTEEGNLEK--SG	256
sp|Q9W679|P53_TETMU        FMCNSSCMGGMNRRPILTILTLETQEGIVLGRRCFEVRVCACPGRDRKTEETNSTKMQND	276
                           : .      *:* . :::          : *   *    *.  .    .   .       

sp|P04637|P53_HUMAN        HHELPP---GSTKRALPNNTSSSPQP----------KKKPLDGEYFTLQI----------	332
sp|P10361|P53_RAT          CPELPP---GSAKRALPTSTSSSPQQ----------KKKPLDGEYFTLKI----------	330
sp|P02340|P53_MOUSE        CPELPP---GSAKRALPTCTSASPPQ----------KKKPLDGEYFTLKI----------	329
sp|Q42578|PER53_ARATH      DPTLNSTLLSTLQQLCPQNGSAST-----ITNLDLSTPDAFDNNYFANLQSNDGLLQSDQ	277
sp|O09185|P53_CRIGR        CPELPP---KSAKRALPTNTSSSPPP----------KKKTLDGEYFTLKI----------	332
sp|Q8SPZ3|P53_DELLE        CPELPT---GSAKRALPTGTSSSPPQ----------KKKPLDGEYFTLQI----------	326
sp|Q9TTA1|P53_TUPBE        CPKLPT---GSIKRALPTGSSSSPQP----------KKKPLDEEYFTLQI----------	332
sp|P61260|P53_MACFU        CHQLPP---GSTKRALPNNTSSSPQP----------KKKPLDGEYFTLQI----------	332
sp|P56424|P53_MACMU        CHQLPP---GSTKRALPNNTSSSPQP----------KKKPLDGEYFTLQI----------	332
sp|P79892|P53_HORSE        CPEPPP---RSTKRVLSSNTSSSPPQ----------KKKPLDGEYFT-------------	280
sp|Q29537|P53_CANLF        CPEPPP---GSTKRALPPSTSSSPPQ----------KKKPLDGEYFTLQI----------	320
sp|P56423|P53_MACFA        CHQLPP---GSTKRALPNNTSSSPQP----------KKKPLDGEYFTLQI----------	332
sp|Q9TUB2|P53_PIG          CPEPPP---GSTKRALPTSTSSSPVQ----------KKKPLDGEYFTLQI----------	325
sp|Q9W678|P53_BARBU        KTLDKIPSANK--RSLTKDSTSSVPRPEGSK--KAKLSGSSDEEIYTLQV----------	305
sp|P25035|P53_ONCMY        TLETKTKPAQGIKRAMKEASL--PAPQPGASKKTKSSPAVSDDEIYTLQI----------	332
sp|O12946|P53_PLAFE        PKQTKK-------RKQAPS-NSAPHTTTVMKSKSSSSAEEEDKEVFTVLV----------	312
sp|O57538|P53_XIPHE        TKQTKK-------RKSAP----APDTSTAKKSKSASSGEDEDKEIYTLSI----------	295
sp|O93379|P53_ICTPU        KTSGKT---------LTKRSMKDPPSHPEAS--KKSKNSSSDDEIYTLQV----------	310
sp|P79820|P53_ORYLA        ----KK-------RKVTPN-TS----SSKRKKSHSSGEEEDNREVFHFEV----------	309
sp|Q92143|P53_XIPMA        TKQTKK-------RKSAP----APDTSTAKKSKSASSGEDEDKEIYTLSI----------	295
sp|Q9W679|P53_TETMU        AKDAKK-------RKSVP----TPDSTTIKKSKTASSAEEDNNEVYTLQI----------	315
                                                                    : : :              

sp|P04637|P53_HUMAN        -----RGR-----------ERFEMFRELNEALELKDAQAG-K-EPGGSRAHSS---HLKS	371
sp|P10361|P53_RAT          -----RGR-----------ERFEMFRELNEALELKDARAA-E-ESGDSRAHSS---YPKT	369
sp|P02340|P53_MOUSE        -----RGR-----------KRFEMFRELNEALELKDAHAT-E-ESGDSRAHSS---YLKT	368
sp|Q42578|PER53_ARATH      ELFSTTGSSTIAIVTSFASNQTLFFQAFAQSMINMGNISPLTGSNGEIRLDC------KK	331
sp|O09185|P53_CRIGR        -----RGH-----------ERFKMFQELNEALELKDAQAS-K-GSEDNGAHSS---YLKS	371
sp|Q8SPZ3|P53_DELLE        -----RGR-----------ERFEMFRELNEALELKDAQAG-K-EPGESRAHSS---HLKS	365
sp|Q9TTA1|P53_TUPBE        -----RGR-----------ERFEMLREINEALELKDAMAG-K-ESAGSRAHSS---HLKS	371
sp|P61260|P53_MACFU        -----RGR-----------ERFEMFRELNEALELKDAQAG-K-EPAGSRAHSS---HLKS	371
sp|P56424|P53_MACMU        -----RGR-----------ERFEMFRELNEALELKDAQAG-K-EPAGSRAHSS---HLKS	371
sp|P79892|P53_HORSE        ------------------------------------------------------------	280
sp|Q29537|P53_CANLF        -----RGR-----------ERYEMFRNLNEALELKDAQSG-K-EPGGSRAHSS---HLKA	359
sp|P56423|P53_MACFA        -----RGR-----------ERFEMFRELNEALELKDAQAG-K-EPAGSRAHSS---HLKS	371
sp|Q9TUB2|P53_PIG          -----RGR-----------ERFEMFRELNDALELKDAQTA-R-ESGENRAHSS---HLKS	364
sp|Q9W678|P53_BARBU        -----RGK-----------ERYEMLKKINDSLELSDVVPP-S-EMDRYRQKLLTKG--KK	345
sp|P25035|P53_ONCMY        -----RGK-----------EKYEMLKKFNDSLELSELVPV-A-DADKYRQKCLTKRVAKR	374
sp|O12946|P53_PLAFE        -----KGR-----------ERYEIIKKINEAFEGAAEKEK-A-KNK----------VAVK	344
sp|O57538|P53_XIPHE        -----RGR-----------NRYLWFKSLNDGLELMDKTG-----------------PKIK	322
sp|O93379|P53_ICTPU        -----RGK-----------ERYEFLKKINDGLELSDVVPP-A-DQEKYRQKLLSKTCRKE	352
sp|P79820|P53_ORYLA        -----YGR-----------ERYEFLKKINDGLELLEKESK-S-KN--------------K	337
sp|Q92143|P53_XIPMA        -----RGR-----------NRYLWFKSLNDGLELMDKTG-----------------PKIK	322
sp|Q9W679|P53_TETMU        -----RGR-----------KRYEMLKKINDGLDLLENKP----KSK----------ATHR	345
                                                                                       

sp|P04637|P53_HUMAN        KKG--QSTSRHKKLMFKTEGPDSD	393
sp|P10361|P53_RAT          KKG--QSTSRHKKPMIKKVGPDSD	391
sp|P02340|P53_MOUSE        KKG--QSTSRHKKTMVKKVGPDSD	390
sp|Q42578|PER53_ARATH      VNG-----------------S---	335
sp|O09185|P53_CRIGR        KKG--QSASRLKKLMIKREGPDSD	393
sp|Q8SPZ3|P53_DELLE        KKG--QSPSRHKKLMFKREGPDSD	387
sp|Q9TTA1|P53_TUPBE        KKG--QSTSRHRKLMFKTEGPDSD	393
sp|P61260|P53_MACFU        KKG--QSTSRHKKFMFKTEGPDSD	393
sp|P56424|P53_MACMU        KKG--QSTSRHKKFMFKTEGPDSD	393
sp|P79892|P53_HORSE        ------------------------	280
sp|Q29537|P53_CANLF        KKG--QSTSRHKKLMFKREGLDSD	381
sp|P56423|P53_MACFA        KKG--QSTSRHKKFMFKTEGPDSD	393
sp|Q9TUB2|P53_PIG          KKG--QSPSRHKKPMFKREGPDSD	386
sp|Q9W678|P53_BARBU        KDGQTPEPKRGKKLMVKDEKSDSD	369
sp|P25035|P53_ONCMY        DFG--VGPKKRKKLLVKEEKSDSD	396
sp|O12946|P53_PLAFE        QEL--PVPSSGKRLVQRGERSDSD	366
sp|O57538|P53_XIPHE        QEI--PAPSSGKRLLKGGSDSD--	342
sp|O93379|P53_ICTPU        RDGAAGEPKRGKKRLVKEEKCDSD	376
sp|P79820|P53_ORYLA        DSG--MVPSSGKKLKSN-------	352
sp|Q92143|P53_XIPMA        QEI--PAPSSGKRLLKGGSDSD--	342
sp|Q9W679|P53_TETMU        PDG--PIPPSGKRLLHRGEKSDSD	367t        

Final notes: you can modify p53 to any protein of interest. EMBL-EBI has some great resources. I recommend this video to start: https://www.youtube.com/watch?v=-2g3nFhZkzo

Hope you are having as much summer fun as we are!

要查看或添加评论,请登录

社区洞察

其他会员也浏览了