Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undefined bit score (-2147483648)? #419

Open
Andy-B-123 opened this issue Feb 5, 2025 · 4 comments
Open

Undefined bit score (-2147483648)? #419

Andy-B-123 opened this issue Feb 5, 2025 · 4 comments

Comments

@Andy-B-123
Copy link

Hi, I'm running Foldseek with some structures predicted using esmfold and for some hits I'm getting a negative/undefined bit score?

I am using the SwissProt database generated using the 'databases' command from foldseek. I'm using the latest version and this only happens for 1 of the ~30,000 structures I've generated, but it seems to happen for a range of hits with e-values and I normally filter on bits so this would be removing potentially real hits! I'm not sure if it's related to the protein sequence/structure or the search. I have attached the .pdb file here.

Thank you for having a look!

$ grep '\-2147483648' pgaptmp_006197.foldseek.out
#query	target	fident	alnlen	mismatch	gapopen	qstart	qend	tstart	tend	evalue	bits
pgaptmp_006197	AF-Q45827-F1-model_v4	0.128	749	513	0	259	1007	9	597	2.579E-13	-2147483648
pgaptmp_006197	AF-P07168-F1-model_v4	0.142	985	693	0	199	1183	21	829	7.470E-20	-2147483648
pgaptmp_006197	AF-P07167-F1-model_v4	0.137	987	703	0	198	1184	20	835	2.069E-17	-2147483648
pgaptmp_006197	AF-A7MRY4-F1-model_v4	0.105	1109	739	0	224	1332	9	835	4.819E-15	-2147483648
pgaptmp_006197	AF-Q9KLK7-F1-model_v4	0.147	1208	719	0	124	1331	2	845	1.331E-25	-2147483648
pgaptmp_006197	AF-P58356-F1-model_v4	0.177	1241	744	0	90	1330	4	908	4.257E-20	-2147483648
pgaptmp_006197	AF-P39453-F1-model_v4	0.185	1145	680	0	91	1235	5	839	9.779E-21	-2147483648
pgaptmp_006197	AF-Q55445-F1-model_v4	0.138	726	534	0	241	861	198	923	3.166E-03	-2147483648
pgaptmp_006197	AF-Q5A599-F1-model_v4	0.337	1240	627	0	93	1332	53	999	9.278E-51	-2147483648
pgaptmp_006197	AF-Q86CZ2-F1-model_v4	0.151	1096	633	0	237	1332	454	1199	1.733E-21	-2147483648
pgaptmp_006197	AF-P70388-F1-model_v4	0.108	514	429	0	16	529	430	911	2.586E+00	-2147483648
pgaptmp_006197	AF-Q54YH4-F1-model_v4	0.125	1377	939	0	259	1332	592	1968	3.866E-19	-2147483648
pgaptmp_006197	AF-Q54YZ9-F1-model_v4	0.369	1316	822	0	17	1332	546	1848	1.624E-63	-2147483648
pgaptmp_006197	AF-Q5DU28-F1-model_v4	0.095	470	386	0	159	585	958	1427	1.137E+00	-2147483648
pgaptmp_006197	AF-Q5AHA0-F1-model_v4	0.144	1316	1031	0	15	1330	1259	2464	3.586E-29	-2147483648

Command lines to reproduce:

> ${foldseek_path} version
941cd33ff0771cd2e3f144e3293e22a2b87e9fda

> $ echo ${foldseek_database}
/datasets/work/ev-agi-apps-db/reference/database/foldseek/foldseek_db_Alphafold_Swiss-Prot/foldseek_db_Alphafold_Swiss-Prot

> ${foldseek_path} easy-search pgaptmp_006197.pdb ${foldseek_database} pgaptmp_006197.foldseek.out ./tmp

pgaptmp_006197.pdb.gz

@milot-mirdita
Copy link
Member

Interesting, this happens on the x86 server, but doesn't happen on Mac. I'll investigate. Thank you for the bug report

@milot-mirdita
Copy link
Member

Oh, it does also happen, but with a 0 value instead of -INT_MAX

@milot-mirdita
Copy link
Member

I pushed a fix. Thanks a lot for providing a test case. This made fixing the issue very straightforward!

@milot-mirdita
Copy link
Member

By the way, do you have any idea why the structure you attached looks the way it does? The residues around 266 are very odd?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants