FAQ¶
Frequently asked questions or encountered issues when running OpenFold.
Setup¶
When running unit tests (e.g.
./scripts/run_unit_tests.sh), I see an error such asImportError: version GLIBCXX_3.4.30 not found
Solution: Make sure that the
$LD_LIBRARY_PATHenvironment has been set to include the conda path, e.g.export $LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATHI see a CUDA mismatch error, eg.
The detected CUDA version (11.8) mismatches the version that was used to compile
PyTorch (12.1). Please make sure to use the same CUDA versions.
Solution: Ensure that your system’s CUDA driver and toolkit match your intended OpenFold installation (CUDA 11 by default). You can check the CUDA driver version with a command such as
nvidia-smi
I get some error involving
fatal error: cuda_runtime.h: No such file or directoryand orninja: build stopped: subcommand failed..
Solution: Something went wrong with setting up some of the custom kernels. Try running
install_third_party_dependencies.shagain or trypython3 setup.py installfrom inside the OpenFold folder. Make sure to prepend the conda environment as described above before running this.
Training¶
My model training is hanging on the data loading step:
Solution: While each system is different, a few general suggestions: - Check your
$KMP_AFFINITYenvironment setting and see if it is suitable for your system. - Adjust the number of data workers used to prepare data with the--num_workerssetting. Increasing the number could help with dataset processing speed. However, to many workers could cause an OOM issue.When I reload my pretrained model weights or checkpoints, I get
RuntimeError: Error(s) in loading state_dict for OpenFoldWrapper: Unexpected key(s) in state_dict:Solution: This suggests that your checkpoint / model weights are in OpenFold v1 format with outdated model layer names. Convert your weights/checkpoints following this guide.