openfold.np.protein¶

Protein data type.

Classes

Protein(atom_positions, aatype, atom_mask, ...)

Protein structure representation.

Functions

`add_pdb_headers`(prot, pdb_str)	Add pdb headers to an existing PDB string.
`from_pdb_string`(pdb_str[, chain_id])	Takes a PDB string and constructs a Protein object.
`from_prediction`(features, result[, ...])	Assembles a protein from a prediction.
`from_proteinnet_string`(proteinnet_str)
`get_pdb_headers`(prot[, chain_id])
`ideal_atom_mask`(prot)	Computes an ideal atom mask.
`to_modelcif`(prot)	Converts a Protein instance to a ModelCIF string.
`to_pdb`(prot)	Converts a Protein instance to a PDB string.

class Protein(atom_positions, aatype, atom_mask, residue_index, b_factors, chain_index=None, remark=None, parents=None, parents_chain_index=None)¶

Protein structure representation.

Parameters:

atom_positions (ndarray)
aatype (ndarray)
atom_mask (ndarray)
residue_index (ndarray)
b_factors (ndarray)
chain_index (ndarray | None)
remark (str | None)
parents (Sequence[str] | None)
parents_chain_index (Sequence[int] | None)

aatype: ndarray¶

atom_mask: ndarray¶

atom_positions: ndarray¶

b_factors: ndarray¶

chain_index: ndarray | None = None¶

parents: Sequence[str] | None = None¶

parents_chain_index: Sequence[int] | None = None¶

remark: str | None = None¶

residue_index: ndarray¶

add_pdb_headers(prot, pdb_str)¶

Add pdb headers to an existing PDB string. Useful during multi-chain recycling

Parameters:

prot (Protein)
pdb_str (str)

Return type:

str

from_pdb_string(pdb_str, chain_id=None)¶

Takes a PDB string and constructs a Protein object.

WARNING: All non-standard residue types will be converted into UNK. All: non-standard atoms will be ignored.

Parameters:

pdb_str (str) – The contents of the pdb file
chain_id (str | None) – If None, then the whole pdb file is parsed. If chain_id is specified (e.g. A), then only that chain is parsed.

Returns:

A new Protein parsed from the pdb contents.

Return type:

Protein

from_prediction(features, result, b_factors=None, remove_leading_feature_dimension=True, remark=None, parents=None, parents_chain_index=None)¶

Assembles a protein from a prediction.

Parameters:

features (Mapping[str, ndarray]) – Dictionary holding model inputs.
result (Mapping[str, Any]) – Dictionary holding model outputs.
b_factors (ndarray | None) – (Optional) B-factors to use for the protein.
remove_leading_feature_dimension (bool) – Whether to remove the leading dimension of the features values
chain_index – (Optional) Chain indices for multi-chain predictions
remark (str | None) – (Optional) Remark about the prediction
parents (Sequence[str] | None) – (Optional) List of template names
parents_chain_index (Sequence[int] | None)

Returns:

A protein instance.

Return type:

Protein

from_proteinnet_string(proteinnet_str)¶

Parameters:: proteinnet_str (str)
Return type:: Protein

get_pdb_headers(prot, chain_id=0)¶

Parameters:

prot (Protein)
chain_id (int)

Return type:

Sequence[str]

ideal_atom_mask(prot)¶

Computes an ideal atom mask.

Protein.atom_mask typically is defined according to the atoms that are reported in the PDB. This function computes a mask according to heavy atoms that should be present in the given sequence of amino acids.

Parameters:: prot (Protein) – Protein whose fields are numpy.ndarray objects.
Returns:: An ideal atom mask.
Return type:: ndarray

to_modelcif(prot)¶

Converts a Protein instance to a ModelCIF string. Chains with identical modelled coordinates will be treated as the same polymer entity. But note that if chains differ in modelled regions, no attempt is made at identifying them as a single polymer entity.

Parameters:: prot (Protein) – The protein to convert to PDB.
Returns:: ModelCIF string.
Return type:: str

to_pdb(prot)¶

Converts a Protein instance to a PDB string.

Parameters:: prot (Protein) – The protein to convert to PDB.
Returns:: PDB string.
Return type:: str