openfold.data.data_transforms¶
Functions
|
|
|
|
|
|
|
Convert coordinates to torsion angles. |
|
|
|
|
|
Correct MSA restype to have the same order as rc. |
|
|
|
|
|
Supply all arguments but the first. |
|
|
|
|
|
|
|
|
Returns atom indices needed to compute chi angles for all residue types. |
|
|
|
|
Construct denser atom positions (14 dimensions instead of 37). |
|
|
|
Constructs denser atom positions (14 dimensions instead of 37). |
|
Guess at the MSA and sequence dimension to make fixed size. |
|
Compute the HHblits MSA profile if not already present. |
|
Create data for BERT on raw MSA. |
|
Create and concatenate MSA features. |
|
Mask features are all ones, but will later be zero-padded. |
|
|
|
Create pseudo-beta (alpha for glycine) position and mask. |
|
|
|
|
|
|
|
Create pseudo beta features. |
|
Crop randomly to crop_size, or keep as is if shorter than that. |
|
Replace a portion of the MSA with 'X'. |
|
Sample MSA randomly, remaining sequences are stored are stored as extra_*. |
|
|
|
|
|
|
|
Remove singleton and repeated dimensions in protein features. |
|
Produce profile and deletion_matrix_mean within each cluster. |
|
Computes the sum along segments of a tensor. |
- add_constant_field(protein, key, value)¶
- add_distillation_flag(protein, distillation)¶
- atom37_to_frames(protein, eps=1e-08)¶
- atom37_to_torsion_angles(protein, prefix='')¶
Convert coordinates to torsion angles.
This function is extremely sensitive to floating point imprecisions and should be run with double precision whenever possible.
- Parameters:
containing (Dict) –
- Returns:
- Return type:
The same dictionary updated with the following features
- block_delete_msa(protein, config)¶
- cast_to_64bit_ints(protein)¶
- correct_msa_restypes(protein)¶
Correct MSA restype to have the same order as rc.
- crop_extra_msa(protein, max_extra_msa)¶
- crop_templates(protein, max_templates)¶
- curry1(f)¶
Supply all arguments but the first.
- delete_extra_msa(protein)¶
- fix_templates_aatype(protein)¶
- get_backbone_frames(protein)¶
- get_chi_angles(protein)¶
- get_chi_atom_indices()¶
Returns atom indices needed to compute chi angles for all residue types.
- Returns:
A tensor of shape [residue_types=21, chis=4, atoms=4]. The residue types are in the order specified in rc.restypes + unknown residue type at the end. For chi angles which are not defined on the residue, the positions indices are by default set to 0.
- make_all_atom_aatype(protein)¶
- make_atom14_masks(protein)¶
Construct denser atom positions (14 dimensions instead of 37).
- make_atom14_masks_np(batch)¶
- make_atom14_positions(protein)¶
Constructs denser atom positions (14 dimensions instead of 37).
- make_fixed_size(protein, shape_schema, msa_cluster_size, extra_msa_size, num_res=0, num_templates=0)¶
Guess at the MSA and sequence dimension to make fixed size.
- make_hhblits_profile(protein)¶
Compute the HHblits MSA profile if not already present.
- make_masked_msa(protein, config, replace_fraction, seed)¶
Create data for BERT on raw MSA.
- make_msa_feat(protein)¶
Create and concatenate MSA features.
- make_msa_mask(protein)¶
Mask features are all ones, but will later be zero-padded.
- make_one_hot(x, num_classes)¶
- make_pseudo_beta(protein, prefix='')¶
Create pseudo-beta (alpha for glycine) position and mask.
- make_seq_mask(protein)¶
- make_template_mask(protein)¶
- nearest_neighbor_clusters(protein, gap_agreement_weight=0.0)¶
- pseudo_beta_fn(aatype, all_atom_positions, all_atom_mask)¶
Create pseudo beta features.
- random_crop_to_size(protein, crop_size, max_templates, shape_schema, subsample_templates=False, seed=None)¶
Crop randomly to crop_size, or keep as is if shorter than that.
- randomly_replace_msa_with_unknown(protein, replace_proportion)¶
Replace a portion of the MSA with ‘X’.
- sample_msa(protein, max_seq, keep_extra, seed=None)¶
Sample MSA randomly, remaining sequences are stored are stored as extra_*.
- sample_msa_distillation(protein, max_seq)¶
- select_feat(protein, feature_list)¶
- shaped_categorical(probs, epsilon=1e-10)¶
- squeeze_features(protein)¶
Remove singleton and repeated dimensions in protein features.
- summarize_clusters(protein)¶
Produce profile and deletion_matrix_mean within each cluster.
- unsorted_segment_sum(data, segment_ids, num_segments)¶
Computes the sum along segments of a tensor. Similar to tf.unsorted_segment_sum, but only supports 1-D indices.
- Parameters:
data – A tensor whose segments are to be summed.
segment_ids – The 1-D segment indices tensor.
num_segments – The number of segments.
- Returns:
A tensor of same data type as the data argument.