openfold.data.msa_pairing¶
Pairing logic for multimer data pipeline.
Functions
|
Like scipy.linalg.block_diag but with an optional padding value. |
|
Returns the original chains with paired NUM_SEQ features. |
|
Removes unpaired sequences which duplicate a paired sequence. |
|
Merges features for multiple chains to single FeatureDict. |
|
Add a 'padding' row at the end of the features list. |
|
Returns indices for paired MSA sequences across chains. |
|
Creates a list of indices of paired MSA rows across chains. |
- block_diag(*arrs, pad_value=0.0)¶
Like scipy.linalg.block_diag but with an optional padding value.
- create_paired_features(chains)¶
Returns the original chains with paired NUM_SEQ features.
- deduplicate_unpaired_sequences(np_chains)¶
Removes unpaired sequences which duplicate a paired sequence.
- merge_chain_features(np_chains_list, pair_msa_sequences, max_templates)¶
Merges features for multiple chains to single FeatureDict.
- Parameters:
- Returns:
Single FeatureDict for entire complex.
- Return type:
- pad_features(feature, feature_name)¶
Add a ‘padding’ row at the end of the features list.
The padding row will be selected as a ‘paired’ row in the case of partial alignment - for the chain that doesn’t have paired alignment.
- pair_sequences(examples)¶
Returns indices for paired MSA sequences across chains.
- reorder_paired_rows(all_paired_msa_rows_dict)¶
Creates a list of indices of paired MSA rows across chains.
- Parameters:
all_paired_msa_rows_dict (Dict[int, ndarray]) – a mapping from the number of paired chains to the paired indices.
- Returns:
a list of lists, each containing indices of paired MSA rows across chains. The paired-index lists are ordered by:
the number of chains in the paired alignment, i.e, all-chain pairings will come first.
e-values
- Return type: