Extracting Contacts Into a Tabular Format¶
Lahuta can extract contacts into a tabular format. This is useful for further analysis and visualization. The NeighborPairs
class has a method called to_frame
that can be used to extract contacts into a Pandas DataFrame. The following example shows how to extract contacts into a DataFrame and then save them to a CSV file.
Example - Extracting Contacts Into a Tabular Format
# Extracting contacts into a DataFrame
df = ns.to_frame() # (1)!
# Saving the DataFrame to a CSV file
df.to_csv("contacts.csv", index=False) # (2)!
- The
to_frame
method is used to extract contacts into a Pandas DataFrame. - The
to_csv
method is used to save the DataFrame to a CSV file.
The following table shows the first 20 rows of the CSV file generated by the above code:
partner1_resids | partner1_resnames | partner1_names | partner1_indices | partner2_resids | partner2_resnames | partner2_names | partner2_indices | distances |
---|---|---|---|---|---|---|---|---|
7 | TYR | CD2 | 133 | 73 | PHE | CG | 1076 | 3.96418 |
7 | TYR | CD2 | 133 | 73 | PHE | CD1 | 1077 | 3.6598 |
7 | TYR | CE2 | 135 | 73 | PHE | CG | 1076 | 3.87264 |
7 | TYR | CE2 | 135 | 73 | PHE | CD1 | 1077 | 3.23808 |
7 | TYR | CE2 | 135 | 73 | PHE | CE1 | 1079 | 3.24311 |
7 | TYR | CE2 | 135 | 73 | PHE | CZ | 1081 | 3.82791 |
7 | TYR | CZ | 136 | 73 | PHE | CD1 | 1077 | 3.94162 |
7 | TYR | CZ | 136 | 73 | PHE | CE1 | 1079 | 3.70295 |
15 | HIS | CD2 | 250 | 90 | HEC | NB | 1190 | 3.49992 |
15 | HIS | CD2 | 250 | 90 | HEC | C4B | 1194 | 3.78276 |
15 | HIS | CD2 | 250 | 90 | HEC | ND | 1206 | 3.87808 |
15 | HIS | CE1 | 251 | 90 | HEC | NB | 1190 | 3.75112 |
15 | HIS | CE1 | 251 | 90 | HEC | ND | 1206 | 3.389 |
15 | HIS | CE1 | 251 | 90 | HEC | C4D | 1210 | 3.88592 |
15 | HIS | NE2 | 252 | 90 | HEC | NB | 1190 | 2.84992 |
15 | HIS | NE2 | 252 | 90 | HEC | C1B | 1191 | 3.58843 |
15 | HIS | NE2 | 252 | 90 | HEC | C4B | 1194 | 3.61449 |
15 | HIS | NE2 | 252 | 90 | HEC | ND | 1206 | 2.85429 |
15 | HIS | NE2 | 252 | 90 | HEC | C1D | 1207 | 3.55377 |
15 | HIS | NE2 | 252 | 90 | HEC | C4D | 1210 | 3.77193 |
Note
Note that to_frame
does not automatically add a label to the type of contact. This is intentional!
Compact DataFrame¶
Example - Compact DataFrame
from lahuta import Luni
# Extracting contacts into a DataFrame
df = ns.to_frame(df_format="compact") # (1)!
df_format
can be either "compact" or "expanded". The latter is the default.
The following table shows the first 20 rows of the compact DataFrame generated by the above code:
partner1 | partner2 | distances |
---|---|---|
7-TYR-CD2-133 | 73-PHE-CG-1076 | 3.96418 |
7-TYR-CD2-133 | 73-PHE-CD1-1077 | 3.6598 |
7-TYR-CE2-135 | 73-PHE-CG-1076 | 3.87264 |
7-TYR-CE2-135 | 73-PHE-CD1-1077 | 3.23808 |
7-TYR-CE2-135 | 73-PHE-CE1-1079 | 3.24311 |
7-TYR-CE2-135 | 73-PHE-CZ-1081 | 3.82791 |
7-TYR-CZ-136 | 73-PHE-CD1-1077 | 3.94162 |
7-TYR-CZ-136 | 73-PHE-CE1-1079 | 3.70295 |
15-HIS-CD2-250 | 90-HEC-NB-1190 | 3.49992 |
15-HIS-CD2-250 | 90-HEC-C4B-1194 | 3.78276 |
15-HIS-CD2-250 | 90-HEC-ND-1206 | 3.87808 |
15-HIS-CE1-251 | 90-HEC-NB-1190 | 3.75112 |
15-HIS-CE1-251 | 90-HEC-ND-1206 | 3.389 |
15-HIS-CE1-251 | 90-HEC-C4D-1210 | 3.88592 |
15-HIS-NE2-252 | 90-HEC-NB-1190 | 2.84992 |
15-HIS-NE2-252 | 90-HEC-C1B-1191 | 3.58843 |
15-HIS-NE2-252 | 90-HEC-C4B-1194 | 3.61449 |
15-HIS-NE2-252 | 90-HEC-ND-1206 | 2.85429 |
15-HIS-NE2-252 | 90-HEC-C1D-1207 | 3.55377 |
15-HIS-NE2-252 | 90-HEC-C4D-1210 | 3.77193 |
Adding Annotations¶
Some types of contacts, mainly plane-plane
contacts, require additional information to be displayed. This is handled internally, but to dispaly this information in the DataFrame, you need to use the annotation
argument. The following example shows how to add annotations to the DataFrame.
Example - Adding Annotations
from lahuta import Luni
# Extracting contacts into a DataFrame
df = ns.to_frame(annotation=True) # (1)!
annotation
is set toTrue
to add annotations to the DataFrame.
The following table shows the DataFrame generated by the above code:
partner1_resids | partner1_resnames | partner1_names | partner1_indices | partner2_resids | partner2_resnames | partner2_names | partner2_indices | distances | theta_angles | normal_angles | ring1_atoms | ring2_atoms | contact_labels |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
7 | TYR | CD1 | 132 | 73 | PHE | CD1 | 1077 | 4.74552 | 61.2032 | 27.9044 | [132, 133, 134, 135, 136, 137] | [1077, 1078, 1079, 1080, 1081, 1082] | EE |
15 | HIS | ND1 | 249 | 90 | HEC | C1B | 1191 | 4.43278 | 41.4592 | 83.7813 | [1191, 1192, 1193, 1194, 1195] | [249, 250, 251, 252, 253] | OE |
15 | HIS | ND1 | 249 | 90 | HEC | C1D | 1207 | 4.54436 | 49.4147 | 89.0591 | [1207, 1208, 1209, 1210, 1211] | [249, 250, 251, 252, 253] | OE |
31 | TRP | CD1 | 458 | 31 | TRP | NE1 | 460 | 2.18248 | 89.4365 | 4.1208 | [458, 459, 460, 461, 462] | [460, 462, 463, 464, 465, 466] | EE |
Notice how now we get additional columns for the theta_angles
, normal_angles
, ring1_atoms
, ring2_atoms
, and contact_labels
. This tells us exactly the atoms involved in the contact along with the angles between the two planes and the atoms involved in the two rings. Further, we get a label for the contact type.
Note
See the API for plane-plane contacts for more information.
to_frame
API¶
Convert the NeighborPairs object to a pandas DataFrame.
The method provides two formatting options. The 'compact' format contains two columns
for atom indices and one column for distances. The 'expanded' format contains four columns
for atom indices (two columns for each atom pair) and one column for distances.
If annotations
is True, the resulting DataFrame will also include annotation columns.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df_format |
str
|
The format of the DataFrame. It can be either "compact" or "expanded". Defaults to "expanded". |
'expanded'
|
annotations |
bool
|
Whether to include annotations in the DataFrame. Defaults to False. |
False
|
Returns:
Type | Description |
---|---|
DataFrame
|
A pandas DataFrame containing the atom pairs and their distances. |
Source code in lahuta/core/neighbors.py
661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 |
|
Adding Annotations¶
Add annotations to the existing NeighborPairs object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
annotations |
dict[str, NDArray[Any]]
|
A dictionary containing the annotations to be added. |
required |
Source code in lahuta/core/neighbors.py
650 651 652 653 654 655 656 657 658 659 |
|