Extracting Contacts Into a Tabular Format¶

Lahuta can extract contacts into a tabular format. This is useful for further analysis and visualization. The NeighborPairs class has a method called to_frame that can be used to extract contacts into a Pandas DataFrame. The following example shows how to extract contacts into a DataFrame and then save them to a CSV file.

Example - Extracting Contacts Into a Tabular Format

# Extracting contacts into a DataFrame
df = ns.to_frame() # (1)!

# Saving the DataFrame to a CSV file
df.to_csv("contacts.csv", index=False) # (2)!

The to_frame method is used to extract contacts into a Pandas DataFrame.
The to_csv method is used to save the DataFrame to a CSV file.

The following table shows the first 20 rows of the CSV file generated by the above code:

partner1_resids	partner1_resnames	partner1_names	partner1_indices	partner2_resids	partner2_resnames	partner2_names	partner2_indices	distances
7	TYR	CD2	133	73	PHE	CG	1076	3.96418
7	TYR	CD2	133	73	PHE	CD1	1077	3.6598
7	TYR	CE2	135	73	PHE	CG	1076	3.87264
7	TYR	CE2	135	73	PHE	CD1	1077	3.23808
7	TYR	CE2	135	73	PHE	CE1	1079	3.24311
7	TYR	CE2	135	73	PHE	CZ	1081	3.82791
7	TYR	CZ	136	73	PHE	CD1	1077	3.94162
7	TYR	CZ	136	73	PHE	CE1	1079	3.70295
15	HIS	CD2	250	90	HEC	NB	1190	3.49992
15	HIS	CD2	250	90	HEC	C4B	1194	3.78276
15	HIS	CD2	250	90	HEC	ND	1206	3.87808
15	HIS	CE1	251	90	HEC	NB	1190	3.75112
15	HIS	CE1	251	90	HEC	ND	1206	3.389
15	HIS	CE1	251	90	HEC	C4D	1210	3.88592
15	HIS	NE2	252	90	HEC	NB	1190	2.84992
15	HIS	NE2	252	90	HEC	C1B	1191	3.58843
15	HIS	NE2	252	90	HEC	C4B	1194	3.61449
15	HIS	NE2	252	90	HEC	ND	1206	2.85429
15	HIS	NE2	252	90	HEC	C1D	1207	3.55377
15	HIS	NE2	252	90	HEC	C4D	1210	3.77193

Note

Note that to_frame does not automatically add a label to the type of contact. This is intentional!

Compact DataFrame¶

Example - Compact DataFrame

from lahuta import Luni

# Extracting contacts into a DataFrame
df = ns.to_frame(df_format="compact") # (1)!

df_format can be either "compact" or "expanded". The latter is the default.

The following table shows the first 20 rows of the compact DataFrame generated by the above code:

partner1	partner2	distances
7-TYR-CD2-133	73-PHE-CG-1076	3.96418
7-TYR-CD2-133	73-PHE-CD1-1077	3.6598
7-TYR-CE2-135	73-PHE-CG-1076	3.87264
7-TYR-CE2-135	73-PHE-CD1-1077	3.23808
7-TYR-CE2-135	73-PHE-CE1-1079	3.24311
7-TYR-CE2-135	73-PHE-CZ-1081	3.82791
7-TYR-CZ-136	73-PHE-CD1-1077	3.94162
7-TYR-CZ-136	73-PHE-CE1-1079	3.70295
15-HIS-CD2-250	90-HEC-NB-1190	3.49992
15-HIS-CD2-250	90-HEC-C4B-1194	3.78276
15-HIS-CD2-250	90-HEC-ND-1206	3.87808
15-HIS-CE1-251	90-HEC-NB-1190	3.75112
15-HIS-CE1-251	90-HEC-ND-1206	3.389
15-HIS-CE1-251	90-HEC-C4D-1210	3.88592
15-HIS-NE2-252	90-HEC-NB-1190	2.84992
15-HIS-NE2-252	90-HEC-C1B-1191	3.58843
15-HIS-NE2-252	90-HEC-C4B-1194	3.61449
15-HIS-NE2-252	90-HEC-ND-1206	2.85429
15-HIS-NE2-252	90-HEC-C1D-1207	3.55377
15-HIS-NE2-252	90-HEC-C4D-1210	3.77193

Adding Annotations¶

Some types of contacts, mainly plane-plane contacts, require additional information to be displayed. This is handled internally, but to dispaly this information in the DataFrame, you need to use the annotation argument. The following example shows how to add annotations to the DataFrame.

Example - Adding Annotations

from lahuta import Luni

# Extracting contacts into a DataFrame
df = ns.to_frame(annotation=True) # (1)!

annotation is set to True to add annotations to the DataFrame.

The following table shows the DataFrame generated by the above code:

partner1_resids	partner1_resnames	partner1_names	partner1_indices	partner2_resids	partner2_resnames	partner2_names	partner2_indices	distances	theta_angles	normal_angles	ring1_atoms	ring2_atoms	contact_labels
7	TYR	CD1	132	73	PHE	CD1	1077	4.74552	61.2032	27.9044	[132, 133, 134, 135, 136, 137]	[1077, 1078, 1079, 1080, 1081, 1082]	EE
15	HIS	ND1	249	90	HEC	C1B	1191	4.43278	41.4592	83.7813	[1191, 1192, 1193, 1194, 1195]	[249, 250, 251, 252, 253]	OE
15	HIS	ND1	249	90	HEC	C1D	1207	4.54436	49.4147	89.0591	[1207, 1208, 1209, 1210, 1211]	[249, 250, 251, 252, 253]	OE
31	TRP	CD1	458	31	TRP	NE1	460	2.18248	89.4365	4.1208	[458, 459, 460, 461, 462]	[460, 462, 463, 464, 465, 466]	EE

Notice how now we get additional columns for the theta_angles, normal_angles, ring1_atoms, ring2_atoms, and contact_labels. This tells us exactly the atoms involved in the contact along with the angles between the two planes and the atoms involved in the two rings. Further, we get a label for the contact type.

Note

See the API for plane-plane contacts for more information.

`to_frame` API¶

Convert the NeighborPairs object to a pandas DataFrame.

The method provides two formatting options. The 'compact' format contains two columns for atom indices and one column for distances. The 'expanded' format contains four columns for atom indices (two columns for each atom pair) and one column for distances. If annotations is True, the resulting DataFrame will also include annotation columns.

Parameters:

Name	Type	Description	Default
`df_format`	`str`	The format of the DataFrame. It can be either "compact" or "expanded". Defaults to "expanded".	`'expanded'`
`annotations`	`bool`	Whether to include annotations in the DataFrame. Defaults to False.	`False`

Returns:

Type	Description
`DataFrame`	A pandas DataFrame containing the atom pairs and their distances.

Source code in lahuta/core/neighbors.py

def to_frame(
    self,
    df_format: Literal["compact", "expanded"] = "expanded",
    annotations: bool = False,
) -> pd.DataFrame:
    """Convert the NeighborPairs object to a pandas DataFrame.

    The method provides two formatting options. The 'compact' format contains two columns
    for atom indices and one column for distances. The 'expanded' format contains four columns
    for atom indices (two columns for each atom pair) and one column for distances.
    If `annotations` is True, the resulting DataFrame will also include annotation columns.

    Args:
        df_format (str, optional): The format of the DataFrame. It can be either "compact" or "expanded".
                                    Defaults to "expanded".
        annotations (bool, optional): Whether to include annotations in the DataFrame. Defaults to False.

    Returns:
        A pandas DataFrame containing the atom pairs and their distances.
    """
    if annotations:
        return self._create_df(df_format, self.annotations)

    return self._create_df(df_format)

Adding Annotations¶

Add annotations to the existing NeighborPairs object.

Parameters:

Name	Type	Description	Default
`annotations`	`dict[str, NDArray[Any]]`	A dictionary containing the annotations to be added.	required

Source code in lahuta/core/neighbors.py

def add_annotations(self, annotations: dict[str, NDArray[Any]]) -> None:
    """Add annotations to the existing NeighborPairs object.

    Args:
        annotations (dict[str, NDArray[Any]]): A dictionary containing the annotations to be added.
    """
    for value in annotations.values():
        assert len(value) == self.pairs.shape[0]

    self._annotations.update(annotations)

Extracting Contacts Into a Tabular Format¶

Compact DataFrame¶

Adding Annotations¶

to_frame API¶

Adding Annotations¶

`to_frame` API¶