Skip to content

Atom Type Definitions

Provide string representations of various atom types and categories.

The various types of atoms represented here include metals, standard amino acids, different types of bond acceptors and donors, ionisable atoms, hydrophobic atoms, carbonyl atoms, and aromatic atoms.

Variables
_METALS_STR (str): A string of metal atom types.
_STANDARD_AA_STR (str): A string of standard amino acid atom types.
_HA_ATOM_TYPES (str): A string of hydrogen bond acceptor atom types.
_HD_ATOM_TYPES (str): A string of hydrogen bond donor atom types.
_XA_ATOM_TYPES (str): A string of halogen bond acceptor atom types.
_XD_ATOM_TYPES (str): A string of halogen bond donor atom types.
_WHA_ATOM_TYPES (str): A string of weak hydrogen bond acceptor atom types.
_WHD_ATOM_TYPES (str): A string of weak hydrogen bond donor atom types.
_POS_IONISABLE_ATOM_TYPES (str): A string of positive ionisable atom types.
_NEG_IONISABLE_ATOM_TYPES (str): A string of negative ionisable atom types.
_HYDROPHOBE_ATOM_TYPES (str): A string of hydrophobic atom types.
_CARBONYL_OXYGEN_ATOM_TYPES (str): A string of carbonyl oxygen atom types.
_CARBONYL_CARBON_ATOM_TYPES (str): A string of carbonyl carbon atom types.
_AROMATIC_ATOM_TYPES (str): A string of aromatic atom types.
RESIDUE_SYNONYMS (dict): A dictionary of residue names and their synonyms.
STANDARD_AMINO_ACIDS (set): A set of standard amino acids.

The atom types are initially defined as comma-separated strings. These strings are then processed into sets for easy and efficient access throughout the rest of the library.

_METALS_STR module-attribute

_METALS_STR = upper()

_HA_ATOM_TYPES module-attribute

_HA_ATOM_TYPES = 'ALAO,ARGO,ASNO,ASPO,CYSO,GLNO,GLUO,GLYO,HISO,ILEO,LEUO,LYSO,METO,PHEO,PROO,SERO,THRO,TRPO,TYRO,VALO,ALAOXT,ARGOXT,ASNOXT,ASPOXT,CYSOXT,GLNOXT,GLUOXT,GLYOXT,HISOXT,ILEOXT,LEUOXT,LYSOXT,METOXT,PHEOXT,PROOXT,SEROXT,THROXT,TRPOXT,TYROXT,VALOXT,ASNOD1,ASNND2,ASPOD1,ASPOD2,GLNOE1,GLNNE2,GLUOE1,GLUOE2,HISND1,HISCE1,HISNE2,HISCD2,METSD,CYSSG,SEROG,THROG1,TYROH'

_HD_ATOM_TYPES module-attribute

_HD_ATOM_TYPES = 'ALAN,ARGN,ASNN,ASPN,CYSN,GLNN,GLUN,GLYN,HISN,ILEN,LEUN,LYSN,METN,PHEN,SERN,THRN,TRPN,TYRN,VALN,ARGNE,ARGNH1,ARGNH2,ASNND2,ASNOD1,CYSSG,GLNNE2,GLNOE1,HISND1,HISCE1,HISNE2,HISCD2,LYSNZ,SEROG,THROG1,TRPNE1,TYROH'

_XA_ATOM_TYPES module-attribute

_XA_ATOM_TYPES = 'ALAO,ARGO,ASNO,ASPO,CYSO,GLNO,GLUO,GLYO,HISO,ILEO,LEUO,LYSO,METO,PHEO,PROO,SERO,THRO,TRPO,TYRO,VALO,ALAOXT,ARGOXT,ASNOXT,ASPOXT,CYSOXT,GLNOXT,GLUOXT,GLYOXT,HISOXT,ILEOXT,LEUOXT,LYSOXT,METOXT,PHEOXT,PROOXT,SEROXT,THROXT,TRPOXT,TYROXT,VALOXT,ASNOD1,ASNND2,ASPOD1,ASPOD2,GLNOE1,GLNNE2,GLUOE1,GLUOE2,HISND1,HISCE1,HISNE2,HISCD2,METSD,CYSSG,SEROG,THROG1,TYROH'

_XD_ATOM_TYPES module-attribute

_XD_ATOM_TYPES = ''

_WHA_ATOM_TYPES module-attribute

_WHA_ATOM_TYPES = 'ALAO,ARGO,ASNO,ASPO,CYSO,GLNO,GLUO,GLYO,HISO,ILEO,LEUO,LYSO,METO,PHEO,PROO,SERO,THRO,TRPO,TYRO,VALO,ALAOXT,ARGOXT,ASNOXT,ASPOXT,CYSOXT,GLNOXT,GLUOXT,GLYOXT,HISOXT,ILEOXT,LEUOXT,LYSOXT,METOXT,PHEOXT,PROOXT,SEROXT,THROXT,TRPOXT,TYROXT,VALOXT,ASNOD1,ASNND2,ASPOD1,ASPOD2,GLNOE1,GLNNE2,GLUOE1,GLUOE2,HISND1,HISCE1,HISNE2,HISCD2,METSD,CYSSG,SEROG,THROG1,TYROH'

_WHD_ATOM_TYPES module-attribute

_WHD_ATOM_TYPES = 'ALACA,ARGCA,ASNCA,ASPCA,CYSCA,GLNCA,GLUCA,GLYCA,HISCA,ILECA,LEUCA,LYSCA,METCA,PHECA,PROCA,SERCA,THRCA,TRPCA,TYRCA,VALCA,ALACB,ARGCB,ARGCG,ARGCD,ASNCB,ASPCB,CYSCB,GLNCB,GLNCG,GLUCB,GLUCG,GLNCB,HISCB,ILECB,ILECG1,ILECD1,ILECG2,LEUCB,LEUCG,LEUCD1,LEUCD2,LYSCB,LYSCG,LYSCD,LYSCE,METCB,METCG,METCE,PHECB,PHECG,PHECD1,PHECD2,PHECE1,PHECE2,PHECZ,PROCB,PROCG,PROCD,SERCB,THRCB,THRCG2,TRPCB,TRPCD1TRPCE3,TRPCZ3,TRPCH2,TRPCZ2,TYRCB,TYRCD1,TYRCD2,TYRCE1,TYRCE2,TRYCB,VALCB,VALCG1,VALCG2'

_POS_IONISABLE_ATOM_TYPES module-attribute

_POS_IONISABLE_ATOM_TYPES = 'ARGNE,ARGCZ,ARGNH1,ARGNH2,HISCG,HISND1,HISCE1,HISNE2,HISCD2,LYSNZ'

_NEG_IONISABLE_ATOM_TYPES module-attribute

_NEG_IONISABLE_ATOM_TYPES = 'ASPOD1,ASPOD2,GLUOE1,GLUOE2'

_HYDROPHOBE_ATOM_TYPES module-attribute

_HYDROPHOBE_ATOM_TYPES = 'ALACB,ARGCB,ARGCG,ASNCB,ASPCB,CYSCB,GLNCB,GLNCG,GLUCB,GLUCG,GLNCB,HISCB,ILECB,ILECG1,ILECD1,ILECG2,LEUCB,LEUCG,LEUCD1,LEUCD2,LYSCB,LYSCG,LYSCD,METCB,METCG,METSD,METCE,PHECB,PHECG,PHECD1,PHECD2,PHECE1,PHECE2,PHECZ,PROCB,PROCG,THRCG2,TRPCB,TRPCG,TRPCD2,TRPCE3,TRPCZ3,TRPCH2,TRPCZ2,TRYCB,TYRCG,TYRCD1,TYRCD2,TYRCE1,TYRCE2,VALCB,VALCG1,VALCG2'

_CARBONYL_OXYGEN_ATOM_TYPES module-attribute

_CARBONYL_OXYGEN_ATOM_TYPES = 'ALAO,ARGO,ASNO,ASPO,CYSO,GLNO,GLUO,GLYO,HISO,ILEO,LEUO,LYSO,METO,PHEO,PROO,SERO,THRO,TRPO,TYRO,VALO'

_CARBONYL_CARBON_ATOM_TYPES module-attribute

_CARBONYL_CARBON_ATOM_TYPES = 'ALAC,ARGC,ASNC,ASPC,CYSC,GLNC,GLUC,GLYC,HISC,ILEC,LEUC,LYSC,METC,PHEC,PROC,SERC,THRC,TRPC,TYRC,VALC'

_AROMATIC_ATOM_TYPES module-attribute

_AROMATIC_ATOM_TYPES = 'HISCG,HISND1,HISCE1,HISNE2,HISCD2,PHECG,PHECD1,PHECD2,PHECE1,PHECE2,PHECZ,TRPCG,TRPCD1,TRPCD2,TRPNE1,TRPCE2,TRPCE3,TRPCZ2,TRPCZ3,TRPCH2,TYRCG,TYRCD1,TYRCD2,TYRCE1,TYRCE2,TYRCZ'

RESIDUE_SYNONYMS module-attribute

RESIDUE_SYNONYMS = {'HIS': ['HIS', 'HSD', 'HSE', 'HSP', 'HIE', 'HIP', 'HID', 'HIS1', 'HIS2', 'HISA', 'HISB', 'HISD', 'HISE', 'HISH', 'HYP'], 'PHE': ['PHE'], 'TYR': ['TYR'], 'ALA': ['ALA', 'ALAD'], 'ARG': ['ARG', 'ARGN'], 'ASN': ['ASN', 'ASN1'], 'ASP': ['ASP', 'ASPH'], 'CYS': ['CYS', 'CYS1', 'CYS2', 'CYSH', 'CYX', 'CYM'], 'GLN': ['GLN', 'GLH'], 'GLU': ['GLU', 'GLUH'], 'GLY': ['GLY'], 'ILE': ['ILE'], 'LEU': ['LEU'], 'LYS': ['LYS', 'LYSH', 'LYN'], 'MET': ['MET'], 'PRO': ['PRO'], 'SER': ['SER'], 'THR': ['THR'], 'TRP': ['TRP'], 'VAL': ['VAL']}

STANDARD_AMINO_ACIDS module-attribute

STANDARD_AMINO_ACIDS = prot_res

module: lahuta.config._atom_type_strings.py

Type: set[str]: A set of standard amino acids. Taken from MDAnalysis.core.selection.ProteinSelection.

parse_atom_types_string

parse_atom_types_string(_atom_types_string)

Parse a string of atom types into a dictionary of residue names and atom parts.

Parameters:

Name Type Description Default
_atom_types_string str

A string of atom types.

required

Returns:

Type Description
dict[str, list[str]]

dict[str, list[str]]: A dictionary of residue names and atom parts.

Source code in lahuta/config/_atom_type_strings.py
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
def parse_atom_types_string(_atom_types_string: str) -> dict[str, list[str]]:
    """Parse a string of atom types into a dictionary of residue names and atom parts.

    Args:
        _atom_types_string (str): A string of atom types.

    Returns:
        dict[str, list[str]]: A dictionary of residue names and atom parts.
    """
    atom_parts: dict[str, list[str]] = {}
    for atom_type in _atom_types_string.split(","):
        residue_name = atom_type[:3]
        atom_part = atom_type[3:]

        if residue_name not in atom_parts:
            atom_parts[residue_name] = []
        atom_parts[residue_name].append(atom_part)

    return atom_parts

parse_atom_types

parse_atom_types(_atom_types_string)

Parse a string of atom types into a set of atom types.

Parameters:

Name Type Description Default
_atom_types_string str

A string of atom types.

required

Returns:

Type Description
set[str]

set[str]: A set of atom types.

Source code in lahuta/config/_atom_type_strings.py
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
def parse_atom_types(_atom_types_string: str) -> set[str]:
    """Parse a string of atom types into a set of atom types.

    Args:
        _atom_types_string (str): A string of atom types.

    Returns:
        set[str]: A set of atom types.
    """
    res_atoms = parse_atom_types_string(_atom_types_string)

    atom_types: set[str] = set()
    for residue, synonyms in RESIDUE_SYNONYMS.items():
        for synonym in synonyms:
            atom_parts = res_atoms.get(residue)
            if atom_parts is None:
                continue
            for atom_part in atom_parts:
                atom_types.add(synonym + atom_part)

    return atom_types