Matchers
lahuta.core.matchers ¶
The SMARTS pattern matching classes are used to match SMARTS patterns to atoms in a molecule. This is how we assign atom types to molecules.
Classes:
Name | Description |
---|---|
SmartsMatcherBase |
Abstract base class for SMARTS pattern matching. |
SmartsMatcher |
Sequential SMARTS pattern matching. |
ParallelSmartsMatcher |
Parallel SMARTS pattern matching. |
SmartsMatcherBase ¶
Bases: ABC
A base class for different implementations of SMARTS pattern matching on molecules.
This abstract class needs to be inherited by any class that implements SMARTS pattern matching. The subclass must implement the compute method.
Source code in lahuta/core/matchers.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
|
compute
abstractmethod
¶
compute(mol)
Abstract method for SMARTS pattern matching.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mol |
MolType
|
A molecule object to match patterns on. |
required |
Raises:
Type | Description |
---|---|
NotImplementedError
|
This is an abstract method that needs to be implemented in the subclass. |
Returns:
Type | Description |
---|---|
dok_matrix
|
A sparse matrix of atom types that match the SMARTS patterns in the given molecule. |
Source code in lahuta/core/matchers.py
37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
|
SmartsMatcher ¶
Bases: SmartsMatcherBase
Matches SMARTS patterns to atoms in a molecule.
This class performs sequential SMARTS pattern matching on atoms in a molecule. It inherits from the SmartsMatcherBase abstract base class.
Source code in lahuta/core/matchers.py
53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 |
|
compute ¶
compute(mol)
Perform SMARTS pattern matching on a molecule.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mol |
MolType
|
A molecule object to match patterns on. |
required |
Returns:
Type | Description |
---|---|
dok_matrix
|
A sparse matrix of atom types that match the SMARTS patterns in the given molecule. |
Source code in lahuta/core/matchers.py
60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 |
|
ParallelSmartsMatcher ¶
Bases: SmartsMatcherBase
Matches SMARTS patterns to atoms in a molecule using multiple threads.
This class performs SMARTS pattern matching on atoms in a molecule using multiple threads for improved performance. It inherits from the SmartsMatcherBase abstract base class.
Source code in lahuta/core/matchers.py
88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 |
|
precompute_ob_smarts ¶
precompute_ob_smarts()
Precompute and stores the Open Babel SMARTS patterns for all atom types.
Returns:
Type | Description |
---|---|
dict[str, list[ObSmartPatternType]]
|
A dictionary with atom type names as keys and lists of precomputed Open Babel SMARTS patterns as values. |
Source code in lahuta/core/matchers.py
98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 |
|
match_ob_smarts ¶
match_ob_smarts(ob_smart, mol, atypes, atom_type)
Match an Open Babel SMARTS pattern to a molecule.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
ob_smart |
ObSmartPatternType
|
An Open Babel SMARTS pattern. |
required |
mol |
MolType
|
A molecule object to match the pattern on. |
required |
atypes |
dict[str, int]
|
A dictionary of atom types. |
required |
atom_type |
str
|
The name of the atom type that the SMARTS pattern represents. |
required |
Returns:
Type | Description |
---|---|
list[tuple[Any, int]]
|
list[tuple[Any, int]]: A list of tuples, where each tuple contains the matched atom's index and the corresponding atom type. |
Source code in lahuta/core/matchers.py
115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 |
|
compute ¶
compute(mol)
Perform SMARTS pattern matching on a molecule using multiple threads.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mol |
MolType
|
A molecule object to match patterns on. |
required |
Returns:
Type | Description |
---|---|
dok_matrix
|
A sparse matrix of atom types that match the SMARTS patterns in the given molecule. |
Source code in lahuta/core/matchers.py
138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 |
|