RegexPattern
Represents a combinator structure for building more complex regexes
It might be worth working with this combinator structure in a lazy fashion so that we can drill down into the expression structure… that way we can define a sort-of Regex calculus that we can use to build up higher order regexes but still be able to recursively inspect subparts?
__init__(self, pat, name=None, children=None, parents=None, dtype=None, repetitions=None, key=None, joiner='', join_function=None, wrapper_function=None, suffix=None, prefix=None, parser=None, handler=None, default_value=None, capturing=None, allow_inner_captures=False):
pat
:str | callable
name
:str
dtype
:Any
repetitions
:Any
key
:Any
joiner
:Any
children
:Any
parents
:Any
wrapper_function
:Any
suffix
:Any
prefix
:Any
parser
:Any
handler
:Any
capturing
:Any
allow_inner_captures
:Any
@property
pat(self):
@property
children(self):
:returns
:tuple[RegexPattern]
@property
child_count(self):
:returns
:int
@property
child_map(self):
Returns the map to subregexes for named regex components
:returns
:Dict[str, RegexPattern]
@property
parents(self):
:returns
:tuple[RegexPattern]
@property
joiner(self):
:returns
:str
@property
join_function(self):
:returns
:function
@property
suffix(self):
:returns
:str | RegexPattern
@property
prefix(self):
:returns
:str | RegexPattern
@property
dtype(self):
Returns the StructuredType for the matched object
The basic thing we do is build the type from the contained child dtypes The process effectively works like this: If there’s a single object, we use its dtype no matter what Otherwise, we add together our type objects one by one, allowing the StructuredType to handle the calculus
After we’ve built our raw types, we compute the shape on top of these, using the assigned repetitions object One thing I realize now I failed to do is to include the effects of sub-repetitions… only a single one will ever get called.
:returns
:None | StructuredType
@property
is_repeating(self):
@property
capturing(self):
get_capturing_groups(self, allow_inners=None):
We walk down the tree to find the children with capturing groups in them and then find the outermost RegexPattern for those unless allow_inners is on in which case we pull them all
@property
captures(self):
Subtly different from capturing n that it will tell us if we need to use the group in post-processing, essentially
:returns
:_
@property
capturing_groups(self):
Returns the capturing children for the pattern
:returns
:_
@property
named_groups(self):
Returns the named children for the pattern
:returns
:_
combine(self, other, *args, **kwargs):
Combines self and other
other
:RegexPattern | str
:returns
:str | callable
wrap(self, *args, **kwargs):
Applies wrapper function
build(self, joiner=None, prefix=None, suffix=None, recompile=True, no_captures=False, verbose=False):
@property
compiled(self):
add_parent(self, parent):
remove_parent(self, parent):
add_child(self, child):
add_children(self, children):
remove_child(self, child):
insert_child(self, index, child):
invalidate_cache(self):
__copy__(self):
__add__(self, other):
Combines self and other
other
:RegexPattern
:returns
:_
__radd__(self, other):
Combines self and other
other
:RegexPattern
:returns
:_
__call__(self, other, *args, name=None, dtype=None, repetitions=None, key=None, joiner=None, join_function=None, wrap_function=None, suffix=None, prefix=None, multiline=None, parser=None, handler=None, capturing=None, default=None, allow_inner_captures=None, **kwargs):
Wraps self around other
other
:RegexPattern
:returns
:_
__repr__(self):
__str__(self):
__getitem__(self, item):
match(self, txt):
search(self, txt):
findall(self, txt):
finditer(self, txt):