Python Backport Compiler Utilities¶
Utility library for the Python bpc backport compiler.
Currently, the three individual tools (f2format, poseur,
walrus) depend on this repo. The bpc compiler is a
work in progress.
Module contents¶
Utility library for the Python bpc backport compiler.
- exception bpc_utils.BPCInternalError(message, context)[source]¶
Bases:
RuntimeErrorInternal bug happened in BPC tools.
Initialize BPCInternalError.
- exception bpc_utils.BPCRecoveryError[source]¶
Bases:
RuntimeErrorError during file recovery.
- exception bpc_utils.BPCSyntaxError[source]¶
Bases:
SyntaxErrorSyntax error detected when parsing code.
- class bpc_utils.BaseContext(node, config, *, indent_level=0, raw=False)[source]¶
Bases:
ABCAbstract base class for general conversion context.
Initialize BaseContext.
- Parameters:
node (
NodeOrLeaf) – parso ASTconfig (
Config) – conversion configurationsindent_level (
int) – current indentation levelraw (
bool) – raw processing flag
- final __iadd__(code)[source]¶
Support of the
+=operator.If
self._prefix_or_suffixisTrue, then thecodewill be appended toself._prefix; else it will be appended toself._suffix.- Parameters:
code (
str) – code string- Returns:
self
- Return type:
- final __str__()[source]¶
Returns a stripped version of
self._buffer.- Return type:
- final _process(node)[source]¶
Recursively process parso AST.
All processing methods for a specific
nodetype are defined as_process_{type}. This method first checks if such processing method exists. If so, it will call such method on thenode; otherwise it will traverse through all children ofnode, and perform the same logic on each child.- Parameters:
node (
NodeOrLeaf) – parso AST- Return type:
- final _walk(node)[source]¶
Start traversing the AST module.
The method traverses through all children of
node. It first checks if such child has the target expression. If so, it will toggleself._prefix_or_suffix(set toFalse) and save the last previous child asself._node_before_expr. Then it processes the child withself._process.- Parameters:
node (
NodeOrLeaf) – parso AST- Return type:
- final static extract_whitespaces(code)[source]¶
Extract preceding and succeeding whitespaces from the code given.
- abstract has_expr(node)[source]¶
Check if node has the target expression.
- Parameters:
node (
NodeOrLeaf) – parso AST- Return type:
- Returns:
whether
nodehas the target expression
- final classmethod mangle(cls_name, var_name)[source]¶
Mangle variable names.
This method mangles variable names as described in Python documentation about mangling and further normalizes the mangled variable name through
normalize().
- final static missing_newlines(prefix, suffix, expected, linesep)[source]¶
Count missing blank lines for code insertion given surrounding code.
- final static normalize(name)[source]¶
Normalize variable names.
This method normalizes variable names as described in Python documentation about identifiers and PEP 3131.
- final static split_comments(code, linesep)[source]¶
Separates prefixing comments from code.
This method separates prefixing comments and suffixing code. It is rather useful when inserting code might break shebang and encoding cookies (PEP 263), etc.
- _buffer¶
Final converted result.
- _indent_level¶
Current indentation level.
- _indentation¶
Indentation sequence.
- _node_before_expr¶
Preceding node with the target expression, i.e. the insertion point.
- _prefix¶
Code before insertion point.
- _prefix_or_suffix¶
Flag to indicate whether buffer is now
self._prefix.
- _root¶
Root node given by the
nodeparameter.
- _suffix¶
Code after insertion point.
- _uuid_gen¶
UUID generator.
- config¶
Internal configurations.
- property string¶
Returns conversion buffer (
self._buffer).
- class bpc_utils.Config(**kwargs)[source]¶
Bases:
MutableMapping[str,object]Configuration namespace.
This class is inspired from
argparse.Namespacefor storing internal attributes and/or configuration variables.>>> config = Config(foo='var', bar=True) >>> config.foo 'var' >>> config['bar'] True >>> config.bar = 'boo' >>> del config['foo'] >>> config Config(bar='boo')
- class bpc_utils.Placeholder(name)[source]¶
Bases:
objectPlaceholder for string interpolation.
Placeholderobjects can be concatenated withstr, otherPlaceholderobjects andStringInterpolationobjects via the ‘+’ operator.Placeholderobjects should be regarded as immutable. Please do not modify the_nameinternal attribute. Build new objects instead.Initialize Placeholder.
- property name¶
Returns the name of this placeholder.
- class bpc_utils.StringInterpolation(*args)[source]¶
Bases:
objectA string with placeholders to be filled in.
This looks like an object-oriented format string, but making sure that string literals are always interpreted literally (so no need to manually do escaping). The boundaries between string literals and placeholders are very clear. Filling in a placeholder will never inject a new placeholder, protecting string integrity for multiple-round interpolation.
>>> s1 = '%(injected)s' >>> s2 = 'hello' >>> s = StringInterpolation('prefix ', Placeholder('q1'), ' infix ', Placeholder('q2'), ' suffix') >>> str(s % {'q1': s1} % {'q2': s2}) 'prefix %(injected)s infix hello suffix'
(This can be regarded as an improved version of
string.Template.safe_substitute().)Multiple-round interpolation is tricky to do with a traditional format string. In order to do things correctly and avoid format string injection vulnerabilities, you need to perform escapes very carefully.
>>> fs = 'prefix %(q1)s infix %(q2)s suffix' >>> fs % {'q1': s1} % {'q2': s2} Traceback (most recent call last): ... KeyError: 'q2' >>> fs = 'prefix %(q1)s infix %%(q2)s suffix' >>> fs % {'q1': s1} % {'q2': s2} Traceback (most recent call last): ... KeyError: 'injected' >>> fs % {'q1': s1.replace('%', '%%')} % {'q2': s2} 'prefix %(injected)s infix hello suffix'
StringInterpolationobjects can be concatenated withstr,Placeholderobjects and otherStringInterpolationobjects via the ‘+’ operator.StringInterpolationobjects should be regarded as immutable. Please do not modify the_literalsand_placeholdersinternal attributes. Build new objects instead.Initialize StringInterpolation.
argswill be concatenated to construct aStringInterpolationobject.>>> StringInterpolation('prefix', Placeholder('data'), 'suffix') StringInterpolation('prefix', Placeholder('data'), 'suffix')
- Parameters:
args (
Union[str,Placeholder,StringInterpolation]) – the components to construct aStringInterpolationobject
- __mod__(substitutions)[source]¶
Substitute the placeholders in this
StringInterpolationobject with string values (if possible) according to thesubstitutionsmapping.>>> StringInterpolation('prefix ', Placeholder('data'), ' suffix') % {'data': 'hello'} StringInterpolation('prefix hello suffix')
- Parameters:
substitutions (
Mapping[str,object]) – a mapping from placeholder names to the values to be filled in; all values are converted intostr- Return type:
- Returns:
a new
StringInterpolationobject with as many placeholders substituted as possible
- __str__()[source]¶
Returns the fully-substituted string interpolation result.
>>> str(StringInterpolation('prefix hello suffix')) 'prefix hello suffix'
- Return type:
- Returns:
the fully-substituted string interpolation result
- Raises:
ValueError – if there are still unsubstituted placeholders in this
StringInterpolationobject
- classmethod from_components(literals, placeholders)[source]¶
Construct a
StringInterpolationobject fromliteralsandplaceholderscomponents. This method is more efficient than theStringInterpolation()constructor, but it is mainly intended for internal use.>>> StringInterpolation.from_components( ... ('prefix', 'infix', 'suffix'), ... (Placeholder('data1'), Placeholder('data2')) ... ) StringInterpolation('prefix', Placeholder('data1'), 'infix', Placeholder('data2'), 'suffix')
- Parameters:
placeholders (
Iterable[Placeholder]) – thePlaceholdercomponents in order
- Return type:
- Returns:
the constructed
StringInterpolationobject- Raises:
TypeError – if
literalsisstr; ifliteralscontains non-strvalues; ifplaceholderscontains non-PlaceholdervaluesValueError – if the length of
literalsis not exactly one more than the length ofplaceholders
- iter_components()[source]¶
Generator to iterate all components of this
StringInterpolationobject in order.>>> list(StringInterpolation('prefix', Placeholder('data'), 'suffix').iter_components()) ['prefix', Placeholder('data'), 'suffix']
- Yields:
the components of this
StringInterpolationobject in order
- property literals¶
Returns the literal components in this
StringInterpolationobject.
- property placeholders¶
Returns the
Placeholdercomponents in thisStringInterpolationobject.
- property result¶
Alias of
StringInterpolation.__str__()to get the fully-substituted string interpolation result.>>> StringInterpolation('prefix hello suffix').result 'prefix hello suffix'
- class bpc_utils.UUID4Generator(dash=True)[source]¶
Bases:
objectUUID 4 generator wrapper to prevent UUID collisions.
Constructor of UUID 4 generator wrapper.
- Parameters:
dash (
bool) – whether the generated UUID string has dashes or not
- bpc_utils.TaskLock()[source]¶
Function that returns a lock for possibly concurrent tasks.
- Return type:
- Returns:
a lock for possibly concurrent tasks
- bpc_utils.detect_encoding(code)[source]¶
Detect encoding of Python source code as specified in PEP 263.
- Parameters:
code (
bytes) – the code to detect encoding- Return type:
- Returns:
the detected encoding, or the default encoding (
utf-8)- Raises:
SyntaxError – if both a BOM and a cookie are present, but disagree
- bpc_utils.detect_files(files)[source]¶
Get a list of Python files to be processed according to user input.
This will perform glob expansion on Windows, make all paths absolute, resolve symbolic links and remove duplicates.
- Parameters:
files (
Iterable[str]) – a list of files and directories to process (usually provided by users on command-line)- Return type:
- Returns:
a list of Python files to be processed
See also
See
expand_glob_iter()for more information.
- bpc_utils.detect_indentation(code)[source]¶
Detect indentation of Python source code.
- Parameters:
code (
Union[str,bytes,TextIO,NodeOrLeaf]) – the code to detect indentation- Return type:
- Returns:
the detected indentation sequence
- Raises:
TokenError – when failed to tokenize the source code under certain cases, see documentation of
TokenErrorfor more details
Notes
In case of mixed indentation, try voting by the number of occurrences of each indentation value (spaces and tabs).
When there is a tie between spaces and tabs, prefer 4 spaces for PEP 8.
- bpc_utils.detect_linesep(code)[source]¶
Detect linesep of Python source code.
- Parameters:
code (
Union[str,bytes,TextIO,NodeOrLeaf]) – the code to detect linesep- Returns:
the detected linesep (one of
'\n','\r\n'and'\r')- Return type:
Notes
In case of mixed linesep, try voting by the number of occurrences of each linesep value.
When there is a tie, prefer
LFtoCRLF, preferCRLFtoCR.
- bpc_utils.first_non_none(*args)[source]¶
Return the first non-
Nonevalue from a list of values.- Parameters:
*args –
variable length argument list
If one positional argument is provided, it should be an iterable of the values.
If two or more positional arguments are provided, then the value list is the positional argument list.
- Returns:
the first non-
Nonevalue, if all values areNoneor sequence is empty, returnNone- Raises:
TypeError – if no arguments provided
- bpc_utils.first_truthy(*args)[source]¶
Return the first truthy value from a list of values.
- Parameters:
*args –
variable length argument list
If one positional argument is provided, it should be an iterable of the values.
If two or more positional arguments are provided, then the value list is the positional argument list.
- Returns:
the first truthy value, if no truthy values found or sequence is empty, return
None- Raises:
TypeError – if no arguments provided
- bpc_utils.get_parso_grammar_versions(minimum=None)[source]¶
Get Python versions that parso supports to parse grammar.
- bpc_utils.map_tasks(func, iterable, posargs=None, kwargs=None, *, processes=None, chunksize=None)[source]¶
Execute tasks in parallel if
multiprocessingis available, otherwise execute them sequentially.- Parameters:
func (
Callable[...,TypeVar(T)]) – the task function to executeposargs (
Optional[Iterable[object]]) – additional positional arguments to pass tofunckwargs (
Optional[Mapping[str,object]]) – keyword arguments to pass tofuncprocesses (
Optional[int]) – the number of worker processes (default: auto determine)
- Return type:
- Returns:
the return values of the task function applied on the input items and additional arguments
- bpc_utils.parse_boolean_state(s)[source]¶
Parse a boolean state from a string representation.
These values are regarded as
True:'1','yes','y','true','on'These values are regarded as
False:'0','no','n','false','off'
Value matching is case insensitive.
- Parameters:
s (
Optional[str]) – string representation of a boolean state- Return type:
- Returns:
- Raises:
ValueError – if
sis an invalid boolean state value
See also
See
_boolean_state_lookupfor default lookup mapping values.
- bpc_utils.parse_indentation(s)[source]¶
Parse indentation from a string representation.
If an integer or a string of positive integer
nis specified, then indentation isnspaces.If
't'or'tab'is specified, then indentation is tab.If
'\t'(the tab character itself) or a string consisting only of the space character (U+0020) is specified, it is returned directly.
Value matching is case insensitive.
- bpc_utils.parse_linesep(s)[source]¶
Parse linesep from a string representation.
These values are regarded as
'\n':'\n','lf'These values are regarded as
'\r\n':'\r\n','crlf'These values are regarded as
'\r':'\r','cr'
Value matching is case insensitive.
- Parameters:
- Returns:
the parsed linesep result, return
Noneif input isNoneor empty string- Return type:
Optional[
Linesep]- Raises:
ValueError – if
sis an invalid linesep value
See also
See
_linesep_lookupfor default lookup mapping values.
- bpc_utils.parse_positive_integer(s)[source]¶
Parse a positive integer from a string representation.
- bpc_utils.parso_parse(code, filename=None, *, version=None)[source]¶
Parse Python source code with parso.
- Parameters:
- Return type:
- Returns:
parso AST
- Raises:
BPCSyntaxError – when source code contains syntax errors
- bpc_utils.recover_files(archive_file_or_dir, *, rr=False, rs=False)[source]¶
Recover files from a tar archive, optionally removing the archive file and archive directory after recovery.
This function supports three modes:
- Normal mode (when
rrandrsare bothFalse): Recover from the archive file specified by
archive_file_or_dir.
- Normal mode (when
- Recover and remove (when
rrisTrue): Recover from the archive file specified by
archive_file_or_dir, and remove this archive file after recovery.
- Recover and remove (when
- Recover from the only file in the archive directory (when
rsisTrue): If the directory specified by
archive_file_or_dircontains exactly one (regular) file, recover from that file and remove the archive directory.
- Recover from the only file in the archive directory (when
Specifying both
rrandrsasTrueis not accepted.- Parameters:
- Raises:
ValueError – when
rrandrsare bothTrueBPCRecoveryError – when
rsisTrue, and the directory specified byarchive_file_or_diris empty, contains more than one item, or contains a non-regular file
- Return type:
- bpc_utils.Linesep¶
Type alias for
Literal['\n', '\r\n', '\r'].
Internal utilities¶
- bpc_utils.argparse._boolean_state_lookup¶
-
A mapping from string representation to boolean states. The values are used for
parse_boolean_state().
- bpc_utils.argparse._linesep_lookup¶
- Type:
Final[Dict[str,
Linesep]]
A mapping from string representation to linesep. The values are used for
parse_linesep().
- bpc_utils.fileprocessing.LOOKUP_TABLE = '_lookup_table.json'¶
File name for the lookup table in the archive file.
- Type:
Final[str]
- bpc_utils.fileprocessing.is_python_filename(filename)[source]¶
Determine whether a file is a Python source file by its extension.
- bpc_utils.fileprocessing.expand_glob_iter(pattern)[source]¶
Wrapper function to perform glob expansion.
- class bpc_utils.logging.BPCLogHandler[source]¶
Bases:
StreamHandlerHandler used to format BPC logging records.
Initialize BPCLogHandler.
- format(record)[source]¶
Format the specified record based on log level.
The record will be formatted based on its log level in the following flavour:
DEBUG[%(levelname)s] %(asctime)s %(message)sINFO%(message)sWARNINGWarning: %(message)sERRORError: %(message)sCRITICALError: %(message)s
- format_templates = {'CRITICAL': 'Error: %(message)s', 'DEBUG': '[%(levelname)s] %(asctime)s %(message)s', 'ERROR': 'Error: %(message)s', 'INFO': '%(message)s', 'WARNING': 'Warning: %(message)s'}¶
- time_format = '%Y-%m-%d %H:%M:%S.%f%z'¶
- bpc_utils.misc.current_time_with_tzinfo()[source]¶
Get the current time with local time zone information.
- Return type:
- Returns:
datetime object representing current time with local time zone information
- class bpc_utils.misc.MakeTextIO(obj)[source]¶
Bases:
objectContext wrapper class to handle
strand file objects together.- Variables:
Initialize context.
- bpc_utils.multiprocessing.mp¶
- Type:
Optional[ModuleType]
- Value:
<module ‘multiprocessing’>
An alias of the Python builtin
multiprocessingmodule if available.
- bpc_utils.multiprocessing._mp_map_wrapper(args)[source]¶
Map wrapper function for
multiprocessing.
- bpc_utils.multiprocessing._mp_init_lock(lock)[source]¶
Initialize lock for
multiprocessing.- Parameters:
lock (
ContextManager[None]) – the lock to be shared among tasks- Return type:
- bpc_utils.multiprocessing.task_lock¶
- Type:
ContextManager[None]
A lock for possibly concurrent tasks.