Python Backport Compiler Utilities¶

Utility library for the Python bpc backport compiler.

Currently, the three individual tools (f2format, poseur, walrus) depend on this repo. The bpc compiler is a work in progress.

Module contents¶

Utility library for the Python bpc backport compiler.

exception bpc_utils.BPCInternalError(message, context)[source]¶

Bases: RuntimeError

Internal bug happened in BPC tools.

Initialize BPCInternalError.

Parameters:

message (object) – the error message
context (str) – describe the context/location/component where the bug happened

Raises:

TypeError – if context is not str
ValueError – if message (when converted to str) or context is empty or only contains whitespace characters

exception bpc_utils.BPCRecoveryError[source]¶

Bases: RuntimeError

Error during file recovery.

exception bpc_utils.BPCSyntaxError[source]¶

Bases: SyntaxError

Syntax error detected when parsing code.

class bpc_utils.BaseContext(node, config, *, indent_level=0, raw=False)[source]¶

Bases: ABC

Abstract base class for general conversion context.

Initialize BaseContext.

Parameters:

node (NodeOrLeaf) – parso AST
config (Config) – conversion configurations
indent_level (int) – current indentation level
raw (bool) – raw processing flag

final __iadd__(code)[source]¶

Support of the += operator.

If self._prefix_or_suffix is True, then the code will be appended to self._prefix; else it will be appended to self._suffix.

Parameters:: code (str) – code string
Returns:: self
Return type:: BaseContext

final __str__()[source]¶

Returns a stripped version of self._buffer.

Return type:: str

abstract _concat()[source]¶

Concatenate final string.

Return type:: None

final _process(node)[source]¶

Recursively process parso AST.

All processing methods for a specific node type are defined as _process_{type}. This method first checks if such processing method exists. If so, it will call such method on the node; otherwise it will traverse through all children of node, and perform the same logic on each child.

Parameters:: node (NodeOrLeaf) – parso AST
Return type:: None

final _walk(node)[source]¶

Start traversing the AST module.

The method traverses through all children of node. It first checks if such child has the target expression. If so, it will toggle self._prefix_or_suffix (set to False) and save the last previous child as self._node_before_expr. Then it processes the child with self._process.

Parameters:: node (NodeOrLeaf) – parso AST
Return type:: None

final static extract_whitespaces(code)[source]¶

Extract preceding and succeeding whitespaces from the code given.

Parameters:: code (str) – the code to extract whitespaces
Return type:: Tuple[str, str]
Returns:: a tuple of preceding and succeeding whitespaces in code

abstract has_expr(node)[source]¶

Check if node has the target expression.

Parameters:: node (NodeOrLeaf) – parso AST
Return type:: bool
Returns:: whether node has the target expression

final classmethod mangle(cls_name, var_name)[source]¶

Mangle variable names.

This method mangles variable names as described in Python documentation about mangling and further normalizes the mangled variable name through normalize().

Parameters:

cls_name (str) – class name
var_name (str) – variable name

Return type:

str

Returns:

mangled and normalized variable name

final static missing_newlines(prefix, suffix, expected, linesep)[source]¶

Count missing blank lines for code insertion given surrounding code.

Parameters:

prefix (str) – preceding source code
suffix (str) – succeeding source code
expected (int) – number of expected blank lines
linesep (Linesep) – line separator

Return type:

int

Returns:

number of blank lines to add

final static normalize(name)[source]¶

Normalize variable names.

This method normalizes variable names as described in Python documentation about identifiers and PEP 3131.

Parameters:: name (str) – variable name as it appears in the source code
Return type:: str
Returns:: normalized variable name

final static split_comments(code, linesep)[source]¶

Separates prefixing comments from code.

This method separates prefixing comments and suffixing code. It is rather useful when inserting code might break shebang and encoding cookies (PEP 263), etc.

Parameters:

code (str) – the code to split comments
linesep (Linesep) – line separator

Return type:

Tuple[str, str]

Returns:

a tuple of prefix comments and suffix code

_buffer¶: Final converted result.

_indent_level¶: Current indentation level.

_indentation¶: Indentation sequence.

_linesep¶

Line separator.

Type:: Final[Linesep]

_node_before_expr¶: Preceding node with the target expression, i.e. the insertion point.

_pep8¶: PEP 8 compliant conversion flag.

_prefix¶: Code before insertion point.

_prefix_or_suffix¶: Flag to indicate whether buffer is now self._prefix.

_root¶: Root node given by the node parameter.

_suffix¶: Code after insertion point.

_uuid_gen¶: UUID generator.

config¶: Internal configurations.

property string¶: Returns conversion buffer (self._buffer).

class bpc_utils.Config(**kwargs)[source]¶

Bases: MutableMapping[str, object]

Configuration namespace.

This class is inspired from argparse.Namespace for storing internal attributes and/or configuration variables.

>>> config = Config(foo='var', bar=True)
>>> config.foo
'var'
>>> config['bar']
True
>>> config.bar = 'boo'
>>> del config['foo']
>>> config
Config(bar='boo')

class bpc_utils.Placeholder(name)[source]¶

Bases: object

Placeholder for string interpolation.

Placeholder objects can be concatenated with str, other Placeholder objects and StringInterpolation objects via the ‘+’ operator.

Placeholder objects should be regarded as immutable. Please do not modify the _name internal attribute. Build new objects instead.

Initialize Placeholder.

Parameters:: name (str) – name of the placeholder
Raises:: TypeError – if name is not str

property name¶: Returns the name of this placeholder.

class bpc_utils.StringInterpolation(*args)[source]¶

Bases: object

A string with placeholders to be filled in.

This looks like an object-oriented format string, but making sure that string literals are always interpreted literally (so no need to manually do escaping). The boundaries between string literals and placeholders are very clear. Filling in a placeholder will never inject a new placeholder, protecting string integrity for multiple-round interpolation.

>>> s1 = '%(injected)s'
>>> s2 = 'hello'
>>> s = StringInterpolation('prefix ', Placeholder('q1'), ' infix ', Placeholder('q2'), ' suffix')
>>> str(s % {'q1': s1} % {'q2': s2})
'prefix %(injected)s infix hello suffix'

(This can be regarded as an improved version of string.Template.safe_substitute().)

Multiple-round interpolation is tricky to do with a traditional format string. In order to do things correctly and avoid format string injection vulnerabilities, you need to perform escapes very carefully.

>>> fs = 'prefix %(q1)s infix %(q2)s suffix'
>>> fs % {'q1': s1} % {'q2': s2}
Traceback (most recent call last):
    ...
KeyError: 'q2'
>>> fs = 'prefix %(q1)s infix %%(q2)s suffix'
>>> fs % {'q1': s1} % {'q2': s2}
Traceback (most recent call last):
    ...
KeyError: 'injected'
>>> fs % {'q1': s1.replace('%', '%%')} % {'q2': s2}
'prefix %(injected)s infix hello suffix'

StringInterpolation objects can be concatenated with str, Placeholder objects and other StringInterpolation objects via the ‘+’ operator.

StringInterpolation objects should be regarded as immutable. Please do not modify the _literals and _placeholders internal attributes. Build new objects instead.

Initialize StringInterpolation. args will be concatenated to construct a StringInterpolation object.

>>> StringInterpolation('prefix', Placeholder('data'), 'suffix')
StringInterpolation('prefix', Placeholder('data'), 'suffix')

Parameters:: args (Union[str, Placeholder, StringInterpolation]) – the components to construct a StringInterpolation object

__mod__(substitutions)[source]¶

Substitute the placeholders in this StringInterpolation object with string values (if possible) according to the substitutions mapping.

>>> StringInterpolation('prefix ', Placeholder('data'), ' suffix') % {'data': 'hello'}
StringInterpolation('prefix hello suffix')

Parameters:: substitutions (Mapping[str, object]) – a mapping from placeholder names to the values to be filled in; all values are converted into str
Return type:: StringInterpolation
Returns:: a new StringInterpolation object with as many placeholders substituted as possible

__str__()[source]¶

Returns the fully-substituted string interpolation result.

>>> str(StringInterpolation('prefix hello suffix'))
'prefix hello suffix'

Return type:: str
Returns:: the fully-substituted string interpolation result
Raises:: ValueError – if there are still unsubstituted placeholders in this StringInterpolation object

classmethod from_components(literals, placeholders)[source]¶

Construct a StringInterpolation object from literals and placeholders components. This method is more efficient than the StringInterpolation() constructor, but it is mainly intended for internal use.

>>> StringInterpolation.from_components(
...     ('prefix', 'infix', 'suffix'),
...     (Placeholder('data1'), Placeholder('data2'))
... )
StringInterpolation('prefix', Placeholder('data1'), 'infix', Placeholder('data2'), 'suffix')

Parameters:

literals (Iterable[str]) – the literal components in order
placeholders (Iterable[Placeholder]) – the Placeholder components in order

Return type:

StringInterpolation

Returns:

the constructed StringInterpolation object

Raises:

TypeError – if literals is str; if literals contains non-str values; if placeholders contains non-Placeholder values
ValueError – if the length of literals is not exactly one more than the length of placeholders

iter_components()[source]¶

Generator to iterate all components of this StringInterpolation object in order.

Return type:: Generator[Union[str, Placeholder], None, None]

>>> list(StringInterpolation('prefix', Placeholder('data'), 'suffix').iter_components())
['prefix', Placeholder('data'), 'suffix']

Yields:: the components of this StringInterpolation object in order

property literals¶: Returns the literal components in this StringInterpolation object.

property placeholders¶: Returns the Placeholder components in this StringInterpolation object.

property result¶

Alias of StringInterpolation.__str__() to get the fully-substituted string interpolation result.

>>> StringInterpolation('prefix hello suffix').result
'prefix hello suffix'

class bpc_utils.UUID4Generator(dash=True)[source]¶

Bases: object

UUID 4 generator wrapper to prevent UUID collisions.

Constructor of UUID 4 generator wrapper.

Parameters:: dash (bool) – whether the generated UUID string has dashes or not

gen()[source]¶

Generate a new UUID 4 string that is guaranteed not to collide with used UUIDs.

Return type:: str
Returns:: a new UUID 4 string

bpc_utils.TaskLock()[source]¶

Function that returns a lock for possibly concurrent tasks.

Return type:: ContextManager[None]
Returns:: a lock for possibly concurrent tasks

bpc_utils.archive_files(files, archive_dir)[source]¶

Archive the list of files into a tar file.

Parameters:

files (Iterable[str]) – a list of files to be archived (should be absolute path)
archive_dir (str) – the directory to save the archive

Return type:

str

Returns:

path to the generated tar archive

bpc_utils.detect_encoding(code)[source]¶

Detect encoding of Python source code as specified in PEP 263.

Parameters:

code (bytes) – the code to detect encoding

Return type:

str

Returns:

the detected encoding, or the default encoding (utf-8)

Raises:

TypeError – if code is not a bytes string
SyntaxError – if both a BOM and a cookie are present, but disagree

bpc_utils.detect_files(files)[source]¶

Get a list of Python files to be processed according to user input.

This will perform glob expansion on Windows, make all paths absolute, resolve symbolic links and remove duplicates.

Parameters:: files (Iterable[str]) – a list of files and directories to process (usually provided by users on command-line)
Return type:: List[str]
Returns:: a list of Python files to be processed

Internal utilities¶

bpc_utils.argparse._boolean_state_lookup¶

Type:: Final[Dict[str, bool]]

A mapping from string representation to boolean states. The values are used for parse_boolean_state().

bpc_utils.argparse._linesep_lookup¶

Type:: Final[Dict[str, Linesep]]

A mapping from string representation to linesep. The values are used for parse_linesep().

bpc_utils.fileprocessing.has_gz_support¶

Type:: bool

Whether gzip is supported.

bpc_utils.fileprocessing.LOOKUP_TABLE = '_lookup_table.json'¶

File name for the lookup table in the archive file.

Type:: Final[str]

bpc_utils.fileprocessing.is_python_filename(filename)[source]¶

Determine whether a file is a Python source file by its extension.

Parameters:: filename (str) – the name of the file
Return type:: bool
Returns:: whether the file is a Python source file

bpc_utils.fileprocessing.expand_glob_iter(pattern)[source]¶

Wrapper function to perform glob expansion.

Parameters:: pattern (str) – the pattern to expand
Return type:: Iterator[str]
Returns:: an iterator of expansion result

class bpc_utils.logging.BPCLogHandler[source]¶

Bases: StreamHandler

Handler used to format BPC logging records.

Initialize BPCLogHandler.

format(record)[source]¶

Format the specified record based on log level.

The record will be formatted based on its log level in the following flavour:

`DEBUG`	`[%(levelname)s] %(asctime)s %(message)s`
`INFO`	`%(message)s`
`WARNING`	`Warning: %(message)s`
`ERROR`	`Error: %(message)s`
`CRITICAL`	`Error: %(message)s`

Parameters:: record (LogRecord) – the log record
Return type:: str
Returns:: the formatted log string

format_templates = {'CRITICAL': 'Error: %(message)s', 'DEBUG': '[%(levelname)s] %(asctime)s %(message)s', 'ERROR': 'Error: %(message)s', 'INFO': '%(message)s', 'WARNING': 'Warning: %(message)s'}¶

time_format = '%Y-%m-%d %H:%M:%S.%f%z'¶

bpc_utils.misc.is_windows¶

Type:: bool

Whether the current operating system is Windows.

bpc_utils.misc.current_time_with_tzinfo()[source]¶

Get the current time with local time zone information.

Return type:: datetime
Returns:: datetime object representing current time with local time zone information

class bpc_utils.misc.MakeTextIO(obj)[source]¶

Bases: object

Context wrapper class to handle str and file objects together.

Variables:

obj (Union[str, TextIO]) – the object to manage in the context
sio (Optional[StringIO]) – the I/O object to manage in the context only if self.obj is str
pos (Optional[int]) – the original offset of self.obj, only if self.obj is a seekable file object

Initialize context.

Parameters:: obj (Union[str, TextIO]) – the object to manage in the context

obj¶

Type:: Union[str, TextIO]

The object to manage in the context.

sio¶

Type:: StringIO

The I/O object to manage in the context only if self.obj is str.

pos¶

Type:: int

The original offset of self.obj, if only self.obj is a seekable TextIO.

__enter__()[source]¶

Enter context. :rtype: TextIO

If self.obj is str, a StringIO will be created and returned.
If self.obj is a seekable file object, it will be seeked to the beginning and returned.
If self.obj is an unseekable file object, it will be returned directly.

__exit__(exc_type, exc_value, traceback)[source]¶

Exit context. :rtype: None

If self.obj is str, the StringIO (self.sio) will be closed.
If self.obj is a seekable file object, its stream position (self.pos) will be recovered.

bpc_utils.multiprocessing.CPU_CNT¶

Type:: int

Number of CPUs for multiprocessing support.

bpc_utils.multiprocessing.mp¶

Type:: Optional[ModuleType]
Value:: <module ‘multiprocessing’>

An alias of the Python builtin multiprocessing module if available.

bpc_utils.multiprocessing.parallel_available¶

Type:: bool

Whether parallel execution is available.

bpc_utils.multiprocessing._mp_map_wrapper(args)[source]¶

Map wrapper function for multiprocessing.

Parameters:: args (Tuple[Callable[..., TypeVar(T)], Iterable[object], Mapping[str, object]]) – the function to execute, the positional arguments and the keyword arguments packed into a tuple
Return type:: TypeVar(T)
Returns:: the function execution result

bpc_utils.multiprocessing._mp_init_lock(lock)[source]¶

Initialize lock for multiprocessing.

Parameters:: lock (ContextManager[None]) – the lock to be shared among tasks
Return type:: None

bpc_utils.multiprocessing.task_lock¶

Type:: ContextManager[None]

A lock for possibly concurrent tasks.

Python Backport Compiler Utilities¶

Module contents¶

Internal utilities¶

Indices and tables¶

bpc-utils

Navigation

Related Topics