Skip to content

Latest commit

 

History

History
329 lines (228 loc) · 6.1 KB

20200425_import.md

File metadata and controls

329 lines (228 loc) · 6.1 KB

%title: Import Anything: Playing with Python's Import System %author: Remote Python Pizza 🍕 @gahjelle %date: April 25, 2020

-> # Geir Arne Hjelle <-

-> realpython.com <- ^

Disclaimer:

I've been so preoccupied with whether or not I could, I didn’t stop to think if I should.


Python Imports -- Behind the Scenes

At a high level, three things happen when you import a module: ^

  1. The module is found on your system ^

  2. The module is loaded into memory ^

  3. The module is bound to a name


importlib

The import machinery is exposed in importlib. Consider the following code:

>>> import numpy as np

^

Using importlib, you can rewrite the import:

>>> import importlib
>>> np = importlib.import_module("numpy")

^

Note that importlib handles the finding (1) and loading (2). You need to do the binding (3) yourself.


Name Binding

Python always imports whole modules:

>>> from math import pi as π
>>> π
3.141592653589793

^

>>> math
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'math' is not defined

The name math is not bound, but the whole math module has been imported.


Name Binding

sys.modules is a module cache that stores references to all loaded modules:

>>> from math import pi as π
>>> π
3.141592653589793

^

>>> import sys
>>> math = sys.modules["math"]
>>> math.cos(π)
-1.0

You can reference math through sys.modules to call math.cos().


Python's Import System

Let's look at finding and loading in more detail. When doing an import, the following steps are carried out:

  1. Check if the module is already available in sys.modules. If it is, the import process ends immediately. ^

  2. Find the module using finders. Each finder implements a different strategy. Default finders can find built-in modules, frozen modules, and modules on the import path. ^

  3. Load the module using a loader. Which loader to use is specified by the finder that found the module using a ModuleSpec.


Finders

sys.meta_path defines which finders to use:

>>> import sys
>>> sys.meta_path
[<class '_frozen_importlib.BuiltinImporter'>,
 <class '_frozen_importlib.FrozenImporter'>,
 <class '_frozen_importlib_external.PathFinder'>]

^

You can manipulate sys.meta_path:

>>> sys.meta_path.clear()
>>> import math
ModuleNotFoundError: No module named 'math'

Writing Your Own Finder

A finder is any class that implements a .find_spec() class method:

class DebugFinder:
    @classmethod
    def find_spec(cls, name, path, target=None):
        print(f"Importing {name!r}")
        return None

^

Finders can terminate in one of three ways:

  1. Returning None if it doesn't find the module
  2. Returning a ModuleSpec if it finds the module
  3. Raising ModuleNotFoundError if the module can't be imported

Using Your Own Finder

To use your own finder, add it to sys.meta_path:

>>> sys.meta_path.insert(0, DebugFinder)

^

>>> import csv
Importing 'csv'
Importing 're'
Importing 'enum'
Importing 'sre_compile'
Importing '_sre'
Importing 'sre_parse'
Importing 'sre_constants'
Importing 'copyreg'
Importing '_csv'

Writing Your Own Loader

Loaders are classes that implement .create_module() and .exec_module(): ^

  • .create_module() creates the module object. You can usually return None, which will use Python's default module creator. ^

  • .exec_module() adds content and functionality to the module object. This is what you will usually implement yourself. ^

Think of .create_module() and .exec_module() for modules as analogues to .__new__() and .__init__() for classes.


Import CSV Files

This example is inspired by Aleksey Bilogur's article Import almost anything in Python.

You'll add finders and loaders that can directly import CSV files:

$ cat people.csv
name,language,address
"Geir Arne",Python,Norway
Guido,Python,USA
Hadley,R,"New Zealand"

Import CSV Files

csv_importer will create a finder and a loader, and add it to sys.meta_path. You can then import CSV files: ^

>>> import csv_importer
>>> import people  # Imports people.csv
>>> people.name
('Geir Arne', 'Guido', 'Hadley')
    
>>> people.data[1]
{'name': 'Guido', 'language': 'Python', 'address': 'USA'}

The CSV Finder

CsvImporter will implement both the finder and the loader. .csv_path is used to reference which file to load.

import csv
import pathlib
import sys
from importlib.machinery import ModuleSpec
    
class CsvImporter:
    def __init__(self, csv_path):
        self.csv_path = csv_path
    
sys.meta_path.append(CsvImporter)

The CSV Finder

.find_spec looks for a given CSV file in sys.path:

    @classmethod
    def find_spec(cls, name, path, target=None):
        package, _, module_name = name.rpartition(".")
        csv_file_name = f"{module_name}.csv"
        directories = sys.path if path is None else path
        for directory in directories:
            csv_path = pathlib.Path(directory) / csv_file_name
            if csv_path.exists():
                return ModuleSpec(name, cls(csv_path))

The CSV Loader

Use Python's default system to create the module:

    def create_module(self, spec):
        return None

The CSV Loader

Read the CSV file and store it in the module's .__dict__:

    def exec_module(self, module):
        with self.csv_path.open() as fid:
            rows = csv.DictReader(fid)
            data = list(rows)
            fieldnames = tuple(rows.fieldnames)
        
        values = zip(*(row.values() for row in data))
        fields = dict(zip(fieldnames, values))
        
        module.__dict__.update(fields)
        module.__dict__["data"] = data
        module.__dict__["fieldnames"] = fieldnames

-> # Thank You For Your Attention <-

^

-> - Me: @gahjelle <- -> - Code: github.com/gahjelle/talks <- -> - Real Python: realpython.com <-