2.6. Unpack Assignment Expression

  • Since Python 3.8: PEP 572 -- Assignment Expressions

  • Also known as "Walrus operator"

  • Also known as "Named expression"

During discussion of this PEP, the operator became informally known as "the walrus operator". The construct's formal name is "Assignment Expressions" (as per the PEP title), but they may also be referred to as "Named Expressions". The CPython reference implementation uses that name internally). 1

Guido van Rossum stepped down after accepting PEP 572 -- Assignment Expressions:

../../_images/unpack-assignmentexpr-bdfl.png

2.6.1. Syntax

Scalar:

(x := <VALUE>)

Comprehension:

result = [<RETURN>
          for <VARIABLE1> in <ITERABLE>
          if (<VARIABLE2> := <EXPR>)]
result = [<RETURN>
          for <VARIABLE1> in <ITERABLE>
          if (<VARIABLE2> := <EXPR>)
          and (<VARIABLE3> := <EXPR>)]
result = [<RETURN>
          for <VARIABLE1> in <ITERABLE>
          if (<VARIABLE2> := <EXPR>)
          and (<VARIABLE3> := <EXPR>)
          or (<VARIABLE4> := <EXPR>)]

2.6.2. Example

  • First defines identifier with value

  • Then returns the value from the identifier

  • Both operations in the same line

>>> x = 1
>>> print(x)
1
>>> print(x = 1)
Traceback (most recent call last):
TypeError: 'x' is an invalid keyword argument for print()
>>> print(x := 1)
1

2.6.3. What is not

  • It's not substitution for equals

>>> x = 1
>>> print(x)
1
>>> x := 1
Traceback (most recent call last):
SyntaxError: invalid syntax

2.6.4. Processing Streams

  • Processing steams in chunks

Imagine we have a temperature sensor, and this sensor stream values. We have a process which receives values from string and appends them to the file. Let's simulate the process by adding temperature measurements to the file:

>>> with open('/tmp/myfile.txt', mode='w') as file:
...     _ = file.write('21.1,21.1,21.2,21.2,21.3,22.4,')

Note, that all values have fixed length of 4 bytes plus comma (5th byte). We cannot open and read whole file to the memory, like we normally do. This file may be huge, much larger than RAM in our computer.

We will process file reading 5 bytes of data (one measurement) at a time:

>>> file = open('/tmp/myfile.txt')
>>>
>>> value = file.read(5)
>>> while value:
...     print(f'Processing... {value.removesuffix(",")}')
...     value = file.read(5)
Processing... 21.1
Processing... 21.1
Processing... 21.2
Processing... 21.2
Processing... 21.3
Processing... 22.4

As you can see we have two places where we define number of bytes, read and cleanup data. First file.read() is needed to enter the loop. Second file.read() is needed to process the file further until the end. Using assignment expression we can write code which is far better:

>>> file = open('/tmp/myfile.txt')
>>>
>>> while value := file.read(5):
...     print(f'Processing... {value.removesuffix(",")}')
Processing... 21.1
Processing... 21.1
Processing... 21.2
Processing... 21.2
Processing... 21.3
Processing... 22.4

Imagine if this is not a 5 bytes of data, but a chunk of data for processing (for example a ten megabytes at once). This construct make more sense then.

Always remember to close the file at the end:

>>> file.close()

2.6.5. Checking Match

>>> import re
>>>
>>> DATA = 'mwatney@nasa.gov'

Typically regular expressions requires to check if the value is not None before using it further:

>>> result = re.search(r'@nasa.gov', DATA)
>>>
>>> if result:
...     print(result)
<re.Match object; span=(7, 16), match='@nasa.gov'>

Assignment expressions allows to merge two independent lines into one coherent statement:

>>> if result := re.search(r'@nasa.gov', DATA):
...     print(result)
<re.Match object; span=(7, 16), match='@nasa.gov'>

2.6.6. Comprehensions

Let's define data:

>>> DATA = ['Mark Watney',
...         'Melissa Lewis',
...         'Rick Martinez']

Typical comprehension would require calling str.split() multiple times:

>>> result = [{'firstname': fullname.split()[0],
...            'lastname': fullname.split()[1]}
...           for fullname in DATA]
>>>
>>> print(result)  
[{'firstname': 'Mark', 'lastname': 'Watney'},
 {'firstname': 'Melissa', 'lastname': 'Lewis'},
 {'firstname': 'Rick', 'lastname': 'Martinez'}]

Assignment expressions allows definition of a variable which can be used in the comprehension. It is not only more clear and readable, but also saves time and memory, especially if the function call would take a lot of resources:

>>> result = [{'firstname': name[0], 'lastname': name[1]}
...           for fullname in DATA
...           if (name := fullname.split())]
>>>
>>> print(result)  
[{'firstname': 'Mark', 'lastname': 'Watney'},
 {'firstname': 'Melissa', 'lastname': 'Lewis'},
 {'firstname': 'Rick', 'lastname': 'Martinez'}]

You can define multiple assignment expressions if needed.

>>> result = [{'firstname': name[0], 'lastname': name[1]}
...           for fullname in DATA
...           if (name := fullname.split())
...           and (firstname := name[0])
...           and (lastname := name[1])]
>>>
>>> print(result)  
[{'firstname': 'Mark', 'lastname': 'Watney'},
 {'firstname': 'Melissa', 'lastname': 'Lewis'},
 {'firstname': 'Rick', 'lastname': 'Martinez'}]

2.6.7. Assignment vs Assignment Expression

>>> (x := 1)
1
>>>
>>> print(x)
1
>>> x = 1, 2
>>>
>>> print(x)
(1, 2)
>>> (x := 1, 2)
(1, 2)
>>>
>>> print(x)
1
>>> result = (x := 1, 2)
>>>
>>> print(result)
(1, 2)
>>> x = 0
>>> x += 1
>>>
>>> print(x)
1
>>> x = 0
>>> x +:= 1
Traceback (most recent call last):
SyntaxError: invalid syntax
>>> data = {}
>>> data['commander'] = 'Mark Watney'
>>>
>>> data = {}
>>> data['commander'] := 'Mark Watney'
Traceback (most recent call last):
SyntaxError: cannot use assignment expressions with subscript

2.6.8. Use Case - 0x01

  • Reusing Results

>>> def run(x):
...     return 1
>>>
>>>
>>> result = [run(x), run(x)+1, run(x)+2]
>>>
>>> result = [res := run(x), res+1, res+2]

2.6.9. Use Case - 0x02

>>> from pprint import pprint

We want to convert:

>>> DATA = """5.8,2.7,5.1,1.9,virginica
... 5.1,3.5,1.4,0.2,setosa
... 5.7,2.8,4.1,1.3,versicolor"""

Into:

>>> pprint(result)  
[(5.8, 2.7, 5.1, 1.9, 'virginica'),
 (5.1, 3.5, 1.4, 0.2, 'setosa'),
 (5.7, 2.8, 4.1, 1.3, 'versicolor')]

Using loop:

>>> result = []
>>> for line in DATA.splitlines():
...     records = line.split(',')
...     features = tuple(map(float, records[0:4]))
...     species = (records[-1],)
...     result.append(features + species)

Using comprehension:

>>> result = [tuple(map(float, line.split(',')[0:4])) + (line.split(',')[-1],)
...           for line in DATA.splitlines()]

Using comprehension with assignment expression:

>>> result = [tuple(map(float, records[0:4])) + (records[-1],)
...           for line in DATA.splitlines()
...           if (records := line.split(','))]

Using comprehension with multiple assignment expression:

>>> result = [tuple(features) + (species,)
...           for line in DATA.splitlines()
...           if (records := line.split(','))
...           and (features := map(float, records[0:4]))
...           and (species := records[-1])]

2.6.10. Use Case - 0x03

>>> DATA = """5.8,2.7,5.1,1.9,virginica
... 5.1,3.5,1.4,0.2,setosa
... 5.7,2.8,4.1,1.3,versicolor"""
>>> %%timeit -r 1000 -n 1000  
... result = []
... for line in DATA.splitlines():
...     *values, species = line.split(',')
...     values = map(float,values)
...     row = tuple(values) + (species,)
...     result.append(row)
3.18 µs ± 394 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)
>>> %%timeit -r 1000 -n 1000  
... result = [tuple(values) + (species,)
...           for line in DATA.splitlines()
...           if (row := line.split(','))
...           and (values := map(float, row[:-1]))
...           and (species := row[-1])]
2.97 µs ± 386 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)
>>> %%timeit -r 1000 -n 1000  
... result = (tuple(values) + (species,)
...           for line in DATA.splitlines()
...           if (row := line.split(','))
...           and (values := map(float, row[:-1]))
...           and (species := row[-1]))
577 ns ± 53.3 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)

Note, that the generator expression will not return values, but create an object which execution will get values. This is the reason why this solution is such drastically fast.

2.6.11. Use Case - 0x04

>>> DATA = """5.8,2.7,5.1,1.9,virginica
... 5.1,3.5,1.4,0.2,setosa
... 5.7,2.8,4.1,1.3,versicolor"""
>>> result = [tuple(values) + (species,)
...           for line in DATA.splitlines()
...           if (row := line.split(','))
...           and (values := map(float, row[:-1]))
...           and (species := row[-1])]
>>>
>>> result   
[(5.8, 2.7, 5.1, 1.9, 'virginica'),
 (5.1, 3.5, 1.4, 0.2, 'setosa'),
 (5.7, 2.8, 4.1, 1.3, 'versicolor')]
>>> result = (tuple(values) + (species,)
...           for line in DATA.splitlines()
...           if (row := line.split(','))
...           and (values := map(float, row[:-1]))
...           and (species := row[-1]))
>>>
>>> result  
<generator object <genexpr> at 0x...>
>>>
>>> next(result)
(5.8, 2.7, 5.1, 1.9, 'virginica')
>>>
>>> next(result)
(5.1, 3.5, 1.4, 0.2, 'setosa')
>>>
>>> next(result)
(5.7, 2.8, 4.1, 1.3, 'versicolor')
>>>
>>> next(result)
Traceback (most recent call last):
StopIteration

2.6.12. Use Case - 0x05

>>> DATA = [{'is_astronaut': True,  'name': 'Mark Watney'},
...         {'is_astronaut': True,  'name': 'Melissa Lewis'},
...         {'is_astronaut': False, 'name': 'José Jiménez'},
...         {'is_astronaut': True,  'name': 'Rick Martinez'},
...         {'is_astronaut': False, 'name': 'Alex Vogel'}]

Comprehension:

>>> result = [{'firstname': person['name'].title().split()[0],
...            'lastname': person['name'].title().split()[1]}
...           for person in DATA
...           if person['is_astronaut']]

One assignment expression:

>>> result = [{'firstname': name[0],
...            'lastname': name[1]}
...           for person in DATA
...           if person['is_astronaut']
...           and (name := person['name'].title().split())]

Many assignment expressions:

>>> result = [{'firstname': firstname,
...            'lastname': lastname}
...           for person in DATA
...           if person['is_astronaut']
...           and (name := person['name'].title().split())
...           and (firstname := name[0])
...           and (lastname := name[1])]

In all cases result is the same:

>>> print(result)  
[{'firstname': 'Mark', 'lastname': 'Watney'},
 {'firstname': 'Melissa', 'lastname': 'Lewis'},
 {'firstname': 'Rick', 'lastname': 'Martinez'}]

2.6.13. Use Case - 0x06

>>> DATA = [{'is_astronaut': True,  'name': 'Mark Watney'},
...         {'is_astronaut': True,  'name': 'Melissa Lewis'},
...         {'is_astronaut': False, 'name': 'José Jiménez'},
...         {'is_astronaut': True,  'name': 'Rick Martinez'},
...         {'is_astronaut': False, 'name': 'Alex Vogel'}]
>>>
>>>
>>> astronauts = [{'firstname': firstname, 'lastname': lastname}
...                for person in DATA
...                if person['is_astronaut']
...                and (name := person['name'].split())
...                and (firstname := name[0].capitalize())
...                and (lastname := f'{name[1][0]}.')]
>>>
>>> print(astronauts)  
[{'firstname': 'Mark', 'lastname': 'W.'},
 {'firstname': 'Melissa', 'lastname': 'L.'},
 {'firstname': 'Rick', 'lastname': 'M.'}]

2.6.14. Use Case - 0x07

In the following example dataclasses are used to automatically generate __init__() method based on the attributes:

>>> from dataclasses import dataclass
>>> from pprint import pprint
>>>
>>>
>>> @dataclass
... class Iris:
...     sepal_length: float
...     sepal_width: float
...     petal_length: float
...     petal_width: float
>>>
>>>
>>> class Versicolor(Iris):
...     pass
>>>
>>> class Virginica(Iris):
...     pass
>>>
>>> class Setosa(Iris):
...     pass
>>>
>>>
>>> DATA = [
...    ('SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth', 'Species'),
...    (5.8, 2.7, 5.1, 1.9, 'virginica'),
...    (5.1, 3.5, 1.4, 0.2, 'setosa'),
...    (5.7, 2.8, 4.1, 1.3, 'versicolor'),
...    (6.3, 2.9, 5.6, 1.8, 'virginica'),
...    (6.4, 3.2, 4.5, 1.5, 'versicolor'),
...    (4.7, 3.2, 1.3, 0.2, 'setosa'),
...    (7.0, 3.2, 4.7, 1.4, 'versicolor')]
>>>
>>>
>>> result = [iris(*values)
...           for *values, species in DATA[1:]
...           if (clsname := species.capitalize())
...           and (iris := globals()[clsname])]
>>>
>>> pprint(result, width=120)
[Virginica(sepal_length=5.8, sepal_width=2.7, petal_length=5.1, petal_width=1.9),
 Setosa(sepal_length=5.1, sepal_width=3.5, petal_length=1.4, petal_width=0.2),
 Versicolor(sepal_length=5.7, sepal_width=2.8, petal_length=4.1, petal_width=1.3),
 Virginica(sepal_length=6.3, sepal_width=2.9, petal_length=5.6, petal_width=1.8),
 Versicolor(sepal_length=6.4, sepal_width=3.2, petal_length=4.5, petal_width=1.5),
 Setosa(sepal_length=4.7, sepal_width=3.2, petal_length=1.3, petal_width=0.2),
 Versicolor(sepal_length=7.0, sepal_width=3.2, petal_length=4.7, petal_width=1.4)]

2.6.15. Use Case - 0x09

>>> import re
>>>
>>>
>>> data = 'mark.watney@nasa.gov'
>>> pattern = r'([a-z]+)\.([a-z]+)@nasa.gov'

Procedural approach:

>>> match = re.match(pattern, data)
>>> result = match.groups() if match else None

Conditional statement requires to perform match twice in order to get results:

>>> result = re.match(pattern, data).groups() if re.match(pattern, data) else None

Assignment expressions allows to defile a variable and reuse it:

>>> result = x.groups() if (x := re.match(pattern, data)) else None

In all cases result is the same:

>>> print(result)
('mark', 'watney')

2.6.16. Use Case - 0x0A

>>> 
... from ninja import Router
... from django.contrib.auth import authenticate, login
... from backend.auth.schemas import LoginRequest SessionIdResponse
... from backend.common.schemas import ResponseUnauthorized
...
... router = Router()
...
...
... @router.api_operation(
...     methods=['POST'],
...     path='session/',
...     response={
...         200: SessionIdResponse,
...         401: ResponseUnauthorized},
...     summary='Authenticate using Cookies and SessionID')
... def session(request, data: LoginRequest):
...     username = data.username
...     password = data.password
...     if user := authenticate(request, username=username, password=password):
...         login(request, user)
...         return 200, {'sessionid': request.session.session_key}
...     else:
...         return 401, {'details': 'Invalid credentials'}

2.6.17. References

1

Angelico, C. and Peters, T. and van Rossum, G. PEP 572 -- Assignment Expressions. Python Software Foundation. Year: 2018. Retrieved: 2020-12-04. Url: https://www.python.org/dev/peps/pep-0572/#abstract

2.6.18. Assignments

Code 2.25. Solution
"""
* Assignment: Unpack Assignement Expression
* Complexity: medium
* Lines of code: 6 lines
* Time: 13 min

English:
    1. Split `DATA` by lines and then by colon `:`
    2. Extract system accounts
       (users with UID [third field] is less than 1000)
    3. Return list of system account logins
    4. Solve using list comprehension and assignment expression
    5. Mind the `root` user who has `uid == 0`
       (whether is not filtered-out in if statement)
    6. Run doctests - all must succeed

Polish:
    1. Podziel `DATA` po liniach a następnie po dwukropku `:`
    2. Wyciągnij konta systemowe
       (użytkownicy z UID [trzecie pole] mniejszym niż 1000)
    3. Zwróć listę loginów użytkowników systemowych
    4. Rozwiąż wykorzystując list comprehension i assignment expression
    5. Zwróć uwagę na użytkownika `root`, który ma `uid == 0`
       (czy nie jest odfiltrowany w instrukcji if)
    6. Uruchom doctesty - wszystkie muszą się powieść

Hint:
    * `str.splitlines()`
    * `str.strip()`
    * `str.split()`
    * `int()`
    * `bool(0) == False`
    * `bool('0') == True`

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> assert len(result) > 0, \
    'Variable `result` cannot be empty'
    >>> assert type(result) is list, \
    'Variable `result` has invalid type, should be list'
    >>> assert all(type(x) is str for x in result), \
    'All rows in `result` should be str'

    >>> result
    ['root', 'bin', 'daemon', 'adm', 'shutdown', 'halt', 'nobody', 'sshd']
"""

DATA = """root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
nobody:x:99:99:Nobody:/:/sbin/nologin
sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
watney:x:1000:1000:Mark Watney:/home/watney:/bin/bash
lewis:x:1001:1001:Melissa Lewis:/home/lewis:/bin/bash
martinez:x:1002:1002:Rick Martinez:/home/martinez:/bin/bash"""

# system account usernames (UID [third field] is less than 1000)
# type: list[str]
result = ...