mirror of https://github.com/python/cpython.git
GH-77265: Document NaN handling in statistics functions that sort or count (#94676)
* Document NaN handling in functions that sort or count * Update Doc/library/statistics.rst Co-authored-by: Erlend Egeberg Aasland <erlend.aasland@protonmail.com> * Update Doc/library/statistics.rst Co-authored-by: Erlend Egeberg Aasland <erlend.aasland@protonmail.com> * Fix trailing whitespace and rewrap text Co-authored-by: Erlend Egeberg Aasland <erlend.aasland@protonmail.com>
This commit is contained in:
parent
264b3ddfd5
commit
ef61b259e3
|
@ -35,6 +35,35 @@ and implementation-dependent. If your input data consists of mixed types,
|
|||
you may be able to use :func:`map` to ensure a consistent result, for
|
||||
example: ``map(float, input_data)``.
|
||||
|
||||
Some datasets use ``NaN`` (not a number) values to represent missing data.
|
||||
Since NaNs have unusual comparison semantics, they cause surprising or
|
||||
undefined behaviors in the statistics functions that sort data or that count
|
||||
occurrences. The functions affected are ``median()``, ``median_low()``,
|
||||
``median_high()``, ``median_grouped()``, ``mode()``, ``multimode()``, and
|
||||
``quantiles()``. The ``NaN`` values should be stripped before calling these
|
||||
functions::
|
||||
|
||||
>>> from statistics import median
|
||||
>>> from math import isnan
|
||||
>>> from itertools import filterfalse
|
||||
|
||||
>>> data = [20.7, float('NaN'),19.2, 18.3, float('NaN'), 14.4]
|
||||
>>> sorted(data) # This has surprising behavior
|
||||
[20.7, nan, 14.4, 18.3, 19.2, nan]
|
||||
>>> median(data) # This result is unexpected
|
||||
16.35
|
||||
|
||||
>>> sum(map(isnan, data)) # Number of missing values
|
||||
2
|
||||
>>> clean = list(filterfalse(isnan, data)) # Strip NaN values
|
||||
>>> clean
|
||||
[20.7, 19.2, 18.3, 14.4]
|
||||
>>> sorted(clean) # Sorting now works as expected
|
||||
[14.4, 18.3, 19.2, 20.7]
|
||||
>>> median(clean) # This result is now well defined
|
||||
18.75
|
||||
|
||||
|
||||
Averages and measures of central location
|
||||
-----------------------------------------
|
||||
|
||||
|
|
Loading…
Reference in New Issue