区别在于:np.mean() 只能计算算数平均值,而np.average() 通过添加 weights 参数可以计算加权平均值。
np.mean() 源码:
@array_function_dispatch(_mean_dispatcher)
def mean(a, axis=None, dtype=None, out=None, keepdims=np._NoValue):
"""
Compute the arithmetic mean along the specified axis.
Returns the average of the array elements. The average is taken over
the flattened array by default, otherwise over the specified axis.
`float64` intermediate and return values are used for integer inputs.
Parameters
----------
a : array_like
Array containing numbers whose mean is desired. If `a` is not an
array, a conversion is attempted.
axis : None or int or tuple of ints, optional
Axis or axes along which the means are computed. The default is to
compute the mean of the flattened array.
.. versionadded:: 1.7.0
If this is a tuple of ints, a mean is performed over multiple axes,
instead of a single axis or all the axes as before.
dtype : data-type, optional
Type to use in computing the mean. For integer inputs, the default
is `float64`; for floating point inputs, it is the same as the
input dtype.
out : ndarray, optional
Alternate output array in which to place the result. The default
is ``None``; if provided, it must have the same shape as the
expected output, but the type will be cast if necessary.
See `doc.ufuncs` for details.
keepdims : bool, optional
If this is set to True, the axes which are reduced are left
in the result as dimensions with size one. With this option,
the result will broadcast correctly against the input array.
If the default value is passed, then `keepdims` will not be
passed through to the `mean` method of sub-classes of
`ndarray`, however any non-default value will be. If the
sub-class' method does not implement `keepdims` any
exceptions will be raised.
Returns
-------
m : ndarray, see dtype parameter above
If `out=None`, returns a new array containing the mean values,
otherwise a reference to the output array is returned.
See Also
--------
average : Weighted average
std, var, nanmean, nanstd, nanvar
Notes
-----
The arithmetic mean is the sum of the elements along the axis divided
by the number of elements.
Note that for floating-point input, the mean is computed using the
same precision the input has. Depending on the input data, this can
cause the results to be inaccurate, especially for `float32` (see
example below). Specifying a higher-precision accumulator using the
`dtype` keyword can alleviate this issue.
By default, `float16` results are computed using `float32` intermediates
for extra precision.
Examples
--------
>>> a = np.array([[1, 2], [3, 4]])
>>> np.mean(a)
2.5
>>> np.mean(a, axis=0)
array([ 2., 3.])
>>> np.mean(a, axis=1)
array([ 1.5, 3.5])
In single precision, `mean` can be inaccurate:
>>> a = np.zeros((2, 512*512), dtype=np.float32)
>>> a[0, :] = 1.0
>>> a[1, :] = 0.1
>>> np.mean(a)
0.54999924
Computing the mean in float64 is more accurate:
>>> np.mean(a, dtype=np.float64)
0.55000000074505806
"""
kwargs = {}
if keepdims is not np._NoValue:
kwargs['keepdims'] = keepdims
if type(a) is not mu.ndarray:
try:
mean = a.mean
except AttributeError:
pass
else:
return mean(axis=axis, dtype=dtype, out=out, **kwargs)
return _methods._mean(a, axis=axis, dtype=dtype,
out=out, **kwargs)
参数:
- a:必须是数组。
- axis:默认条件下是flatten的array,可以指定相应的轴。
- 如果是二维矩阵,axis=0返回纵轴的平均值,axis=1返回横轴的平均值。
- 算术平均值是沿轴的元素总和除以元素的数量。既然是除法,就涉及到一个精确度的问题。
- 对于浮点输入,平均值的计算使用与输入相同的精度计算,这可能会导致结果不准确,特别是对于float32来说。为了缓解这个问题,我们可以使用dtype关键字指定更高精度的累加器。
np.average()源码:
@array_function_dispatch(_average_dispatcher)
def average(a, axis=None, weights=None, returned=False):
"""
Compute the weighted average along the specified axis.
Parameters
----------
a : array_like
Array containing data to be averaged. If `a` is not an array, a
conversion is attempted.
axis : None or int or tuple of ints, optional
Axis or axes along which to average `a`. The default,
axis=None, will average over all of the elements of the input array.
If axis is negative it counts from the last to the first axis.
.. versionadded:: 1.7.0
If axis is a tuple of ints, averaging is performed on all of the axes
specified in the tuple instead of a single axis or all the axes as
before.
weights : array_like, optional
An array of weights associated with the values in `a`. Each value in
`a` contributes to the average according to its associated weight.
The weights array can either be 1-D (in which case its length must be
the size of `a` along the given axis) or of the same shape as `a`.
If `weights=None`, then all data in `a` are assumed to have a
weight equal to one.
returned : bool, optional
Default is `False`. If `True`, the tuple (`average`, `sum_of_weights`)
is returned, otherwise only the average is returned.
If `weights=None`, `sum_of_weights` is equivalent to the number of
elements over which the average is taken.
Returns
-------
retval, [sum_of_weights] : array_type or double
Return the average along the specified axis. When `returned` is `True`,
return a tuple with the average as the first element and the sum
of the weights as the second element. `sum_of_weights` is of the
same type as `retval`. The result dtype follows a genereal pattern.
If `weights` is None, the result dtype will be that of `a` , or ``float64``
if `a` is integral. Otherwise, if `weights` is not None and `a` is non-
integral, the result type will be the type of lowest precision capable of
representing values of both `a` and `weights`. If `a` happens to be
integral, the previous rules still applies but the result dtype will
at least be ``float64``.
Raises
------
ZeroDivisionError
When all weights along axis are zero. See `numpy.ma.average` for a
version robust to this type of error.
TypeError
When the length of 1D `weights` is not the same as the shape of `a`
along axis.
See Also
--------
mean
ma.average : average for masked arrays -- useful if your data contains
"missing" values
numpy.result_type : Returns the type that results from applying the
numpy type promotion rules to the arguments.
Examples
--------
>>> data = range(1,5)
>>> data
[1, 2, 3, 4]
>>> np.average(data)
2.5
>>> np.average(range(1,11), weights=range(10,0,-1))
4.0
>>> data = np.arange(6).reshape((3,2))
>>> data
array([[0, 1],
[2, 3],
[4, 5]])
>>> np.average(data, axis=1, weights=[1./4, 3./4])
array([ 0.75, 2.75, 4.75])
>>> np.average(data, weights=[1./4, 3./4])
Traceback (most recent call last):
...
TypeError: Axis must be specified when shapes of a and weights differ.
>>> a = np.ones(5, dtype=np.float128)
>>> w = np.ones(5, dtype=np.complex64)
>>> avg = np.average(a, weights=w)
>>> print(avg.dtype)
complex256
"""
a = np.asanyarray(a)
if weights is None:
avg = a.mean(axis)
scl = avg.dtype.type(a.size/avg.size)
else:
wgt = np.asanyarray(weights)
if issubclass(a.dtype.type, (np.integer, np.bool_)):
result_dtype = np.result_type(a.dtype, wgt.dtype, 'f8')
else:
result_dtype = np.result_type(a.dtype, wgt.dtype)
# Sanity checks
if a.shape != wgt.shape:
if axis is None:
raise TypeError(
"Axis must be specified when shapes of a and weights "
"differ.")
if wgt.ndim != 1:
raise TypeError(
"1D weights expected when shapes of a and weights differ.")
if wgt.shape[0] != a.shape[axis]:
raise ValueError(
"Length of weights not compatible with specified axis.")
# setup wgt to broadcast along axis
wgt = np.broadcast_to(wgt, (a.ndim-1)*(1,) + wgt.shape)
wgt = wgt.swapaxes(-1, axis)
scl = wgt.sum(axis=axis, dtype=result_dtype)
if np.any(scl == 0.0):
raise ZeroDivisionError(
"Weights sum to zero, can't be normalized")
avg = np.multiply(a, wgt, dtype=result_dtype).sum(axis)/scl
if returned:
if scl.shape != avg.shape:
scl = np.broadcast_to(scl, avg.shape).copy()
return avg, scl
else:
return avg