## Introduction

SciPy is an enormous Python library for scientific computing. This article will explain how to get started with SciPy, survey what the library has to offer, and give some examples of how to use it for common tasks.

## First steps with SciPy

The SciPy download page has links to the SourceForge download sites for SciPy and NumPy. (SciPy depends on NumPy and so both packages must be installed in order to use SciPy.) The version of SciPy (and NumPy) must be compatible with your version of Python. At the time of this writing, SciPy is available for Python 2.6 and earlier. In particular, Python 3.0 is not yet supported.

IronPython cannot use SciPy directly because much of SciPy is implemented in C and at this time IronPython can only directly import modules implemented in Python. However, the Ironclad package enables IronPython to use SciPy. See Numerical computing in IronPython with Ironclad for details.

To start using SciPy, import the `scipy`

package. By convention, the `scipy`

package is often imported with the `sp`

abbreviation for ease of use.

>>> import scipy as sp

There is some functionality at the root of the `scipy`

hierarchy, but most functionality is located in sub-packages that must be imported separately. For example, the `erf`

function is located in the `special`

sub-package for special functions. To call the `erf`

function, you need to first import the `special`

sub-package.

>>> from scipy import special >>> special.erf(2.0) 0.99532226501895271

## Getting help

The SciPy documentation page has links to extensive documentation available in HTML, PDF, and CHM (HTML help) formats.

As with any Python package, you can also find help for SciPy objects using Python's `help()`

function from the command line. However, sometimes `help`

is unhelpful when it comes to SciPy. The function `scipy.info()`

is analogous to the standard `help()`

function but specialized to give better documentation for SciPy objects. When `scipy.info()`

is given a string argument, it does a search for matching objects. When `scipy.info()`

is given an object, it returns documentation specific to that object. For example, if `scipy`

was imported as `sp`

, then:

>>> sp.info("gamma")

returns documentation on both the gamma probability distribution and the gamma function. But,

>>> sp.info(special.gamma)

only returns documentation for the gamma function.

## Library overview

The following table lists the sub-packages of `scipy`

along with a brief description of each. The next section will give examples using some of the more common sub-packages.

Sub-package | Description |

`cluster` | Clustering algorithms |

`constants` | Mathematical and physical constants |

`fftpack` | Fourier transforms |

`integrate` | Numerical integration |

`interpolate` | Interpolation |

`io` | Input and output |

`linalg` | Linear algebra |

`maxentropy` | Maximum entropy models |

`misc` | Miscellaneous |

`ndimage` | Multi-dimensional image processing |

`odr` | Orthogonal distance regression |

`optimize` | Optimization |

`signal` | Signal processing |

`sparse` | Sparse matrices |

`spatial` | Spatial algorithms and data structures |

`special` | Special functions |

`stats` | Statistical functions |

`stsci` | Image processing |

`weave` | C/C++ integration |

## Examples

SciPy is huge. The draft SciPy Reference Guide is currently 632 pages. This article will only illustrate a tiny sampling of what you can do with SciPy, focusing on some of the more common applications.

### Special functions

The `special`

sub-package contains mathematical functions beyond those included in the standard Python `math`

package. The most commonly used special function is probably the gamma function, Γ(x). The following example shows how to access it from SciPy.

>>> from scipy.special import gamma >>> gamma(0.5) 1.7724538509055159

SciPy contains a dozen functions related to the gamma function. For example, there is a separate function `gammaln`

just to return the logarithm of the gamma function. This may seem redundant, but it is very practical. Since the gamma function grows very quickly, it can easily overflow, and so the logarithm of the gamma function is often more useful than the gamma function itself. Some of the other gamma-related functions in SciPy include the incomplete gamma function `gammainc`

, the beta function `beta`

, and the logarithmic derivative of the gamma function `psi`

. Gamma functions are far from the only special functions included in SciPy. All the most commonly used special functions are included: Bessel functions, Fresnel integrals, etc.

Some of the functions in `special`

would not be classified as “special” in a mathematical sense. These are elementary functions that are included because they present numerical difficulties. For example, `scipy.special`

contains a function `log1p`

that evaluates log(1 + x). This function may seem useless at first: why not just use `math.log(1 + x)`

instead of `log1p`

? The problem is that in applications, you often need to evaluate log(1 + x) for small values of x. If x is sufficiently small (e.g., less than 10^{-16}), then `math.log(1 + x)`

will return 0 because 1 + x will equal x to machine precision and `math.log(1 + x)`

will simply return log(1) which equals 0. The function `log1p`

evaluates log(1 + x) indirectly without first computing 1 + x.

### Constants

The `constants`

sub-package contains a wide variety of physical constants. The following code will display a few common constants:

>>> from scipy import constants >>> constants.c # speed of light 299792458.0 >>> constants.h # Plank’s constant 6.6260693000000002e-034 >>> constants.N_A # Avogadro’s number 6.0221415000000003e+023

The dictionary `physical_constants`

has descriptive strings as keys. The values are triples containing a constant's value, its units of measurement, and the uncertainty in the value. For example, the dictionary gives this information on the mass of an electron.

>>> constants.physical_constants["electron mass"] (9.1093825999999998e-031, ‘kg’, 1.5999999999999999e-037)

In addition to physical constants, `constants`

contains information on units. For example, `constants.nautical_mile`

equals 1852, the number of meters in a nautical mile. And, in case you ever wondered, `constants.troy_ounce`

equals 0.0311034768, the number of kilograms in a troy ounce. There is also support for SI and binary unit prefixes. For example, the SI prefix `constants.kilo`

equals 10^{3} = 1000.0, and the binary prefix `constants.kibi`

= 2^{10} = 1024.

### Integration

The `integrate`

sub-package contains several routines for numerical integration. The most commonly used routine is `quad`

(named for “quadrature”, an old-fashioned name for integration). The first argument to `quad`

is a function of one variable to integrate. For simple functions, it is convenient to use `lambda`

to define the function inline, though of course integrands can be defined elsewhere using `def`

. The `quad`

function returns a pair: the value of the integral and an estimate of the error in the value. The following code integrates cos(e^{x}) between the limits of 2 and 3.

>>> from scipy import integrate >>> integrate.quad(lambda x: math.cos(math.exp(x)), 2, 3) (-0.063708480528704675, 2.4175070627010321e-014)

To specify infinite limits of integration in `quad`

, use the constant `scipy.inf`

for ∞, as in the following example:

>>> from scipy import inf integrate.quad(lambda x: math.exp(-x*x), -inf, inf) (1.7724538509055159, 1.4202636780944923e-008)

The `integrate`

sub-package contains other integration routines, such as `dblquad`

for double integrals and `tplquad`

for triple integrals. It also contains `odeint`

for numerically evaluating systems of ordinary differential equations.

### Probability and statistics

The `stats`

sub-package contains a wealth of functions for probability and statistics. The library currently features 81 continuous distributions and 12 discrete distributions. The following example shows how to compute the probability that a normal (Gaussian) random variable with mean 0 and standard deviation 3 takes on a value less than 5. It also shows how to generate five random samples from the same distribution.

>>> from scipy.stats import norm >>> norm.cdf(5, 0, 3) 0.9522096477271853 >>> norm.rvs(0, 3, size=5) array([ 4.85229537, 3.0104119 , 1.13189841, 5.19688369, -2.97970912])

See Probability distributions in SciPy for more examples of working with probability distributions.

The `stats`

sub-package has simple functions such as `std`

for computing the sample standard deviation of an array of numbers. It has more sophisticated functions such as `glm`

for working with linear regression, analysis of variance, etc. It also contains functions for numerous statistical tests and common chores such as producing histograms.

Note that some statistical functionality is located outside the `stats`

sub-package. For example, orthogonal distance regression support is contained in its own sub-package `odr`

. Also, you might use the `linalg`

sub-package in conjunction with `stats`

.

## Further resources

The SciPy.org website has links to further documentation, including tutorials and cookbook examples.

The Mathesaurus is a sort of Rosetta stone for comparing mathematical environments such as SciPy, Matlab, R, etc. The resources on that site are useful even if you do not know one of the other packages and just want a cheat sheet for doing various tasks in Python (especially SciPy).

The IPython shell is a popular environment for working interactively with SciPy. Also, many SciPy users use matplotlib to create plots from either the standard Python command line for from IPython.