Python API

The Iris flower data set or Fisher’s Iris data set is a multivariate data set introduced by Sir Ronald Aylmer Fisher (1936) as an example of discriminant analysis. It is sometimes called Anderson’s Iris data set because Edgar Anderson collected the data to quantify the geographic variation of Iris flowers in the Gaspe Peninsula.

For more information: http://en.wikipedia.org/wiki/Iris_flower_data_set

References

1. Fisher,R.A. “The use of multiple measurements in taxonomic problems”, Annual Eugenics, 7, Part II, 179-188 (1936); also in “Contributions to Mathematical Statistics” (John Wiley, NY, 1950).

2. Duda,R.O., & Hart,P.E. (1973) Pattern Classification and Scene Analysis. (Q327.D83) John Wiley & Sons. ISBN 0-471-22361-1. See page 218.

3. Dasarathy, B.V. (1980) “Nosing Around the Neighborhood: A New System Structure and Classification Rule for Recognition in Partially Exposed Environments”. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-2, No. 1, 67-71.

4. Gates, G.W. (1972) “The Reduced Nearest Neighbor Rule”. IEEE Transactions on Information Theory, May 1972, 431-433.

bob.db.iris.names = ['Sepal Length', 'Sepal Width', 'Petal Length', 'Petal Width']

Names of the features for each entry in the dataset.

bob.db.iris.stats = {'Petal Length': [1.0, 6.9, 3.76, 1.76, 0.949], 'Petal Width': [0.1, 2.5, 1.2, 0.76, 0.9565], 'Sepal Length': [4.3, 7.9, 5.84, 0.83, 0.7826], 'Sepal Width': [2.0, 4.4, 3.05, 0.43, -0.4194]}

These are basic statistics for each of the features in the whole dataset.

bob.db.iris.stat_names = ['Minimum', 'Maximum', 'Mean', 'Std.Dev.', 'Correlation']

These are the statistics available in each column of the stats variable.

bob.db.iris.data()[source]

Loads from (text) file and returns Fisher’s Iris Dataset.

This set is small and simple enough to require an SQL backend. We keep the single file it has in text and load it on-the-fly every time this method is called.

We return a dictionary containing the 3 classes of Iris plants catalogued in this dataset. Each dictionary entry contains an 2D numpy.ndarray of 64-bit floats and 50 entries. Each entry is an Array with 4 features as described by “names”.

bob.db.iris.get_config()[source]

Returns a string containing the configuration information.