This package describes the database API for Bob. Database APIs establish how your programs can query for file lists using known pre-coded protocols that assure reproducibility. This package contains only the base API for you to create and distribute new databases and a single, very simple example using the publicly available Iris Flower Dataset.
Build a database package for Bob goes pretty much like building a satellite package. For examples and details, have a look at our satellite package portal.
The db package contains simplified APIs to access data for various databases that can be used in Biometry, Machine Learning or Pattern Classification.
Some utilities shared by many of the databases.
Bases: object
An object that handles the connection to SQLite databases.
Initializes the connector
Keyword arguments
Creates a read-only session to an SQLite database. If read-only sessions are not supported by the underlying sqlite3 python DB driver, then a normal session is returned. A warning is emitted in case the underlying filesystem does not support locking properly.
Raises a NotImplementedError if the dbtype is not supported.
Creates an engine connected to an SQLite database with no locks. If engines without locks are not supported by the underlying sqlite3 python DB driver, then a normal engine is returned. A warning is emitted if the underlying filesystem does not support locking properly in this case.
Raises a NotImplementedError if the dbtype is not supported.
Creates a session to an SQLite database with no locks. If sessions without locks are not supported by the underlying sqlite3 python DB driver, then a normal session is returned. A warning is emitted if the underlying filesystem does not support locking properly in this case.
Raises a NotImplementedError if the dbtype is not supported.
This module defines, among other less important constructions, a management interface that can be used by Bob to display information about the database and manage installed files.
Bases: object
Base manager for Bob databases
Returns a python iterable with all auxiliary files needed.
The values should be take w.r.t. where the python file that declares the database is sitting at.
Returns the type of auxiliary files you have for this database
If you return ‘sqlite’, then we append special actions such as ‘dbshell’ on ‘bob_dbmanage.py’ automatically for you. Otherwise, we don’t.
If you use auxiliary text files, just return ‘text’. We may provide special services for those types in the future.
Use the special name ‘builtin’ if this database is an integral part of Bob.
Sets up the base parser for this database.
Keyword arguments:
Returns a subparser, ready to be added commands on
Adds commands to a given (argparse) parser.
This method, effectively, allows you to define special commands that your database will be able to perform when called from the common driver like for example create or checkfiles.
You are not obliged to overwrite this method. If you do, you will have the chance to establish your own commands. You don’t have to worry about stock commands such as files or version. They will be automatically hooked-in depending on the values you return for type() and files().
Keyword arguments
The Iris flower data set or Fisher’s Iris data set is a multivariate data set introduced by Sir Ronald Aylmer Fisher (1936) as an example of discriminant analysis. It is sometimes called Anderson’s Iris data set because Edgar Anderson collected the data to quantify the geographic variation of Iris flowers in the Gaspe Peninsula.
For more information: http://en.wikipedia.org/wiki/Iris_flower_data_set
References:
1. Fisher,R.A. “The use of multiple measurements in taxonomic problems”, Annual Eugenics, 7, Part II, 179-188 (1936); also in “Contributions to Mathematical Statistics” (John Wiley, NY, 1950).
2. Duda,R.O., & Hart,P.E. (1973) Pattern Classification and Scene Analysis. (Q327.D83) John Wiley & Sons. ISBN 0-471-22361-1. See page 218.
3. Dasarathy, B.V. (1980) “Nosing Around the Neighborhood: A New System Structure and Classification Rule for Recognition in Partially Exposed Environments”. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-2, No. 1, 67-71.
4. Gates, G.W. (1972) “The Reduced Nearest Neighbor Rule”. IEEE Transactions on Information Theory, May 1972, 431-433.
Names of the features for each entry in the dataset.
These are basic statistics for each of the features in the whole dataset.
These are the statistics available in each column of the stats variable.
Loads from (text) file and returns Fisher’s Iris Dataset.
This set is small and simple enough to require an SQL backend. We keep the single file it has in text and load it on-the-fly every time this method is called.
We return a dictionary containing the 3 classes of Iris plants catalogued in this dataset. Each dictionary entry contains an 2D numpy.ndarray of 64-bit floats and 50 entries. Each entry is an Array with 4 features as described by “names”.