Game
The dataset was created and donated to the UCI ML Repository by John Tromp (tromp ‘@’ cwi.nl).
Samples total |
67557 |
Dimensionality |
42 |
Features |
categorical |
Targets |
str: {“win”, “loss”, “draw”} |
Task(s) |
classification |
Description
This database contains all legal 8-ply positions in the game of connect-4 in which neither player has won yet, and in which the next move is not forced.
The symbol x represents the first player; o the second. The dataset contains the state of the game by representing each position in a 6x7 grid board. The outcome class is the game theoretical value for the first player.
Example
Note that to use the game dataset the categorical data in the features array must be encoded numerically. There are a number of numeric encoding mechanisms such as the sklearn.preprocessing.OrdinalEncoder
or the sklearn.preprocessing.OneHotEncoder
that may be used as follows:
from sklearn.preprocessing import OneHotEncoder
from yellowbrick.datasets import load_game
X, y = load_game()
X = OneHotEncoder().fit_transform(X)
Citation
Downloaded from the UCI Machine Learning Repository on May 4, 2017.
Loader
- yellowbrick.datasets.loaders.load_game(data_home=None, return_dataset=False)[source]
Load the Connect-4 game multivariate and spatial dataset that is well suited to multiclass classification tasks. The dataset contains 67557 instances with 42 categorical attributes and a discrete target.
Note that the game data is stored with categorical features that need to be numerically encoded before use with scikit-learn estimators. We recommend the use of the
sklearn.preprocessing.OneHotEncoder
for this task and to develop aPipeline
using this dataset.The Yellowbrick datasets are hosted online and when requested, the dataset is downloaded to your local computer for use. Note that if the dataset hasn’t been downloaded before, an Internet connection is required. However, if the data is cached locally, no data will be downloaded. Yellowbrick checks the known signature of the dataset with the data downloaded to ensure the download completes successfully.
Datasets are stored alongside the code, but the location can be specified with the
data_home
parameter or the$YELLOWBRICK_DATA
envvar.- Parameters
- data_homestr, optional
The path on disk where data is stored. If not passed in, it is looked up from
$YELLOWBRICK_DATA
or the default returned byget_data_home
.- return_datasetbool, default=False
Return the raw dataset object instead of X and y numpy arrays to get access to alternative targets, extra features, content and meta.
- Returns
- Xarray-like with shape (n_instances, n_features) if return_dataset=False
A pandas DataFrame or numpy array describing the instance features.
- yarray-like with shape (n_instances,) if return_dataset=False
A pandas Series or numpy array describing the target vector.
- datasetDataset instance if return_dataset=True
The Yellowbrick Dataset object provides an interface to accessing the data in a variety of formats as well as associated metadata and content.