Accord.MachineLearning
Learning algorithm for s.
The algorithm is mostly intended to be used to create
weak classifiers in the context of an
learning algorithm. Please refer to the class for more examples
on using the classifier in this scenario. A simple example is shown below:
It is also possible to use the as a standalone learning algorithm.
An example is given below:
Gets or sets the model being trained.
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Contains Boosting related techniques for creating classifier ensembles and other composition models.
The namespace class diagram is shown below.
Adapter for models that do not implement a .Decide function.
The type for the weak classifier model.
Gets or sets the weak decision model.
Gets or sets the decision function used by the .
Creates a new Weak classifier given a
classification model and its decision function.
The classifier.
The classifier decision function.
Computes the classifier decision for a given input.
The input vector.
The model's decision label.
Computes a class-label decision for a given .
The input vector that should be classified into
one of the possible classes.
A class-label that best described according
to this classifier.
Simple classifier that based on decision margins that
are perpendicular to one of the space dimensions.
The classifier is mostly intended to be used as a weak classifier
in the context of an learning algorithm. Please refer to the
class for more examples on using the classifier in this scenario.
A simple example is shown below:
It is also possible to use the as a standalone classifier through
the algorithm. An example is given below:
Initializes a new instance of the class.
Gets the decision threshold for this linear classifier.
Gets the index of the attribute which this
classifier will use to compare against
.
Gets or sets the comparison to be performed.
Computes a class-label decision for a given .
The input vector that should be classified into
one of the possible classes.
A class-label that best described according
to this classifier.
Initializes a new instance of the class.
The number of inputs for this classifier.
Gets the direction of the comparison
(if greater than or less than).
Computes the output class label for a given input.
The input vector.
The most likely class label for the given input.
Teaches the stump classifier to recognize
the class labels of the given input samples.
The input vectors.
The class labels corresponding to each input vector.
The weights associated with each input vector.
Contains Boosting related techniques for creating classifier ensembles and other composition models.
The namespace class diagram is shown below.
Model construction (fitting) delegate.
The type of the model to be created.
The current weights for the input samples.
A model trained over the weighted samples.
Extra parameters that can be passed to AdaBoost's model learning function.
AdaBoost learning algorithm.
The type of the model to be trained.
AdaBoost, short for "Adaptive Boosting", is a machine learning meta-algorithm
formulated by Yoav Freund and Robert Schapire who won the Gödel Prize in 2003
for their work. It can be used in conjunction with many other types of learning
algorithms to improve their performance. The output of the other learning algorithms
('weak learners') is combined into a weighted sum that represents the final output of
the boosted classifier. AdaBoost is adaptive in the sense that subsequent weak learners
are tweaked in favor of those instances misclassified by previous classifiers. AdaBoost
is sensitive to noisy data and outliers. In some problems it can be less susceptible to
the overfitting problem than other learning algorithms. The individual learners can be
weak, but as long as the performance of each one is slightly better than random guessing
(e.g., their error rate is smaller than 0.5 for binary classification), the final model
can be proven to converge to a strong learner.
Every learning algorithm will tend to suit some problem types better than others, and will
typically have many different parameters and configurations to be adjusted before achieving
optimal performance on a dataset, AdaBoost(with decision trees as the weak learners) is often
referred to as the best out-of-the-box classifier. When used with decision tree learning,
information gathered at each stage of the AdaBoost algorithm about the relative 'hardness'
of each training sample is fed into the tree growing algorithm such that later trees tend
to focus on harder-to-classify examples.
References:
-
Wikipedia contributors. "AdaBoost." Wikipedia, The Free Encyclopedia. Wikipedia, The
Free Encyclopedia, 10 Aug. 2017. Web. 7 Sep. 2017
Initializes a new instance of the class.
Initializes a new instance of the class.
The model to be learned.
Initializes a new instance of the class.
The model to be learned.
The model fitting function.
Gets or sets the model being trained.
Gets or sets the maximum number of iterations
performed by the learning algorithm.
Please use MaxIterations instead.
Gets or sets the relative tolerance used to
detect convergence of the learning algorithm.
Gets or sets the error limit before learning stops. Default is 0.5.
Gets or sets the fitting function which creates
and trains a model given a weighted data set.
Gets or sets a function that takes a set of parameters and creates
a learning algorithm for learning each stage of the boosted classsifier.
Gets or sets a cancellation token that can be used to
stop the learning algorithm while it is running.
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Runs the learning algorithm.
The input samples.
The corresponding output labels.
The classifier error.
Runs the learning algorithm.
The input samples.
The corresponding output labels.
The weights for each of the samples.
The classifier error.
Computes the error ratio, the number of
misclassifications divided by the total
number of samples in a dataset.
Weighted Weak Classifier.
The type of the weak classifier.
Weighted Weak Classifier.
The type of the weak classifier.
The type of the input vectors accepted by the classifier.
Gets or sets the weight associated
with the weak .
Gets or sets the weak
classifier associated with the .
Boosted classification model.
The type of the weak classifier.
Initializes a new instance of the class.
Initializes a new instance of the class.
The initial boosting weights.
The initial weak classifiers.
Computes the output class label for a given input.
The input vector.
The most likely class label for the given input.
Boosted classification model.
The type of the weak classifier.
The type of the input vectors accepted by the classifier.
Initializes a new instance of the class.
Initializes a new instance of the class.
The initial boosting weights.
The initial weak classifiers.
Boosted classification model.
The type of the weak classifier.
The type of the weighted classifier.
The type of the input vectors accepted by the classifier.
Gets the list of weighted weak models
contained in this boosted classifier.
Initializes a new instance of the class.
Initializes a new instance of the class.
The initial boosting weights.
The initial weak classifiers.
Computes a class-label decision for a given .
The input vector that should be classified into
one of the possible classes.
A class-label that best described according
to this classifier.
Adds a new weak classifier and its corresponding
weight to the end of this boosted classifier.
The weight of the weak classifier.
The weak classifier
Gets or sets the at the specified index.
Returns an enumerator that iterates through this collection.
An object that can be used to iterate through the collection.
Returns an enumerator that iterates through this collection.
An object that can be used to iterate through the collection.
Common interface for Weak classifiers
used in Boosting mechanisms.
Computes the output class label for a given input.
The input vector.
The most likely class label for the given input.
Data cluster.
Initializes a new instance of the class.
The collection that contains this instance as a field.
The number of clusters K.
The distance metric to consider.
Randomizes the clusters inside a dataset.
The data to randomize the algorithm.
The seeding strategy to be used. Default is .
Randomizes the clusters inside a dataset.
The data to randomize the algorithm.
The seeding strategy to be used. Default is .
The parallelization options for this procedure.
Only relevant for the .
Data cluster.
Gets the cluster's centroid.
Computes the distortion of the cluster, measured
as the average distance between the cluster points
and its centroid.
The input points.
The average distance between all points
in the cluster and the cluster centroid.
Initializes a new instance of the class.
The collection that contains this instance as a field.
The number of clusters K.
The distance metric to consider.
Gets or sets the distance function used to measure the distance
between a point and the cluster centroid in this clustering definition.
Gets or sets the clusters' centroids.
The clusters' centroids.
Calculates the average square distance from the data points
to the nearest clusters' centroids.
The average distance from centroids can be used as a measure
of the "goodness" of the clustering. The more the data are
aggregated around the centroids, the less the average distance.
The average square distance from the data points to the nearest
clusters' centroids.
Transform data points into feature vectors containing the
distance between each point and each of the clusters.
The input points.
The label of each input point.
The weight associated with each point.
An optional matrix to store the computed transformation.
A vector containing the distance between the input points and the clusters.
Transform data points into feature vectors containing the
distance between each point and each of the clusters.
The input points.
The weight associated with each point.
An optional matrix to store the computed transformation.
A vector containing the distance between the input points and the clusters.
Common interface for collections of clusters (i.e. ,
, ).
Gets the number of clusters in the collection.
Gets the collection of clusters currently modeled by the clustering algorithm.
Gets the proportion of samples in each cluster.
Gets the cluster at the given index.
The index of the cluster. This should also be the class label of the cluster.
An object holding information about the selected cluster.
Common interface for clusters that contains centroids, where the centroid data type might be different
from the data type of the data bring clustered (i.e. ).
Gets or sets the clusters' centroids.
The clusters' centroids.
Gets or sets the distance function used to measure the distance
between a point and the cluster centroid in this clustering definition.
Calculates the average square distance from the data points
to the nearest clusters' centroids.
The average distance from centroids can be used as a measure
of the "goodness" of the clustering. The more the data are
aggregated around the centroids, the less the average distance.
The average square distance from the data points to the nearest
clusters' centroids.
Transform data points into feature vectors containing the
distance between each point and each of the clusters.
The input points.
The weight associated with each point.
An optional matrix to store the computed transformation.
A vector containing the distance between the input points and the clusters.
Transform data points into feature vectors containing the
distance between each point and each of the clusters.
The input points.
The label of each input point.
The weight associated with each point.
An optional matrix to store the computed transformation.
A vector containing the distance between the input points and the clusters.
Common interface for clusters that contains centroids which are of the same data type
as the clustered data types (i.e. ).
Balanced K-Means algorithm. Note: The balanced clusters will be
available in the property of this instance!
The Balanced k-Means algorithm attempts to find a clustering where each cluster
has approximately the same number of data points. The Balanced k-Means implementation
used in the framework uses the algorithm to solve the assignment
problem thus enforcing balance between the clusters.
Note: the method of this class will
return the centroids of balanced clusters, but please note that these centroids
cannot be used to obtain balanced clusterings for another (or even the same) data
set. Instead, in order to inspect the balanced clustering that has been obtained
after calling , please take a look at the
contents of the property.
References:
-
M. I. Malinen and P.Fränti, "Balanced K-means for Clustering", Joint Int.Workshop on Structural, Syntactic,
and Statistical Pattern Recognition (S+SSPR 2014), LNCS 8621, 32-41, Joensuu, Finland, August 2014.
-
M. I. Malinen, "New alternatives for k-Means clustering." PhD thesis. Available in:
http://cs.uef.fi/sipu/pub/PhD_Thesis_Mikko_Malinen.pdf
How to perform clustering with Balanced K-Means.
Gets the labels assigned for each data point in the last
call to .
The labels.
Initializes a new instance of the Balanced K-Means algorithm.
The number of clusters to divide the input data into.
The distance function to use. Default is to
use the distance.
Initializes a new instance of the Balanced K-Means algorithm.
The number of clusters to divide the input data into.
Learns a model that can map the given inputs to the desired outputs. Note:
the model created by this function will not be able to produce balanced
clusterings. To retrieve the balanced labels, check the
property of this class after calling this function.
The model inputs.
The weight of importance for each input sample.
A model that has learned how to produce suitable outputs
given the input data .
Fast k-means clustering algorithm.
The Mini-Batch K-Means clustering algorithm is a modification of the K-Means
algorithm.
In each iteration, it uses only a portion of data to update the cluster centroids with the gradient step.
The subsets of data are called mini-batches and are randomly sampled from the whole dataset in each iteration.
Mini-Batch K-Means is faster than k-means for large datasets since batching reduces computational time of the algorithm.
The algorithm is composed of the following steps:
-
Place K points into the space represented by the objects that are
being clustered. These points represent initial group centroids.
-
Form a batch by choosing B objects from the whole input dataset.
For each object in the batch, determine the group that has the closest centroid.
Then, update the centroid with a gradient step.
-
Repeat step 2 until the centroids converge or the maximal number of iterations has been performed.
References:
-
D. Sculley. Web-Scale K-Means Clustering. Available on:
https://www.eecs.tufts.edu/~dsculley/papers/fastkmeans.pdf
Gets the labels assigned for each data point in the last
call to .
The labels.
Initializes a new instance of Mini-Batch K-Means algorithm
The number of clusters to divide input data.
The size of batches.
The distance function to use. Default is to
use the Euclidean distance.
Initializes a new instance of KMeans algorithm
The number of clusters to divide input data.
The size of batches.
Gets or sets the size of the batch used during initialization.
Gets or sets the number of different initializations of the centroids.
Gets or sets the size of batches.
Learns a model that can map the given inputs to the desired outputs.
The model inputs.
The weight of importance for each input sample.
A model that has learned how to produce suitable outputs
given the input data .
Creates a random batch.
The size of the model input dataset.
The size of the batch.
An array of indices of the input objects which the created batch contains.
Initializes the centroids.
The model inputs.
The weight of importance for each input sample.
k-Medoids cluster collection.
k-Medoids' cluster.
Initializes a new instance of the class.
The number of clusters K.
The distance metric to use.
Gets or sets the clusters' centroids.
The clusters' centroids.
Gets or sets the distance function used to measure the distance
between a point and the cluster centroid in this clustering definition.
The distance.
Gets the collection of clusters currently modeled by the clustering algorithm.
The clusters.
Gets the proportion of samples in each cluster.
Gets the number of clusters in the collection.
The count.
Gets the at the specified index.
The index.
GaussianCluster.
Randomizes the clusters inside a dataset.
The data to randomize the algorithm.
The seeding strategy to be used. Default is .
The parallelization options for this procedure.
Only relevant for the .
Array of point indices, if clusters were binded to points, null otherwise.
Calculates the average square distance from the data points
to the nearest clusters' centroids.
The data.
The labels.
The weights.
The average square distance from the data points to the nearest
clusters' centroids.
The average distance from centroids can be used as a measure
of the "goodness" of the clustering. The more the data are
aggregated around the centroids, the less the average distance.
Transform data points into feature vectors containing the
distance between each point and each of the clusters.
The input points.
The weight associated with each point.
An optional matrix to store the computed transformation.
A vector containing the distance between the input points and the clusters.
Transform data points into feature vectors containing the
distance between each point and each of the clusters.
The input points.
The label of each input point.
The weight associated with each point.
An optional matrix to store the computed transformation.
A vector containing the distance between the input points and the clusters.
Returns an enumerator that iterates through the collection.
An enumerator that can be used to iterate through the collection.
Returns an enumerator that iterates through a collection.
An object that can be used to iterate through the collection.
Computes a numerical score measuring the association between
the given vector and a given
.
The input vector.
The index of the class whose score will be computed.
System.Double.
k-Medoids clustering using PAM (Partition Around Medoids) algorithm.
From Wikipedia:
The k-medoids algorithm is a clustering algorithm related to the k-means algorithm and the medoidshift
algorithm. Both the k-means and k-medoids algorithms are partitional (breaking the dataset up into groups)
and both attempt to minimize the distance between points labeled to be in a cluster and a point designated
as the center of that cluster. In contrast to the k-means algorithm, k-medoids chooses datapoints as centers
(medoids or exemplars) and works with a generalization of the Manhattan Norm to define distance between
datapoints instead of L2. This method was proposed in 1987[1] for the work with L1 norm and other distances.
The most common realisation of k-medoid clustering is the Partitioning Around Medoids (PAM) algorithm.
PAM uses a greedy search which may not find the optimum solution, but it is faster than exhaustive search.
[1] Kaufman, L. and Rousseeuw, P.J. (1987), Clustering by means of Medoids, in Statistical Data Analysis
Based on the L1–Norm and Related Methods, edited by Y. Dodge, North-Holland, 405–416.
How to perform K-Medoids clustering with PAM algorithm.
Gets the clusters found by k-Medoids.
Gets the number of clusters.
Gets the dimensionality of the data space.
Gets or sets whether the clustering distortion error (the
average distance between all data points and the cluster
centroids) should be computed at the end of the algorithm.
The result will be stored in . Default is true.
Gets or sets the distance function used
as a distance metric between data points.
Gets or sets the maximum number of iterations to
be performed by the method. If set to zero, no
iteration limit will be imposed. Default is 0.
Gets or sets the relative convergence threshold
for stopping the algorithm. Default is 1e-5.
Gets the number of iterations performed in the
last call to this class' Compute methods.
Gets the cluster distortion error (the average distance
between data points and the cluster centroids) after the
last call to this class' Compute methods.
Gets or sets the strategy used to initialize the
centroids of the clustering algorithm. Default is
.
Initializes a new instance of PartitioningAroundMedoids algorithm
The number of clusters to divide input data.
The distance function to use. Default is to
use the distance.
Learns a model that can map the given inputs to the desired outputs.
The model inputs.
The weight of importance for each input sample.
A model that has learned how to produce suitable outputs
given the input data .
points
Not enough points. There should be more points than the number K of clusters.
Implementation of the PAM algorithm.
k-Medoids clustering using PAM (Partition Around Medoids) algorithm.
From Wikipedia:
The k-medoids algorithm is a clustering algorithm related to the k-means algorithm and the medoidshift
algorithm. Both the k-means and k-medoids algorithms are partitional (breaking the dataset up into groups)
and both attempt to minimize the distance between points labeled to be in a cluster and a point designated
as the center of that cluster. In contrast to the k-means algorithm, k-medoids chooses datapoints as centers
(medoids or exemplars) and works with a generalization of the Manhattan Norm to define distance between
datapoints instead of L2. This method was proposed in 1987[1] for the work with L1 norm and other distances.
The most common realisation of k-medoid clustering is the Partitioning Around Medoids (PAM) algorithm.
PAM uses a greedy search which may not find the optimum solution, but it is faster than exhaustive search.
[1] Kaufman, L. and Rousseeuw, P.J. (1987), Clustering by means of Medoids, in Statistical Data Analysis
Based on the L1–Norm and Related Methods, edited by Y. Dodge, North-Holland, 405–416.
This is the specialized, non-generic version of the k-Medoids algorithm
that is set to work on arrays.
Initializes a new instance of k-Medoids algorithm
The number of clusters to divide input data.
k-Medoids clustering using Voronoi iteration algorithm.
From Wikipedia:
The k-medoids algorithm is a clustering algorithm related to the k-means algorithm and the medoidshift
algorithm. Both the k-means and k-medoids algorithms are partitional (breaking the dataset up into groups)
and both attempt to minimize the distance between points labeled to be in a cluster and a point designated
as the center of that cluster. In contrast to the k-means algorithm, k-medoids chooses datapoints as centers
(medoids or exemplars) and works with a generalization of the Manhattan Norm to define distance between
datapoints instead of L2. This method was proposed in 1987[1] for the work with L1 norm and other distances.
Voronoi iteration algorithm (or Lloyd algorithm) is one of possible implementations of the k-medoids
clustering. It was suggested in the [2] and [3].
[1] Kaufman, L. and Rousseeuw, P.J. (1987), Clustering by means of Medoids, in Statistical Data Analysis
Based on the L1–Norm and Related Methods, edited by Y. Dodge, North-Holland, 405–416.
[2] T. Hastie, R. Tibshirani, and J.Friedman.The Elements of Statistical Learning, Springer (2001), 468–469.
[3] H.S.Park , C.H.Jun, A simple and fast algorithm for K-medoids clustering, Expert Systems with Applications,
36, (2) (2009), 3336–3341.
How to perform K-Medoids clustering with Voronoi iteration algorithm.
Initializes a new instance of VoronoiIteration algorithm
The number of clusters to divide input data.
The distance function to use. Default is to
use the distance.
Helper class - cluster infromation.
Index of the medoid point for this cluster.
Cost of this cluster, i.e. sum of distances of all
cluster member points to the medoid point.
Set of member point indices.
Initializes new ClusterInfo object.
Reset object to the initial state.
Implementation of the Voronoi Iteration algorithm.
k-Medoids clustering using Voronoi iteration algorithm.
From Wikipedia:
The k-medoids algorithm is a clustering algorithm related to the k-means algorithm and the medoidshift
algorithm. Both the k-means and k-medoids algorithms are partitional (breaking the dataset up into groups)
and both attempt to minimize the distance between points labeled to be in a cluster and a point designated
as the center of that cluster. In contrast to the k-means algorithm, k-medoids chooses datapoints as centers
(medoids or exemplars) and works with a generalization of the Manhattan Norm to define distance between
datapoints instead of L2. This method was proposed in 1987[1] for the work with L1 norm and other distances.
Voronoi iteration algorithm (or Lloyd algorithm) is one of possible implementations of the k-medoids
clustering. It was suggested in the [2] and [3].
[1] Kaufman, L. and Rousseeuw, P.J. (1987), Clustering by means of Medoids, in Statistical Data Analysis
Based on the L1–Norm and Related Methods, edited by Y. Dodge, North-Holland, 405–416.
[2] T. Hastie, R. Tibshirani, and J.Friedman.The Elements of Statistical Learning, Springer (2001), 468–469.
[3] H.S.Park , C.H.Jun, A simple and fast algorithm for K-medoids clustering, Expert Systems with Applications,
36, (2) (2009), 3336–3341.
Initializes a new instance of k-Medoids algorithm
The number of clusters to divide input data.
Base class for tree inducing (learning) algorithms.
Gets or sets the maximum allowed height when learning a tree. If
set to zero, the tree can have an arbitrary length. Default is 0.
Gets or sets the maximum number of variables that
can enter the tree. A value of zero indicates there
is no limit. Default is 0 (there is no limit on the
number of variables).
Gets or sets the collection of attributes to
be processed by the induced decision tree.
Gets or sets how many times one single variable can be integrated into the decision process. In the original
ID3 algorithm, a variable can join only one time per decision path (path from the root to a leaf). If set to
zero, a single variable can participate as many times as needed. Default is 1.
Gets or sets the decision trees being learned.
Gets how many times each attribute has already been used in the current path.
In the original C4.5 and ID3 algorithms, attributes could be re-used only once,
but in the framework implementation this behaviour can be adjusted by setting
the property.
Initializes a new instance of the class.
The attributes to be processed by the induced tree.
Adds the specified variable to the list of s.
Returns an enumerator that iterates through the collection.
An enumerator that can be used to iterate through the collection.
Returns an enumerator that iterates through a collection.
An object that can be used to iterate through the collection.
Computes the split information measure.
The total number of samples.
The partitioning.
An extra partition containing only missing values.
The split information for the given partitions.
Contains learning algorithms for inducing
Decision Trees.
C4.5 Learning algorithm for Decision Trees.
References:
-
Quinlan, J. R. C4.5: Programs for Machine Learning. Morgan
Kaufmann Publishers, 1993.
-
Quinlan, J. R. C4.5: Programs for Machine Learning. Morgan
Kaufmann Publishers, 1993.
-
Quinlan, J. R. Improved use of continuous attributes in c4.5. Journal
of Artificial Intelligence Research, 4:77-90, 1996.
-
Mitchell, T. M. Machine Learning. McGraw-Hill, 1997. pp. 55-58.
-
Wikipedia, the free encyclopedia. ID3 algorithm. Available on
http://en.wikipedia.org/wiki/ID3_algorithm
This example shows the simplest way to induce a decision tree with continuous variables.
This is the same example as above, but the decision variables are specified manually.
This example shows how to handle missing values in the training data.
The next example shows how to induce a decision tree for a more complicated example, again
using a codebook to manage how input
variables should be encoded. It also shows how to obtain a compiled version of the decision
tree for deciding the class labels for new samples with maximum performance.
The next example shows how to estimate the true performance of a decision tree model using cross-validation:
The next example shows how to find the best parameters for a decision tree using grid-search cross-validation:
Gets or sets the step at which the samples will
be divided when dividing continuous columns in
binary classes. Default is 1.
Creates a new C4.5 learning algorithm.
Creates a new C4.5 learning algorithm.
The attributes to be processed by the induced tree.
Creates a new C4.5 learning algorithm.
The decision tree to be generated.
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Runs the learning algorithm, creating a decision
tree modeling the given inputs and outputs.
The inputs.
The corresponding outputs.
The error of the generated tree.
Computes the prediction error for the tree
over a given set of input and outputs.
The input points.
The corresponding output labels.
The percentage error of the prediction.
ID3 (Iterative Dichotomizer 3) learning algorithm
for Decision Trees.
References:
-
Quinlan, J. R 1986. Induction of Decision Trees.
Mach. Learn. 1, 1 (Mar. 1986), 81-106.
-
Mitchell, T. M. Machine Learning. McGraw-Hill, 1997. pp. 55-58.
-
Wikipedia, the free encyclopedia. ID3 algorithm. Available on
http://en.wikipedia.org/wiki/ID3_algorithm
This example shows the simplest way to induce a decision tree with discrete variables.
This example shows a common textbook example, and how to induce a decision tree using a
codebook to convert string (text) variables into discrete symbols.
Gets or sets whether all nodes are obligated to provide
a true decision value. If set to false, some leaf nodes
may contain null. Default is false.
Creates a new ID3 learning algorithm.
Creates a new ID3 learning algorithm.
The decision tree to be generated.
Creates a new ID3 learning algorithm.
The attributes to be processed by the induced tree.
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Runs the learning algorithm, creating a decision
tree modeling the given inputs and outputs.
The inputs.
The corresponding outputs.
The error of the generated tree.
Computes the prediction error for the tree
over a given set of input and outputs.
The input points.
The corresponding output labels.
The percentage error of the prediction.
Contains discrete and continuous Decision Trees, with
support for automatic code generation, tree pruning and
the creation of decision rule sets.
Numeric comparison category.
The node does no comparison.
The node compares for equality.
The node compares for non-equality.
The node compares for greater-than or equality.
The node compares for greater-than.
The node compares for less-than.
The node compares for less-than or equality.
Extension methods for enumeration values.
Returns a that represents this instance.
The comparison type.
A that represents this instance.
Collection of decision nodes. A decision branch specifies the index of
an attribute whose current value should be compared against its children
nodes. The type of the comparison is specified in each child node.
Gets or sets the index of the attribute to be
used in this stage of the decision process.
Gets the attribute that is being used in
this stage of the decision process, given
by the current
Gets or sets the decision node that contains this collection.
Initializes a new instance of the class.
Initializes a new instance of the class.
The to whom
this belongs.
Initializes a new instance of the class.
Index of the attribute to be processed.
The children nodes. Each child node should be
responsible for a possible value of a discrete attribute, or for
a region of a continuous-valued attribute.
Adds the elements of the specified collection to the end of the collection.
The child nodes to be added.
Returns a that represents this instance.
A that represents this instance.
Contains classes to prune decision trees, removing
unneeded nodes in an attempt to improve generalization.
Reduced error pruning.
Initializes a new instance of the class.
The tree to be pruned.
The pruning set inputs.
The pruning set outputs.
Computes one pass of the pruning algorithm.
Error-based pruning.
References:
-
Lior Rokach, Oded Maimon. The Data Mining and Knowledge Discovery Handbook,
Chapter 9, Decision Trees. Springer, 2nd ed. 2010, XX, 1285 p. 40 illus.
Available at: http://www.ise.bgu.ac.il/faculty/liorr/hbchap9.pdf .
// Suppose you have the following input and output data
// and would like to learn the relationship between the
// inputs and outputs by using a Decision Tree:
double[][] inputs = ...
int[] output = ...
// To prune a decision tree, we need to split your data into
// training and pruning groups. Let's say we have 100 samples,
// and would like to reserve 50 samples for training, and 50
// for pruning:
// Gather the first half for the training set
var trainingInputs = inputs.Submatrix(0, 49);
var trainingOutput = output.Submatrix(0, 49);
// Gather the second hand data for pruning
var pruningInputs = inputs.Submatrix(50, 99);
var pruningOutput = output.Submatrix(50, 99);
// Create the decision tree
DecisionTree tree = new DecisionTree( ... );
// Learn our tree using the training data
C45Learning c45 = new C45Learning(tree);
double error = c45.Run(trainingInputs, trainingOutput);
// Now we can attempt to prune the tree using the pruning groups
ErrorBasedPruning prune = new ErrorBasedPruning(tree, pruningInputs, pruningOutput);
// Gain threshold
prune.Threshold = 0.1;
double lastError;
double error = Double.PositiveInfinity;
do
{
// Now we can start pruning the tree as
// long as the error doesn't increase
lastError = error;
error = prune.Run();
} while (error < lastError);
Initializes a new instance of the class.
The tree to be pruned.
The pruning set inputs.
The pruning set outputs.
Gets or sets the minimum allowed gain threshold
to prune the tree. Default is 0.01.
Computes one pass of the pruning algorithm.
Attempts to prune a node's subtrees.
Whether the current node was changed or not.
Random Forest learning algorithm.
This example shows the simplest way to induce a decision tree with continuous variables.
The next example shows how to induce a decision tree with continuous variables using a
codebook to manage how input
variables should be encoded.
Gets or sets the number of trees in the random forest.
Gets or sets the number of trees in the random forest.
Gets or sets how many times the same variable can
enter a tree's decision path. Default is 100.
Gets or sets the collection of attributes to
be processed by the induced decision tree.
Gets the proportion of samples used to train each
of the trees in the decision forest. Default is 0.632.
Gets or sets the proportion of variables that
can be used at maximum by each tree in the decision
forest. Default is 1 (always use all variables).
Creates a new decision forest learning algorithm.
Creates a new decision forest learning algorithm.
Creates a new decision forest learning algorithm.
The attributes to be processed by the induced tree.
Creates a new decision forest learning algorithm.
The attributes to be processed by the induced tree.
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Runs the learning algorithm with the given data.
The input points.
The class label for each point.
Random Forest.
Represents a random forest of s. For
sample usage and example of learning, please see the documentation
page for .
This example shows the simplest way to induce a decision tree with continuous variables.
The next example shows how to induce a decision tree with continuous variables using a
codebook to manage how input
variables should be encoded.
Gets the trees in the random forest.
Gets the number of classes that can be recognized
by this random forest.
Gets or sets the parallelization options for this algorithm.
Gets or sets a cancellation token that can be used
to cancel the algorithm while it is running.
Creates a new random forest.
The trees to be added to the forest.
Creates a new random forest.
The number of trees to be added to the forest.
An array specifying the attributes to be processed by the trees.
The number of classes in the classification problem.
Creates a new random forest.
The number of trees in the forest.
The number of classes in the classification problem.
Computes the decision output for a given input vector.
The input vector.
The forest decision for the given vector.
Computes a class-label decision for a given .
The input vector that should be classified into
one of the possible classes.
A class-label that best described according
to this classifier.
Called when the object is being deserialized.
Contains sets of decision rules that can be created from
Decision
Trees.
Antecedent expression for s.
Gets the index of the variable used as the
left hand side term of this expression.
Gets the comparison being made between the variable
value at and .
Gets the right hand side of this expression.
Creates a new instance of the class.
The variable index.
The comparison to be made using the value at
and .
The value to be compared against.
Checks if this antecedent applies to a given input.
An input vector.
True if the input element at position
compares to using ; false
otherwise.
Determines whether the specified
is equal to this instance.
The to compare with this instance.
true if the specified
is equal to this instance; otherwise, false.
Determines whether the specified
is equal to this instance.
The to compare with this instance.
true if the specified
is equal to this instance; otherwise, false.
Returns a hash code for this instance.
A hash code for this instance, suitable for use in
hashing algorithms and data structures like a hash table.
Returns a that represents this instance.
A that represents this instance.
Implements the operator ==.
Implements the operator !=.
Decision rule set.
Decision rule sets can be created from s using their
method. An example is shown below.
Obsolete. Please use instead.
Initializes a new instance of the class.
Initializes a new instance of the class.
A set of decision rules.
Creates a new from a .
A .
A that is completely
equivalent to the given
Computes the decision output for a given input.
An input vector.
The decision output for the given
.
Adds a new to the set.
The to be added.
Adds a collection of new s to the set.
The collection of s to be added.
Removes all rules from this set.
Gets the number of rules in this set.
Removes a given rule from the set.
The to be removed.
True if the rule was removed; false otherwise.
Returns a that represents this instance.
A that represents this instance.
Returns a that represents this instance.
A that represents this instance.
Returns a that represents this instance.
A that represents this instance.
Returns a that represents this instance.
A that represents this instance.
Returns an enumerator that iterates through a collection.
An object
that can be used to iterate through the collection.
Returns an enumerator that iterates through a collection.
An object
that can be used to iterate through the collection.
Computes a class-label decision for a given .
The input vector that should be classified into
one of the possible classes.
A class-label that best described according
to this classifier.
Decision Rule.
The simplest way to create a set of decision rules is by extracting them from an existing .
The example below shows how to create a simple decision tree and convert it to a set of rules
using its method.
Gets the decision variables handled by this rule.
Gets the expressions that
must be fulfilled in order for this rule to be applicable.
Gets or sets the output of this decision rule, given
when all conditions are met.
Initializes a new instance of the class.
The decision variables handled by this decision rule.
The output value, given after all antecedents are met.
The antecedent conditions that lead to the .
Initializes a new instance of the class.
The decision variables handled by this decision rule.
The output value, given after all antecedents are met.
The antecedent conditions that lead to the .
Initializes a new instance of the class.
The output value, given after all antecedents are met.
The antecedent conditions that lead to the .
Gets the number of antecedents contained
in this .
Checks whether a the rule applies to a given input vector.
An input vector.
True, if the input matches the rule
; otherwise, false.
Creates a new from a 's
. This node must be a leaf, cannot be the root, and
should have one output value.
A from a .
A representing the given .
Gets whether this rule and another rule have
the same antecedents but different outputs.
True if the two rules are contradictory;
false otherwise.
Returns a that represents this instance.
A that represents this instance.
Returns a that represents this instance.
A that represents this instance.
Returns a that represents this instance.
A that represents this instance.
Returns a that represents this instance.
A that represents this instance.
Returns a that represents this instance.
A that represents this instance.
Creates a new object that is a copy of the current instance.
A new object that is a copy of this instance.
Returns an enumerator that iterates through a collection.
An object that
can be used to iterate through the collection.
Returns an enumerator that iterates through a collection.
An object that
can be used to iterate through the collection.
Returns a hash code for this instance.
A hash code for this instance, suitable for use in hashing
algorithms and data structures like a hash table.
Determines whether the specified is equal to this instance.
The to compare with this instance.
true if the specified
is equal to this instance; otherwise, false.
Determines whether the specified is equal to this instance.
The to compare with this instance.
true if the specified
is equal to this instance; otherwise, false.
Compares this instance to another .
Implements the operator <.
Implements the operator >.
Implements the operator ==.
Implements the operator !=.
Decision rule simplification algorithm.
Gets or sets the underlying hypothesis test
size parameter used to reject hypothesis.
Initializes a new instance of the class.
The decision set to be simplified.
Computes the reduction algorithm.
A set of training inputs.
The outputs corresponding to each of the inputs.
The average error after the reduction.
Computes the average decision error.
A set of input vectors.
A set of corresponding output vectors.
The average misclassification rate.
Checks if two variables can be eliminated.
Checks if two variables can be eliminated.
Decision Tree C# Writer.
Initializes a new instance of the class.
Creates a C# code for the tree.
Attribute category.
Attribute is discrete-valued.
Attribute is continuous-valued.
Decision attribute.
Gets the name of the attribute.
Gets the nature of the attribute (i.e. real-valued or discrete-valued).
Gets the valid range of the attribute.
Creates a new .
The name of the attribute.
The range of valid values for this attribute. Default is [0;1].
Creates a new .
The name of the attribute.
The attribute's nature (i.e. real-valued or discrete-valued).
Creates a new .
The name of the attribute.
The range of valid values for this attribute.
Creates a new discrete-valued .
The name of the attribute.
The number of possible values for this attribute.
Creates a new continuous .
The name of the attribute.
Creates a new continuous .
The name of the attribute.
The range of valid values for this attribute. Default is [0;1].
Creates a new discrete .
The name of the attribute.
The range of valid values for this attribute.
Creates a new discrete-valued .
The name of the attribute.
The number of possible values for this attribute.
Returns a that represents this instance.
A that represents this instance.
Creates a set of decision variables from a codebook.
The ordered dictionary containing information about the variables.
An array of objects
initialized with the values from the codebook.
Creates a set of decision variables from a codebook.
The codebook containing information about the variables.
The columns to consider as decision variables.
An array of objects
initialized with the values from the codebook.
Creates a set of decision variables from input data.
The input data.
An array of objects
initialized with the values from the codebook.
Creates a set of decision variables from input data.
The input data.
An array of objects
initialized with the values from the codebook.
Creates a set of decision variables from input data.
The input data.
An array of objects
initialized with the values from the codebook.
Collection of decision attributes.
Initializes a new instance of the class.
The list to initialize the collection.
Decision Tree (Linq) Expression Creator.
Initializes a new instance of the class.
The decision tree.
Creates an expression for the tree.
A strongly typed lambda expression in the form
of an expression tree
representing the .
Decision tree (for both discrete and continuous classification problems).
Represents a decision tree which can be compiled to code at run-time. For sample usage
and example of learning, please see the documentation pages for the
ID3 and C4.5 learning algorithms.
It is also possible to create random forests using
the random forest learning algorithm.
This example shows the simplest way to induce a decision tree with discrete variables.
This example shows a common textbook example, and how to induce a decision tree using a
codebook to convert string (text) variables into discrete symbols.
For more examples with discrete variables, please see
This example shows the simplest way to induce a decision tree with continuous variables.
For more examples with continuous variables, please see
The next example shows how to estimate the true performance of a decision tree model using cross-validation:
Gets or sets the root node for this tree.
Gets the collection of attributes processed by this tree.
Creates a new to process
the given and the given
number of possible .
An array specifying the attributes to be processed by this tree.
The number of possible output classes for the given attributes.
Computes the tree decision for a given input.
The input data.
A predicted class for the given input.
Computes the tree decision for a given input.
The input data.
A predicted class for the given input.
Computes the tree decision for a given input.
The input data.
A predicted class for the given input.
Computes the tree decision for a given input.
The input data.
A predicted class for the given input.
Computes the tree decision for a given input.
The input data.
The location to where to store the class labels.
A predicted class for the given input.
Computes the tree decision for a given input.
The input data.
The node where the decision starts.
A predicted class for the given input.
Computes the tree decision for a given input.
The input data.
The node where the decision starts.
A predicted class for the given input.
Returns an enumerator that iterates through the tree.
An object that can be
used to iterate through the collection.
Traverse the tree using a tree
traversal method. Can be iterated with a foreach loop.
The tree traversal method. Common methods are
available in the static class.
An object which can be used to
traverse the tree using the chosen traversal method.
Traverse a subtree using a tree
traversal method. Can be iterated with a foreach loop.
The tree traversal method. Common methods are
available in the static class.
The root of the subtree to be traversed.
An object which can be used to
traverse the tree using the chosen traversal method.
Transforms the tree into a set of decision rules.
A created from this tree.
Creates an Expression Tree representation
of this decision tree, which can in turn be compiled into code.
A tree in the form of an expression tree.
Creates a .NET assembly (.dll) containing a static class of
the given name implementing the decision tree. The class will
contain a single static Compute method implementing the tree.
The name of the assembly to generate.
The name of the generated static class.
Creates a .NET assembly (.dll) containing a static class of
the given name implementing the decision tree. The class will
contain a single static Compute method implementing the tree.
The name of the assembly to generate.
The namespace which should contain the class.
The name of the generated static class.
Generates a C# class implementing the decision tree.
The name for the generated class.
A string containing the generated class.
Generates a C# class implementing the decision tree.
The name for the generated class.
The where the class should be written.
Computes the height of the tree, defined as the
greatest distance (in links) between the tree's
root node and its leaves.
The tree's height.
Obsolete. Please use (or use it as an extension method).
Obsolete. Please use (or use it as an extension method).
Obsolete. Please use .
Obsolete. Please use .
Deprecated. Please use the NumberOfOutputs property instead.
Deprecated. Please use the NumberOfInputs property.
Deprecated. Please use the Decide() method instead.
Deprecated. Please use the Decide() method instead.
Deprecated. Please use the Decide() method instead.
Deprecated. Please use the Decide() method instead.
Decision Tree (DT) Node.
Each node of a decision tree can play two roles. When a node is not a leaf, it
contains a with a collection of child nodes. The
branch specifies an attribute index, indicating which column from the data set
(the attribute) should be compared against its children values. The type of the
comparison is specified by each of the children. When a node is a leaf, it will
contain the output value which should be decided for when the node is reached.
Gets or sets the value this node responds to
whenever this node acts as a child node. This
value is set only when the node has a parent.
Gets or sets the type of the comparison which
should be done against .
If this is a leaf node, gets or sets the output
value to be decided when this node is reached.
If this is not a leaf node, gets or sets the collection
of child nodes for this node, together with the attribute
determining the reasoning process for those children.
Gets or sets the parent of this node. If this is a root
node, the parent is null.
Gets the containing this node.
Creates a new decision node.
The owner tree for this node.
Gets a value indicating whether this instance is a root node (has no parent).
true if this instance is a root; otherwise, false.
Gets a value indicating whether this instance is a leaf (has no children).
true if this instance is a leaf; otherwise, false.
Computes whether a value satisfies
the condition imposed by this node.
The value x.
true if the value satisfies this node's
condition; otherwise, false.
Computes whether a value satisfies
the condition imposed by this node.
The value x.
true if the value satisfies this node's
condition; otherwise, false.
Returns a that represents this instance.
A that represents this instance.
Returns a that represents this instance.
A that represents this instance.
Computes the height of the node, defined as the
distance (in number of links) between the tree's
root node and this node.
The node's height.
Returns an enumerator that iterates through the node's subtree.
A that can be used to iterate through the collection.
Returns an enumerator that iterates through the node's subtree.
An object that can be used to iterate through the collection.
Tree enumeration method delegate.
An enumerator traversing the tree.
Common traversal methods for n-ary trees.
Breadth-first traversal method.
Depth-first traversal method.
Post-order tree traversal method.
Adapted from John Cowan (1998) recommendation.
Base class for K-Nearest Neighbor (k-NN) algorithms.
The type of the model being learned.
The type of the input data.
The type for distance functions that can be used with this algorithm.
Initializes a new instance of the class.
Gets or sets the distance function used
as a distance metric between data points.
Gets or sets the number of nearest neighbors to be used
in the decision. Default is 5.
The number of neighbors.
Gets or sets a cancellation token that can be used to
stop the learning algorithm while it is running.
Gets the set of points given
as input of the algorithm.
The input points.
Gets the set of labels associated
with each point.
Computes a numerical score measuring the association between
the given vector and a given
.
The input vector.
The index of the class whose score will be computed.
System.Double.
Gets the top points that are the closest
to a given reference point.
The query point whose neighbors will be found.
The label for each neighboring point.
An array containing the top points that are
at the closest possible distance to .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Mini-batch data shuffling options.
Do not perform any shuffling.
Shuffle the data only once, before any batches are created.
Re-shuffles the data after every epoch.
Utility class for preparing mini-batches of data.
Creates a method to partition a given dataset into mini-batches of equal size.
The type of the input data.
The type of the output output.
The input data to be partitioned into mini-batches.
The output data to be partitioned into mini-batches.
The weights for the data to be partitioned into mini-batches.
The size of the batch.
The maximum number of mini-batches that should be created until the method stops.
The maximum number of epochs that should be run until the method stops.
The data shuffling options.
Utility class for preparing mini-batches of data.
Initializes a new instance of the class.
The input data that should be divided into batches.
The weight for the data that should be divided into batches.
Inheritors should use this method to create a new instance of a mini-batch object.
You can use this method to create mini-batches containing different objects related
to the mini-batch, such as auxiliary data, privileged information, etc).
DataSubset<TInput>.
Utility class for preparing mini-batches of data.
Gets or sets the output associated with each input instances
that should be divided among the mini-batches at every epoch.
Initializes a new instance of the class.
The input data that should be divided into batches.
The output for the data that should be divided into batches.
The weight for the data that should be divided into batches.
Inheritors should use this method to create a new instance of a mini-batch object.
You can use this method to create mini-batches containing different objects related
to the mini-batch, such as auxiliary data, privileged information, etc).
Inheritors should use this method to put the current sample in the
given mini-batch at the given position.
The mini-batch being constructed.
The position in mini-batch where the current sample must be put.
Inheritors should use this method to prepare the data for the next batches,
for example by reshuffling according to the ordering passed as argument.
The ordering for the samples in the new batches.
Utility class for preparing mini-batches of data.
Gets or sets the input instances that should be divided among the mini-batches at every epoch.
Gets or sets the weights associated with each data instance
that should be divided among the mini-batches at every epoch.
Gets or sets the size of the mini-batch that will be generated.
Gets or sets options about how and when data should be shuffled.
Gets the number of samples in each epoch.
Gets or sets the number of mini-batches that are going to be generated for each epoch.
Gets or sets the maximum number of iterations for which mini-batches should be generated.
Gets or sets the maximum number of epochs for which mini-batches should be generated.
Gets or sets the current iteration counter.
The index of the current iteration.
Gets or sets the current epoch counter.
The index of the current epoch.
Gets or sets the current sample counter.
The index of the current sample.
Gets or sets the current mini-batch counter.
The index of the current mini-batch.
Initializes a new instance of the class.
The input data that should be divided into batches.
The weight for the data that should be divided into batches.
Inheritors should use this method to create a new instance of a mini-batch object.
You can use this method to create mini-batches containing different objects related
to the mini-batch, such as auxiliary data, privileged information, etc).
Inheritors should use this method to put the current sample in the
given mini-batch at the given position.
The mini-batch being constructed.
The position in mini-batch where the current sample must be put.
Inheritors should use this method to prepare the data for the next batches,
for example by reshuffling according to the ordering passed as argument.
The ordering for the samples in the new batches.
Returns an enumerator that iterates through a collection.
An object that can be used to iterate through the collection.
Returns an enumerator that iterates through a collection.
An object that can be used to iterate through the collection.
k-Fold cross-validation.
Cross-validation is a technique for estimating the performance of a predictive
model. It can be used to measure how the results of a statistical analysis will
generalize to an independent data set. It is mainly used in settings where the
goal is prediction, and one wants to estimate how accurately a predictive model
will perform in practice.
One round of cross-validation involves partitioning a sample of data into
complementary subsets, performing the analysis on one subset (called the
training set), and validating the analysis on the other subset (called the
validation set or testing set). To reduce variability, multiple rounds of
cross-validation are performed using different partitions, and the validation
results are averaged over the rounds.
References:
-
Wikipedia, The Free Encyclopedia. Cross-validation (statistics). Available on:
http://en.wikipedia.org/wiki/Cross-validation_(statistics)
The type of the machine learning model.
The type of the input data.
The type of the output data or labels.
The type of the learning algorithm used to learn .
Gets the array of data set indexes contained in each fold.
Gets the array of fold indices for each point in the data set.
Gets the number of folds in the k-fold cross validation.
Initializes a new instance of the class.
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Please set the Learner property before calling the Learn(x, y) method.
or
Please set the Learner property before calling the Learn(x, y) method.
or
The number of folds can not exceed the total number of samples in the data set.
Creates a list of the sample indices that should serve as the validation set.
The input data from where subsamples should be drawn.
The output data from where subsamples should be drawn.
The number of folds to be created.
The indices of the samples in the original set that should compose the validation set.
Gets a subset of the training and testing sets.
Index of the subsample.
The input data x.
The output data y.
The weights of each sample.
A that defines a
data split of a subsample of the dataset.
Subset of a larger dataset.
The type of the input data in this dataset.
Gets or sets the input data in the dataset.
Gets or sets the weights associated with each input sample in the dataset.
Gets or sets the indices of the samples of this subset
in relation to the original dataset they belong to.
Gets or sets a user-defined tag that can be associated with this instance.
Gets or sets the size of this subset as a proportion in
relation to the original dataset this subset comes from.
Gets or sets an index associated with this subset, if applicable.
Initializes a new instance of the class.
Initializes a new instance of the class.
The size of the data subset.
The total size of the dataset that contains this subset.
Initializes a new instance of the class.
The index associated with this subset, if any.
The input instances in this subset.
The weights associated with the input instances.
The indices of the input instances in relation to the original dataset.
Subset of a larger dataset.
The type of the input data in this dataset.
The type of the output data in this dataset.
Gets or sets the input data in the dataset.
Initializes a new instance of the class.
Initializes a new instance of the class.
Initializes a new instance of the class.
The index associated with this subset, if any.
The input instances in this subset.
The output instances in this subset.
The weights associated with the input instances.
The indices of the input instances in relation to the original dataset.
Range of parameters to be tested in a grid search.
Gets or sets the range of values that should be tested for this parameter.
Gets or sets the index of the current value in the search,
whose value will be shown in the property.
Gets the current value being considered during the grid-search.
Gets the number of values that this parameter can assume (the length of the parameter range).
Creates a new object that is a copy of the current instance.
A new object that is a copy of this instance.
Performs an implicit conversion from to .
The range to be converted.
The value of the parameter's .
Grid search procedure for automatic parameter tuning.
The type of the input data. Default is double[].
The type of the output data. Default is int.
Grid Search tries to find the best combination of parameters across a range of possible values that produces the best fit model. If there
are two parameters, each with 10 possible values, Grid Search will try an exhaustive evaluation of the model using every combination of points,
resulting in 100 model fits.
The framework offers different ways to use grid search: one version is strongly-typed using generics
and the other might need some manual casting. The exapmle below shows how to perform grid-search in
a non-stringly typed way:
The main disadvantages of the method above is the need to keep string identifiers for each of the parameters
being searched. Furthermore, it is also necessary to keep track of their types in order to cast them accordingly
when using them in the specification of the
property.
The next example shows how to perform grid-search in a strongly typed way:
The code above uses anonymous types and generics to create a specialized
class that keeps the anonymous type given as .
Its main disadvantage is the (high) increase in type complexity, making the use of the var keyword almost
mandatory.
It is also possible to create grid-search objects using convenience methods from the static class:
Finally, it is also possible to combine grid-search with ,
as shown in the examples below:
Creates a new algorithm.
The type of the machine learning model whose parameters should be searched.
The type of the learning algorithm used to learn .
The range of parameters to consider during search.
A function that can create a given training parameters.
A function that can measure how far model predictions are from the expected ground-truth.
A function that specifies how to create a new model using the teacher learning algorirhm.
A grid-search algorithm that has been configured with the given parameters.
Creates a new algorithm.
The type of the machine learning model whose parameters should be searched.
The type that specifies how ranges of the parameter values are represented.
The type of the learning algorithm used to learn .
The range of parameters to consider during search.
A function that can create a given training parameters.
A function that can measure how far model predictions are from the expected ground-truth.
A function that specifies how to create a new model using the teacher learning algorirhm.
A grid-search algorithm that has been configured with the given parameters.
Creates a new combined with
algorithms.
The type of the machine learning model whose parameters should be searched.
The type that specifies how ranges of the parameter values are represented.
The type of the learning algorithm used to learn .
The range of parameters to consider during search.
A function that can create a given training parameters.
A function that can measure how far model predictions are from the expected ground-truth.
A function that specifies how to create a new model using the teacher learning algorirhm.
The number of folds in the k-fold cross-validation. Default is 10.
A grid-search algorithm that has been configured with the given parameters.
Creates a new combined with
algorithms.
The type of the machine learning model whose parameters should be searched.
The type of the learning algorithm used to learn .
The range of parameters to consider during search.
A function that can create a given training parameters.
A function that can measure how far model predictions are from the expected ground-truth.
A function that specifies how to create a new model using the teacher learning algorirhm.
The number of folds in the k-fold cross-validation. Default is 10.
A grid-search algorithm that has been configured with the given parameters.
Grid search procedure for automatic parameter tuning.
The type of the machine learning model whose parameters should be searched.
The type of the input data. Default is double[].
The type of the output data. Default is int.
Grid Search tries to find the best combination of parameters across a range of possible values that produces the best fit model. If there
are two parameters, each with 10 possible values, Grid Search will try an exhaustive evaluation of the model using every combination of points,
resulting in 100 model fits.
The framework offers different ways to use grid search: one version is strongly-typed using generics
and the other might need some manual casting. The exapmle below shows how to perform grid-search in
a non-stringly typed way:
The main disadvantages of the method above is the need to keep string identifiers for each of the parameters
being searched. Furthermore, it is also necessary to keep track of their types in order to cast them accordingly
when using them in the specification of the
property.
The next example shows how to perform grid-search in a strongly typed way:
The code above uses anonymous types and generics to create a specialized
class that keeps the anonymous type given as .
Its main disadvantage is the (high) increase in type complexity, making the use of the var keyword almost
mandatory.
It is also possible to create grid-search objects using convenience methods from the static class:
Finally, it is also possible to combine grid-search with ,
as shown in the examples below:
Initializes a new instance of the class.
Grid search procedure for automatic parameter tuning.
The type of the machine learning model whose parameters should be searched.
The type that specifies how ranges of the parameter values are represented.
The type of the learning algorithm used to learn .
The type of the input data. Default is double[].
The type of the output data. Default is int.
Grid Search tries to find the best combination of parameters across a range of possible values that produces the best fit model. If there
are two parameters, each with 10 possible values, Grid Search will try an exhaustive evaluation of the model using every combination of points,
resulting in 100 model fits.
The framework offers different ways to use grid search: one version is strongly-typed using generics
and the other might need some manual casting. The exapmle below shows how to perform grid-search in
a non-stringly typed way:
The main disadvantages of the method above is the need to keep string identifiers for each of the parameters
being searched. Furthermore, it is also necessary to keep track of their types in order to cast them accordingly
when using them in the specification of the
property.
The next example shows how to perform grid-search in a strongly typed way:
The code above uses anonymous types and generics to create a specialized
class that keeps the anonymous type given as .
Its main disadvantage is the (high) increase in type complexity, making the use of the var keyword almost
mandatory.
It is also possible to create grid-search objects using convenience methods from the static class:
Finally, it is also possible to combine grid-search with ,
as shown in the examples below:
Initializes a new instance of the class.
Inheritors of this class should return the number of possible parameter values for
each parameter in the grid-search range. For example, if a problem should search
parameters in the range {0, 1, ... 9} (10 values) and {-1, -2, -3 } (3 values), this
method should return { 10, 3 }.
The number of possibilities for each parameter.
Inheritors of this class should specify how to get actual values for the parameters
given a index vector in the grid-search space. Those indices indicate which values
should be given, e.g. if there are two parameters in the problem, the ranges of the
first parameter are {10, 20, 30}, and the ranges of the second parameter are {0.1, 0.01, 0.001 },
if the index vector is { 1, 2 } this method should return { 20, 0.001 }.
The indices in grid-search space.
The parameters at the location indicated by .
Grid search procedure for automatic parameter tuning.
Grid Search tries to find the best combination of parameters across a range of possible values that produces the best fit model. If there
are two parameters, each with 10 possible values, Grid Search will try an exhaustive evaluation of the model using every combination of points,
resulting in 100 model fits.
The framework offers different ways to use grid search: one version is strongly-typed using generics
and the other might need some manual casting. The exapmle below shows how to perform grid-search in
a non-stringly typed way:
The main disadvantages of the method above is the need to keep string identifiers for each of the parameters
being searched. Furthermore, it is also necessary to keep track of their types in order to cast them accordingly
when using them in the specification of the
property.
The next example shows how to perform grid-search in a strongly typed way:
The code above uses anonymous types and generics to create a specialized
class that keeps the anonymous type given as .
Its main disadvantage is the (high) increase in type complexity, making the use of the var keyword almost
mandatory.
It is also possible to create grid-search objects using convenience methods from the static class:
Finally, it is also possible to combine grid-search with ,
as shown in the examples below:
Creates a range of parameter values that should be searched during .
The type of the parameter values.
The values to be included in .
Creates a range of parameter values that should be searched during .
The type of the parameter values.
Creates a range of parameter values that should be searched during .
The type of the parameter values.
Creates a range of parameter values that should be searched during .
The type of the parameter values.
Creates a new algorithm.
The type of the input data. Default is double[].
The type of the output data. Default is int.
The type of the machine learning model whose parameters should be searched.
The type of the learning algorithm used to learn .
The range of parameters to consider during search.
A function that can create a given training parameters.
A function that can measure how far model predictions are from the expected ground-truth.
A function that specifies how to create a new model using the teacher learning algorirhm.
The input data to be used during training.
The output data to be used during training.
A grid-search algorithm that has been configured with the given parameters.
Creates a new algorithm.
The type of the machine learning model whose parameters should be searched.
The type that specifies how ranges of the parameter values are represented.
The type of the learning algorithm used to learn .
The type of the input data. Default is double[].
The type of the output data. Default is int.
The range of parameters to consider during search.
A function that can create a given training parameters.
A function that can measure how far model predictions are from the expected ground-truth.
A function that specifies how to create a new model using the teacher learning algorirhm.
The input data to be used during training.
The output data to be used during training.
A grid-search algorithm that has been configured with the given parameters.
Creates a new combined with
algorithms.
The type of the machine learning model whose parameters should be searched.
The type that specifies how ranges of the parameter values are represented.
The type of the learning algorithm used to learn .
The type of the input data. Default is double[].
The type of the output data. Default is int.
The range of parameters to consider during search.
A function that can create a given training parameters.
A function that can measure how far model predictions are from the expected ground-truth.
A function that specifies how to create a new model using the teacher learning algorirhm.
The number of folds in the k-fold cross-validation. Default is 10.
The input data to be used during training.
The output data to be used during training.
A grid-search algorithm that has been configured with the given parameters.
Creates a new combined with
algorithms.
The type of the machine learning model whose parameters should be searched.
The type of the learning algorithm used to learn .
The type of the input data. Default is double[].
The type of the output data. Default is int.
The range of parameters to consider during search.
A function that can create a given training parameters.
A function that can measure how far model predictions are from the expected ground-truth.
A function that specifies how to create a new model using the teacher learning algorirhm.
The number of folds in the k-fold cross-validation. Default is 10.
The input data to be used during training.
The output data to be used during training.
A grid-search algorithm that has been configured with the given parameters.
Grid search procedure for automatic parameter tuning.
The type of the machine learning model whose parameters should be searched.
The type of the learning algorithm used to learn .
The type of the input data. Default is double[].
The type of the output data. Default is int.
Grid Search tries to find the best combination of parameters across a range of possible values that produces the best fit model. If there
are two parameters, each with 10 possible values, Grid Search will try an exhaustive evaluation of the model using every combination of points,
resulting in 100 model fits.
The framework offers different ways to use grid search: one version is strongly-typed using generics
and the other might need some manual casting. The exapmle below shows how to perform grid-search in
a non-stringly typed way:
The main disadvantages of the method above is the need to keep string identifiers for each of the parameters
being searched. Furthermore, it is also necessary to keep track of their types in order to cast them accordingly
when using them in the specification of the
property.
The next example shows how to perform grid-search in a strongly typed way:
The code above uses anonymous types and generics to create a specialized
class that keeps the anonymous type given as .
Its main disadvantage is the (high) increase in type complexity, making the use of the var keyword almost
mandatory.
It is also possible to create grid-search objects using convenience methods from the static class:
Finally, it is also possible to combine grid-search with ,
as shown in the examples below:
Initializes a new instance of the class.
Inheritors of this class should return the number of possible parameter values for
each parameter in the grid-search range. For example, if a problem should search
parameters in the range {0, 1, ... 9} (10 values) and {-1, -2, -3 } (3 values), this
method should return { 10, 3 }.
The number of possibilities for each parameter.
Inheritors of this class should specify how to get actual values for the parameters
given a index vector in the grid-search space. Those indices indicate which values
should be given, e.g. if there are two parameters in the problem, the ranges of the
first parameter are {10, 20, 30}, and the ranges of the second parameter are {0.1, 0.01, 0.001 },
if the index vector is { 1, 2 } this method should return { 20, 0.001 }.
The indices in grid-search space.
The parameters at the location indicated by .
Function signature for a function that creates a machine learning model
given a set of parameter values. This function should use the parameters to create and configure
a learning algorithm that can in turn
be used to create a new machine learning model with those parameters.
The training parameters.
A learning algorithm that can be used
to create and train machine learning models.
Base class for methods.
The type of the object that should hold the results of the grid serach (e.g. ).
The type of the machine learning model whose parameters should be searched.
The type that specifies how ranges of the parameter values are represented.
The type that specifies how the value for a single parameter is represented.
The type of the learning algorithm used to learn .
The type of the input data. Default is double[].
The type of the output data. Default is int.
The range of parameters to consider during search.
Gets or sets a function
that can be used to create a given training parameters.
Gets or sets a function that can be used to create
new machine learning models using the current
learning algorithm.
Gets or sets a function that can
be used to measure how far the actual model predictions were from the expected ground-truth.
Initializes a new instance of the class.
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Inheritors of this class should specify how to get actual values for the parameters
given a index vector in the grid-search space. Those indices indicate which values
should be given, e.g. if there are two parameters in the problem, the ranges of the
first parameter are {10, 20, 30}, and the ranges of the second parameter are {0.1, 0.01, 0.001 },
if the index vector is { 1, 2 } this method should return { 20, 0.001 }.
The indices in grid-search space.
The parameters at the location indicated by .
Inheritors of this class should return the number of possible parameter values for
each parameter in the grid-search range. For example, if a problem should search
parameters in the range {0, 1, ... 9} (10 values) and {-1, -2, -3 } (3 values), this
method should return { 10, 3 }.
The number of possibilities for each parameter.
Function signature for a function that creates a machine learning model
from a subset of the training data. This function
should take a subset of the data as input, and create a
algorithm that can create a model using this given subset.
The subset of the training data that the model should be trained on.
A learning algorithm that can be used
to create and train machine learning models.
Function signature for a function that specifies how a learning algorithm
should be used to create a new machine learning model.
The teacher learning algorithm.
The input data in the dataset.
The output data in the dataset.
The weights for each instance in the dataset.
A machine learning model that has been learnt from the
input and output data using
the given teaching algorithm.
The type of the machine learning model.
The type of the learning algorithm used to learn .
The type of the input data.
The type of the output data or labels.
Function signature for a function that can compute a performance metric (i.e. a ) from
a set of (ground-truth) and (model prediction) output
values. Additional information about the metric (such as its variance) or the learning problem (such as the
expected number of classes) can be set in the object passed as the parameter.
The ground-truth data that the model was supposed to predict.
The data that the model has actually predicted.
A info object (e.g. ) that can be used to obtain more information
about the data split being evaluated and store additional information about the computed metric.
A metric that measures how far the model predictions were from the expected ground-truth.
Base class for performance measurement methods based on splitting the data into multiple sets,
such as ,
and .
The type of the result learned by the validation method (e.g. ).
The type of the machine learning model.
The type of the learning algorithm used to learn .
The type of the input data.
The type of the output data or labels.
Gets or sets a value to be used as the in case the model throws
an exception during learning. Default is null (exceptions will not be ignored).
Gets or sets a function
that can be used to create a from a subset of the learning dataset.
Gets or sets a function that can
be used to measure how far the actual model predictions were from the expected ground-truth.
Gets or sets a function that can be used to create
new machine learning models using the current
learning algorithm.
Initializes a new instance of the class.
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Learns and evaluates a model in a single subset of the data.
The subset of the data containing the training and testing subsets where
a model should be trained and evaluated, respectively.
The index of this subset, if applicable.
A object containing the created model
and its performance on the training and validation sets.
Bootstrap validation analysis results.
Gets the 0.632 bootstrap estimate.
Gets the number of subsamples taken to compute the bootstrap estimate.
Initializes a new instance of the class.
The models created during the cross-validation runs.
Bootstrap method for generalization performance measurements (with
support for stratification and default loss function for classification).
The type of the machine learning model.
The type of the input data.
Gets or sets a value indicating whether the prevalence of an output
label should be balanced between training and testing sets. Default is false.
true if this instance is stratified; otherwise, false.
Initializes a new instance of the class.
Draws the bootstrap samples from the population.
The input data from where subsamples should be drawn.
The output data from where subsamples should be drawn.
The number of samples to drawn.
The size of the samples to be drawn.
The indices of the samples in the original set.
Bootstrap method for generalization performance measurements.
The type of the machine learning model.
The type of the input data.
The type of the output data or labels.
Gets or sets the number B of bootstrap samplings
to be drawn from the population dataset.
Gets or sets the number of samples to be drawn in each subsample. If
set to zero, all samples in the entire dataset will be selected.
Gets the bootstrap samples drawn from the population dataset as indices.
Initializes a new instance of the class.
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Please set the Learner property before calling the Learn(x, y) method.
or
Please set the Learner property before calling the Learn(x, y) method.
Draws the bootstrap samples from the population.
The input data from where subsamples should be drawn.
The output data from where subsamples should be drawn.
The number of samples to drawn.
The size of the samples to be drawn.
The indices of the samples in the original set.
Gets a subset of the training and testing sets.
Index of the subsample.
The input data x.
The output data y.
The weights of each sample.
A that defines a
data split of a subsample of the dataset.
Class for representing results acquired through a
k-fold cross-validation analysis.
The type of the machine learning model.
The type of the input data.
The type of the output data or labels.
Gets the total number of data samples in the entire data set.
Gets the average number of data samples in
each cross-validation fold of the data set.
Gets the models created for each fold of the cross validation.
Gets or sets a tag for user-defined information.
Gets the number of inputs accepted by the model.
The number of inputs.
This property is read only.
Gets the number of outputs generated by the model.
The number of outputs.
This property is read only.
Initializes a new instance of the class.
The models created during the cross-validation runs.
Applies the transformation to an input, producing an associated output.
The input data to which the transformation should be applied.
The output generated by applying this transformation to the given input.
Please specify how the results of the different models should be combined by setting the CombineMethod property.
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The output generated by applying this
transformation to the given input.
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Please specify how the results of the different models should be combined by setting the CombineMethod property.
Gets or sets the method used to combine the scores of different classifiers.
Base class for performance measurement methods based on splitting the data into multiple sets,
such as ,
and .
The type of the result learned by the validation method (e.g. ).
The type of the machine learning model.
The type of the input data.
The type of the output data or labels.
Initializes a new instance of the class.
Common interfae for data splits.
The type of the input being partitioned into splits.
The type of the output being partitioned into splits.
Gets or sets the index of the split in relation to the original dataset, if applicable.
Training-Validation-Testing data split.
The type of the input being partitioned into splits.
The type of the output being partitioned into splits.
Gets or sets the index of the split in relation to the original dataset, if applicable.
Initializes a new instance of the class.
Initializes a new instance of the class.
The index associated with this subset, if any.
The input instances in this subset.
The output instances in this subset.
The weights associated with the input instances.
The indices of the training instances in relation to the original dataset.
The indices of the validation instances in relation to the original dataset.
The indices of the testing instances in relation to the original dataset.
Training-Validation-Testing data split.
The type of the input being partitioned into splits.
The type of the output being partitioned into splits.
Gets or sets the index of the split in relation to the original dataset, if applicable.
Initializes a new instance of the class.
Initializes a new instance of the class.
The index associated with this subset, if any.
The input instances in this subset.
The output instances in this subset.
The weights associated with the input instances.
The indices of the training instances in relation to the original dataset.
The indices of the validation instances in relation to the original dataset.
Contains results from the grid-search procedure.
The type of the machine learning model whose parameters should be searched.
The type of the input data. Default is double[].
The type of the output data. Default is int.
Contains results from the grid-search procedure.
The type of the machine learning model whose parameters should be searched.
The type that specifies how the value for a single parameter is represented.
The type of the input data. Default is double[].
The type of the output data. Default is int.
Gets all combination of parameters tried.
Gets all models created during the search.
Gets the error for each of the created models.
Gets exceptions found during the learning of each of the created models, if any.
Gets the index of the best found model
in the collection.
Gets the best model found.
Gets the best parameter combination found.
Gets the minimum validation error found. If this
result has been retrieved through Grid-Search Cross-Validation,
this will correspond to the minimum average validation error
for the different data splits (validation folds).
Gets the size of the grid used in the grid-search.
Gets the number of inputs accepted by the model.
Gets the number of outputs generated by the model.
Applies the transformation to an input, producing an associated output.
The input data to which the transformation should be applied.
The output generated by applying this transformation to the given input.
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The output generated by applying this
transformation to the given input.
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Non-generic interface for .
Gets or sets the index of the current value in the search.
Gets the number of values that this parameter can assume (the length of the parameter range).
Training and validation errors of a model.
The type of the model.
Gets or sets the set name (e.g. "Training" or "Testing").
The name of this set.
Gets the model.
Gets the indices of the samples in this subset in the original complete dataset.
Gets the number of samples in this subset.
Gets how much this subset represents, in proportion, of the original dataset.
Gets the metric value for the model in the current set.
Gets the variance of the validation value for the model, if available.
Gets the standard deviation of the validation value for the model, if available.
Gets or sets a tag for user-defined information.
Initializes a new instance of the class.
The model computed in this subset.
The indices of the samples in this subset.
The name of this set.
The proportion of samples in this subset, compared to the full dataset.
Information class to store the training and validation errors of a model.
Gets the total number of samples contained in the subset used to teach the .
Gets the average number of samples between the
and sets.
The average number of samples.
Gets the model.
Gets or sets the index of this split.
The data set split index.
Gets or sets a tag for user-defined information.
Initializes a new instance of the class.
The model.
The index of this subset in relation to the entire set, if applicable.
Gets or sets the number of inputs accepted by the model.
The number of inputs.
Gets or sets the number of outputs generated by the model.
The number of outputs.
Applies the transformation to an input, producing an associated output.
The input data to which the transformation should be applied.
The output generated by applying this transformation to the given input.
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The output generated by applying this
transformation to the given input.
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
k-Fold cross-validation (with support for stratification and default loss function for classification).
Cross-validation is a technique for estimating the performance of a predictive
model. It can be used to measure how the results of a statistical analysis will
generalize to an independent data set. It is mainly used in settings where the
goal is prediction, and one wants to estimate how accurately a predictive model
will perform in practice.
One round of cross-validation involves partitioning a sample of data into
complementary subsets, performing the analysis on one subset (called the
training set), and validating the analysis on the other subset (called the
validation set or testing set). To reduce variability, multiple rounds of
cross-validation are performed using different partitions, and the validation
results are averaged over the rounds.
References:
-
Wikipedia, The Free Encyclopedia. Cross-validation (statistics). Available on:
http://en.wikipedia.org/wiki/Cross-validation_(statistics)
The type of the machine learning model.
The type of the input data.
Gets or sets a value indicating whether the prevalence of an output
label should be balanced between training and testing sets. Default is false.
true if this instance is stratified; otherwise, false.
Initializes a new instance of the class.
Creates a list of the sample indices that should serve as the validation set.
The input data from where subsamples should be drawn.
The output data from where subsamples should be drawn.
The number of folds to be created.
The indices of the samples in the original set that should compose the validation set.
k-Fold cross-validation.
Cross-validation is a technique for estimating the performance of a predictive
model. It can be used to measure how the results of a statistical analysis will
generalize to an independent data set. It is mainly used in settings where the
goal is prediction, and one wants to estimate how accurately a predictive model
will perform in practice.
One round of cross-validation involves partitioning a sample of data into
complementary subsets, performing the analysis on one subset (called the
training set), and validating the analysis on the other subset (called the
validation set or testing set). To reduce variability, multiple rounds of
cross-validation are performed using different partitions, and the validation
results are averaged over the rounds.
References:
-
Wikipedia, The Free Encyclopedia. Cross-validation (statistics). Available on:
http://en.wikipedia.org/wiki/Cross-validation_(statistics)
The type of the machine learning model.
The type of the input data.
The type of the output data or labels.
Initializes a new instance of the class.
Split-Set Validation (with support for stratification and default loss function for classification).
The type of the machine learning model.
The type of the input data.
Gets or sets a value indicating whether the prevalence of an output
label should be balanced between training and testing sets. Default is false.
true if this instance is stratified; otherwise, false.
Initializes a new instance of the class.
Creates a list of the sample indices that should serve as the validation set.
The input data from where subsamples should be drawn.
The output data from where subsamples should be drawn.
The indices of the samples in the original set that should compose the validation set.
Split-Set Validation.
The type of the machine learning model.
The type of the input data.
The type of the output data or labels.
Gets the group labels assigned to each of the data samples.
Gets or sets the proportion of samples that should be
reserved in the validation set. Default is 20%.
Gets or sets the proportion of samples that should be
reserved in the training set. Default is 80%.
Gets the indices of elements in the validation set.
Gets the indices of elements in the training set.
Initializes a new instance of the class.
Creates a list of the sample indices that should serve as the validation set.
The input data from where subsamples should be drawn.
The output data from where subsamples should be drawn.
The indices of the samples in the original set that should compose the validation set.
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Please set the Learner property before calling the Learn(x, y) method.
or
Please set the Learner property before calling the Learn(x, y) method.
Training-Validation-Testing data split.
The type of the input being partitioned into splits.
The type of the output being partitioned into splits.
Gets or sets the index of the split in relation to the original dataset, if applicable.
Initializes a new instance of the class.
Initializes a new instance of the class.
The index associated with this subset, if any.
The input instances in this subset.
The output instances in this subset.
The weights associated with the input instances.
The indices of the training instances in relation to the original dataset.
The indices of the validation instances in relation to the original dataset.
Training-Validation split.
The type being separated in training and validation splits.
Gets or sets the training split.
Gets or sets the validation split.
Returns an enumerator that iterates through the collection.
An enumerator that can be used to iterate through the collection.
Returns an enumerator that iterates through a collection.
An object that can be used to iterate through the collection.
Training-Validation-Test split.
The type being separated in training, validation and test splits.
Gets or sets the training split.
Gets or sets the validation split.
Gets or sets the testing split.
Returns an enumerator that iterates through the collection.
An enumerator that can be used to iterate through the collection.
Returns an enumerator that iterates through a collection.
An object that can be used to iterate through the collection.
Training-Test split.
The type being separated in training and test splits.
Gets or sets the training split.
Gets or sets the testing split.
Returns an enumerator that iterates through the collection.
An enumerator that can be used to iterate through the collection.
Returns an enumerator that iterates through a collection.
An object that can be used to iterate through the collection.
Codebook learning statistics for models.
Gets or sets the number of instances (i.e. images or audio signals) in the training set.
The number of instances (i.e. images or audio signals).
Gets or sets the total number of descriptors seen in the training set.
The total number of descriptors.
Gets or sets the count distribution of the descriptors seen in the training set.
Gets or sets the minimum and maximum number of descriptors per instance seen in the training set.
Gets or sets the number of instances (i.e. images or audio signals)
actually used in the learning of the .
The number of instances.
Gets or sets the number of descriptors actually used
in the learning of the .
The total number of descriptors.
Gets or sets the count distribution of the descriptors actually
used in the learning of the .
Gets or sets the minimum and maximum number of descriptors per instance
actually used in the learning of the .
Bag of words.
The bag-of-words (BoW) model can be used to extract finite
length features from otherwise varying length representations.
The following example shows how to use Bag-of-Words to convert other kinds of sequences
into fixed-length representations. In particular, we apply Bag-of-Words to convert data
from the PENDIGITS handwritten digit recognition dataset and afterwards convert their
representations using a .
Constructs a new .
Bag of words.
The bag-of-words (BoW) model can be used to extract finite
length features from otherwise varying length representations.
The following example shows how to use Bag-of-Words to convert other kinds of sequences
into fixed-length representations. In particular, we apply Bag-of-Words to convert data
from the PENDIGITS handwritten digit recognition dataset and afterwards convert their
representations using a .
Constructs a new .
Base class for Bag of Audiovisual Words implementations.
Gets the number of words in this codebook.
Gets or sets the maximum number of descriptors that should be used
to learn the codebook. Default is 0 (meaning to use all descriptors).
The maximum number of samples.
Gets or sets the maximum number of descriptors per image that should be
used to learn the codebook. Default is 0 (meaning to use all descriptors).
The maximum number of samples per image.
Gets the clustering algorithm used to create this model.
Gets the feature extractor used to identify features in the input data.
Gets the number of inputs accepted by the model.
The number of inputs.
Gets the number of outputs generated by the model.
The number of outputs.
Gets statistics about the last codebook learned.
Constructs a new .
Initializes this instance.
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Applies the transformation to an input, producing an associated output.
The input data to which the transformation should be applied.
The output generated by applying this transformation to the given input.
Applies the transformation to an input, producing an associated output.
The input data to which the transformation should be applied.
The output generated by applying this transformation to the given input.
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The output generated by applying this
transformation to the given input.
Executes a parallel for using the feature detector in a thread-safe way.
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Learns a model that can map the given inputs to the desired outputs.
The model inputs.
The weight of importance for each input sample.
A model that has learned how to produce suitable outputs
given the input data .
Learns a model that can map the given inputs to the desired outputs.
The model inputs.
The weight of importance for each input sample.
A model that has learned how to produce suitable outputs
given the input data .
Generic learn method implementation that should work for any input type.
This method is useful for re-using code between methods that accept Bitmap,
BitmapData, UnmanagedImage, filenames as strings, etc.
The input type.
The inputs.
The weights.
A function that knows how to process the input
and extract features from them.
The trained model.
Base class for Bag of Visual Words implementations.
Gets the number of words in this codebook.
Gets the clustering algorithm used to create this model.
Gets the number of inputs accepted by the model.
The number of inputs.
Gets the number of outputs generated by the model.
The number of outputs.
Constructs a new .
Initializes this instance.
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Learns a model that can map the given inputs to the desired outputs.
The model inputs.
The weight of importance for each input sample.
A model that has learned how to produce suitable outputs
given the input data .
Applies the transformation to an input, producing an associated output.
The input data to which the transformation should be applied.
The output generated by applying this transformation to the given input.
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The output generated by applying this
transformation to the given input.
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Applies the transformation to an input, producing an associated output.
The input data to which the transformation should be applied.
The output generated by applying this transformation to the given input.
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The output generated by applying this
transformation to the given input.
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Weighting schemes for term-frequency (TF).
Binary TF variant (0, 1).
Raw frequency variant (f_{t,d}).
Log normalization (1 + log(f_{t,d})).
Double normalization (0.5 + 0.5 { f_{t,d} / max_t'{ f_{t',d} } }
Weighting schemes for Inverse Document Frequency (IDF).
Unary (1).
Inverse document frequency, log(N / n_t).
Smooth inverse document frequency, log(N / (1 + n_t)).
Max inverse document frequency, log( max_t'{n_t} / n_t).
Probabilistic inverse document frequency, log((N -n_t) / n_t).
Term Frequency - Inverse Term Frequency.
Gets the number of documents that contain each code word. Each element
is associated with a word, and the value of the element gives the number
of documents that contain this word.
Gets the total number of documents considered by this TF-IDF.
Gets the number of words in this codebook.
Gets the number of inputs accepted by the model.
The number of inputs.
Gets the number of outputs generated by the model.
The number of outputs.
Gets or sets the inverse document frequency (IDF) definition to be used.
Gets or sets the term frequency (TF) definition to be used.
Gets or sets a value indicating whether new words should be added to the
dictionary in the next call to .
Gets the inverse document frequency vector used to scale term-frequency vectors.
Constructs a new .
Constructs a new .
Learns a model that can map the given inputs to the desired outputs.
The model inputs.
The weight of importance for each input sample.
A model that has learned how to produce suitable outputs
given the input data .
Examples are available in the main documentation page for
. One of those examples is reproduced below:
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Examples are available in the main documentation page for
. One of those examples is reproduced below:
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Examples are available in the main documentation page for
. One of those examples is reproduced below:
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Examples are available in the main documentation page for
. One of those examples is reproduced below:
Applies the transformation to an input, producing an associated output.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Examples are available in the main documentation page for
. One of those examples is reproduced below:
Applies the transformation to an input, producing an associated output.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Examples are available in the main documentation page for
. One of those examples is reproduced below:
Applies the transformation to an input, producing an associated output.
The input data to which the transformation should be applied.
The output generated by applying this transformation to the given input.
Examples are available in the main documentation page for
. One of those examples is reproduced below:
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The output generated by applying this
transformation to the given input.
Examples are available in the main documentation page for
. One of those examples is reproduced below:
Applies the transformation to an input, producing an associated output.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Examples are available in the main documentation page for
. One of those examples is reproduced below:
Bag of words.
The bag-of-words (BoW) model can be used to extract finite
length features from otherwise varying length representations.
The following example shows how to use Bag-of-Words to convert other kinds of sequences
into fixed-length representations. In particular, we apply Bag-of-Words to convert data
from the PENDIGITS handwritten digit recognition dataset and afterwards convert their
representations using a .
Gets the number of words in this codebook.
Gets the number of outputs generated by the model.
The number of outputs.
Gets the number of inputs accepted by the model.
The number of inputs.
Gets the forward dictionary which translates
string tokens to integer labels.
Gets the reverse dictionary which translates
integer labels into string tokens.
Gets or sets the maximum number of occurrences of a word which
should be registered in the feature vector. Default is 1 (if a
word occurs, corresponding feature is set to 1).
Constructs a new .
The texts to build the bag of words model from.
Constructs a new .
The texts to build the bag of words model from.
Constructs a new .
Computes the Bag of Words model.
Gets the codeword representation of a given text.
The text to be processed.
An integer vector with the same length as words
in the code book.
Applies the transformation to an input, producing an associated output.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Applies the transformation to an input, producing an associated output.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Applies the transformation to an input, producing an associated output.
The input data to which the transformation should be applied.
The output generated by applying this transformation to the given input.
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The output generated by applying this
transformation to the given input.
Learns a model that can map the given inputs to the desired outputs.
The model inputs.
The weight of importance for each input sample.
A model that has learned how to produce suitable outputs
given the input data .
Learns a model that can map the given inputs to the desired outputs.
The model inputs.
The weight of importance for each input sample.
A model that has learned how to produce suitable outputs
given the input data .
Applies the transformation to an input, producing an associated output.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
The input data to which
the transformation should be applied.
The location to where to store the
result of this transformation.
The output generated by applying this
transformation to the given input.
Creates the default clustering algorithm for Bag-of-Words models ().
The number of clusters for k-means.
Association rule.
The item type.
Gets or sets the set of items that triggers the
activation of this association rule.
Gets or sets the set of items that are also likely to
be included in the original list given that
the input items are present.
Gets or sets the number of cases that support this rule
(the number of times this association has been seen in the
training set).
Gets or sets the confidence of this rule (as a percentage).
Determines whether this rule can be applied to a given input.
The set of elements (being bought, or processed).
True, if this rule can be applied to the given set of inputs; false othersie.
Determines whether this rule can be applied to a given input.
The set of elements (being bought, or processed).
True, if this rule can be applied to the given set of inputs; false othersie.
Returns a that represents this instance.
A that represents this instance.
A-priori algorithm for association rule mining.
References:
-
Anita Wasilewska, Lecture Notes. Available on
http://www3.cs.stonybrook.edu/~cse634/lecture_notes/07apriori.pdf
>
Initializes a new instance of the class.
The minimum number of times a rule should be detected (also known as its support)
before it can be registered as a permanent in the learned classifier.
The minimum confidence in an association rule beore it is
registered.
A-priori algorithm for association rule mining.
The dataset item type. Default is int.
References:
-
Anita Wasilewska, Lecture Notes. Available on
http://www3.cs.stonybrook.edu/~cse634/lecture_notes/07apriori.pdf
>
Gets the set of most frequent items and the respective
number of times their appear in in the training dataset.
Gets or sets a cancellation token that can be used to
stop the learning algorithm while it is running.
The token.
Initializes a new instance of the class.
The minimum number of times a rule should be detected (also known as its support)
before it can be registered as a permanent in the learned classifier.
The minimum confidence in an association rule beore it is
registered.
Learns a model that can map the given inputs to the desired outputs.
The model inputs.
The weight of importance for each input sample.
A model that has learned how to produce suitable outputs
given the input data .
Learns a model that can map the given inputs to the desired outputs.
The model inputs.
The weight of importance for each input sample.
A model that has learned how to produce suitable outputs
given the input data .
Association ruler matcher.
The item type.
Initializes a new instance of the class.
The number of distinct items in the dataset.
The association rules between items of the dataset.
Gets the number of items seen by the model during training.
Gets the number of association rules seen by the model.
Gets the number of classes expected and recognized by the classifier.
The number of classes.
Gets or sets the association rules in this model.
Gets or sets the minimum confidence threshold used to
determine whether a rule applies to an input or not.
Predicts a class label vector for the given input vector, returning a
numerical score measuring the strength of association of the input vector
to each of the possible classes.
A set of input vectors.
The class labels associated with each input
vector, as predicted by the classifier. If passed as null, the classifier
will create a new array.
Predicts a class label vector for the given input vector, returning a
numerical score measuring the strength of association of the input vector
to each of the possible classes.
A set of input vectors.
The class labels associated with each input
vector, as predicted by the classifier. If passed as null, the classifier
will create a new array.
An array where the distances will be stored,
avoiding unnecessary memory allocations.
Computes a class-label decision for a given .
The input vector that should be classified into
one of the possible classes.
A class-label that best described according
to this classifier.
Computes a class-label decision for a given .
The input vector that should be classified into
one of the possible classes.
A class-label that best described according
to this classifier.
Computes a class-label decision for a given .
The input vector that should be classified into
one of the possible classes.
An array where the distances will be stored,
avoiding unnecessary memory allocations.
A class-label that best described according
to this classifier.
Predicts a class label vector for each input vector, returning a
numerical score measuring the strength of association of the input vector
to each of the possible classes.
A set of input vectors.
The class labels associated with each input
vector, as predicted by the classifier. If passed as null, the classifier
will create a new array.
Predicts a class label vector for each input vector, returning a
numerical score measuring the strength of association of the input vector
to each of the possible classes.
A set of input vectors.
The class labels associated with each input
vector, as predicted by the classifier. If passed as null, the classifier
will create a new array.
An array where the distances will be stored,
avoiding unnecessary memory allocations.
Applies the transformation to an input, producing an associated output.
The input data to which the transformation should be applied.
The output generated by applying this transformation to the given input.
Applies the transformation to an input, producing an associated output.
The input data to which the transformation should be applied.
The output generated by applying this transformation to the given input.
Applies the transformation to an input, producing an associated output.
The input data to which the transformation should be applied.
An array where the distances will be stored,
avoiding unnecessary memory allocations.
The output generated by applying this transformation to the given input.
Predicts a class label vector for the given input vector, returning a
numerical score measuring the strength of association of the input vector
to each of the possible classes.
A set of input vectors.
The class labels associated with each input
vector, as predicted by the classifier. If passed as null, the classifier
will create a new array.
Predicts a class label vector for the given input vector, returning a
numerical score measuring the strength of association of the input vector
to each of the possible classes.
A set of input vectors.
The class labels associated with each input
vector, as predicted by the classifier. If passed as null, the classifier
will create a new array.
An array where the distances will be stored,
avoiding unnecessary memory allocations.
Predicts a class label vector for each input vector, returning a
numerical score measuring the strength of association of the input vector
to each of the possible classes.
A set of input vectors.
The class labels associated with each input
vector, as predicted by the classifier. If passed as null, the classifier
will create a new array.
Predicts a class label vector for each input vector, returning a
numerical score measuring the strength of association of the input vector
to each of the possible classes.
A set of input vectors.
The class labels associated with each input
vector, as predicted by the classifier. If passed as null, the classifier
will create a new array.
An array where the distances will be stored,
avoiding unnecessary memory allocations.
Computes a class-label decision for a given .
The input vector that should be classified into
one of the possible classes.
A class-label that best described according
to this classifier.
Computes a class-label decision for a given .
The input vector that should be classified into
one of the possible classes.
A class-label that best described according
to this classifier.
Computes a class-label decision for a given .
The input vector that should be classified into
one of the possible classes.
An array where the distances will be stored,
avoiding unnecessary memory allocations.
A class-label that best described according
to this classifier.
Applies the transformation to an input, producing an associated output.
The input data to which the transformation should be applied.
The output generated by applying this transformation to the given input.
Applies the transformation to an input, producing an associated output.
The input data to which the transformation should be applied.
The output generated by applying this transformation to the given input.
Applies the transformation to an input, producing an associated output.
The input data to which the transformation should be applied.
The output generated by applying this transformation to the given input.
Base class for Naive Bayes learning algorithms.
The type for the Naive Bayes model to be learned.
The univariate distribution to be used as components in the Naive Bayes distribution.
The type for the samples modeled by the distribution.
The fitting options for the independent distribution.
Gets or sets the parallelization options for this algorithm.
Gets or sets the model being learned.
Gets or sets a cancellation token that can be used to
stop the learning algorithm while it is running.
Gets or sets whether the class priors should be estimated
from the data.
Gets or sets the fitting options to use when
estimating the class-specific distributions.
Gets or sets the distribution creation function. This function can
be used to specify how the initial distributions of the model should
be created. By default, this function attempts to call the empty
constructor of the distribution using Activator.CreateInstance().
Constructs a new Naïve Bayes learning algorithm.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair.
A model that has learned how to produce given .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair.
A model that has learned how to produce given .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair.
A model that has learned how to produce given .
Fits one of the distributions in the naive bayes model.
Base class for Naive Bayes learning algorithms.
The type for the Naive Bayes model to be learned.
The univariate distribution to be used as components in the Naive Bayes distribution.
The type for the samples modeled by the distribution.
The fitting options for the independent distribution.
The individual fitting options for the component distributions.
Fits one of the distributions in the naive bayes model.
Naïve Bayes learning algorithm.
For basic examples on how to learn a Naive Bayes algorithm, please see
page. The following examples show how to set
more specialized learning settings for Normal (Gaussian) models.
Creates an instance of the model to be learned.
Naïve Bayes learning algorithm.
For basic examples on how to learn a Naive Bayes algorithm, please see
page. The following examples show how to set
more specialized learning settings for Normal (Gaussian) models.
Creates an instance of the model to be learned.
Naïve Bayes learning algorithm.
For basic examples on how to learn a Naive Bayes algorithm, please see
page. The following examples show how to set
more specialized learning settings for Normal (Gaussian) models.
Creates an instance of the model to be learned.
Naïve Bayes learning algorithm for discrete distribution models.
For basic examples on how to learn a Naive Bayes algorithm, please see
page. The following examples show how to set
more specialized learning settings for discrete models.
Creates an instance of the model to be learned.
Bayes decision algorithm (not naive).
The type for the distributions used to model each class.
The type for the samples modeled by the distributions.
Gets the probability distributions for each class and input.
A TDistribution[,] array in with each row corresponds to a
class, each column corresponds to an input variable. Each element
of this double[,] array is a probability distribution modeling
the occurrence of the input variable in the corresponding class.
Gets the prior beliefs for each class.
Constructs a new Naïve Bayes Classifier.
The number of output classes.
The number of input variables.
An initialization function used to create the distribution functions for
each class. Those will be available in the property.
Constructs a new Naïve Bayes Classifier.
The number of output classes.
The number of input variables.
An initialization function used to create the distribution functions for
each class. Those will be available in the property.
Computes the log-likelihood that the given input vector
belongs to the specified .
The input vector.
The index of the class whose score will be computed.
Naïve Bayes Classifier for arbitrary distributions of arbitrary elements.
Constructs a new Naïve Bayes Classifier.
The number of output classes.
The number of input variables.
An initialization function used to create the distribution functions for
each class. Those will be available in the property.
Constructs a new Naïve Bayes Classifier.
The number of output classes.
The number of input variables.
An initialization function used to create the distribution functions for
each class. Those will be available in the property.
Constructs a new Naïve Bayes Classifier.
The number of output classes.
The number of input variables.
An initial distribution to be used to initialized all independent
distribution components of this Naive Bayes. Those distributions
will made available in the property.
Constructs a new Naïve Bayes Classifier.
The number of output classes.
The number of input variables.
An initial distribution to be used to initialized all independent
distribution components of this Naive Bayes. Those distributions
will made available in the property.
Constructs a new Naïve Bayes Classifier.
The number of output classes.
The number of input variables.
An initial distribution to be used to initialized all independent
distribution components of this Naive Bayes. Those distributions
will made available in the property.
Naïve Bayes Classifier for arbitrary distributions.
A naive Bayes classifier is a simple probabilistic classifier based on applying Bayes' theorem
with strong (naive) independence assumptions. A more descriptive term for the underlying probability
model would be "independent feature model".
In simple terms, a naive Bayes classifier assumes that the presence (or absence) of a particular
feature of a class is unrelated to the presence (or absence) of any other feature, given the class
variable. In spite of their naive design and apparently over-simplified assumptions, naive Bayes
classifiers have worked quite well in many complex real-world situations.
This class implements an arbitrary-distribution (real-valued) Naive-Bayes classifier. There is
also a special named constructor to create classifiers
assuming normal distributions for each variable. For a discrete (integer-valued) distribution
classifier, please see .
References:
-
Wikipedia contributors. "Naive Bayes classifier." Wikipedia, The Free Encyclopedia.
Wikipedia, The Free Encyclopedia, 16 Dec. 2011. Web. 5 Jan. 2012.
This page contains two examples, one using text and another one using normal double vectors.
The first example is the classic example given by Tom Mitchell. If you are not interested
in text or in this particular example, please jump to the second example below.
In the first example, we will be using a mixed-continuous version of the famous Play Tennis
example by Tom Mitchell (1998). In Mitchell's example, one would like to infer if a person
would play tennis or not based solely on four input variables. The original variables were
categorical, but in this example, two of them will be categorical and two will be continuous.
The rows, or instances presented below represent days on which the behavior of the person
has been registered and annotated, pretty much building our set of observation instances for
learning:
In order to estimate a discrete Naive Bayes, we will first convert this problem to a more simpler
representation. Since some variables are categories, it does not matter if they are represented
as strings, or numbers, since both are just symbols for the event they represent. Since numbers
are more easily representable than text strings, we will convert the problem to use a discrete
alphabet through the use of a codebook.
A codebook effectively transforms any distinct possible value for a variable into an integer
symbol. For example, “Sunny” could as well be represented by the integer label 0, “Overcast”
by “1”, Rain by “2”, and the same goes by for the other variables. So:
Now that we already have our learning input/output pairs, we should specify our
Bayes model. We will be trying to build a model to predict the last column, entitled
“PlayTennis”. For this, we will be using the “Outlook”, “Temperature”, “Humidity” and
“Wind” as predictors (variables which will we will use for our decision).
Now that we have created and estimated our classifier, we
can query the classifier for new input samples through the
NaiveBayes{TDistribution}.Decide(double[]) method.
In this second example, we will be creating a simple multi-class
classification problem using integer vectors and learning a discrete
Naive Bayes on those vectors.
Constructs a new Naïve Bayes Classifier.
The number of output classes.
The number of input variables.
An initial distribution to be used to initialized all independent
distribution components of this Naive Bayes. This distribution will
be cloned and made available in the property.
Constructs a new Naïve Bayes Classifier.
The number of output classes.
The number of input variables.
An initial distribution to be used to initialized all independent
distribution components of this Naive Bayes. Those distributions
will made available in the property.
Constructs a new Naïve Bayes Classifier.
The number of output classes.
The number of input variables.
An initial distribution to be used to initialized all independent
distribution components of this Naive Bayes. Those distributions
will made available in the property.
Constructs a new Naïve Bayes Classifier.
The number of output classes.
The number of input variables.
An initial distribution to be used to initialized all independent
distribution components of this Naive Bayes. Those distributions
will made available in the property.
Constructs a new Naïve Bayes Classifier.
The number of output classes.
The number of input variables.
A function that can initialized the distribution components of all classes
modeled by this Naive Bayes. This distribution will be cloned and made
available in the property. The first argument
in the function should be the classIndex, and the second the variableIndex.
Gets the probability distributions for each class and input.
A TDistribution[,] array in with each row corresponds to a
class, each column corresponds to an input variable. Each element
of this double[,] array is a probability distribution modeling
the occurrence of the input variable in the corresponding class.
Obsolete.
Obsolete.
Obsolete.
Obsolete.
Obsolete.
Obsolete.
Obsolete.
Obsolete.
Saves the Naïve Bayes model to a stream.
The stream to which the Naïve Bayes model is to be serialized.
Saves the Naïve Bayes model to a stream.
The path to the file to which the Naïve Bayes model is to be serialized.
Constructs a new Naïve Bayes Classifier.
The number of output classes.
The number of input variables.
Obsolete.
An initial distribution to be used to initialized all independent
distribution components of this Naive Bayes. This distribution will
be cloned and made available in the property.
Constructs a new Naïve Bayes Classifier.
The number of output classes.
The number of input variables.
Obsolete.
An initial distribution to be used to initialized all independent
distribution components of this Naive Bayes. Those distributions
will made available in the property.
Constructs a new Naïve Bayes Classifier.
The number of output classes.
The number of input variables.
Obsolete.
An initial distribution to be used to initialized all independent
distribution components of this Naive Bayes. Those distributions
will made available in the property.
Constructs a new Naïve Bayes Classifier.
The number of output classes.
The number of input variables.
Obsolete.
An initial distribution to be used to initialized all independent
distribution components of this Naive Bayes. Those distributions
will made available in the property.
Naïve Bayes Classifier.
A naive Bayes classifier is a simple probabilistic classifier based on applying Bayes' theorem
with strong (naive) independence assumptions. A more descriptive term for the underlying probability
model would be "independent feature model".
In simple terms, a naive Bayes classifier assumes that the presence (or absence) of a particular
feature of a class is unrelated to the presence (or absence) of any other feature, given the class
variable. In spite of their naive design and apparently over-simplified assumptions, naive Bayes
classifiers have worked quite well in many complex real-world situations.
This class implements a discrete (integer-valued) Naive-Bayes classifier. There is also a
special named constructor to create classifiers assuming normal
distributions for each variable. For arbitrary distribution classifiers, please see
.
References:
-
Wikipedia contributors. "Naive Bayes classifier." Wikipedia, The Free Encyclopedia.
Wikipedia, The Free Encyclopedia, 16 Dec. 2011. Web. 5 Jan. 2012.
In this example, we will be using the famous Play Tennis example by Tom Mitchell (1998).
In Mitchell's example, one would like to infer if a person would play tennis or not
based solely on four input variables. Those variables are all categorical, meaning that
there is no order between the possible values for the variable (i.e. there is no order
relationship between Sunny and Rain, one is not bigger nor smaller than the other, but are
just distinct). Moreover, the rows, or instances presented below represent days on which the
behavior of the person has been registered and annotated, pretty much building our set of
observation instances for learning:
Obs: The DataTable representation is not required, and instead the NaiveBayes could
also be trained directly on integer arrays containing the integer codewords.
In order to estimate a discrete Naive Bayes, we will first convert this problem to a more simpler
representation. Since all variables are categories, it does not matter if they are represented
as strings, or numbers, since both are just symbols for the event they represent. Since numbers
are more easily representable than text strings, we will convert the problem to use a discrete
alphabet through the use of a codebook.
A codebook effectively transforms any distinct possible value for a variable into an integer
symbol. For example, “Sunny” could as well be represented by the integer label 0, “Overcast”
by “1”, Rain by “2”, and the same goes by for the other variables. So:
Now that we already have our learning input/output pairs, we should specify our
Bayes model. We will be trying to build a model to predict the last column, entitled
“PlayTennis”. For this, we will be using the “Outlook”, “Temperature”, “Humidity” and
“Wind” as predictors (variables which will we will use for our decision). Since those
are categorical, we must specify, at the moment of creation of our Bayes model, the
number of each possible symbols for those variables.
Now that we have created and estimated our classifier, we
can query the classifier for new input samples through the
Decide method.
Please note that, while the example uses a DataTable to exemplify how data stored into tables
can be loaded in the framework, it is not necessary at all to use DataTables in your own, final
code. For example, please consider the same example shown above, but without DataTables:
In this second example, we will be creating a simple multi-class
classification problem using integer vectors and learning a discrete
Naive Bayes on those vectors.
Like all other learning algorithms in the framework, it is also possible to obtain a better measure
of the performance of the Naive Bayes algorithm using cross-validation, as shown in the example below:
Constructs a new Naïve Bayes Classifier.
The number of output classes.
The number of symbols for each input variable.
Gets the number of symbols for each input in the model.
Gets the probability distributions for each class and input.
A TDistribution[,] array in with each row corresponds to a
class, each column corresponds to an input variable. Each element
of this double[,] array is a probability distribution modeling
the occurrence of the input variable in the corresponding class.
Constructs a new Naïve Bayes Classifier.
The number of output classes.
The number of input variables.
Constructs a new Naïve Bayes Classifier.
The number of output classes.
The number of input variables.
An initial distribution to be used to initialized all independent
distribution components of this Naive Bayes. This distribution will
be cloned and made available in the property.
Constructs a new Naïve Bayes Classifier.
The number of output classes.
The number of input variables.
The prior probabilities for each output class.
Constructs a new Naïve Bayes Classifier.
The number of output classes.
The number of input variables.
An initial distribution to be used to initialized all independent
distribution components of this Naive Bayes. This distribution will
be cloned and made available in the property.
Constructs a new Naïve Bayes Classifier.
The number of output classes.
The number of input variables.
An initial distribution to be used to initialized all independent
distribution components of this Naive Bayes. This distribution will
be cloned and made available in the property.
The prior probabilities for each output class.
Constructs a new Naïve Bayes Classifier.
The number of output classes.
The number of input variables.
An initial distribution to be used to initialized all independent
distribution components of this Naive Bayes. This distribution will
be cloned and made available in the property.
Obsolete.
Obsolete.
Gets the number of symbols for each input in the model.
Gets the number of possible output classes.
Gets the number of inputs in the model.
Computes the most likely class for a given instance.
The input instance.
The most likely class for the instance.
Computes the most likely class for a given instance.
The input instance.
The log-likelihood for the instance.
The response probabilities for each class.
The most likely class for the instance.
Saves the Naïve Bayes model to a stream.
The stream to which the Naïve Bayes model is to be serialized.
Saves the Naïve Bayes model to a stream.
The path to the file to which the Naïve Bayes model is to be serialized.
Loads a machine from a stream.
The stream from which the Naïve Bayes model is to be deserialized.
The deserialized machine.
Loads a machine from a file.
The path to the file from which the Naïve Bayes model is to be deserialized.
The deserialized machine.
Loads a machine from a stream.
The stream from which the Naïve Bayes model is to be deserialized.
The deserialized machine.
Loads a machine from a file.
The path to the file from which the Naïve Bayes model is to be deserialized.
The deserialized machine.
Common interface for Bag of Words objects.
The type of the element to be
converted to a fixed-length vector representation.
Gets the number of words in this codebook.
Gets the codeword representation of a given value.
The value to be processed.
A double vector with the same length as words
in the code book.
Base class for a data cluster.
Gets the collection to which this cluster belongs to.
Gets the label for this cluster.
Gets the proportion of samples contained in this cluster.
Initializes a new instance of the class.
The collection that contains this instance as a field.
The number of clusters K.
Gets the cluster definitions.
Gets the number of clusters in the collection.
Gets the cluster at the given index.
The index of the cluster. This should also be the class label of the cluster.
An object holding information about the selected cluster.
Returns an enumerator that iterates through a collection.
An object that can be used to iterate through the collection.
Returns an enumerator that iterates through a collection.
An object that can be used to iterate through the collection.
Binary split clustering algorithm.
How to perform clustering with Binary Split.
Gets or sets whether cluster proportions
should be calculated after the learning algorithm has finished computing the clusters. Default
is false.
true if to compute proportions after learning; otherwise, false.
Initializes a new instance of the Binary Split algorithm
The number of clusters to divide the input data into.
The distance function to use. Default is to
use the distance.
Initializes a new instance of the Binary Split algorithm
The number of clusters to divide the input data into.
Learns a model that can map the given inputs to the desired outputs.
The model inputs.
The weight of importance for each input sample.
A model that has learned how to produce suitable outputs
given the input data .
Gaussian Mixture Model Cluster Collection.
This class contains information about all
Gaussian clusters found during a
estimation.
Given a new sample, this class can be used to find the nearest cluster related
to this sample through the Nearest method.
Gaussian Mixture Model cluster.
This class contains information about a Gaussian cluster found
during a estimation. Clusters
are often contained within a .
Gets the probability density function of the
underlying Gaussian probability distribution
evaluated in point x.
An observation.
The log-probability of x occurring
in the weighted Gaussian distribution.
Gets the probability density function of the
underlying Gaussian probability distribution
evaluated in point x.
An observation.
The probability of x occurring
in the weighted Gaussian distribution.
Gets a copy of the normal distribution associated with this cluster.
Gets the deviance of the points in relation to the cluster.
The input points.
The deviance, measured as -2 * the log-likelihood
of the input points in this cluster.
Gets the mean vector associated with this cluster.
Gets the variance vector associated with this cluster.
Gets the clusters' variance-covariance matrices.
The clusters' variance-covariance matrices.
Gets the cluster's coefficient component.
Gets the component distribution.
Initializes a new instance of the class.
The number of components in the mixture.
Gets the proportion of samples in each cluster.
Gets the mean vectors for the clusters.
Gets the variance for each of the clusters.
Gets the covariance matrices for each of the clusters.
Gets the mixture model represented by this clustering.
Computes the log-likelihood that the given input vector
belongs to the specified .
The input vector.
The index of the class whose score will be computed.
System.Double.
Gets the deviance of the points in relation to the model.
The input points.
The deviance, measured as -2 * the log-likelihood of the input points.
Gets a copy of the mixture distribution modeled by this Gaussian Mixture Model.
Initializes the model with initial values obtained
through a run of the K-Means clustering algorithm.
Initializes the model with initial values.
Initializes the model with initial values.
Initializes the model with initial values.
Initializes the model with initial values.
Initializes the model with initial values.
Initializes the model with initial values.
Gets or sets the clusters' centroids.
The clusters' centroids.
Gets the collection of clusters currently modeled by the clustering algorithm.
The clusters.
Gets or sets the distance function used to measure the distance
between a point and the cluster centroid in this clustering definition.
The distance.
Gets the number of clusters in the collection.
The count.
Gets the at the specified index.
The index.
GaussianCluster.
Calculates the average square distance from the data points
to the nearest clusters' centroids.
The data.
The labels.
The weights.
The average square distance from the data points to the nearest
clusters' centroids.
The average distance from centroids can be used as a measure
of the "goodness" of the clustering. The more the data are
aggregated around the centroids, the less the average distance.
Transform data points into feature vectors containing the
distance between each point and each of the clusters.
The input points.
The weight associated with each point.
An optional matrix to store the computed transformation.
A vector containing the distance between the input points and the clusters.
Transform data points into feature vectors containing the
distance between each point and each of the clusters.
The input points.
The label of each input point.
The weight associated with each point.
An optional matrix to store the computed transformation.
A vector containing the distance between the input points and the clusters.
Returns an enumerator that iterates through the collection.
An enumerator that can be used to iterate through the collection.
Returns an enumerator that iterates through a collection.
An object that can be used to iterate through the collection.
Mean shift cluster collection.
Mean shift cluster.
Gets the cluster modes.
Initializes a new instance of the class.
Computes a class-label decision for a given .
The input vector that should be classified into
one of the possible classes.
A class-label that best described according
to this classifier.
Calculates the average square distance from the data points
to the nearest clusters' centroids.
The average distance from centroids can be used as a measure
of the "goodness" of the clustering. The more the data are
aggregated around the centroids, the less the average distance.
The average square distance from the data points to the nearest
clusters' centroids.
Transform data points into feature vectors containing the
distance between each point and each of the clusters.
The input points.
The label of each input point.
The weight associated with each point.
An optional matrix to store the computed transformation.
A vector containing the distance between the input points and the clusters.
Gets the number of clusters in the collection.
The count.
Gets the collection of clusters currently modeled by the clustering algorithm.
The clusters.
Gets the proportion of samples in each cluster.
Gets the at the specified index.
The index.
GaussianCluster.
Returns an enumerator that iterates through the collection.
An enumerator that can be used to iterate through the collection.
Returns an enumerator that iterates through a collection.
An object that can be used to iterate through the collection.
k-Means cluster collection.
k-Means' cluster.
Gets the covariance matrix for the samples in this cluster.
Initializes a new instance of the class.
The number of clusters K.
The distance metric to consider.
Gets the dimensionality of the data space.
Gets the clusters' variance-covariance matrices.
The clusters' variance-covariance matrices.
Gets or sets the clusters' centroids.
The clusters' centroids.
Gets the proportion of samples in each cluster.
Gets the collection of clusters currently modeled by the clustering algorithm.
The clusters.
Gets or sets the distance function used to measure the distance
between a point and the cluster centroid in this clustering definition.
The distance.
Gets the number of clusters in the collection.
The count.
Gets the at the specified index.
The index.
GaussianCluster.
Calculates the average square distance from the data points
to the nearest clusters' centroids.
The data.
The labels.
The weights.
The average square distance from the data points to the nearest
clusters' centroids.
The average distance from centroids can be used as a measure
of the "goodness" of the clustering. The more the data are
aggregated around the centroids, the less the average distance.
Transform data points into feature vectors containing the
distance between each point and each of the clusters.
The input points.
The weight associated with each point.
An optional matrix to store the computed transformation.
A vector containing the distance between the input points and the clusters.
Transform data points into feature vectors containing the
distance between each point and each of the clusters.
The input points.
The label of each input point.
The weight associated with each point.
An optional matrix to store the computed transformation.
A vector containing the distance between the input points and the clusters.
Returns an enumerator that iterates through the collection.
An enumerator that can be used to iterate through the collection.
Returns an enumerator that iterates through a collection.
An object that can be used to iterate through the collection.
Computes a numerical score measuring the association between
the given vector and a given
.
The input vector.
The index of the class whose score will be computed.
System.Double.
Randomizes the clusters inside a dataset.
The data to randomize the algorithm.
The seeding strategy to be used. Default is .
Common interface for clustering algorithms.
The type of the data being clustered, such as .
Divides the input data into a number of clusters.
The data where to compute the algorithm.
The labellings for the input data.
Gets the collection of clusters currently modeled by the clustering algorithm.
Common interface for clustering algorithms.
The type of the data being clustered, such as .
The type of the weights associated with each point, such as or .
Divides the input data into a number of clusters.
The data where to compute the algorithm.
The weight associated with each data point.
The labellings for the input data.
Common interface for cluster collections.
The type of the data being clustered, such as .
Gets the number of clusters in the collection.
Common interface for cluster collections.
The type of the data being clustered, such as .
The type of the clusters considered by a clustering algorithm.
Gets the cluster at the given index.
The index of the cluster. This should also be the class label of the cluster.
An object holding information about the selected cluster.
k-Modes cluster collection.
k-Modes' cluster.
Gets the dimensionality of the data space.
Initializes a new instance of the class.
The number of clusters K.
The distance metric to use.
Gets or sets the clusters' centroids.
The clusters' centroids.
Gets or sets the distance function used to measure the distance
between a point and the cluster centroid in this clustering definition.
The distance.
Gets the collection of clusters currently modeled by the clustering algorithm.
The clusters.
Gets the proportion of samples in each cluster.
Gets the number of clusters in the collection.
The count.
Gets the at the specified index.
The index.
GaussianCluster.
Randomizes the clusters inside a dataset.
The data to randomize the algorithm.
The seeding strategy to be used. Default is .
Calculates the average square distance from the data points
to the nearest clusters' centroids.
The data.
The labels.
The weights.
The average square distance from the data points to the nearest
clusters' centroids.
The average distance from centroids can be used as a measure
of the "goodness" of the clustering. The more the data are
aggregated around the centroids, the less the average distance.
Transform data points into feature vectors containing the
distance between each point and each of the clusters.
The input points.
The weight associated with each point.
An optional matrix to store the computed transformation.
A vector containing the distance between the input points and the clusters.
Transform data points into feature vectors containing the
distance between each point and each of the clusters.
The input points.
The label of each input point.
The weight associated with each point.
An optional matrix to store the computed transformation.
A vector containing the distance between the input points and the clusters.
Returns an enumerator that iterates through the collection.
An enumerator that can be used to iterate through the collection.
Returns an enumerator that iterates through a collection.
An object that can be used to iterate through the collection.
Computes a numerical score measuring the association between
the given vector and a given
.
The input vector.
The index of the class whose score will be computed.
System.Double.
Mean shift clustering algorithm.
Mean shift is a non-parametric feature-space analysis technique originally
presented in 1975 by Fukunaga and Hostetler. It is a procedure for locating
the maxima of a density function given discrete data sampled from that function.
The method iteratively seeks the location of the modes of the distribution using
local updates.
As it is, the method would be intractable; however, some clever optimizations such as
the use of appropriate data structures and seeding strategies as shown in Lee (2011)
and Carreira-Perpinan (2006) can improve its computational speed.
References:
-
Wikipedia, The Free Encyclopedia. Mean-shift. Available on:
http://en.wikipedia.org/wiki/Mean-shift
-
Comaniciu, Dorin, and Peter Meer. "Mean shift: A robust approach toward
feature space analysis." Pattern Analysis and Machine Intelligence, IEEE
Transactions on 24.5 (2002): 603-619. Available at:
http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1000236
-
Conrad Lee. Scalable mean-shift clustering in a few lines of python. The
Sociograph blog, 2011. Available at:
http://sociograph.blogspot.com.br/2011/11/scalable-mean-shift-clustering-in-few.html
-
Carreira-Perpinan, Miguel A. "Acceleration strategies for Gaussian mean-shift image
segmentation." Computer Vision and Pattern Recognition, 2006 IEEE Computer Society
Conference on. Vol. 1. IEEE, 2006. Available at:
http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=1640881
The following example demonstrates how to use the Mean Shift algorithm with
a uniform kernel to solve a clustering task:
The following example demonstrates how to use the Mean Shift algorithm for color clustering. It is the same code which can be
found in the
color clustering sample application.
The original image is shown below:
The resulting image will be:
Gets the clusters found by Mean Shift.
Gets or sets the used to
compute distances between points in the clustering.
Gets or sets the bandwidth (radius, or smoothness)
parameter to be used in the mean-shift algorithm.
Gets or sets the maximum number of neighbors which should be
used to determine the direction of the mean-shift during the
computations. Default is zero (unlimited number of neighbors).
Gets or sets whether the algorithm can use parallel
processing to speedup computations. Enabling parallel
processing can, however, result in different results
at each run.
Gets or sets whether to use the agglomeration shortcut,
meaning the algorithm will stop early when it detects that
a sample is going to follow the same path as another sample
when running in parallel.
Gets or sets whether to use seeding to initialize the algorithm.
With seeding, new points will be sampled from an uniform grid in
the range of the input points to be used as seeds. Otherwise, the
input points themselves will be used as the initial centroids for
the algorithm.
Gets or sets whether cluster labels should be computed
at the end of the learning iteration. Setting to False
might save a few computations in case they are not necessary.
Gets or sets whether cluster proportions should be computed
at the end of the learning iteration. Setting to False
might save a few computations in case they are not necessary.
Gets the dimension of the samples being
modeled by this clustering algorithm.
Gets or sets the maximum number of iterations to
be performed by the method. If set to zero, no
iteration limit will be imposed. Default is 0.
Gets or sets the relative convergence threshold
for stopping the algorithm. Default is 1e-3.
Gets or sets the density kernel to be used in the algorithm.
Default is to use the .
Creates a new algorithm.
Creates a new algorithm.
The bandwidth (also known as radius) to consider around samples.
The density kernel function to use.
Creates a new algorithm.
The dimension of the samples to be clustered.
The bandwidth (also known as radius) to consider around samples.
The density kernel function to use.
Divides the input data into clusters.
The data where to compute the algorithm.
Divides the input data into clusters.
The data where to compute the algorithm.
The weight associated with each data point.
Learns a model that can map the given inputs to the desired outputs.
The model inputs.
The weight of importance for each input sample.
A model that has learned how to produce suitable outputs
given the input data .
Learns a model that can map the given inputs to the desired outputs.
The model inputs.
The weight of importance for each input sample.
A model that has learned how to produce suitable outputs
given the input data .
Barnes-Hutt t-SNE.
The code contained in this class was adapted from Laurens van der Maaten excellent
BH T-SNE code from https://github.com/lvdmaaten/bhtsne/. The original license is
listed below:
Copyright (c) 2014, Laurens van der Maaten (Delft University of Technology)
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. All advertising materials mentioning features or use of this software
must display the following acknowledgement:
This product includes software developed by the Delft University of Technology.
4. Neither the name of the Delft University of Technology nor the names of
its contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY LAURENS VAN DER MAATEN ''AS IS'' AND ANY EXPRESS
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO
EVENT SHALL LAURENS VAN DER MAATEN BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY
OF SUCH DAMAGE.
Initializes a new instance of the class.
Gets or sets t-SNE's perplexity value. Default is 50.
Gets or sets t-SNE's Theta value. Default is 0.5
Not supported.
Applies the transformation to an input, producing an associated output.
The input data to which the transformation should be applied.
The location to where to store the result of this transformation.
The output generated by applying this transformation to the given input.
Boltzmann distribution exploration policy.
The class implements exploration policy base on Boltzmann distribution.
Acording to the policy, action a at state s is selected with the next probability:
exp( Q( s, a ) / t )
p( s, a ) = -----------------------------
SUM( exp( Q( s, b ) / t ) )
b
where Q(s, a) is action's a estimation (usefulness) at state s and
t is .
Temperature parameter of Boltzmann distribution. Should be greater than 0.
The property sets the balance between exploration and greedy actions.
If temperature is low, then the policy tends to be more greedy.
Initializes a new instance of the class.
Temperature parameter of Boltzmann distribution.
Choose an action.
Action estimates.
Returns selected action.
The method chooses an action depending on the provided estimates. The
estimates can be any sort of estimate, which values usefulness of the action
(expected summary reward, discounted reward, etc).
Epsilon greedy exploration policy.
The class implements epsilon greedy exploration policy. According to the policy,
the best action is chosen with probability 1-epsilon. Otherwise,
with probability epsilon, any other action, except the best one, is
chosen randomly.
According to the policy, the epsilon value is known also as exploration rate.
Epsilon value (exploration rate), [0, 1].
The value determines the amount of exploration driven by the policy.
If the value is high, then the policy drives more to exploration - choosing random
action, which excludes the best one. If the value is low, then the policy is more
greedy - choosing the beat so far action.
Initializes a new instance of the class.
Epsilon value (exploration rate).
Choose an action.
Action estimates.
Returns selected action.
The method chooses an action depending on the provided estimates. The
estimates can be any sort of estimate, which values usefulness of the action
(expected summary reward, discounted reward, etc).
Exploration policy interface.
The interface describes exploration policies, which are used in Reinforcement
Learning to explore state space.
Choose an action.
Action estimates.
Returns selected action.
The method chooses an action depending on the provided estimates. The
estimates can be any sort of estimate, which values usefulness of the action
(expected summary reward, discounted reward, etc).
Roulette wheel exploration policy.
The class implements roulette whell exploration policy. Acording to the policy,
action a at state s is selected with the next probability:
Q( s, a )
p( s, a ) = ------------------
SUM( Q( s, b ) )
b
where Q(s, a) is action's a estimation (usefulness) at state s.
The exploration policy may be applied only in cases, when action estimates (usefulness)
are represented with positive value greater then 0.
Initializes a new instance of the class.
Choose an action.
Action estimates.
Returns selected action.
The method chooses an action depending on the provided estimates. The
estimates can be any sort of estimate, which values usefulness of the action
(expected summary reward, discounted reward, etc).
Tabu search exploration policy.
The class implements simple tabu search exploration policy,
allowing to set certain actions as tabu for a specified amount of
iterations. The actual exploration and choosing from non-tabu actions
is done by base exploration policy.
Base exploration policy.
Base exploration policy is the policy, which is used
to choose from non-tabu actions.
Initializes a new instance of the class.
Total actions count.
Base exploration policy.
Choose an action.
Action estimates.
Returns selected action.
The method chooses an action depending on the provided estimates. The
estimates can be any sort of estimate, which values usefulness of the action
(expected summary reward, discounted reward, etc). The action is choosed from
non-tabu actions only.
Reset tabu list.
Clears tabu list making all actions allowed.
Set tabu action.
Action to set tabu for.
Tabu time in iterations.
Robust circle estimator with RANSAC.
Gets the RANSAC estimator used.
Gets the final set of inliers detected by RANSAC.
Creates a new RANSAC 2D circle estimator.
Inlier threshold.
Inlier probability.
Produces a robust estimation of the circle
passing through the given (noisy) points.
A set of (possibly noisy) points.
The circle passing through the points.
Produces a robust estimation of the circle
passing through the given (noisy) points.
A set of (possibly noisy) points.
The circle passing through the points.
Produces a robust estimation of the circle
passing through the given (noisy) points.
A set of (possibly noisy) points.
The circle passing through the points.
Produces a robust estimation of the circle
passing through the given (noisy) points.
A set of (possibly noisy) points.
The circle passing through the points.
Robust line estimator with RANSAC.
Gets the RANSAC estimator used.
Gets the final set of inliers detected by RANSAC.
Creates a new RANSAC line estimator.
Inlier threshold.
Inlier probability.
Produces a robust estimation of the line
passing through the given (noisy) points.
A set of (possibly noisy) points.
The line passing through the points.
Produces a robust estimation of the line
passing through the given (noisy) points.
A set of (possibly noisy) points.
The line passing through the points.
Produces a robust estimation of the line
passing through the given (noisy) points.
A set of (possibly noisy) points.
The line passing through the points.
Produces a robust estimation of the line
passing through the given (noisy) points.
A set of (possibly noisy) points.
The line passing through the points.
Robust plane estimator with RANSAC.
Gets the RANSAC estimator used.
Gets the final set of inliers detected by RANSAC.
Creates a new RANSAC 3D plane estimator.
Inlier threshold.
Inlier probability.
Produces a robust estimation of the plane
passing through the given (noisy) points.
A set of (possibly noisy) points.
The plane passing through the points.
K-Nearest Neighbor (k-NN) algorithm.
The type of the input data.
The k-nearest neighbor algorithm (k-NN) is a method for classifying objects
based on closest training examples in the feature space. It is amongst the simplest
of all machine learning algorithms: an object is classified by a majority vote of
its neighbors, with the object being assigned to the class most common amongst its
k nearest neighbors (k is a positive integer, typically small).
If k = 1, then the object is simply assigned to the class of its nearest neighbor.
References:
-
Wikipedia contributors. "K-nearest neighbor algorithm." Wikipedia, The
Free Encyclopedia. Wikipedia, The Free Encyclopedia, 10 Oct. 2012. Web.
9 Nov. 2012. http://en.wikipedia.org/wiki/K-nearest_neighbor_algorithm
The first example shows how to create and use a k-Nearest Neighbor algorithm to classify
a set of numeric vectors in a multi-class decision problem involving 3 classes. It also shows
how to compute class decisions for a new sample and how to measure the performance of a classifier.
The second example show how to use a different distance metric when computing k-NN:
The k-Nearest neighbor algorithm implementation in the framework can also be used with any instance
data type. For such cases, the framework offers a generic version of the classifier. The third example
shows how to use the generic kNN classifier to perform the direct classification of actual text samples:
Gets or sets the parallelization options for this algorithm.
Creates a new .
Creates a new .
Computes a numerical score measuring the association between
the given vector and each class.
The input vector.
An array where the result will be stored,
avoiding unnecessary memory allocations.
System.Double[].
Computes a numerical score measuring the association between
the given vector and each class.
The input vector.
An array where the scores will be stored,
avoiding unnecessary memory allocations.
System.Double[][].
Gets the top points
that are the closest to a given reference point.
The query point whose neighbors will be found.
The label for each neighboring point.
An array containing the top points that are
at the closest possible distance to .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Creates a new .
The number of nearest neighbors to be used in the decision.
The input data points.
The associated labels for the input points.
The distance measure to use in the decision.
Creates a new .
The number of nearest neighbors to be used in the decision.
The number of classes in the classification problem.
The input data points.
The associated labels for the input points.
The distance measure to use in the decision.
Gets the number of class labels
handled by this classifier.
Computes the most likely label of a new given point.
A point to be classified.
The most likely label for the given point.
Computes the most likely label of a new given point.
A point to be classified.
A value between 0 and 1 giving
the strength of the classification in relation to the
other classes.
The most likely label for the given point.
Computes the most likely label of a new given point.
A point to be classified.
The distance score for each possible class.
The most likely label for the given point.
Pair of class labels.
The structure is the equivalent of a
where the tuple elements are called and instead of
and . It is
mainly used to index or provide access to individual binary models within a
(i.e. through and
) and in the definition of the
structure.
Gets the first class in the pair.
Gets the second class in the pair.
Initializes a new instance of the struct.
The first class index in the pair.
The second class index in the pair.
Converts to a tuple (class_a, class_b).
Returns a that represents this instance.
A that represents this instance.
Indicates whether the current object is equal to another object of the same type.
An object to compare with this object.
true if the current object is equal to the parameter; otherwise, false.
Determines whether the specified is equal to this instance.
The object to compare with the current instance.
true if the specified is equal to this instance; otherwise, false.
Returns a hash code for this instance.
A hash code for this instance, suitable for use in hashing algorithms and data structures like a hash table.
Decision between two class labels. Indicates the class index of the first
class, the class index of the adversary, and the class index of the winner.
The structure is used to represent the outcome of a binary classifier for the
problem of deciding between two classes. For example, let's say we would like to represent that, given
the problem of deciding between class #4 and class #2, a binary classsifier has opted for deciding that
class #2 was more likely than class #4. This could be represented by a structure
by instantiating it using Decision(i: 4, j: 2, winner: 2).
The structure is more likely to be used or found when dealing with strategies
for creating multi-class and/or multi-label classifiers using a set of binary classifiers, such as when using
and . In the example below, we
will extract the sequence of binary classification problems and their respective decisions when evaluating a
multi-class SVM using the one-vs-one decision strategy for multi-class problems:
Gets the adversarial classes.
Gets the class label of the winner.
Initializes a new instance of the struct.
The first class index.
The second class index.
The class index that won.
Converts to a triplet (class_a, class_b, winner).
Returns a that represents this instance.
A that represents this instance.
Parameters for learning a binary decision model. An object of this class is passed by
or
to instruct how binary learning algorithms should create their binary classifiers.
The type of the binary model to be learned.
The input type for the binary classifiers.
Gets or sets the binary model to be learned.
Gets or sets the input data that should be used to train the classifier.
Gets or sets the output data that should be used to train the classifier.
Gets or sets the class pair that the classifier will be designated
to learn. For classifiers, the first element
in the pair designates the class to be learned against all others.
Initializes a new instance of the class.
The binary model to be learned.
The inputs to be used.
The outputs to be used.
The class labels for the problem to be learned.
Base learning algorithm for multi-class classifiers.
The type for the inner binary classifiers used in the one-vs-rest approach.
The type of the model being learned.
Base learning algorithm for multi-class classifiers.
The type for the samples handled by the classifier. Default is double[].
The type for the inner binary classifiers used in the one-vs-rest approach.
The type of the model being learned.
Gets or sets the model being learned.
Gets or sets a function that takes a set of parameters and creates
a learning algorithm for learning each of the binary inner classifiers
needed by the one-vs-rest classification strategy.
Gets or sets a value indicating whether the entire training algorithm should stop
in case an exception has been detected at just one of the inner binary learning
problems. Default is true (execution will not be stopped).
Gets or sets a value indicating whether the learning algorithm should generate multi-label
(as opposed to multi-class) models. If left unspecified, the type of the model will be determined
automatically depending on which overload of the
method will be called first by the executing code.
Occurs when the learning of a subproblem has started.
Occurs when the learning of a subproblem has finished.
Initializes a new instance of the class.
Sets a callback function that takes a set of parameters and creates
a learning algorithm for learning each of the binary inner classifiers
needed by the one-vs-rest classification strategy. Calling this method
sets the property.
Sets a callback function that takes a set of parameters and creates
a learning algorithm for learning each of the binary inner classifiers
needed by the one-vs-rest classification strategy. Calling this method
sets the property.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Raises the event.
The instance containing the event data.
Raises the event.
The instance containing the event data.
Base learning algorithm for multi-class classifiers.
The type for the inner binary classifiers used in the one-vs-one approach.
The type of the model being learned.
Base learning algorithm for multi-class classifiers.
The type for the samples handled by the classifier. Default is double[].
The type for the inner binary classifiers used in the one-vs-one approach.
The type of the model being learned.
Gets or sets the model being learned.
Gets or sets a function that takes a set of parameters and creates
a learning algorithm for learning each of the binary inner classifiers
needed by the one-vs-one classification strategy.
Gets or sets a value indicating whether the entire training algorithm should stop
in case an exception has been detected at just one of the inner binary learning
problems. Default is true (execution will not be stopped).
Occurs when the learning of a subproblem has started.
Occurs when the learning of a subproblem has finished.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Sets a callback function that takes a set of parameters and creates
a learning algorithm for learning each of the binary inner classifiers
needed by the one-vs-rest classification strategy. Calling this method
sets the property.
Sets a callback function that takes a set of parameters and creates
a learning algorithm for learning each of the binary inner classifiers
needed by the one-vs-rest classification strategy. Calling this method
sets the property.
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Raises the event.
The instance containing the event data.
Raises the event.
The instance containing the event data.
Subproblem progress event argument.
One of the classes belonging to the subproblem.
One of the classes belonging to the subproblem.
Gets the progress of the overall problem,
ranging from zero up to .
Gets the maximum value for the current .
Initializes a new instance of the class.
One of the classes in the subproblem.
The other class in the subproblem.
Base class for multi-class classifiers based on the
"one-vs-rest" construction based on binary classifiers.
The type for the inner binary classifiers.
The input type handled by the classifiers. Default is double.
Initializes a new instance of the class.
The number of classes in the multi-label classification problem.
A function to create the inner binary classifiers.
Gets the binary classifier for particular class index.
Index of the class.
A that has been trained
to distinguish between the chosen class and all other classes.
Gets or sets the inner binary classifiers used to distinguish
between each class and all other classes.
The classifier index.
A that has been trained
to distinguish between the chosen class and all other classes.
Gets or sets the binary classifiers that have been trained
to distinguish between each class and all other classes.
Gets the total number of binary models in this one-vs-rest
multi-label configuration. Should be equal to the
(number of classes).
Computes a numerical score measuring the association between
the given vector and a given
.
The input vector.
The index of the class whose score will be computed.
The class label associated with the input
vector, as predicted by the classifier.
Computes whether a class label applies to an vector.
The input vectors that should be classified as
any of the possible classes.
The class label index to be tested.
A boolean value indicating whether the given
class label applies to the vector.
Computes a log-likelihood measuring the association between
the given vector and a given
.
The input vector.
The index of the class whose score will be computed.
The class label associated with the input
vector, as predicted by the classifier.
Returns an enumerator that iterates through the collection.
A that can be used to iterate through the collection.
Returns an enumerator that iterates through the collection.
A that can be used to iterate through the collection.
Base class for multi-class classifiers based on the
"one-vs-rest" construction based on binary classifiers.
The type for the inner binary classifiers.
Initializes a new instance of the class.
The number of classes in the multi-label classification problem.
A function to create the inner binary classifiers.
Decision strategies for
Multi-class Support Vector Machines.
Max-voting method (also known as 1-vs-1 decision).
Elimination method (also known as DAG decision).
Contains classes related to Support Vector Machines (SVMs).
Contains linear machines,
kernel machines, multi-class machines, SVM-DAGs
(Directed Acyclic Graphs), multi-label classification
and also offers support for the probabilistic output calibration
of SVM outputs.
This namespace contains both standard s and the
kernel extension given by s. For multiple
classes or categories, the framework offers s
and s. Multi-class machines can be used for
cases where a single class should be picked up from a list of several class labels, and
the multi-label machine for cases where multiple class labels might be detected for a
single input vector. The multi-class machines also support two types of classification:
the faster decision based on Decision Directed Acyclic Graphs, and the more traditional
based on a Voting scheme.
Learning can be achieved using the standard
(SMO) algorithm. However, the framework can also learn Least Squares SVMs (LS-SVMs) using , and even calibrate SVMs to produce probabilistic outputs
using . A
huge variety of kernels functions is available in the statistics namespace, and
new kernels can be created easily using the interface.
The namespace class diagram is shown below.
Please note that class diagrams for each of the inner namespaces are
also available within their own documentation pages.
Common interface for binary support vector machines.
The type of the input data handled by the machine.
Gets or sets the collection of weights used by this machine.
Gets or sets the collection of support vectors used by this machine.
Gets or sets the threshold (bias) term for this machine.
Gets or sets the kernel used by this machine.
Gets whether this machine has been calibrated to
produce probabilistic outputs (through the Probability
method).
If this machine has a linear kernel, compresses all
support vectors into a single parameter vector.
Obsolete.
Obsolete.
Obsolete.
Obsolete.
Obsolete.
Base class for calibration algorithms.
Gets or sets a cancellation token that can be used to
stop the learning algorithm while it is running.
Gets or sets the input vectors for training.
Gets or sets the output labels for each training vector.
Initializes a new instance of the class.
The machine to be calibrated.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Initializes a new instance of the class.
The machine to be calibrated.
Initializes a new instance of the class.
Gets whether the machine being learned is linear.
Gets the machine's function.
Gets the machine to be taught.
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Runs the learning algorithm.
Obsolete.
Obsolete.
Base class for learning algorithms.
Initializes a new instance of the class.
Obsolete.
Gets or sets the input vectors for training.
Gets or sets the output labels for each training vector.
Complexity (cost) parameter C. Increasing the value of C forces the creation
of a more accurate model that may not generalize well. If this value is not
set and is set to true, the framework
will automatically guess a value for C. If this value is manually set to
something else, then will be automatically
disabled and the given value will be used instead.
The cost parameter C controls the trade off between allowing training
errors and forcing rigid margins. It creates a soft margin that permits
some misclassifications. Increasing the value of C increases the cost of
misclassifying points and forces the creation of a more accurate model
that may not generalize well.
If this value is not set and is set to
true, the framework will automatically guess a suitable value for C by
calling . If this value
is manually set to something else, then the class will respect the new value
and automatically disable .
Gets or sets the positive class weight. This should be a
value higher than 0 indicating how much of the
parameter C should be applied to instances carrying the positive label.
Gets or sets the negative class weight. This should be a
value higher than 0 indicating how much of the
parameter C should be applied to instances carrying the negative label.
Gets or sets the weight ratio between positive and negative class
weights. This ratio controls how much of the
parameter C should be applied to the positive class.
A weight ratio lesser than one, such as 1/10 (0.1) means 10% of C will
be applied to the positive class, while 100% of C will be applied to the
negative class.
A weight ratio greater than one, such as 10/1 (10) means that 100% of C will
be applied to the positive class, while 10% of C will be applied to the
negative class.
Gets or sets a value indicating whether the Complexity parameter C
should be computed automatically by employing an heuristic rule.
Default is true.
true if complexity should be computed automatically; otherwise, false.
Gets or sets whether initial values for some kernel parameters
should be estimated from the data, if possible. Default is true.
Gets or sets a value indicating whether the weight ratio to be used between
values for negative and positive instances should
be computed automatically from the data proportions. Default is false.
true if the weighting coefficient should be computed
automatically from the data; otherwise, false.
Gets or sets the kernel function use to create a
kernel Support Vector Machine. If this property
is set, will be
set to false.
Gets or sets the cost values associated with each input vector.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Runs the main body of the learning algorithm.
Computes the error rate for a given set of input and outputs.
Obsolete.
Obsolete.
Base class for regression learning algorithms.
Gets or sets a cancellation token that can be used to
stop the learning algorithm while it is running.
Gets or sets the cost values associated with each input vector.
Initializes a new instance of the class.
Obsolete.
Complexity (cost) parameter C. Increasing the value of C forces the creation
of a more accurate model that may not generalize well. If this value is not
set and is set to true, the framework
will automatically guess a value for C. If this value is manually set to
something else, then will be automatically
disabled and the given value will be used instead.
The cost parameter C controls the trade off between allowing training
errors and forcing rigid margins. It creates a soft margin that permits
some misclassifications. Increasing the value of C increases the cost of
misclassifying points and forces the creation of a more accurate model
that may not generalize well.
If this value is not set and is set to
true, the framework will automatically guess a suitable value for C by
calling . If this value
is manually set to something else, then the class will respect the new value
and automatically disable .
Insensitivity zone ε. Increasing the value of ε can result in fewer
support vectors in the created model. Default value is 1e-3.
Parameter ε controls the width of the ε-insensitive zone, used to fit the training
data. The value of ε can affect the number of support vectors used to construct the
regression function. The bigger ε, the fewer support vectors are selected. On the
other hand, bigger ε-values results in more flat estimates.
Gets or sets the individual weight of each sample in the training set. If set
to null, all samples will be assumed equal weight. Default is null.
Gets or sets a value indicating whether the Complexity parameter C
should be computed automatically by employing an heuristic rule.
Default is false.
true if complexity should be computed automatically; otherwise, false.
Gets or sets whether initial values for some kernel parameters
should be estimated from the data, if possible. Default is true.
Gets whether the machine to be learned
has a kernel.
Gets or sets the kernel function use to create a
kernel Support Vector Machine. If this property
is set, will be
set to false.
Gets or sets the input vectors for training.
Gets or sets the output values for each calibration vector.
Gets the machine to be taught.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Runs the learning algorithm.
Obsolete.
Obsolete.
Common interface for Support Machine Vector learning algorithms.
Common interface for Support Machine Vector learning algorithms.
Common interface for Support Machine Vector learning algorithms.
Obsolete.
Obsolete.
Common interface for Support Machine Vector learning algorithms.
Gets or sets the support vector machine being learned.
Obsolete.
Obsolete.
Least Squares SVM (LS-SVM) learning algorithm.
References:
-
Suykens, J. A. K., et al. "Least squares support vector machine classifiers: a large scale
algorithm." European Conference on Circuit Theory and Design, ECCTD. Vol. 99. 1999. Available on:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.43.6438
Obsolete.
Initializes a new instance of the class.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Least Squares SVM (LS-SVM) learning algorithm.
References:
-
Suykens, J. A. K., et al. "Least squares support vector machine classifiers: a large scale
algorithm." European Conference on Circuit Theory and Design, ECCTD. Vol. 99. 1999. Available on:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.43.6438
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Base class for Least Squares SVM (LS-SVM) learning algorithm.
Constructs a new Least Squares SVM (LS-SVM) learning algorithm.
Convergence tolerance. Default value is 1e-6.
The criterion for completing the model training process. The default is 1e-6.
Gets or sets the cache size to partially
stored the kernel matrix. Default is the
same number of input vectors.
Runs the main body of the learning algorithm.
Obsolete.
Averaged Stochastic Gradient Descent (ASGD) for training linear support vector machines.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Inheritors should implement this function to produce a new instance
with the same characteristics of the current object.
Averaged Stochastic Gradient Descent (ASGD) for training linear support vector machines.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Inheritors should implement this function to produce a new instance
with the same characteristics of the current object.
BaseAveragedStochasticGradientDescent<SupportVectorMachine<TKernel>, TKernel, System.Double[]>.
Averaged Stochastic Gradient Descent (ASGD) for training linear support vector machines.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Inheritors should implement this function to produce a new instance
with the same characteristics of the current object.
BaseAveragedStochasticGradientDescent<SupportVectorMachine<TKernel, TInput>, TKernel, TInput>.
Averaged Stochastic Gradient Descent (ASGD) for training linear support vector machines.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Inheritors should implement this function to produce a new instance
with the same characteristics of the current object.
BaseAveragedStochasticGradientDescent<SupportVectorMachine<TKernel, TInput>, TKernel, TInput>.
Base class for Averaged Stochastic Gradient Descent algorithm implementations.
The and
are passed as generic parameters (constrained to be structs) because this is the only
way to force the compiler to emit a separate native code for this class whose performance
critical sections can be inlined.
The type of the model being learned.
The type of the kernel function to use.
The type of the input to consider.
The type of the loss function to use.
Gets or sets the kernel function use to create a
kernel Support Vector Machine.
Gets or sets the loss function to be used.
Default is to use the .
Gets or sets the learning rate for the SGD algorithm.
Gets or sets the number of iterations that should be
performed by the algorithm when calling .
Default is 0 (iterate until convergence).
Please use MaxIterations instead.
Gets or sets the current epoch counter.
Gets or sets the parallelization options for this algorithm.
Gets or sets a cancellation token that can be used
to cancel the algorithm while it is running.
Gets or sets the maximum relative change in the watched value
after an iteration of the algorithm used to detect convergence.
Default is 1e-3. If set to 0, the loss will not be computed
during learning and execution will be faster.
Gets or sets the lambda regularization term. Default is 0.5.
Initializes a new instance of the class.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Renormalize the weights.
Compute the norm of the weights.
Compute the norm of the averaged weights.
Perform one iteration of the SGD algorithm with specified gains
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Creates a new object that is a copy of the current instance.
A new object that is a copy of this instance.
Inheritors should implement this function to produce a new instance
with the same characteristics of the current object.
Support vector regression using (LibSVM) algorithm.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Support vector regression using (LibSVM) algorithm.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Support vector regression using (LibSVM) algorithm.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Base class for Fan-Chen-Lin (LibSVM) regression algorithms.
Constructs a new one-class support vector learning algorithm.
A support vector machine.
Constructs a new one-class support vector learning algorithm.
Gets the value for the Lagrange multipliers
(alpha) for every observation vector.
Convergence tolerance. Default value is 1e-2.
The criterion for completing the model training process. The default is 0.01.
Gets or sets a value indicating whether to use
shrinking heuristics during learning. Default is true.
true to use shrinking; otherwise, false.
Runs the learning algorithm.
Stochastic Gradient Descent (SGD) for training linear support vector machines.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Inheritors should implement this function to produce a new instance
with the same characteristics of the current object.
Stochastic Gradient Descent (SGD) for training linear support vector machines.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Inheritors should implement this function to produce a new instance
with the same characteristics of the current object.
Stochastic Gradient Descent (SGD) for training linear support vector machines.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Inheritors should implement this function to produce a new instance
with the same characteristics of the current object.
Stochastic Gradient Descent (SGD) for training linear support vector machines.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Inheritors should implement this function to produce a new instance
with the same characteristics of the current object.
Base class for Averaged Stochastic Gradient Descent algorithm implementations.
The type of the model being learned.
The type of the kernel function to use.
The type of the input to consider.
The type of the loss function to use.
The and
are passed as generic parameters (constrained to be structs) because this is the only
way to force the compiler to emit a separate native code for this class whose performance
critical sections can be inlined.
Gets or sets the kernel function use to create a
kernel Support Vector Machine.
Gets or sets the loss function to be used.
Default is to use the .
Gets or sets the learning rate for the SGD algorithm.
Please use MaxIterations instead.
Gets or sets the number of iterations that should be
performed by the algorithm when calling .
Default is 0 (iterate until convergence).
Gets or sets the maximum relative change in the watched value
after an iteration of the algorithm used to detect convergence.
Default is 1e-5.
Gets or sets the lambda regularization term. Default is 0.5.
Initializes a new instance of the class.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Renormalize the weights
Compute the norm of the weights
Perform one iteration of the SGD algorithm with specified gains
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Creates a new object that is a copy of the current instance.
A new object that is a copy of this instance.
Inheritors should implement this function to produce a new
instance with the same characteristics of the current object.
L1-regularized L2-loss support vector
Support Vector Machine learning (-s 5).
This class implements a learning algorithm
specifically crafted for linear machines only. It provides a L1-regularized,
L2-loss coordinate descent learning algorithm for optimizing the primal form of
learning. The code has been based on liblinear's method solve_l1r_l2_svc
method, whose original description is provided below.
Liblinear's solver -s 5: L1R_L2LOSS_svc. A coordinate descent
algorithm for L2-loss SVM problems in the primal.
min_w \sum |wj| + C \sum max(0, 1-yi w^T xi)^2,
Given: x, y, Cp, Cn and eps as the stopping tolerance
See Yuan et al. (2010) and appendix of LIBLINEAR paper, Fan et al. (2008)
The following example shows how to obtain a
from a linear . It contains exactly the same data
used in the documentation page for
.
Obsolete.
Obsolete.
Initializes a new instance of the class.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
L1-regularized L2-loss support vector
Support Vector Machine learning (-s 5).
This class implements a learning algorithm
specifically crafted for linear machines only. It provides a L1-regularized,
L2-loss coordinate descent learning algorithm for optimizing the primal form of
learning. The code has been based on liblinear's method solve_l1r_l2_svc
method, whose original description is provided below.
Liblinear's solver -s 5: L1R_L2LOSS_svc. A coordinate descent
algorithm for L2-loss SVM problems in the primal.
min_w \sum |wj| + C \sum max(0, 1-yi w^T xi)^2,
Given: x, y, Cp, Cn and eps as the stopping tolerance
See Yuan et al. (2010) and appendix of LIBLINEAR paper, Fan et al. (2008)
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Base class for linear coordinate descent learning algorithm.
Constructs a new coordinate descent algorithm for L1-loss and L2-loss SVM dual problems.
Gets the value for the Lagrange multipliers
(alpha) for every observation vector.
Convergence tolerance. Default value is 0.1.
The criterion for completing the model training process. The default is 0.1.
Runs the learning algorithm.
Obsolete.
Obsolete.
Different categories of loss functions that can be used to learn
support vector machines.
Hinge-loss function.
Squared hinge-loss function.
L2-regularized, L1 or L2-loss dual formulation
Support Vector Machine learning (-s 1 and -s 3).
This class implements a learning algorithm
specifically crafted for linear machines only. It provides a L2-regularized, L1
or L2-loss coordinate descent learning algorithm for optimizing the dual form of
learning. The code has been based on liblinear's method solve_l2r_l1l2_svc
method, whose original description is provided below.
Liblinear's solver -s 1: L2R_L2LOSS_SVC_DUAL and -s 3:
L2R_L1LOSS_SVC_DUAL. A coordinate descent algorithm for L1-loss and
L2-loss SVM problems in the dual.
min_\alpha 0.5(\alpha^T (Q + D)\alpha) - e^T \alpha,
s.t. 0 <= \alpha_i <= upper_bound_i,
where Qij = yi yj xi^T xj and
D is a diagonal matrix
In L1-SVM case:
upper_bound_i = Cp if y_i = 1
upper_bound_i = Cn if y_i = -1
D_ii = 0
In L2-SVM case:
upper_bound_i = INF
D_ii = 1/(2*Cp) if y_i = 1
D_ii = 1/(2*Cn) if y_i = -1
Given: x, y, Cp, Cn, and eps as the stopping tolerance
See Algorithm 3 of Hsieh et al., ICML 2008.
The next example shows how to solve a multi-class problem using a one-vs-one SVM
where the binary machines are learned using the Linear Dual Coordinate Descent algorithm.
The following example shows how to obtain a
from a linear . It contains exactly the same data
used in the documentation page for
.
Obsolete.
Initializes a new instance of the class.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
L2-regularized, L1 or L2-loss dual formulation
Support Vector Machine learning (-s 1 and -s 3).
This class implements a learning algorithm
specifically crafted for linear machines only. It provides a L2-regularized, L1
or L2-loss coordinate descent learning algorithm for optimizing the dual form of
learning. The code has been based on liblinear's method solve_l2r_l1l2_svc
method, whose original description is provided below.
Liblinear's solver -s 1: L2R_L2LOSS_SVC_DUAL and -s 3:
L2R_L1LOSS_SVC_DUAL. A coordinate descent algorithm for L1-loss and
L2-loss SVM problems in the dual.
min_\alpha 0.5(\alpha^T (Q + D)\alpha) - e^T \alpha,
s.t. 0 <= \alpha_i <= upper_bound_i,
where Qij = yi yj xi^T xj and
D is a diagonal matrix
In L1-SVM case:
upper_bound_i = Cp if y_i = 1
upper_bound_i = Cn if y_i = -1
D_ii = 0
In L2-SVM case:
upper_bound_i = INF
D_ii = 1/(2*Cp) if y_i = 1
D_ii = 1/(2*Cn) if y_i = -1
Given: x, y, Cp, Cn, and eps as the stopping tolerance
See Algorithm 3 of Hsieh et al., ICML 2008.
The next example shows how to solve a multi-class problem using a one-vs-one SVM
where the binary machines are learned using the Linear Dual Coordinate Descent algorithm.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
L2-regularized, L1 or L2-loss dual formulation
Support Vector Machine learning (-s 1 and -s 3).
This class implements a learning algorithm
specifically crafted for linear machines only. It provides a L2-regularized, L1
or L2-loss coordinate descent learning algorithm for optimizing the dual form of
learning. The code has been based on liblinear's method solve_l2r_l1l2_svc
method, whose original description is provided below.
Liblinear's solver -s 1: L2R_L2LOSS_SVC_DUAL and -s 3:
L2R_L1LOSS_SVC_DUAL. A coordinate descent algorithm for L1-loss and
L2-loss SVM problems in the dual.
min_\alpha 0.5(\alpha^T (Q + D)\alpha) - e^T \alpha,
s.t. 0 <= \alpha_i <= upper_bound_i,
where Qij = yi yj xi^T xj and
D is a diagonal matrix
In L1-SVM case:
upper_bound_i = Cp if y_i = 1
upper_bound_i = Cn if y_i = -1
D_ii = 0
In L2-SVM case:
upper_bound_i = INF
D_ii = 1/(2*Cp) if y_i = 1
D_ii = 1/(2*Cn) if y_i = -1
Given: x, y, Cp, Cn, and eps as the stopping tolerance
See Algorithm 3 of Hsieh et al., ICML 2008.
The next example shows how to solve a multi-class problem using a one-vs-one SVM
where the binary machines are learned using the Linear Dual Coordinate Descent algorithm.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Base class for Linear Dual Coordinate Descent.
Constructs a new coordinate descent algorithm for L1-loss and L2-loss SVM dual problems.
Gets or sets the cost function that
should be optimized. Default is
.
Gets the value for the Lagrange multipliers
(alpha) for every observation vector.
Convergence tolerance. Default value is 0.1.
The criterion for completing the model training process. The default is 0.1.
Runs the learning algorithm.
Obsolete.
L2-regularized L2-loss linear support vector classification (primal).
This class implements a L2-regularized L2-loss support vector machine
learning algorithm that operates in the primal form of the optimization
problem. This method has been based on liblinear's l2r_l2_svc_fun
problem specification, optimized using a
Trust-region Newton method. This method might be faster than the often
preferred .
Liblinear's solver -s 2: L2R_L2LOSS_SVC. A trust region newton
algorithm for the primal of L2-regularized, L2-loss linear support vector
classification.
The following example shows how to obtain a
from a linear . It contains exactly the same data
used in the documentation page for
.
Obsolete.
Obsolete.
Initializes a new instance of the class.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
L2-regularized L2-loss linear support vector classification (primal).
This class implements a L2-regularized L2-loss support vector machine
learning algorithm that operates in the primal form of the optimization
problem. This method has been based on liblinear's l2r_l2_svc_fun
problem specification, optimized using a
Trust-region Newton method. This method might be faster than the often
preferred .
Liblinear's solver -s 2: L2R_L2LOSS_SVC. A trust region newton
algorithm for the primal of L2-regularized, L2-loss linear support vector
classification.
The following example shows how to obtain a
from a linear . It contains exactly the same data
used in the documentation page for
.
Initializes a new instance of the class.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Base class for L2-regularized L2-loss linear support vector classification (primal).
Initializes a new instance of the class.
Obsolete.
Obsolete.
L2-regularized L2-loss linear support vector classification (primal).
This class implements a L2-regularized L2-loss support vector machine
learning algorithm that operates in the primal form of the optimization
problem. This method has been based on liblinear's l2r_l2_svc_fun
problem specification, optimized using a
Trust-region Newton method. This method might be faster than the often
preferred .
Liblinear's solver -s 2: L2R_L2LOSS_SVC. A trust region newton
algorithm for the primal of L2-regularized, L2-loss linear support vector
classification.
Constructs a new Newton method algorithm for L2-regularized
Support Vector Classification problems in the primal form (-s 2).
Convergence tolerance. Default value is 0.1.
The criterion for completing the model training process. The default is 0.1.
Gets or sets the maximum number of iterations that should
be performed until the algorithm stops. Default is 1000.
Runs the learning algorithm.
Obsolete.
Obsolete.
Obsolete.
One-against-one Multi-class Support Vector Machine Learning Algorithm
This class can be used to train Kernel Support Vector Machines with
any algorithm using a one-against-one strategy. The underlying
training algorithm can be configured by defining the
property.
One example of learning algorithm that can be used with this class is the
Sequential Minimal Optimization
(SMO) algorithm.
The following example shows how to learn a linear, multi-class support vector
machine using the algorithm.
The following example shows how to learn a non-linear, multi-class support
vector machine using the kernel and the
algorithm.
Support vector machines can have their weights calibrated in order to produce
probability estimates (instead of simple class separation distances). The
following example shows how to use
within to generate a probabilistic
SVM:
Computes the error ratio, the number of
misclassifications divided by the total
number of samples in a dataset.
Obsolete.
Initializes a new instance of the class.
Obsolete.
Obsolete.
Converts
into a lambda function that can be passed to the
property of a learning algorithm.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Gets or sets the kernel function to be used to learn the
kernel support
vector machines.
One-against-one Multi-class Support Vector Machine Learning Algorithm
This class can be used to train Kernel Support Vector Machines with
any algorithm using a one-against-one strategy. The underlying
training algorithm can be configured by defining the
property.
One example of learning algorithm that can be used with this class is the
Sequential Minimal Optimization
(SMO) algorithm.
The following example shows how to learn a linear, multi-class support vector
machine using the algorithm.
The following example shows how to learn a non-linear, multi-class support
vector machine using the kernel and the
algorithm.
Support vector machines can have their weights calibrated in order to produce
probability estimates (instead of simple class separation distances). The
following example shows how to use
within to generate a probabilistic
SVM:
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Initializes a new instance of the class.
Initializes a new instance of the class.
Base class for multi-class support vector learning algorithms.
Gets or sets the kernel function to be used to learn the
kernel support
vector machines.
Base class for multi-class support vector learning algorithms.
Gets or sets the kernel function to be used to learn the
kernel support
vector machines.
One-against-one Multi-class Support Vector Machine Learning Algorithm
This class can be used to train Kernel Support Vector Machines with
any algorithm using a one-against-one strategy. The underlying
training algorithm can be configured by defining the
property.
One example of learning algorithm that can be used with this class is the
Sequential Minimal Optimization
(SMO) algorithm.
The following example shows how to learn a linear, multi-class support vector
machine using the algorithm.
The following example shows how to learn a non-linear, multi-class support
vector machine using the kernel and the
algorithm.
Support vector machines can have their weights calibrated in order to produce
probability estimates (instead of simple class separation distances). The
following example shows how to use
within to generate a probabilistic
SVM:
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Initializes a new instance of the class.
Initializes a new instance of the class.
The existing machine to be learned.
Obsolete.
Obsolete.
Obsolete.
Obsolete.
Obsolete.
Converts
into a lambda function that can be passed to the
property of a learning algorithm.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Gets or sets the kernel function to be used to learn the
kernel support
vector machines.
One-against-all Multi-label Support Vector Machine Learning Algorithm
This class can be used to train Kernel Support Vector Machines with
any algorithm using a one-against-all strategy. The underlying
training algorithm can be configured by defining the
property.
One example of learning algorithm that can be used with this class is the
Sequential Minimal Optimization
(SMO) algorithm.
The following example shows how to learn a linear, multi-label (one-vs-rest) support
vector machine using the algorithm.
The following example shows how to learn a non-linear, multi-label (one-vs-rest)
support vector machine using the kernel and the
algorithm.
Support vector machines can have their weights calibrated in order to produce probability
estimates (instead of simple class separation distances). The following example shows how
to use within
to generate a probabilistic SVM:
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Initializes a new instance of the class.
Initializes a new instance of the class.
One-against-all Multi-label Support Vector Machine Learning Algorithm
This class can be used to train Kernel Support Vector Machines with
any algorithm using a one-against-all strategy. The underlying
training algorithm can be configured by defining the
property.
One example of learning algorithm that can be used with this class is the
Sequential Minimal Optimization
(SMO) algorithm.
The following example shows how to learn a linear, multi-label (one-vs-rest) support
vector machine using the algorithm.
The following example shows how to learn a non-linear, multi-label (one-vs-rest)
support vector machine using the kernel and the
algorithm.
Support vector machines can have their weights calibrated in order to produce probability
estimates (instead of simple class separation distances). The following example shows how
to use within
to generate a probabilistic SVM:
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Initializes a new instance of the class.
Initializes a new instance of the class.
Base class for multi-label support vector learning algorithms.
Gets or sets the kernel function to be used to learn the
kernel support
vector machines.
One-class Support Vector Machine learning algorithm.
The following example shows how to use an one-class SVM.
Obsolete.
Initializes a new instance of the class.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
One-class Support Vector Machine learning algorithm.
The following example shows how to use an one-class SVM.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
One-class Support Vector Machine learning algorithm.
The following example shows how to use an one-class SVM.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
One-class Support Vector Machine Learning Algorithm.
Gets or sets the classifier being learned.
Gets or sets the kernel function use to create a
kernel Support Vector Machine. If this property
is set, will be
set to false.
Gets or sets whether initial values for some kernel parameters
should be estimated from the data, if possible. Default is true.
Constructs a new one-class support vector learning algorithm.
A support vector machine.
Constructs a new one-class support vector learning algorithm.
Gets the value for the Lagrange multipliers
(alpha) for every observation vector.
Gets or sets a cancellation token that can be used to
stop the learning algorithm while it is running.
Convergence tolerance. Default value is 1e-2.
The criterion for completing the model training process. The default is 0.01.
Gets or sets a value indicating whether to use
shrinking heuristics during learning. Default is true.
true to use shrinking; otherwise, false.
Controls the number of outliers accepted by the algorithm. This
value provides an upper bound on the fraction of training errors
and a lower bound of the fraction of support vectors. Default is 0.5
The summary description is given in Chang and Lin,
"LIBSVM: A Library for Support Vector Machines", 2013.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Learns a model that can map the given inputs to the desired outputs.
The model inputs.
The weight of importance for each input sample.
A model that has learned how to produce suitable outputs
given the input data .
Obsolete.
Obsolete.
L1-regularized logistic regression (probabilistic SVM)
learning algorithm (-s 6).
This class implements a learning algorithm
specifically crafted for probabilistic linear machines only. It provides a L1-
regularized coordinate descent learning algorithm for optimizing the learning
problem. The code has been based on liblinear's method solve_l1r_lr
method, whose original description is provided below.
Liblinear's solver -s 6: L1R_LR.
A coordinate descent algorithm for L1-regularized
logistic regression (probabilistic svm) problems.
min_w \sum |wj| + C \sum log(1+exp(-yi w^T xi)),
Given: x, y, Cp, Cn, and eps as the stopping tolerance
See Yuan et al. (2011) and appendix of LIBLINEAR paper, Fan et al. (2008)
Probabilistic SVMs are exactly the same as logistic regression models
trained using a large-margin decision criteria. As such, any linear SVM
learning algorithm can be used to obtain
objects as well.
The following example shows how to obtain a
from a probabilistic linear . It contains
exactly the same data used in the
documentation page for .
Constructs a new Newton method algorithm for L1-regularized
logistic regression (probabilistic linear vector machine).
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Obsolete.
L1-regularized logistic regression (probabilistic SVM)
learning algorithm (-s 6).
This class implements a learning algorithm
specifically crafted for probabilistic linear machines only. It provides a L1-
regularized coordinate descent learning algorithm for optimizing the learning
problem. The code has been based on liblinear's method solve_l1r_lr
method, whose original description is provided below.
Liblinear's solver -s 6: L1R_LR.
A coordinate descent algorithm for L1-regularized
logistic regression (probabilistic svm) problems.
min_w \sum |wj| + C \sum log(1+exp(-yi w^T xi)),
Given: x, y, Cp, Cn, and eps as the stopping tolerance
See Yuan et al. (2011) and appendix of LIBLINEAR paper, Fan et al. (2008)
Probabilistic SVMs are exactly the same as logistic regression models
trained using a large-margin decision criteria. As such, any linear SVM
learning algorithm can be used to obtain
objects as well.
The following example shows how to obtain a
from a probabilistic linear . It contains
exactly the same data used in the
documentation page for .
Constructs a new Newton method algorithm for L1-regularized
logistic regression (probabilistic linear vector machine).
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Base class for L1-regularized logistic regression (probabilistic SVM) learning algorithm (-s 6).
Initializes a new instance of the class.
Gets or sets the maximum number of iterations that should
be performed until the algorithm stops. Default is 1000.
Gets or sets the maximum number of line searches
that can be performed per iteration. Default is 20.
Gets or sets the maximum number of inner iterations that can
be performed by the inner solver algorithm. Default is 100.
Convergence tolerance. Default value is 0.01.
The criterion for completing the model training process. The default is 0.01.
Runs the learning algorithm.
Obsolete.
L2-regularized logistic regression (probabilistic support
vector machine) learning algorithm in the dual form (-s 7).
This class implements a learning algorithm
specifically crafted for probabilistic linear machines only. It provides a L2-
regularized coordinate descent learning algorithm for optimizing the dual form
of the learning problem. The code has been based on liblinear's method
solve_l2r_lr_dual method, whose original description is provided below.
Liblinear's solver -s 7: L2R_LR_DUAL. A coordinate descent
algorithm for the dual of L2-regularized logistic regression problems.
min_\alpha 0.5(\alpha^T Q \alpha) + \sum \alpha_i log (\alpha_i)
+ (upper_bound_i - \alpha_i) log (upper_bound_i - \alpha_i),
s.t. 0 <= \alpha_i <= upper_bound_i,
where Qij = yi yj xi^T xj and
upper_bound_i = Cp if y_i = 1
upper_bound_i = Cn if y_i = -1
Given: x, y, Cp, Cn, and eps as the stopping tolerance
See Algorithm 5 of Yu et al., MLJ 2010.
Probabilistic SVMs are exactly the same as logistic regression models
trained using a large-margin decision criteria. As such, any linear SVM
learning algorithm can be used to obtain
objects as well.
The following example shows how to obtain a
from a probabilistic linear . It contains
exactly the same data used in the
documentation page for .
Constructs a new Newton method algorithm for L2-regularized
logistic regression (probabilistic linear SVMs) dual problems.
Obsolete.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
L2-regularized logistic regression (probabilistic support
vector machine) learning algorithm in the dual form (-s 7).
This class implements a learning algorithm
specifically crafted for probabilistic linear machines only. It provides a L2-
regularized coordinate descent learning algorithm for optimizing the dual form
of the learning problem. The code has been based on liblinear's method
solve_l2r_lr_dual method, whose original description is provided below.
Liblinear's solver -s 7: L2R_LR_DUAL. A coordinate descent
algorithm for the dual of L2-regularized logistic regression problems.
min_\alpha 0.5(\alpha^T Q \alpha) + \sum \alpha_i log (\alpha_i)
+ (upper_bound_i - \alpha_i) log (upper_bound_i - \alpha_i),
s.t. 0 <= \alpha_i <= upper_bound_i,
where Qij = yi yj xi^T xj and
upper_bound_i = Cp if y_i = 1
upper_bound_i = Cn if y_i = -1
Given: x, y, Cp, Cn, and eps as the stopping tolerance
See Algorithm 5 of Yu et al., MLJ 2010.
Probabilistic SVMs are exactly the same as logistic regression models
trained using a large-margin decision criteria. As such, any linear SVM
learning algorithm can be used to obtain
objects as well.
The following example shows how to obtain a
from a probabilistic linear . It contains
exactly the same data used in the
documentation page for .
Constructs a new Newton method algorithm for L2-regularized
logistic regression (probabilistic linear SVMs) dual problems.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Base class for L2-regularized logistic regression (probabilistic support
vector machine) learning algorithm in the dual form (-s 7).
Constructs a new Newton method algorithm for L2-regularized
logistic regression (probabilistic linear SVMs) dual problems.
Gets or sets the maximum number of iterations that should
be performed until the algorithm stops. Default is 1000.
Gets or sets the maximum number of inner iterations that can
be performed by the inner solver algorithm. Default is 100.
Convergence tolerance. Default value is 0.1.
The criterion for completing the model training process. The default is 0.1.
Runs the learning algorithm.
Obsolete.
L2-regularized L2-loss logistic regression (probabilistic
support vector machine) learning algorithm in the primal.
This class implements a L2-regularized L2-loss logistic regression (probabilistic
support vector machine) learning algorithm that operates in the primal form of the
optimization problem. This method has been based on liblinear's l2r_lr_fun
problem specification, optimized using a
Trust-region Newton method.
Liblinear's solver -s 0: L2R_LR. A trust region newton
algorithm for the primal of L2-regularized, L2-loss logistic regression.
Probabilistic SVMs are exactly the same as logistic regression models
trained using a large-margin decision criteria. As such, any linear SVM
learning algorithm can be used to obtain
objects as well.
The following example shows how to obtain a
from a probabilistic linear . It contains
exactly the same data used in the
documentation page for .
Obsolete.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Initializes a new instance of the class.
L2-regularized L2-loss logistic regression (probabilistic
support vector machine) learning algorithm in the primal.
This class implements a L2-regularized L2-loss logistic regression (probabilistic
support vector machine) learning algorithm that operates in the primal form of the
optimization problem. This method has been based on liblinear's l2r_lr_fun
problem specification, optimized using a
Trust-region Newton method.
Liblinear's solver -s 0: L2R_LR. A trust region newton
algorithm for the primal of L2-regularized, L2-loss logistic regression.
Probabilistic SVMs are exactly the same as logistic regression models
trained using a large-margin decision criteria. As such, any linear SVM
learning algorithm can be used to obtain
objects as well.
The following example shows how to obtain a
from a probabilistic linear . It contains
exactly the same data used in the
documentation page for .
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
L2-regularized L2-loss logistic regression (probabilistic
support vector machine) learning algorithm in the primal.
This class implements a L2-regularized L2-loss logistic regression (probabilistic
support vector machine) learning algorithm that operates in the primal form of the
optimization problem. This method has been based on liblinear's l2r_lr_fun
problem specification, optimized using a
Trust-region Newton method.
Liblinear's solver -s 0: L2R_LR. A trust region newton
algorithm for the primal of L2-regularized, L2-loss logistic regression.
Probabilistic SVMs are exactly the same as logistic regression models
trained using a large-margin decision criteria. As such, any linear SVM
learning algorithm can be used to obtain
objects as well.
The following example shows how to obtain a
from a probabilistic linear . It contains
exactly the same data used in the
documentation page for .
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Base class for probabilistic Newton Method learning.
Constructs a new Newton method algorithm for L2-regularized logistic
regression (probabilistic linear SVMs) primal problems (-s 0).
Convergence tolerance. Default value is 0.01.
The criterion for completing the model training process. The default is 0.01.
Gets or sets the maximum number of iterations that should
be performed until the algorithm stops. Default is 1000.
Runs the learning algorithm.
Obsolete.
Probabilistic Output Calibration for Linear machines.
Instead of producing probabilistic outputs, Support Vector Machines
express their decisions in the form of a distance from support vectors in
feature space. In order to convert the SVM outputs into probabilities,
Platt (1999) proposed the calibration of the SVM outputs using a sigmoid
(Logit) link function. Later, Lin et al (2007) provided a corrected and
improved version of Platt's probabilistic outputs. This class implements
the later.
This class is not an actual learning algorithm, but a calibrator.
Machines passed as input to this algorithm should already have been trained
by a proper learning algorithm such as
Sequential Minimal Optimization (SMO).
This class can also be used in combination with
or to learn s
using the one-vs-one or one-vs-all multi-class decision strategies, respectively.
References:
-
John C. Platt. 1999. Probabilistic Outputs for Support Vector Machines and Comparisons to
Regularized Likelihood Methods. In ADVANCES IN LARGE MARGIN CLASSIFIERS (1999), pp. 61-74.
-
Hsuan-Tien Lin, Chih-Jen Lin, and Ruby C. Weng. 2007. A note on Platt's probabilistic outputs
for support vector machines. Mach. Learn. 68, 3 (October 2007), 267-276.
The following example shows how to calibrate a SVM that has
been trained to perform a simple XOR function.
The next example shows how to solve a multi-class problem using a one-vs-one SVM
where the binary machines are learned using SMO and calibrated using Platt's scaling.
Initializes a new instance of Platt's Probabilistic Output Calibration algorithm.
The support vector machine to be calibrated.
Obsolete.
Initializes a new instance of Platt's Probabilistic Output Calibration algorithm.
Probabilistic Output Calibration for Kernel machines.
Instead of producing probabilistic outputs, Support Vector Machines
express their decisions in the form of a distance from support vectors in
feature space. In order to convert the SVM outputs into probabilities,
Platt (1999) proposed the calibration of the SVM outputs using a sigmoid
(Logit) link function. Later, Lin et al (2007) provided a corrected and
improved version of Platt's probabilistic outputs. This class implements
the later.
This class is not an actual learning algorithm, but a calibrator.
Machines passed as input to this algorithm should already have been trained
by a proper learning algorithm such as
Sequential Minimal Optimization (SMO).
This class can also be used in combination with
or to learn s
using the one-vs-one or one-vs-all multi-class decision strategies, respectively.
References:
-
John C. Platt. 1999. Probabilistic Outputs for Support Vector Machines and Comparisons to
Regularized Likelihood Methods. In ADVANCES IN LARGE MARGIN CLASSIFIERS (1999), pp. 61-74.
-
Hsuan-Tien Lin, Chih-Jen Lin, and Ruby C. Weng. 2007. A note on Platt's probabilistic outputs
for support vector machines. Mach. Learn. 68, 3 (October 2007), 267-276.
The following example shows how to calibrate a SVM that has
been trained to perform a simple XOR function.
The next example shows how to solve a multi-class problem using a one-vs-one SVM
where the binary machines are learned using SMO and calibrated using Platt's scaling.
Initializes a new instance of Platt's Probabilistic Output Calibration algorithm.
The support vector machine to be calibrated.
Initializes a new instance of Platt's Probabilistic Output Calibration algorithm.
Probabilistic Output Calibration for structured Kernel machines.
Instead of producing probabilistic outputs, Support Vector Machines
express their decisions in the form of a distance from support vectors in
feature space. In order to convert the SVM outputs into probabilities,
Platt (1999) proposed the calibration of the SVM outputs using a sigmoid
(Logit) link function. Later, Lin et al (2007) provided a corrected and
improved version of Platt's probabilistic outputs. This class implements
the later.
This class is not an actual learning algorithm, but a calibrator.
Machines passed as input to this algorithm should already have been trained
by a proper learning algorithm such as
Sequential Minimal Optimization (SMO).
This class can also be used in combination with
or to learn s
using the one-vs-one or one-vs-all multi-class decision strategies, respectively.
References:
-
John C. Platt. 1999. Probabilistic Outputs for Support Vector Machines and Comparisons to
Regularized Likelihood Methods. In ADVANCES IN LARGE MARGIN CLASSIFIERS (1999), pp. 61-74.
-
Hsuan-Tien Lin, Chih-Jen Lin, and Ruby C. Weng. 2007. A note on Platt's probabilistic outputs
for support vector machines. Mach. Learn. 68, 3 (October 2007), 267-276.
The following example shows how to calibrate a SVM that has
been trained to perform a simple XOR function.
The next example shows how to solve a multi-class problem using a one-vs-one SVM
where the binary machines are learned using SMO and calibrated using Platt's scaling.
Initializes a new instance of Platt's Probabilistic Output Calibration algorithm.
The support vector machine to be calibrated.
Initializes a new instance of Platt's Probabilistic Output Calibration algorithm.
Probabilistic Output Calibration.
Initializes a new instance of Platt's Probabilistic Output Calibration algorithm.
The support vector machine to be calibrated.
Gets or sets the maximum number of
iterations. Default is 100.
Gets or sets the tolerance under which the
answer must be found. Default is 1-e5.
Gets or sets the minimum step size used
during line search. Default is 1e-10.
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Initializes a new instance of the class.
Obsolete.
Obsolete.
Obsolete.
Obsolete.
Coordinate descent algorithm for the L1 or L2-loss linear Support
Vector Regression (epsilon-SVR) learning problem in the dual form
(-s 12 and -s 13).
This class implements a learning algorithm
specifically crafted for linear machines only. It provides a L2-regularized, L1
or L2-loss coordinate descent learning algorithm for optimizing the dual form of
learning. The code has been based on liblinear's method solve_l2r_l1l2_svc
method, whose original description is provided below.
Liblinear's solver -s 12: L2R_L2LOSS_SVR_DUAL and -s 13:
L2R_L1LOSS_SVR_DUAL. A coordinate descent algorithm for L1-loss and
L2-loss linear epsilon-vector regression (epsilon-SVR).
min_\beta 0.5\beta^T (Q + diag(lambda)) \beta - p \sum_{i=1}^l|\beta_i| + \sum_{i=1}^l yi\beta_i,
s.t. -upper_bound_i <= \beta_i <= upper_bound_i,
where Qij = yi yj xi^T xj and
D is a diagonal matrix
In L1-SVM case:
upper_bound_i = C
lambda_i = 0
In L2-SVM case:
upper_bound_i = INF
lambda_i = 1/(2*C)
Given: x, y, p, C and eps as the stopping tolerance
See Algorithm 4 of Ho and Lin, 2012.
Obsolete.
Initializes a new instance of the class.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Coordinate descent algorithm for the L1 or L2-loss linear Support
Vector Regression (epsilon-SVR) learning problem in the dual form
(-s 12 and -s 13).
This class implements a learning algorithm
specifically crafted for linear machines only. It provides a L2-regularized, L1
or L2-loss coordinate descent learning algorithm for optimizing the dual form of
learning. The code has been based on liblinear's method solve_l2r_l1l2_svc
method, whose original description is provided below.
Liblinear's solver -s 12: L2R_L2LOSS_SVR_DUAL and -s 13:
L2R_L1LOSS_SVR_DUAL. A coordinate descent algorithm for L1-loss and
L2-loss linear epsilon-vector regression (epsilon-SVR).
min_\beta 0.5\beta^T (Q + diag(lambda)) \beta - p \sum_{i=1}^l|\beta_i| + \sum_{i=1}^l yi\beta_i,
s.t. -upper_bound_i <= \beta_i <= upper_bound_i,
where Qij = yi yj xi^T xj and
D is a diagonal matrix
In L1-SVM case:
upper_bound_i = C
lambda_i = 0
In L2-SVM case:
upper_bound_i = INF
lambda_i = 1/(2*C)
Given: x, y, p, C and eps as the stopping tolerance
See Algorithm 4 of Ho and Lin, 2012.
Initializes a new instance of the class.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Base class for Coordinate descent algorithm for the L1 or L2-loss linear Support
Vector Regression (epsilon-SVR) learning problem in the dual form (-s 12 and -s 13).
Constructs a new coordinate descent algorithm for L1-loss and L2-loss SVM dual problems.
Gets or sets the cost function that
should be optimized. Default is
.
Gets the value for the Lagrange multipliers
(alpha) for every observation vector.
Convergence tolerance. Default value is 0.1.
The criterion for completing the model training process. The default is 0.1.
Runs the learning algorithm.
Obsolete.
L2-regularized L2-loss linear support vector regression
(SVR) learning algorithm in the primal formulation (-s 11).
This class implements a L2-regularized L2-loss support vector regression (SVR)
learning algorithm that operates in the primal form of the optimization problem.
This method has been based on liblinear's l2r_l2_svr_fun problem specification,
optimized using a Trust-region Newton method.
Liblinear's solver -s 11: L2R_L2LOSS_SVR. A trust region newton algorithm
for the primal of L2-regularized, L2-loss linear epsilon-vector regression (epsilon-SVR).
Initializes a new instance of the class.
Obsolete.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
L2-regularized L2-loss linear support vector regression
(SVR) learning algorithm in the primal formulation (-s 11).
This class implements a L2-regularized L2-loss support vector regression (SVR)
learning algorithm that operates in the primal form of the optimization problem.
This method has been based on liblinear's l2r_l2_svr_fun problem specification,
optimized using a Trust-region Newton method.
Liblinear's solver -s 11: L2R_L2LOSS_SVR. A trust region newton algorithm
for the primal of L2-regularized, L2-loss linear epsilon-vector regression (epsilon-SVR).
Initializes a new instance of the class.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Base class for newton method for linear regression learning algorithm.
Constructs a new Newton method algorithm for L2-regularized
support vector regression (SVR-SVMs) primal problems.
Convergence tolerance. Default value is 0.01.
The criterion for completing the model training process. The default is 0.01.
Gets or sets the maximum number of iterations that should
be performed until the algorithm stops. Default is 1000.
Runs the learning algorithm.
Obsolete.
Gets the selection strategy to be used in SMO.
Uses the sequential selection strategy as
suggested by Keerthi et al's algorithm 1.
Always select the worst violation pair
to be optimized first, as suggested in
Keerthi et al's algorithm 2.
Use a second order selection algorithm, using
the same algorithm as LibSVM's implementation.
Sequential Minimal Optimization (SMO) Algorithm
The SMO algorithm is an algorithm for solving large quadratic programming (QP)
optimization problems, widely used for the training of support vector machines.
First developed by John C. Platt in 1998, SMO breaks up large QP problems into
a series of smallest possible QP problems, which are then solved analytically.
This class follows the original algorithm by Platt with additional modifications
by Keerthi et al.
This class can also be used in combination with
or to learn s
using the one-vs-one or one-vs-all multi-class decision strategies, respectively.
References:
-
Wikipedia, The Free Encyclopedia. Sequential Minimal Optimization. Available on:
http://en.wikipedia.org/wiki/Sequential_Minimal_Optimization
-
John C. Platt, Sequential Minimal Optimization: A Fast Algorithm for Training Support
Vector Machines. 1998. Available on: http://research.microsoft.com/en-us/um/people/jplatt/smoTR.pdf
-
S. S. Keerthi et al. Improvements to Platt's SMO Algorithm for SVM Classifier Design.
Technical Report CD-99-14. Available on: http://www.cs.iastate.edu/~honavar/keerthi-svm.pdf
-
J. P. Lewis. A Short SVM (Support Vector Machine) Tutorial. Available on:
http://www.idiom.com/~zilla/Work/Notes/svmtutorial.pdf
The following example shows how to use a SVM to learn a simple XOR function.
The next example shows how to solve a multi-class problem using a one-vs-one SVM
where the binary machines are learned using SMO.
The same as before, but using a Gaussian kernel.
The following example shows how to learn a simple binary SVM using
a precomputed kernel matrix obtained from a Gaussian kernel.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Obsolete.
Obsolete.
Initializes a new instance of the class.
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Sequential Minimal Optimization (SMO) Algorithm.
The SMO algorithm is an algorithm for solving large quadratic programming (QP)
optimization problems, widely used for the training of support vector machines.
First developed by John C. Platt in 1998, SMO breaks up large QP problems into
a series of smallest possible QP problems, which are then solved analytically.
This class follows the original algorithm by Platt with additional modifications
by Keerthi et al.
This class can also be used in combination with
or to learn s
using the one-vs-one or one-vs-all multi-class decision strategies, respectively.
References:
-
Wikipedia, The Free Encyclopedia. Sequential Minimal Optimization. Available on:
http://en.wikipedia.org/wiki/Sequential_Minimal_Optimization
-
John C. Platt, Sequential Minimal Optimization: A Fast Algorithm for Training Support
Vector Machines. 1998. Available on: http://research.microsoft.com/en-us/um/people/jplatt/smoTR.pdf
-
S. S. Keerthi et al. Improvements to Platt's SMO Algorithm for SVM Classifier Design.
Technical Report CD-99-14. Available on: http://www.cs.iastate.edu/~honavar/keerthi-svm.pdf
-
J. P. Lewis. A Short SVM (Support Vector Machine) Tutorial. Available on:
http://www.idiom.com/~zilla/Work/Notes/svmtutorial.pdf
The following example shows how to use a SVM to learn a simple XOR function.
The next example shows how to solve a multi-class problem using a one-vs-one SVM
where the binary machines are learned using SMO.
The same as before, but using a Gaussian kernel.
The following example shows how to learn a simple binary SVM using
a precomputed kernel matrix obtained from a Gaussian kernel.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Sequential Minimal Optimization (SMO) Algorithm (for arbitrary data types).
The SMO algorithm is an algorithm for solving large quadratic programming (QP)
optimization problems, widely used for the training of support vector machines.
First developed by John C. Platt in 1998, SMO breaks up large QP problems into
a series of smallest possible QP problems, which are then solved analytically.
This class follows the original algorithm by Platt with additional modifications
by Keerthi et al.
This class can also be used in combination with
or to learn s
using the one-vs-one or one-vs-all multi-class decision strategies, respectively.
References:
-
Wikipedia, The Free Encyclopedia. Sequential Minimal Optimization. Available on:
http://en.wikipedia.org/wiki/Sequential_Minimal_Optimization
-
John C. Platt, Sequential Minimal Optimization: A Fast Algorithm for Training Support
Vector Machines. 1998. Available on: http://research.microsoft.com/en-us/um/people/jplatt/smoTR.pdf
-
S. S. Keerthi et al. Improvements to Platt's SMO Algorithm for SVM Classifier Design.
Technical Report CD-99-14. Available on: http://www.cs.iastate.edu/~honavar/keerthi-svm.pdf
-
J. P. Lewis. A Short SVM (Support Vector Machine) Tutorial. Available on:
http://www.idiom.com/~zilla/Work/Notes/svmtutorial.pdf
The following example shows how to use a SVM to learn a simple XOR function.
The next example shows how to solve a multi-class problem using a one-vs-one SVM
where the binary machines are learned using SMO.
The same as before, but using a Gaussian kernel.
The following example shows how to learn a simple binary SVM using
a precomputed kernel matrix obtained from a Gaussian kernel.
Creates an instance of the model to be learned. Inheritors
of this abstract class must define this method so new models
can be created from the training data.
Base class for Sequential Minimal Optimization.
Initializes a new instance of the class.
Epsilon for round-off errors. Default value is 1e-6.
Convergence tolerance. Default value is 1e-2.
The criterion for completing the model training process. The default is 0.01.
Gets or sets the pair selection
strategy to be used during optimization.
Gets or sets the cache size to partially store the kernel
matrix. Default is the same number of input vectors, meaning
the entire kernel matrix will be computed and cached in memory.
If set to zero, the cache will be disabled and all operations will
be computed as needed.
In order to know how many rows can fit under a amount of memory, you can use
.
Be sure to also test the algorithm with the cache disabled, as sometimes the
cost of the extra memory allocations needed by the cache will be higher than
the cost of evaluating the kernel function, specially for fast kernels such
as .
Gets or sets a value indicating whether shrinking heuristics should be used. Default is false. Note:
this property can only be used when is set to .
true to use shrinking heuristics; otherwise, false.
Gets the value for the Lagrange multipliers
(alpha) for every observation vector.
Gets or sets whether to produce compact models. Compact
formulation is currently limited to linear models.
Gets the indices of the active examples (examples which have
the corresponding Lagrange multiplier different than zero).
Gets the indices of the non-bounded examples (examples which
have the corresponding Lagrange multipliers between 0 and C).
Gets the indices of the examples at the boundary (examples
which have the corresponding Lagrange multipliers equal to C).
Runs the learning algorithm.
Chooses which multipliers to optimize using heuristics.
Analytically solves the optimization problem for two Lagrange multipliers.
Computes the SVM output for a given point.
Obsolete.
Exact support vector reduction through linear dependency elimination.
The following example shows how to reduce the number of support vectors in
a SVM by removing vectors which are linearly dependent between themselves.
Initializes a new instance of the class.
The machine to be reduced.
Exact support vector reduction through linear dependency elimination.
The following example shows how to reduce the number of support vectors in
a SVM by removing vectors which are linearly dependent between themselves.
Initializes a new instance of the class.
The machine to be reduced.
Exact support vector reduction through linear dependency elimination.
The following example shows how to reduce the number of support vectors in
a SVM by removing vectors which are linearly dependent between themselves.
Initializes a new instance of the class.
The machine to be reduced.
Exact support vector reduction through linear dependency elimination.
The following example shows how to reduce the number of support vectors in
a SVM by removing vectors which are linearly dependent between themselves.
Gets or sets the minimum threshold that is used to determine
whether a weight will be kept in the machine or not. Default
is 1e-12.
Creates a new algorithm.
The machine to be reduced.
Learns a model that can map the given inputs to the given outputs.
Runs the learning algorithm.
Obsolete.
Creates a new algorithm.
The machine to be reduced.
One-against-one Multi-class Kernel Support Vector Machine Classifier.
The Support Vector Machine is by nature a binary classifier. One of the ways
to extend the original SVM algorithm to multiple classes is to build a one-
against-one scheme where multiple SVMs specialize to recognize each of the
available classes. By using a competition scheme, the original multi-class
classification problem is then reduced to n*(n/2) smaller binary problems.
Currently this class supports only Kernel machines as the underlying classifiers.
If a Linear Support Vector Machine is needed, specify a Linear kernel in the
constructor at the moment of creation.
References:
-
http://courses.media.mit.edu/2006fall/mas622j/Projects/aisen-project/index.html
-
http://nlp.stanford.edu/IR-book/html/htmledition/multiclass-svms-1.html
The following example shows how to learn a linear, multi-class support vector
machine using the algorithm.
The following example shows how to learn a non-linear, multi-class support
vector machine using the kernel and the
algorithm.
Support vector machines can have their weights calibrated in order to produce
probability estimates (instead of simple class separation distances). The
following example shows how to use
within to generate a probabilistic
SVM:
Initializes a new instance of the class.
The number of classes in the multi-class classification problem.
A function to create the inner binary support vector machines.
Initializes a new instance of the class.
The number of inputs by the machine.
The number of classes to be handled by the machine.
The kernel function to be used in the machine.
One-against-one Multi-class Kernel Support Vector Machine Classifier.
The Support Vector Machine is by nature a binary classifier. One of the ways
to extend the original SVM algorithm to multiple classes is to build a one-
against-one scheme where multiple SVMs specialize to recognize each of the
available classes. By using a competition scheme, the original multi-class
classification problem is then reduced to n*(n/2) smaller binary problems.
Currently this class supports only Kernel machines as the underlying classifiers.
If a Linear Support Vector Machine is needed, specify a Linear kernel in the
constructor at the moment of creation.
References:
-
http://courses.media.mit.edu/2006fall/mas622j/Projects/aisen-project/index.html
-
http://nlp.stanford.edu/IR-book/html/htmledition/multiclass-svms-1.html
The following example shows how to learn a linear, multi-class support vector
machine using the algorithm.
The following example shows how to learn a non-linear, multi-class support
vector machine using the kernel and the
algorithm.
Support vector machines can have their weights calibrated in order to produce
probability estimates (instead of simple class separation distances). The
following example shows how to use
within to generate a probabilistic
SVM:
Initializes a new instance of the class.
The number of inputs by the machine.
The number of classes to be handled by the machine.
Initializes a new instance of the class.
The number of inputs by the machine.
The number of classes to be handled by the machine.
The kernel function to be used in the machine.
Constructs a new Multi-class Kernel Support Vector Machine
The machines to be used in each of the pair-wise class subproblems.
Creates a new object that is a copy of the current instance.
A new object that is a copy of this instance.
Gets the total number of machines
in this multi-class classifier.
Gets the number of classes.
Gets the number of inputs of the machines.
Gets the subproblems classifiers.
Computes the given input to produce the corresponding output.
An input vector.
The decision label for the given input.
Computes the given input to produce the corresponding output.
An input vector.
The output of the machine. If this is a
probabilistic machine, the
output is the probability of the positive class. If this is
a standard machine, the output is the distance to the decision
hyperplane in feature space.
The decision label for the given input.
Computes the given input to produce the corresponding output.
An input vector.
The output of the machine. If this is a
probabilistic machine, the
output is the probability of the positive class. If this is
a standard machine, the output is the distance to the decision
hyperplane in feature space.
The decision path followed by the Decision
Directed Acyclic Graph used by the
elimination method.
The decision label for the given input.
Computes the given input to produce the corresponding output.
An input vector.
The model response for each class.
The decision label for the given input.
Computes the given input to produce the corresponding output.
An input vector.
The
multi-class classification method to use.
The model response for each class.
The output of the machine. If this is a
probabilistic machine, the
output is the probability of the positive class. If this is
a standard machine, the output is the distance to the decision
hyperplane in feature space.
The decision label for the given input.
Computes the given input to produce the corresponding output.
An input vector.
The
multi-class classification method to use.
The model response for each class.
The class decision for the given input.
Computes the given input to produce the corresponding output.
An input vector.
The
multi-class classification method to use.
The output of the machine. If this is a
probabilistic machine, the
output is the probability of the positive class. If this is
a standard machine, the output is the distance to the decision
hyperplane in feature space.
The class decision for the given input.
Computes the given input to produce the corresponding output.
An input vector.
The
multi-class classification method to use.
The class decision for the given input.
Gets whether this machine has been calibrated to
produce probabilistic outputs (through the Probability(TInput)
and Probabilities(TInput) methods).
Saves the machine to a stream.
The stream to which the machine is to be serialized.
Saves the machine to a file.
The path to the file to which the machine is to be serialized.
Loads a machine from a stream.
The stream from which the machine is to be deserialized.
The deserialized machine.
Loads a machine from a file.
The path to the file from which the machine is to be deserialized.
The deserialized machine.
One-against-one Multi-class Kernel Support Vector Machine Classifier.
The Support Vector Machine is by nature a binary classifier. One of the ways
to extend the original SVM algorithm to multiple classes is to build a one-
against-one scheme where multiple SVMs specialize to recognize each of the
available classes. By using a competition scheme, the original multi-class
classification problem is then reduced to n*(n/2) smaller binary problems.
Currently this class supports only Kernel machines as the underlying classifiers.
If a Linear Support Vector Machine is needed, specify a Linear kernel in the
constructor at the moment of creation.
References:
-
http://courses.media.mit.edu/2006fall/mas622j/Projects/aisen-project/index.html
-
http://nlp.stanford.edu/IR-book/html/htmledition/multiclass-svms-1.html
The following example shows how to learn a linear, multi-class support vector
machine using the algorithm.
The following example shows how to learn a non-linear, multi-class support
vector machine using the kernel and the
algorithm.
Support vector machines can have their weights calibrated in order to produce
probability estimates (instead of simple class separation distances). The
following example shows how to use
within to generate a probabilistic
SVM:
Gets or sets the kernel function used in all machines at once.
Gets or sets the minimum number of shared support vectors that a
machine should have for kernel evaluation caching to be enabled.
Default is 64.
The cache threshold.
If the inner machines have a linear kernel, compresses
their support vectors into a single parameter vector for
each machine.
Initializes a new instance of the class.
The number of classes in the multi-class classification problem.
A function to create the inner binary support vector machines.
Gets the total number of support vectors
in the entire multi-class machine.
Gets the number of unique support
vectors in the multi-class machine.
Gets the number of shared support
vectors in the multi-class machine.
Compute SVM output with support vector sharing.
Computes a class-label decision for a given .
The input vector that should be classified into
one of the possible classes.
A class-label that best described according
to this classifier.
Computes a class-label decision for a given .
The input vector that should be classified into
one of the possible classes.
The location where to store the class-labels.
A class-label that best described according
to this classifier.
Computes a numerical score measuring the association between
the given vector and each class.
The input vector.
An array where the result will be stored,
avoiding unnecessary memory allocations.
Resets the cache and machine statistics
so they can be recomputed on next evaluation.
Gets the total kernel evaluations performed in the last call
to Decide(TInput) and similar functions in the current thread.
The number of total kernel evaluations.
Gets the number of cache hits during in the last call
to Decide(TInput) and similar functions in the current thread.
The number of cache hits in the last decision.
Performs application-defined tasks associated with
freeing, releasing, or resetting unmanaged resources.
Releases unmanaged and - optionally - managed resources
true to release both managed and unmanaged resources;
false to release only unmanaged resources.
Creates a new object that is a copy of the current instance.
A new object that is a copy of this instance.
One-against-one Multi-class Kernel Support Vector Machine Classifier.
The Support Vector Machine is by nature a binary classifier. One of the ways
to extend the original SVM algorithm to multiple classes is to build a one-
against-one scheme where multiple SVMs specialize to recognize each of the
available classes. By using a competition scheme, the original multi-class
classification problem is then reduced to n*(n/2) smaller binary problems.
Currently this class supports only Kernel machines as the underlying classifiers.
If a Linear Support Vector Machine is needed, specify a Linear kernel in the
constructor at the moment of creation.
References:
-
http://courses.media.mit.edu/2006fall/mas622j/Projects/aisen-project/index.html
-
http://nlp.stanford.edu/IR-book/html/htmledition/multiclass-svms-1.html
Initializes a new instance of the class.
The number of classes in the multi-class classification problem.
The number of inputs (length of the input vectors) accepted by the machine.
The kernel function to be used.
One-against-all Multi-label Kernel Support Vector Machine Classifier.
The Support Vector Machine is by nature a binary classifier. Multiple label
problems are problems in which an input sample is allowed to belong to one
or more classes. A way to implement multi-label classes in support vector
machines is to build a one-against-all decision scheme where multiple SVMs
are trained to detect each of the available classes.
Currently this class supports only Kernel machines as the underlying classifiers.
If a Linear Support Vector Machine is needed, specify a Linear kernel in the
constructor at the moment of creation.
References:
-
http://courses.media.mit.edu/2006fall/mas622j/Projects/aisen-project/index.html
-
http://nlp.stanford.edu/IR-book/html/htmledition/multiclass-svms-1.html
The following example shows how to learn a linear, multi-label (one-vs-rest) support
vector machine using the algorithm.
The following example shows how to learn a non-linear, multi-label (one-vs-rest)
support vector machine using the kernel and the
algorithm.
Support vector machines can have their weights calibrated in order to produce probability
estimates (instead of simple class separation distances). The following example shows how
to use within
to generate a probabilistic SVM:
Initializes a new instance of the class.
The number of classes in the multi-label classification problem.
A function to create the inner binary classifiers.
Initializes a new instance of the class.
The number of inputs by the machine.
The number of classes to be handled by the machine.
The kernel function to be used in the machine.
One-against-all Multi-label Kernel Support Vector Machine Classifier.
The Support Vector Machine is by nature a binary classifier. Multiple label
problems are problems in which an input sample is allowed to belong to one
or more classes. A way to implement multi-label classes in support vector
machines is to build a one-against-all decision scheme where multiple SVMs
are trained to detect each of the available classes.
Currently this class supports only Kernel machines as the underlying classifiers.
If a Linear Support Vector Machine is needed, specify a Linear kernel in the
constructor at the moment of creation.
References:
-
http://courses.media.mit.edu/2006fall/mas622j/Projects/aisen-project/index.html
-
http://nlp.stanford.edu/IR-book/html/htmledition/multiclass-svms-1.html
The following example shows how to learn a linear, multi-label (one-vs-rest) support
vector machine using the algorithm.
The following example shows how to learn a non-linear, multi-label (one-vs-rest)
support vector machine using the kernel and the
algorithm.
Support vector machines can have their weights calibrated in order to produce probability
estimates (instead of simple class separation distances). The following example shows how
to use within
to generate a probabilistic SVM:
Initializes a new instance of the class.
The number of inputs by the machine.
The number of classes to be handled by the machine.
Initializes a new instance of the class.
The number of inputs by the machine.
The number of classes to be handled by the machine.
The kernel function to be used in the machine.
Initializes a new instance of the class.
The existing machines for detecting each of the classes against all other classes.
Gets the classifier for class .
Saves the machine to a stream.
The stream to which the machine is to be serialized.
Saves the machine to a file.
The path to the file to which the machine is to be serialized.
Loads a machine from a stream.
The stream from which the machine is to be deserialized.
The deserialized machine.
Loads a machine from a file.
The path to the file from which the machine is to be deserialized.
The deserialized machine.
Gets the number of classes.
Gets the number of inputs of the machines.
Gets the subproblems classifiers.
Computes the given input to produce the corresponding output.
An input vector.
The output for the given input.
The decision label for the given input.
Computes the given input to produce the corresponding outputs.
An input vector.
The model response for each class.
The decision label for the given input.
Computes the given input to produce the corresponding outputs.
An input vector.
The decision label for the given input.
Probability computation strategies for
Probabilities should be computed per-class, meaning the probabilities among all the classe should not need to sum
up to one. This is the default when dealing with multi-label (as opposed to mult-class) classification problems.
Probabilities should be normalized to sum up to one. This will be done using the
function considering the output probabilities of all classes.
Probabilities should be normalized to sum up to one. However, in a one-vs-rest setting, one of the machines will
have been trained to specifically distinguish between the winning class and the rest of the classes. As such the
output of this class will be used to determine the true probability of the winning class, and the complement of
this probabilitiy will be divided among the restant of the losing classses. The probabilities of the losing classes
will be determined using a softmax.
One-against-all Multi-label Kernel Support Vector Machine Classifier.
The Support Vector Machine is by nature a binary classifier. Multiple label
problems are problems in which an input sample is allowed to belong to one
or more classes. A way to implement multi-label classes in support vector
machines is to build a one-against-all decision scheme where multiple SVMs
are trained to detect each of the available classes.
Currently this class supports only Kernel machines as the underlying classifiers.
If a Linear Support Vector Machine is needed, specify a Linear kernel in the
constructor at the moment of creation.
References:
-
http://courses.media.mit.edu/2006fall/mas622j/Projects/aisen-project/index.html
-
http://nlp.stanford.edu/IR-book/html/htmledition/multiclass-svms-1.html
The following example shows how to learn a linear, multi-label (one-vs-rest) support
vector machine using the algorithm.
The following example shows how to learn a non-linear, multi-label (one-vs-rest)
support vector machine using the kernel and the
algorithm.
Support vector machines can have their weights calibrated in order to produce probability
estimates (instead of simple class separation distances). The following example shows how
to use within
to generate a probabilistic SVM:
Gets the total number of support vectors
in the entire multi-label machine.
Gets the number of unique support
vectors in the multi-label machine.
Gets the number of shared support
vectors in the multi-label machine.
Gets or sets the that should be used when computing probabilities
using the and related methods.
Gets or sets the parallelization options used
when deciding the class of a new sample.
If the inner machines have a linear kernel, compresses
their support vectors into a single parameter vector for
each machine.
Compute SVM output with support vector sharing.
Resets the cache and machine statistics
so they can be recomputed on next evaluation.
Computes whether a class label applies to an vector.
The input vectors that should be classified as
any of the possible classes.
The class label index to be tested.
A boolean value indicating whether the given
class label applies to the vector.
Computes class-label decisions for the given .
The input vectors that should be classified as
any of the possible classes.
The location where to store the class-labels.
A set of class-labels that best describe the
vectors according to this classifier.
Predicts a class label vector for the given input vector, returning a
numerical score measuring the strength of association of the input vector
to each of the possible classes.
A set of input vectors.
The class labels associated with each input
vector, as predicted by the classifier. If passed as null, the classifier
will create a new array.
An array where the scores will be stored,
avoiding unnecessary memory allocations.
Predicts a class label vector for the given input vector, returning the
log-likelihoods of the input vector belonging to each possible class.
The input vector.
The class label predicted by the classifier.
An array where the log-likelihoods will be stored,
avoiding unnecessary memory allocations.
Predicts a class label vector for the given input vector, returning the
log-likelihoods of the input vector belonging to each possible class.
The input vector.
The class label predicted by the classifier.
An array where the log-likelihoods will be stored,
avoiding unnecessary memory allocations.
Computes a numerical score measuring the association between
the given vector and a given
.
The input vector.
The index of the class whose score will be computed.
System.Double.
Computes a class-label decision for a given .
The input vector that should be classified into
one of the possible classes.
An array where the scores will be stored,
avoiding unnecessary memory allocations.
A class-label that best described according
to this classifier.
Performs application-defined tasks associated with
freeing, releasing, or resetting unmanaged resources.
Releases unmanaged and - optionally - managed resources
true to release both managed and unmanaged resources;
false to release only unmanaged resources.
Gets the total kernel evaluations performed in the last call
to Decide(TInput) and similar functions in the current thread.
The number of total kernel evaluations.
Gets the number of cache hits during in the last call
to Decide(TInput) and similar functions in the current thread.
The number of cache hits in the last decision.
Initializes a new instance of the class.
The number of classes in the multi-label classification problem.
A function to create the inner binary classifiers.
One-against-all Multi-label Kernel Support Vector Machine Classifier.
The Support Vector Machine is by nature a binary classifier. Multiple label
problems are problems in which an input sample is allowed to belong to one
or more classes. A way to implement multi-label classes in support vector
machines is to build a one-against-all decision scheme where multiple SVMs
are trained to detect each of the available classes.
Currently this class supports only Kernel machines as the underlying classifiers.
If a Linear Support Vector Machine is needed, specify a Linear kernel in the
constructor at the moment of creation.
References:
-
http://courses.media.mit.edu/2006fall/mas622j/Projects/aisen-project/index.html
-
http://nlp.stanford.edu/IR-book/html/htmledition/multiclass-svms-1.html
The following example shows how to learn a linear, multi-label (one-vs-rest) support
vector machine using the algorithm.
The following example shows how to learn a non-linear, multi-label (one-vs-rest)
support vector machine using the kernel and the
algorithm.
Support vector machines can have their weights calibrated in order to produce probability
estimates (instead of simple class separation distances). The following example shows how
to use within
to generate a probabilistic SVM:
Initializes a new instance of the class.
The number of classes in the multi-class classification problem.
The number of inputs (length of the input vectors) accepted by the machine.
The kernel function to be used.
Linear Support Vector Machine (SVM).
This class implements a linear support vector machine classifier. For its kernel
counterpart, which can produce non-linear decision boundaries, please check
and .
Note: a linear SVM model can be converted to and
. This means that linear and logistic regressions
can be created using any of the highly optimized LIBLINEAR learning algorithms
such as , ,
and .
References:
-
http://en.wikipedia.org/wiki/Support_vector_machine
-
http://www.kernel-machines.org/
The first example shows how to learn a linear SVM. However, since the
problem being learned is not linearly separable, the classifier will
not be able to produce a perfect decision boundary.
The second example shows how to learn an SVM using a standard kernel
that operates on vectors of doubles. With kernels, it is possible to
produce non-linear boundaries that perfectly separate the data.
The third example shows how to learn an SVM using a Sparse kernel that
operates on sparse vectors.
Initializes a new instance of the class.
The number of inputs for this machine.
Creates a new object that is a copy of the current instance.
A new object that is a copy of this instance.
Performs an explicit conversion from to .
The linear regression to be converted.
The result of the conversion.
Performs an explicit conversion from to .
The linear regression to be converted.
The result of the conversion.
Performs an explicit conversion from to .
The logistic regression to be converted.
The result of the conversion.
Performs an explicit conversion from to .
The logistic regression to be converted.
The result of the conversion.
Creates a new linear
with the given set of linear .
The machine's linear coefficients.
The index of the intercept term in the given weights vector.
A whose linear coefficients
are defined by the given vector.
Sparse Kernel Support Vector Machine (kSVM)
The original optimal hyperplane algorithm (SVM) proposed by Vladimir Vapnik in 1963 was a
linear classifier. However, in 1992, Bernhard Boser, Isabelle Guyon and Vapnik suggested
a way to create non-linear classifiers by applying the kernel trick (originally proposed
by Aizerman et al.) to maximum-margin hyperplanes. The resulting algorithm is formally
similar, except that every dot product is replaced by a non-linear kernel function.
This allows the algorithm to fit the maximum-margin hyperplane in a transformed feature space.
The transformation may be non-linear and the transformed space high dimensional; thus though
the classifier is a hyperplane in the high-dimensional feature space, it may be non-linear in
the original input space.
The machines are also able to learn sequence classification problems in which the input vectors
can have arbitrary length. For an example on how to do that, please see the documentation page
for the DynamicTimeWarping kernel.
References:
-
http://en.wikipedia.org/wiki/Support_vector_machine
-
http://www.kernel-machines.org/
The first example shows how to learn an SVM using a
standard kernel that operates on vectors of doubles.
The second example shows how to learn an SVM using a
Sparse kernel that operates on sparse vectors.
Initializes a new instance of the class.
The number of inputs for this machine.
The kernel function to be used.
Converts a -kernel machine into an array of
linear coefficients. The first position in the array is the
value. If this
machine is not linear, an exception will be thrown.
An array of linear coefficients representing this machine.
Thrown if the kernel function is not .
Creates a new object that is a copy of the current instance.
A new object that is a copy of this instance.
Sparse Kernel Support Vector Machine (kSVM)
The original optimal hyperplane algorithm (SVM) proposed by Vladimir Vapnik in 1963 was a
linear classifier. However, in 1992, Bernhard Boser, Isabelle Guyon and Vapnik suggested
a way to create non-linear classifiers by applying the kernel trick (originally proposed
by Aizerman et al.) to maximum-margin hyperplanes. The resulting algorithm is formally
similar, except that every dot product is replaced by a non-linear kernel function.
This allows the algorithm to fit the maximum-margin hyperplane in a transformed feature space.
The transformation may be non-linear and the transformed space high dimensional; thus though
the classifier is a hyperplane in the high-dimensional feature space, it may be non-linear in
the original input space.
The machines are also able to learn sequence classification problems in which the input vectors
can have arbitrary length. For an example on how to do that, please see the documentation page
for the DynamicTimeWarping kernel.
References:
-
http://en.wikipedia.org/wiki/Support_vector_machine
-
http://www.kernel-machines.org/
The first example shows how to learn an SVM using a
standard kernel that operates on vectors of doubles.
The second example shows how to learn an SVM using a
Sparse kernel that operates on sparse vectors.
Gets or sets the kernel used by this machine.
Gets whether this machine has been calibrated to
produce probabilistic outputs (through the Probability(TInput)
method).
Gets or sets the collection of support vectors used by this machine.
Gets or sets the collection of weights used by this machine.
Gets or sets the threshold (bias) term for this machine.
Initializes a new instance of the class.
The length of the input vectors expected by the machine.
The kernel function to be used.
Computes a class-label decision for a given .
The input vector that should be classified into
one of the possible classes.
A class-label that best described according
to this classifier.
Computes a numerical score measuring the association between
the given vector and each class.
The input vector.
An array where the result will be stored,
avoiding unnecessary memory allocations.
System.Double[].
Predicts a class label vector for the given input vectors, returning the
log-likelihood that the input vector belongs to its predicted class.
The input vector.
An array where the log-likelihoods will be stored,
avoiding unnecessary memory allocations.
System.Double[].
If this machine has a linear kernel, compresses all
support vectors into a single parameter vector.
Computes the given input to produce the corresponding output.
For a binary decision problem, the decision for the negative
or positive class is typically computed by taking the sign of
the machine's output.
An input vector.
The output of the machine. If this is a
probabilistic machine, the
output is the probability of the positive class. If this is
a standard machine, the output is the distance to the decision
hyperplane in feature space.
The decision label for the given input.
Computes the given input to produce the corresponding output.
For a binary decision problem, the decision for the negative
or positive class is typically computed by taking the sign of
the machine's output.
An input vector.
The output for the given input. In a typical classification
problem, the sign of this value should be considered as the class label.
Converts a -kernel
machine into an array of linear coefficients. The first position
in the array is the value.
An array of linear coefficients representing this machine.
Gets the number of inputs accepted by this machine.
If the number of inputs is zero, this means the machine
accepts a indefinite number of inputs. This is often the
case for kernel vector machines using a sequence kernel.
Obsolete.
Creates a new object that is a copy of the current instance.
A new object that is a copy of this instance.
Performs an explicit conversion from to .
The linear Support Vector Machine to be converted.
The result of the conversion.
Performs an explicit conversion from to .
The linear Support Vector Machine to be converted.
The result of the conversion.
One-Vs-One construction for solving multi-class
classification using a set of binary classifiers.
The type for the binary classifier to be used.
The type for the classifier inputs. Default is double[].
Gets the pair of class indices handled by each inner binary classification model.
Gets the inner binary classification models.
Gets or sets the parallelization options for this algorithm.
Gets or sets a cancellation token that can be used
to cancel the algorithm while it is running.
Gets or sets the multi-class classification method to be
used when deciding for the class of a given input vector.
Default is .
Gets or sets whether to track the decision path associated
with each decision. The track will be available through the
method. Default is true.
Initializes a new instance of the class.
The number of classes in the multi-class classification problem.
A function to create the inner binary classifiers.
Gets the last decision path used during the last call to any of the
model evaluation (Decide, Distance, LogLikelihood, Probability) methods
in the current thread. This method is thread-safe and returns the value
obtained in the last call on the current thread.
Gets the last decision path without cloning.
Gets the inner binary classification model used to distinguish between
the given pair of classes.
The class index for the first class.
The class index for the second class.
A binary classifier that can distinguish between the given classes.
Gets or sets the inner binary classification model used
to distinguish between the given pair of classes.
The class index for the first class.
The class index for the second class.
A binary classifier that can distinguish between the given classes.
Gets a inner binary classification model inside this
classifier, together with the pair of classes that it has been designed to
distinguish.
The index of the model (up to ).
Gets the number of inner binary classification models used by
this instance. It should correspond to (c * (c - 1)) / 2
where c is the number of classes.
Computes a class-label decision for a given .
The input vector that should be classified into
one of the possible classes.
A class-label that best described according
to this classifier.
Computes a class-label decision for a given .
The input vector that should be classified into
one of the possible classes.
The location where to store the class-labels.
A class-label that best described according
to this classifier.
Computes a numerical score measuring the association between
the given vector and each class.
The input vector.
An array where the result will be stored,
avoiding unnecessary memory allocations.
Predicts a class label vector for the given input vector, returning the
log-likelihoods of the input vector belonging to each possible class.
A set of input vectors.
An array where the probabilities will be stored,
avoiding unnecessary memory allocations.
Computes the log-likelihood that the given input vector
belongs to the specified .
The input vector.
The index of the class whose score will be computed.
Returns an enumerator that iterates through all machines
contained inside this multi-class support vector machine.
Returns an enumerator that iterates through all machines
contained inside this multi-class support vector machine.
One-Vs-One construction for solving multi-class
classification using a set of binary classifiers.
The type for the binary classifier to be used.
Initializes a new instance of the class.
The number of classes in the multi-class classification problem.
A function to create the inner binary classifiers.
Obsolete. Please refer to instead.
Obsolete. Please use instead.
Gets the number B of bootstrap samplings
to be drawn from the population dataset.
Gets the total number of samples in the population dataset.
Gets the bootstrap samples drawn from
the population dataset as indices.
Gets or sets the model fitting function.
The fitting function should accept an array of integers containing the
indexes for the training samples, an array of integers containing the
indexes for the validation samples and should return information about
the model fitted using those two subsets of the available data.
Gets or sets a value indicating whether to use parallel
processing through the use of multiple threads or not.
Default is true.
true to use multiple threads; otherwise, false.
Creates a new Bootstrap estimation algorithm.
The size of the complete dataset.
The number B of bootstrap resamplings to perform.
Creates a new Bootstrap estimation algorithm.
The size of the complete dataset.
The number B of bootstrap resamplings to perform.
The number of samples in each subsample. Default
is to use the total number of samples in the population dataset..
Creates a new Bootstrap estimation algorithm.
The size of the complete dataset.
The indices of the bootstrap samplings.
Gets the indices for the training and validation
sets for the specified validation fold index.
The index of the validation fold.
The indices for the observations in the training set.
The indices for the observations in the validation set.
Computes the cross validation algorithm.
Gets the number of instances in training and validation
sets for the specified validation fold index.
The index of the bootstrap sample.
The number of instances in the training set.
The number of instances in the validation set.
Draws the bootstrap samples from the population.
The size of the samples to be drawn.
The number of samples to drawn.
The size of the samples to be drawn.
The indices of the samples in the original set.
Obsolete. Please refer to instead.
Gets the
object used to generate this result.
Gets the performance statistics for the training set.
Gets the performance statistics for the validation set.
Gets the 0.632 bootstrap estimate.
Initializes a new instance of the class.
The that is creating this result.
The models created during the cross-validation runs.
Saves the result to a stream.
The stream to which the result is to be serialized.
Saves the result to a stream.
The stream to which the result is to be serialized.
Loads a result from a stream.
The stream from which the result is to be deserialized.
The deserialized result.
Loads a result from a stream.
The path to the file from which the result is to be deserialized.
The deserialized result.
Obsolete. Please refer to instead.
Gets the validation value for the model.
Gets the variance of the validation
value for the model, if available.
Gets the training value for the model.
Gets the variance of the training
value for the model, if available.
Gets or sets a tag for user-defined information.
The training value for the model.
The validation value for the model.
The training value for the model.
The validation value for the model.
The variance of the training values.
The variance of the validation values.
k-Modes algorithm.
The k-Modes algorithm is a variant of the k-Means which instead of locating means attempts to locate
the modes of a set of points. As the algorithm does not require explicit numeric manipulation of the
input points (such as addition and division to compute the means), the algorithm can be used with
arbitrary (generic) data structures.
How to perform clustering with K-Modes.
Gets the clusters found by K-modes.
Gets the number of clusters.
Gets the dimensionality of the data space.
Gets or sets whether the clustering distortion error (the
average distance between all data points and the cluster
centroids) should be computed at the end of the algorithm.
The result will be stored in . Default is true.
Gets or sets the distance function used
as a distance metric between data points.
Gets or sets the maximum number of iterations to
be performed by the method. If set to zero, no
iteration limit will be imposed. Default is 0.
Gets or sets the relative convergence threshold
for stopping the algorithm. Default is 1e-5.
Gets the number of iterations performed in the
last call to this class' Compute methods.
Gets the cluster distortion error (the average distance
between data points and the cluster centroids) after the
last call to this class' Compute methods.
Gets or sets the strategy used to initialize the
centroids of the clustering algorithm. Default is
.
Initializes a new instance of KModes algorithm
The number of clusters to divide input data.
The distance function to use. Default is to
use the distance.
Initializes a new instance of KModes algorithm
The number of clusters to divide input data.
The distance function to use. Default is to
use the distance.
Divides the input data into K clusters.
The data where to compute the algorithm.
Learns a model that can map the given inputs to the desired outputs.
The model inputs.
The weight of importance for each input sample.
A model that has learned how to produce suitable outputs
given the input data .
points
Not enough points. There should be more points than the number K of clusters.
Determines if the algorithm has converged by comparing the
centroids between two consecutive iterations.
The previous centroids.
The new centroids.
Returns if all centroids had a percentage change
less than . Returns otherwise.
k-Modes algorithm.
The k-Modes algorithm is a variant of the k-Means which instead of
locating means attempts to locate the modes of a set of points. As
the algorithm does not require explicit numeric manipulation of the
input points (such as addition and division to compute the means),
the algorithm can be used with arbitrary (generic) data structures.
This is the specialized, non-generic version of the K-Modes algorithm
that is set to work on arrays.
Initializes a new instance of K-Modes algorithm
The number of clusters to divide input data.
Modes for storing models.
Stores a model on each iteration. This is the most
intensive method, but enables a quick restoration
of any point on the learning history.
Stores only the model which had shown the minimum
validation value in the training history. All other
models are discarded and only their validation and
training values will be registered.
Stores only the model which had shown the maximum
validation value in the training history. All other
models are discarded and only their validation and
training values will be registered.
Early stopping training procedure.
The early stopping training procedure monitors a validation set
during training to determine when a learning algorithm has stopped
learning and started to overfit data. This class keeps an history
of training and validation errors and will keep the best model found
during learning.
The type of the model to be trained.
Gets or sets the maximum number of iterations
performed by the early stopping algorithm. Default
is 0 (run until convergence).
Gets or sets the minimum tolerance value used
to determine convergence. Default is 1e-5.
Gets the history of training and validation values
registered at each iteration of the learning algorithm.
Gets the model with minimum validation error found during learning.
Gets the model with maximum validation error found during learning.
Gets or sets the storage policy for the procedure.
Gets or sets the iteration function for the procedure. This
function will be called on each iteration and should run one
iteration of the learning algorithm for the given model.
Creates a new early stopping procedure object.
Starts the model training, calling the
on each iteration.
True if the model training has converged, false otherwise.
Range of parameters to be tested in a grid search.
Gets or sets the name of the parameter from which the range belongs to.
Gets or sets the range of values that should be tested for this parameter.
Constructs a new GridsearchRange object.
The name for this parameter.
The starting value for this range.
The end value for this range.
The step size for this range.
Constructs a new GridSearchRange object.
The name for this parameter.
The array of values to try.
Gets the array of GridSearchParameters to try.
GridSearchRange collection.
Constructs a new collection of GridsearchRange objects.
Returns the identifying value for an item on this collection.
Adds a parameter range to the end of the GridsearchRangeCollection.
Contains the name and value of a parameter that should be used during fitting.
Gets the name of the parameter
Gets the value of the parameter.
Constructs a new parameter.
The name for the parameter.
The value for the parameter.
Determines whether the specified object is equal
to the current GridSearchParameter object.
Returns the hash code for this GridSearchParameter
Compares two GridSearchParameters for equality.
Compares two GridSearchParameters for inequality.
Returns a that represents this instance.
A that represents this instance.
Performs an implicit conversion from to .
The parameter to be converted.
The value of the parameter's .
Grid search parameter collection.
Constructs a new collection of GridsearchParameter objects.
Constructs a new collection of GridsearchParameter objects.
Returns the identifying value for an item on this collection.
Obsolete. Please refer to instead.
Gets the model.
Gets the validation value for the model.
Gets the variance of the validation
value for the model, if available.
Gets the training value for the model.
Gets the variance of the training
value for the model, if available.
Gets or sets a tag for user-defined information.
Creates a new Cross-Validation Values class.
The training value for the model.
The validation value for the model.
Creates a new Cross-Validation Values class.
The training value for the model.
The validation value for the model.
The variance of the training values.
The variance of the validation values.
Creates a new Cross-Validation Values class.
The fitted model.
The training value for the model.
The validation value for the model.
Creates a new Cross-Validation Values class.
The fitted model.
The training value for the model.
The validation value for the model.
The variance of the training values.
The variance of the validation values.
Obsolete. Please refer to instead.
Gets the
object used to generate this result.
Gets the performance statistics for the training set.
Gets the performance statistics for the validation set.
Gets the models created for each fold of the cross validation.
Gets or sets a tag for user-defined information.
Initializes a new instance of the class.
The that is creating this result.
The models created during the cross-validation runs.
Saves the result to a stream.
The stream to which the result is to be serialized.
Saves the result to a stream.
The stream to which the result is to be serialized.
Loads a result from a stream.
The stream from which the result is to be deserialized.
The deserialized result.
Loads a result from a stream.
The path to the file from which the result is to be deserialized.
The deserialized result.
Obsolete. Please refer to instead.
Creates a new Cross-Validation Values class.
The training value for the model.
The validation value for the model.
Creates a new Cross-Validation Values class.
The training value for the model.
The validation value for the model.
The variance of the training values.
The variance of the validation values.
Creates a new Cross-Validation Values class.
The fitted model.
The training value for the model.
The validation value for the model.
Creates a new Cross-Validation Values class.
The fitted model.
The training value for the model.
The validation value for the model.
The variance of the training values.
The variance of the validation values.
Creates a new Cross-Validation Values class.
The fitted model.
The training value for the model.
The validation value for the model.
The variance of the training values.
The variance of the validation values.
Creates a new Cross-Validation Values class.
The fitted model.
The training value for the model.
The variance of the training values.
Summary statistics for a cross-validation trial.
Gets the values acquired during the cross-validation.
Most often those will be the errors for each folding.
Gets the variance for each value acquired during the cross-validation.
Most often those will be the error variance for each folding.
Gets the number of samples used to compute the variance
of the values acquired during the cross-validation.
Gets the mean of the performance statistics.
Gets the variance of the performance statistics.
Gets the standard deviation of the performance statistics.
Gets the pooled variance of the performance statistics.
Gets the pooled standard deviation of the performance statistics.
Gets or sets a tag for user-defined information.
Create a new cross-validation statistics class.
The number of samples used to compute the statistics.
The performance statistics gathered during the run.
Create a new cross-validation statistics class.
The number of samples used to compute the statistics.
The performance statistics gathered during the run.
The variance of the statistics gathered during the run, if available.
k-Fold cross-validation. Please only use the static methods contained in this class,
the rest are marked as obsolete.
Cross-validation is a technique for estimating the performance of a predictive
model. It can be used to measure how the results of a statistical analysis will
generalize to an independent data set. It is mainly used in settings where the
goal is prediction, and one wants to estimate how accurately a predictive model
will perform in practice.
One round of cross-validation involves partitioning a sample of data into
complementary subsets, performing the analysis on one subset (called the
training set), and validating the analysis on the other subset (called the
validation set or testing set). To reduce variability, multiple rounds of
cross-validation are performed using different partitions, and the validation
results are averaged over the rounds.
References:
-
Wikipedia, The Free Encyclopedia. Cross-validation (statistics). Available on:
http://en.wikipedia.org/wiki/Cross-validation_(statistics)
Obsolete. Please use instead.
Obsolete. Please use instead.
Obsolete. Please use instead.
Obsolete. Please use instead.
Obsolete. Please use instead.
Obsolete. Please use Classes.Random(labels, classes, folds) instead.
Creates a new algorithm.
The type of the machine learning model whose parameters should be searched.
The type of the learning algorithm used to learn .
The type of the input data. Default is double[].
The type of the output data. Default is int.
The number of folds in the k-fold cross-validation. Default is 10.
A function that can create a given training parameters.
A function that can measure how far model predictions are from the expected ground-truth.
A function that specifies how to create a new model using the teacher learning algorirhm.
The input data to be used during training.
The output data to be used during training.
A grid-search algorithm that has been configured with the given parameters.
Obsolete. Please use instead.
Obsolete. Please use instead.
Gets or sets the model fitting function.
The fitting function should accept an array of integers containing the
indexes for the training samples, an array of integers containing the
indexes for the validation samples and should return information about
the model fitted using those two subsets of the available data.
Gets the array of data set indexes contained in each fold.
Gets the array of fold indices for each point in the data set.
Gets the number of folds in the k-fold cross validation.
Gets the total number of data samples in the data set.
Gets or sets a value indicating whether to use parallel
processing through the use of multiple threads or not.
Default is true.
true to use multiple threads; otherwise, false.
Creates a new k-fold cross-validation algorithm.
The total number samples in the entire dataset.
Creates a new k-fold cross-validation algorithm.
The total number samples in the entire dataset.
The number of folds, usually denoted as k (default is 10).
Creates a new k-fold cross-validation algorithm.
A vector containing class labels.
The number of different classes in .
The number of folds, usually denoted as k (default is 10).
Creates a new k-fold cross-validation algorithm.
An already created set of fold indices for each sample in a dataset.
The total number of folds referenced in the parameter.
Gets the indices for the training and validation
sets for the specified validation fold index.
The index of the validation fold.
The indices for the observations in the training set.
The indices for the observations in the validation set.
Gets the number of instances in training and validation
sets for the specified validation fold index.
The index of the validation fold.
The number of instances in the training set.
The number of instances in the validation set.
Computes the cross validation algorithm.
Gaussian mixture model clustering.
Gaussian Mixture Models are one of most widely used model-based
clustering methods. This specialized class provides a wrap-up
around the
Mixture<NormalDistribution> distribution and provides
mixture initialization using the K-Means clustering algorithm.
Gets or sets the maximum number of iterations to
be performed by the method. If set to zero, no
iteration limit will be imposed. Default is 0.
Gets or sets the convergence criterion for the
Expectation-Maximization algorithm. Default is 1e-3.
The convergence threshold.
Gets or sets whether cluster labels should be computed
at the end of the learning iteration. Setting to False
might save a few computations in case they are not necessary.
Gets or sets whether the log-likelihood should be computed
at the end of the learning iteration. Setting to False
might save a few computations in case they are not necessary.
Gets the log-likelihood of the model at the last iteration.
Gets or sets how many random initializations to try.
Default is 3.
Gets how many iterations have been performed in the last call
to .
Gets or sets whether to make computations using the log
-domain. This might improve accuracy on large datasets.
Gets or sets the fitting options for the component
Gaussian distributions of the mixture model.
The fitting options for inner Gaussian distributions.
Gets the Gaussian components of the mixture model.
Initializes a new instance of the class.
The number of clusters in the clustering problem. This will be
used to set the number of components in the mixture model.
Initializes a new instance of the class.
The initial solution as a K-Means clustering.
Initializes a new instance of the class.
The initial solution as a mixture of normal distributions.
Initializes a new instance of the class.
The initial solution as a mixture of normal distributions.
Divides the input data into K clusters modeling each
cluster as a multivariate Gaussian distribution.
Divides the input data into K clusters modeling each
cluster as a multivariate Gaussian distribution.
Learns a model that can map the given inputs to the desired outputs.
The model inputs.
The weight of importance for each input sample.
A model that has learned how to produce suitable outputs
given the input data .
Initializes the model with initial values obtained
through a run of the K-Means clustering algorithm.
Initializes the model with initial values obtained
through a run of the K-Means clustering algorithm.
Initializes the model with initial values.
Initializes the model with initial values.
Initializes the model with initial values.
Initializes the model with initial values.
Initializes the model with initial values.
Initializes the model with initial values.
Gets a copy of the mixture distribution modeled by this Gaussian Mixture Model.
Divides the input data into K clusters modeling each
cluster as a multivariate Gaussian distribution.
Divides the input data into K clusters modeling each
cluster as a multivariate Gaussian distribution.
Gets the collection of clusters currently modeled by the
clustering algorithm.
Options for Gaussian Mixture Model fitting.
This class provides different options that can be passed to a
object when calling its
method.
Gets or sets the convergence criterion for the
Expectation-Maximization algorithm. Default is 1e-3.
The convergence threshold.
Gets or sets the maximum number of iterations
to be performed by the Expectation-Maximization
algorithm. Default is zero (iterate until convergence).
Gets or sets whether to make computations using the log
-domain. This might improve accuracy on large datasets.
Gets or sets the sample weights. If set to null,
the data will be assumed equal weights. Default
is null.
Gets or sets the fitting options for the component
Gaussian distributions of the mixture model.
The fitting options for inner Gaussian distributions.
Gets or sets parallelization options.
Initializes a new instance of the class.
K-Nearest Neighbor (k-NN) algorithm.
The k-nearest neighbor algorithm (k-NN) is a method for classifying objects
based on closest training examples in the feature space. It is amongst the simplest
of all machine learning algorithms: an object is classified by a majority vote of
its neighbors, with the object being assigned to the class most common amongst its
k nearest neighbors (k is a positive integer, typically small).
If k = 1, then the object is simply assigned to the class of its nearest neighbor.
References:
-
Wikipedia contributors. "K-nearest neighbor algorithm." Wikipedia, The
Free Encyclopedia. Wikipedia, The Free Encyclopedia, 10 Oct. 2012. Web.
9 Nov. 2012. http://en.wikipedia.org/wiki/K-nearest_neighbor_algorithm
The first example shows how to create and use a k-Nearest Neighbor algorithm to classify
a set of numeric vectors in a multi-class decision problem involving 3 classes. It also shows
how to compute class decisions for a new sample and how to measure the performance of a classifier.
The second example show how to use a different distance metric when computing k-NN:
The k-Nearest neighbor algorithm implementation in the framework can also be used with any instance
data type. For such cases, the framework offers a generic version of the classifier. The third example
shows how to use the generic kNN classifier to perform the direct classification of actual text samples:
Creates a new .
Creates a new .
Creates a new .
Computes a numerical score measuring the association between
the given vector and each class.
The input vector.
An array where the result will be stored,
avoiding unnecessary memory allocations.
System.Double[].
Gets the top points that are the closest
to a given reference point.
The query point whose neighbors will be found.
The label for each neighboring point.
An array containing the top points that are
at the closest possible distance to .
Creates a new algorithm from an existing
. The tree must have been created using the input
points and the point's class labels as the associated node information.
The containing the input points and their integer labels.
The number of nearest neighbors to be used in the decision.
The number of classes in the classification problem.
The input data points.
The associated labels for the input points.
A algorithm initialized from the tree.
Learns a model that can map the given inputs to the given outputs.
The model inputs.
The desired outputs associated with each inputs.
The weight of importance for each input-output pair (if supported by the learning algorithm).
A model that has learned how to produce given .
Creates a new .
The number of nearest neighbors to be used in the decision.
The input data points.
The associated labels for the input points.
Creates a new .
The number of nearest neighbors to be used in the decision.
The number of classes in the classification problem.
The input data points.
The associated labels for the input points.
Creates a new .
The number of nearest neighbors to be used in the decision.
The number of classes in the classification problem.
The input data points.
The associated labels for the input points.
The distance measure to use.
Gets the number of class labels
handled by this classifier.
Computes the most likely label of a new given point.
A point to be classified.
The most likely label for the given point.
Computes the most likely label of a new given point.
A point to be classified.
A value between 0 and 1 giving
the strength of the classification in relation to the
other classes.
The most likely label for the given point.
Computes the most likely label of a new given point.
A point to be classified.
The distance score for each possible class.
The most likely label for the given point.
Obsolete. Please refer to instead.
Creates a new split-set validation algorithm.
The total number of available samples.
The desired proportion of samples in the training
set in comparison with the testing set.
Creates a new split-set validation algorithm.
The total number of available samples.
The desired proportion of samples in the training
set in comparison with the testing set.
The output labels to be balanced between the sets.
Summary statistics for a Split-set validation trial.
Gets the model created with the
Gets the values acquired during the cross-validation.
Most often those will be the errors for each folding.
Gets the variance for each value acquired during the cross-validation.
Most often those will be the error variance for each folding.
Gets the number of samples used to compute the variance
of the values acquired during the validation.
Gets the standard deviation of the performance statistic.
Gets or sets a tag for user-defined information.
Create a new split-set statistics class.
The generated model.
The number of samples used to compute the statistic.
The performance statistic gathered during the run.
The variance of the performance statistic during the run.
Obsolete. Please refer to instead.
Create a new split-set statistics class.
The generated model.
The number of samples used to compute the statistic.
The performance statistic gathered during the run.
The variance of the performance statistic during the run.
Create a new split-set statistics class.
The generated model.
The number of samples used to compute the statistic.
The performance statistic gathered during the run.
The variance of the performance statistic during the run.
Obsolete. Please refer to instead.
Gets the
object used to generate this result.
Gets the performance statistics for the training set.
Gets the performance statistics for the validation set.
Gets or sets a tag for user-defined information.
Initializes a new instance of the class.
The that is creating this result.
The training set statistics.
The testing set statistics.
Obsolete. Please use instead.
Obsolete. Please use instead.
Obsolete. Please use instead.
Gets the group labels assigned to each of the data samples.
Gets the desired proportion of cases in
the training set in comparison to the
testing set.
Gets or sets a value indicating whether the prevalence of
an output label should be balanced between training and
testing sets.
true if this instance is stratified; otherwise, false.
Gets the indices of elements in the validation set.
Gets the indices of elements in the training set.
Get or sets the model fitting function.
Gets or sets the performance estimation function.
Creates a new split-set validation algorithm.
The total number of available samples.
The desired proportion of samples in the training
set in comparison with the testing set.
Creates a new split-set validation algorithm.
The total number of available samples.
The desired proportion of samples in the training
set in comparison with the testing set.
The output labels to be balanced between the sets.
Computes the split-set validation algorithm.
Delegate for grid search fitting functions.
The type of the model to fit.
The collection of parameters to be used in the fitting process.
The error (or any other performance measure) returned by the model.
The model fitted to the data using the given parameters.
Grid search procedure for automatic parameter tuning.
Grid Search tries to find the best combination of parameters across
a range of possible values that produces the best fit model. If there
are two parameters, each with 10 possible values, Grid Search will try
an exhaustive evaluation of the model using every combination of points,
resulting in 100 model fits.
The type of the model to be tuned.
How to fit a Kernel Support Vector Machine using Grid Search.
// Example binary data
double[][] inputs =
{
new double[] { -1, -1 },
new double[] { -1, 1 },
new double[] { 1, -1 },
new double[] { 1, 1 }
};
int[] xor = // xor labels
{
-1, 1, 1, -1
};
// Declare the parameters and ranges to be searched
GridSearchRange[] ranges =
{
new GridSearchRange("complexity", new double[] { 0.00000001, 5.20, 0.30, 0.50 } ),
new GridSearchRange("degree", new double[] { 1, 10, 2, 3, 4, 5 } ),
new GridSearchRange("constant", new double[] { 0, 1, 2 } )
};
// Instantiate a new Grid Search algorithm for Kernel Support Vector Machines
var gridsearch = new GridSearch<KernelSupportVectorMachine>(ranges);
// Set the fitting function for the algorithm
gridsearch.Fitting = delegate(GridSearchParameterCollection parameters, out double error)
{
// The parameters to be tried will be passed as a function parameter.
int degree = (int)parameters["degree"].Value;
double constant = parameters["constant"].Value;
double complexity = parameters["complexity"].Value;
// Use the parameters to build the SVM model
Polynomial kernel = new Polynomial(degree, constant);
KernelSupportVectorMachine ksvm = new KernelSupportVectorMachine(kernel, 2);
// Create a new learning algorithm for SVMs
SequentialMinimalOptimization smo = new SequentialMinimalOptimization(ksvm, inputs, xor);
smo.Complexity = complexity;
// Measure the model performance to return as an out parameter
error = smo.Run();
return ksvm; // Return the current model
};
// Declare some out variables to pass to the grid search algorithm
GridSearchParameterCollection bestParameters; double minError;
// Compute the grid search to find the best Support Vector Machine
KernelSupportVectorMachine bestModel = gridsearch.Compute(out bestParameters, out minError);
Gets or sets the parallelization options for this algorithm.
Gets or sets a cancellation token that can be used
to cancel the algorithm while it is running.
Constructs a new Grid search algorithm.
The range of parameters to search.
A function that fits a model using the given parameters.
The range of parameters to consider during search.
Searches for the best combination of parameters that results in the most accurate model.
The best combination of parameters found by the grid search.
The minimum error of the best model found by the grid search.
The best model found during the grid search.
Searches for the best combination of parameters that results in the most accurate model.
The results found during the grid search.
Contains results from the grid-search procedure.
The type of the model to be tuned.
Gets all combination of parameters tried.
Gets all models created during the search.
Gets the error for each of the created models.
Gets the index of the best found model
in the collection.
Gets the best model found.
Gets the best parameter combination found.
Gets the minimum error found.
Gets the size of the grid used in the grid-search.
Initializes a new instance of the class.
Initializes a new instance of the class.
Initialization schemes for clustering algorithms.
Do not perform initialization.
Randomly sample points to become centroids.
Use the kmeans++ seeding algorithm for generating initial centroids.
Use the PAM BUILD algorithm for generating initial centroids.
Lloyd's k-Means clustering algorithm.
In statistics and machine learning, k-means clustering is a method
of cluster analysis which aims to partition n observations into k
clusters in which each observation belongs to the cluster with the
nearest mean.
It is similar to the expectation-maximization algorithm for mixtures
of Gaussians in that they both attempt to find the centers of natural
clusters in the data as well as in the iterative refinement approach
employed by both algorithms.
The algorithm is composed of the following steps:
-
Place K points into the space represented by the objects that are
being clustered. These points represent initial group centroids.
-
Assign each object to the group that has the closest centroid.
-
When all objects have been assigned, recalculate the positions
of the K centroids.
-
Repeat Steps 2 and 3 until the centroids no longer move. This
produces a separation of the objects into groups from which the
metric to be minimized can be calculated.
This particular implementation uses the squared Euclidean distance
as a similarity measure in order to form clusters.
References:
-
Wikipedia, The Free Encyclopedia. K-means clustering. Available on:
http://en.wikipedia.org/wiki/K-means_clustering
-
Matteo Matteucci. A Tutorial on Clustering Algorithms. Available on:
http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/kmeans.html
How to perform clustering with K-Means.
How to perform clustering with K-Means applying different weights to different columns (dimensions) in the data.
How to perform clustering with K-Means with mixed discrete, continuous and categorical data.
The following example demonstrates how to use the K-Means algorithm for color clustering. It is the same code which can be
found in the
color clustering sample application.
The original image is shown below:
The resulting image will be:
Gets the clusters found by K-means.
Gets or sets the cluster centroids.
Gets the number of clusters.
Gets the dimensionality of the data space.
Gets or sets the distance function used
as a distance metric between data points.
Gets or sets whether covariance matrices for the clusters should
be computed at the end of an iteration. Default is true.
Gets or sets whether the clustering distortion error (the
average distance between all data points and the cluster
centroids) should be computed at the end of the algorithm.
The result will be stored in . Default is true.
Gets or sets the maximum number of iterations to
be performed by the method. If set to zero, no
iteration limit will be imposed. Default is 0.
Gets or sets the relative convergence threshold
for stopping the algorithm. Default is 1e-5.
Gets the number of iterations performed in the
last call to this class' Compute methods.
Gets the cluster distortion error after the
last call to this class' Compute methods.
Gets or sets the strategy used to initialize the
centroids of the clustering algorithm. Default is
.
Initializes a new instance of KMeans algorithm
The number of clusters to divide input data.
The distance function to use. Default is to
use the distance.
Initializes a new instance of the K-Means algorithm
The number of clusters to divide the input data into.
Initializes a new instance of the KMeans algorithm
The number of clusters to divide the input data into.
The distance function to use. Default is to use the
distance.
Randomizes the clusters inside a dataset.
The data to randomize the algorithm.
Learns a model that can map the given inputs to the desired outputs.
The model inputs.
The weight of importance for each input sample.
A model that has learned how to produce suitable outputs
given the input data .
Divides the input data into K clusters.
The data where to compute the algorithm.
Divides the input data into K clusters.
The data where to compute the algorithm.
The weight associated with each data point.
Computes the information about each cluster (covariance, proportions and error).
The data points.
Computes the information about each cluster (covariance, proportions and error).
The data points.
The assigned labels.
Divides the input data into K clusters.
The data where to compute the algorithm.
The weight to consider for each data sample. This is used in weighted K-Means
The total sum of the weights in .
Determines if the algorithm has converged by comparing the
centroids between two consecutive iterations.
The previous centroids.
The new centroids.
Returns if all centroids had a percentage change
less than . Returns otherwise.
Divides the input data into K clusters.
The data where to compute the algorithm.
The average square distance from the
data points to the clusters' centroids.
QLearning learning algorithm.
The class provides implementation of Q-Learning algorithm, known as
off-policy Temporal Difference control.
The following example shows how to learn a model using reinforcement learning through the
Q-learning algorithm. The following code has been inherited from the AForge.NET Framework,
and has not been modified ever since. If you have better ideas on how to improve its
interface, please share it in the project's issue tracker at
https://github.com/accord-net/framework/issues.
If you would like, and if your ideas are feasible and encouraging enough, you can be named an
official contributor of the project. If you would like, you could opt to "inherit" the reinforcement learning
portion of the project such that you could be free to commit, modify and, more importantly, authorship
those modules directly from your own GitHub account without having to wait for Pull Request approvals.
You can be listed as an official author of the Accord.NET Framework, making it possible to list the
creation or shared authorship of the reinforcement learning project in your CV.
Amount of possible states.
Amount of possible actions.
Exploration policy.
Policy, which is used to select actions.
Learning rate, [0, 1].
The value determines the amount of updates Q-function receives
during learning. The greater the value, the more updates the function receives.
The lower the value, the less updates it receives.
Discount factor, [0, 1].
Discount factor for the expected summary reward. The value serves as
multiplier for the expected reward. So if the value is set to 1,
then the expected summary reward is not discounted. If the value is getting
smaller, then smaller amount of the expected reward is used for actions'
estimates update.
Initializes a new instance of the class.
Amount of possible states.
Amount of possible actions.
Exploration policy.
Action estimates are randomized in the case of this constructor
is used.
Initializes a new instance of the class.
Amount of possible states.
Amount of possible actions.
Exploration policy.
Randomize action estimates or not.
The randomize parameter specifies if initial action estimates should be randomized
with small values or not. Randomization of action values may be useful, when greedy exploration
policies are used. In this case randomization ensures that actions of the same type are not chosen always.
Get next action from the specified state.
Current state to get an action for.
Returns the action for the state.
The method returns an action according to current
exploration policy.
Update Q-function's value for the previous state-action pair.
Previous state.
Action, which leads from previous to the next state.
Reward value, received by taking specified action from previous state.
Next state.
Multipurpose RANSAC algorithm.
The model type to be trained by RANSAC.
RANSAC is an abbreviation for "RANdom SAmple Consensus". It is an iterative
method to estimate parameters of a mathematical model from a set of observed
data which contains outliers. It is a non-deterministic algorithm in the sense
that it produces a reasonable result only with a certain probability, with this
probability increasing as more iterations are allowed.
References:
-
P. D. Kovesi. MATLAB and Octave Functions for Computer Vision and Image Processing.
School of Computer Science and Software Engineering, The University of Western Australia.
Available in: http://www.csse.uwa.edu.au/~pk/research/matlabfns
-
Wikipedia, The Free Encyclopedia. RANSAC. Available on:
http://en.wikipedia.org/wiki/RANSAC
Model fitting function.
Degenerative set detection function.
Distance function.
Gets or sets the minimum distance between a data point and
the model used to decide whether the point is an inlier or not.
Gets or sets the minimum number of samples from the data
required by the fitting function to fit a model.
Maximum number of attempts to select a
non-degenerate data set. Default is 100.
Maximum number of trials. Default is 1000.
Gets the current estimate of trials needed.
Gets the current number of trials performed.
Gets or sets the probability of obtaining a random
sample of the input points that contains no outliers.
Default is 0.99.
Constructs a new RANSAC algorithm.
The minimum number of samples from the data
required by the fitting function to fit a model.
Constructs a new RANSAC algorithm.
The minimum number of samples from the data
required by the fitting function to fit a model.
The minimum distance between a data point and
the model used to decide whether the point is
an inlier or not.
Constructs a new RANSAC algorithm.
The minimum number of samples from the data
required by the fitting function to fit a model.
The minimum distance between a data point and
the model used to decide whether the point is
an inlier or not.
The probability of obtaining a random sample of
the input points that contains no outliers.
Computes the model using the RANSAC algorithm.
The total number of points in the data set.
Computes the model using the RANSAC algorithm.
The total number of points in the data set.
The indexes of the outlier points in the data set.
Sarsa learning algorithm.
The class provides implementation of Sarsa algorithm, known as
on-policy Temporal Difference control.
The following example shows how to learn a model using reinforcement learning through the
Sarsa algorithm. The following code has been inherited from the AForge.NET Framework,
and has not been modified ever since. If you have better ideas on how to improve its
interface, please share it in the project's issue tracker at
https://github.com/accord-net/framework/issues.
If you would like, and if your ideas are feasible and encouraging enough, you can be named an
official contributor of the project. If you would like, you could opt to "inherit" the reinforcement learning
portion of the project such that you could be free to commit, modify and, more importantly, authorship
those modules directly from your own GitHub account without having to wait for Pull Request approvals.
You can be listed as an official author of the Accord.NET Framework, making it possible to list the
creation or shared authorship of the reinforcement learning project in your CV.
Amount of possible states.
Amount of possible actions.
Exploration policy.
Policy, which is used to select actions.
Learning rate, [0, 1].
The value determines the amount of updates Q-function receives
during learning. The greater the value, the more updates the function receives.
The lower the value, the less updates it receives.
Discount factor, [0, 1].
Discount factor for the expected summary reward. The value serves as
multiplier for the expected reward. So if the value is set to 1,
then the expected summary reward is not discounted. If the value is getting
smaller, then smaller amount of the expected reward is used for actions'
estimates update.
Initializes a new instance of the class.
Amount of possible states.
Amount of possible actions.
Exploration policy.
Action estimates are randomized in the case of this constructor
is used.
Initializes a new instance of the class.
Amount of possible states.
Amount of possible actions.
Exploration policy.
Randomize action estimates or not.
The randomize parameter specifies if initial action estimates should be randomized
with small values or not. Randomization of action values may be useful, when greedy exploration
policies are used. In this case randomization ensures that actions of the same type are not chosen always.
Get next action from the specified state.
Current state to get an action for.
Returns the action for the state.
The method returns an action according to current
exploration policy.
Update Q-function's value for the previous state-action pair.
Curren state.
Action, which lead from previous to the next state.
Reward value, received by taking specified action from previous state.
Next state.
Next action.
Updates Q-function's value for the previous state-action pair in
the case if the next state is non terminal.
Update Q-function's value for the previous state-action pair.
Curren state.
Action, which lead from previous to the next state.
Reward value, received by taking specified action from previous state.
Updates Q-function's value for the previous state-action pair in
the case if the next state is terminal.
Set of machine learning tools.
Splits the given text into individual atomic words,
irrespective of punctuation and other marks.
Splits the given text into individual atomic words,
irrespective of punctuation and other marks.
Estimates the number of columns (dimensions) in a set of data.
The type of the t input.
The input data.
The number of columns (data dimensions) in the data.
Generates a from a set of cross-validation results.
The type of the model being evaluated.
The type of the inputs accepted by the model.
The cross-validation result.
The inputs fed to the cross-validation object.
The outputs fed to the cross-validation object.
A that captures the performance of the model across all validation folds.
Generates a from a set of cross-validation results.
The type of the model being evaluated.
The type of the inputs accepted by the model.
The cross-validation result.
The inputs fed to the cross-validation object.
The outputs fed to the cross-validation object.
A that captures the performance of the model across all validation folds.
Vantage-Point Tree.
The type for the position vector of each node.
Initializes a new instance of the class.
The distance to use when comparing points.
Creates a new vantage-point tree from the given points.
The points to be added to the tree.
The distance function to use.
Whether to perform operations in place, altering the
original array of points instead of creating an extra copy.
A populated with the given data points.
Vantage-Point Tree.
The type for the position vector of each node.
The type for the value stored at each node.
Initializes a new instance of the class.
The distance to use when comparing points.
Creates a new vantage-point tree from the given points.
The points to be added to the tree.
The corresponding values at each data point.
The distance function to use.
Whether to perform operations in place, altering the
original array of points instead of creating an extra copy.
A populated with the given data points.
Function that (recursively) fills the tree
Base class for K-dimensional trees.
The class type for the nodes of the tree.
Gets the number of dimensions expected
by the input points of this tree.
Gets or set the distance function used to
measure distances amongst points on this tree
Gets the number of elements contained in this
tree. This is also the number of tree nodes.
Gets the number of leaves contained in this
tree. This can be used to calibrate approximate
nearest searchers.
Creates a new .
The number of dimensions in the tree.
Creates a new .
The number of dimensions in the tree.
The Root node, if already existent.
Creates a new .
The number of dimensions in the tree.
The Root node, if already existent.
The number of elements in the Root node.
The number of leaves linked through the Root node.
Retrieves the nearest points to a given point within a given radius.
The queried point.
The search radius.
The maximum number of neighbors to retrieve.
A list of neighbor points, ordered by distance.
Retrieves the nearest points to a given point within a given radius.
The queried point.
The search radius.
A list of neighbor points, ordered by distance.
Retrieves a fixed number of nearest points to a given point.
The queried point.
The number of neighbors to retrieve.
A list of neighbor points, ordered by distance.
Retrieves the nearest point to a given point.
The queried point.
A list of neighbor points, ordered by distance.
Retrieves the nearest point to a given point.
The queried point.
The distance from the
to its nearest neighbor found in the tree.
A list of neighbor points, ordered by distance.
Retrieves a fixed percentage of nearest points to a given point.
The queried point.
The number of neighbors to retrieve.
The maximum percentage of leaf nodes that
can be visited before the search finishes with an approximate answer.
A list of neighbor points, ordered by distance.
Retrieves a percentage of nearest points to a given point.
The queried point.
The maximum percentage of leaf nodes that
can be visited before the search finishes with an approximate answer.
The distance between the query point and its nearest neighbor.
A list of neighbor points, ordered by distance.
Retrieves a percentage of nearest points to a given point.
The queried point.
The maximum percentage of leaf nodes that
can be visited before the search finishes with an approximate answer.
A list of neighbor points, ordered by distance.
Retrieves a fixed number of nearest points to a given point.
The queried point.
The number of neighbors to retrieve.
The maximum number of leaf nodes that can
be visited before the search finishes with an approximate answer.
A list of neighbor points, ordered by distance.
Retrieves a fixed number of nearest points to a given point.
The queried point.
The maximum number of leaf nodes that can
be visited before the search finishes with an approximate answer.
A list of neighbor points, ordered by distance.
Retrieves a list of all points inside a given region.
The region.
A list of all nodes contained in the region.
Creates the Root node for a new given
a set of data points and their respective stored values.
The data points to be inserted in the tree.
Return the number of leaves in the Root subtree.
Whether the given vector
can be ordered in place. Passing true will change the original order of
the vector. If set to false, all operations will be performed on an extra
copy of the vector.
The Root node for a new
contained the given .
Radius search.
k-nearest neighbors search.
Inserts a value into the tree at the desired position.
A double-vector with the same number of elements as dimensions in the tree.
Removes all nodes from this tree.
Copies the entire tree to a compatible one-dimensional , starting
at the specified index of the
target array.
The one-dimensional that is the destination of the
elements copied from tree. The must have zero-based indexing.
The zero-based index in at which copying begins.
List of k-dimensional tree nodes.
The type of the value being stored.
This class is used to store neighbor nodes when running one of the
search algorithms for k-dimensional trees.
Initializes a new instance of the
class that is empty.
Initializes a new instance of the
class that is empty and has the specified capacity.
Region of space in a Space-Partitioning Tree. Represents an axis-aligned
bounding box stored as a center with half-dimensions to represent the boundaries
of this quad tree.
Gets the dimensions of the space delimited
by this spatial cell.
Gets or sets the starting point of this spatial cell.
Gets or sets the width of this spatial cell.
Initializes a new instance of the class.
The number of dimensions of the space.
Initializes a new instance of the class.
The starting point of this spatial cell.
The widths of this spatial cell.
Determines whether a point lies inside this cell.
The point.
True if the point is contained inside this cell; otherwise, false.
Node for a Space-Partitioning Tree.
Initializes a new instance of the class.
The tree that this node belongs to.
The parent node for this node. Can be null if this node is the root.
The starting point of the spatial cell.
The widths of the spatial cell.
The index of this node in the children collection of its parent node.
Gets the position associated with this node.
Gets the center of mass of this node.
Gets or sets the value associated with this node.
Gets or sets the space region delimited by this node.
Gets whether this node is empty and does
not contain any points or children.
Inserts a point in the Space-Partitioning tree.
Checks whether the current tree is correct.
Gets the current depth of this node (distance from the root).
Compute non-edge forces using Barnes-Hut algorithm.
Returns a that represents this instance.
A that represents this instance.
Node-distance pair.
The class type for the nodes of the tree.
Gets the node in this pair.
Gets the distance of the node from the query point.
Creates a new .
The node value.
The distance value.
Determines whether the specified
is equal to this instance.
The to compare
with this instance.
true if the specified is
equal to this instance; otherwise, false.
Returns a hash code for this instance.
A hash code for this instance, suitable for use in hashing
algorithms and data structures like a hash table.
Implements the equality operator.
Implements the inequality operator.
Implements the lesser than operator.
Implements the greater than operator.
Determines whether the specified
is equal to this instance.
The to compare
with this instance.
true if the specified is
equal to this instance; otherwise, false.
Compares this instance to another node, returning an integer
indicating whether this instance has a distance that is less
than, equal to, or greater than the other node's distance.
Compares this instance to another node, returning an integer
indicating whether this instance has a distance that is less
than, equal to, or greater than the other node's distance.
Returns a that represents this instance.
A that represents this instance.
K-dimensional tree.
The type of the value being stored.
A k-d tree (short for k-dimensional tree) is a space-partitioning data structure
for organizing points in a k-dimensional space. k-d trees are a useful data structure
for several applications, such as searches involving a multidimensional search key
(e.g. range searches and nearest neighbor searches). k-d trees are a special case
of binary space partitioning trees.
The k-d tree is a binary tree in which every node is a k-dimensional point. Every non-
leaf node can be thought of as implicitly generating a splitting hyperplane that divides
the space into two parts, known as half-spaces. Points to the left of this hyperplane
represent the left subtree of that node and points right of the hyperplane are represented
by the right subtree. The hyperplane direction is chosen in the following way: every node
in the tree is associated with one of the k-dimensions, with the hyperplane perpendicular
to that dimension's axis. So, for example, if for a particular split the "x" axis is chosen,
all points in the subtree with a smaller "x" value than the node will appear in the left
subtree and all points with larger "x" value will be in the right subtree. In such a case,
the hyperplane would be set by the x-value of the point, and its normal would be the unit
x-axis.
References:
-
Wikipedia, The Free Encyclopedia. K-d tree. Available on:
http://en.wikipedia.org/wiki/K-d_tree
-
Moore, Andrew W. "An intoductory tutorial on kd-trees." (1991).
Available at: http://www.autonlab.org/autonweb/14665/version/2/part/5/data/moore-tutorial.pdf
// This is the same example found in Wikipedia page on
// k-d trees: http://en.wikipedia.org/wiki/K-d_tree
// Suppose we have the following set of points:
double[][] points =
{
new double[] { 2, 3 },
new double[] { 5, 4 },
new double[] { 9, 6 },
new double[] { 4, 7 },
new double[] { 8, 1 },
new double[] { 7, 2 },
};
// To create a tree from a set of points, we use
KDTree<int> tree = KDTree.FromData<int>(points);
// Now we can manually navigate the tree
KDTreeNode<int> node = tree.Root.Left.Right;
// Or traverse it automatically
foreach (KDTreeNode<int> n in tree)
{
double[] location = n.Position;
Assert.AreEqual(2, location.Length);
}
// Given a query point, we can also query for other
// points which are near this point within a radius
double[] query = new double[] { 5, 3 };
// Locate all nearby points within an euclidean distance of 1.5
// (answer should be be a single point located at position (5,4))
KDTreeNodeCollection<int> result = tree.Nearest(query, radius: 1.5);
// We can also use alternate distance functions
tree.Distance = Accord.Math.Distance.Manhattan;
// And also query for a fixed number of neighbor points
// (answer should be the points at (5,4), (7,2), (2,3))
KDTreeNodeCollection<int> neighbors = tree.Nearest(query, neighbors: 3);
' This is the same example found in Wikipedia page on
' k-d trees: http://en.wikipedia.org/wiki/K-d_tree
' Suppose we have the following set of points:
Dim points =
{
New Double() { 2, 3 },
New Double() { 5, 4 },
New Double() { 9, 6 },
New Double() { 4, 7 },
New Double() { 8, 1 },
New Double() { 7, 2 }
}
' To create a tree from a set of points, we use
Dim tree = KDTree.FromData(Of Integer)(points)
' Now we can manually navigate the tree
Dim node = tree.Root.Left.Right
' Or traverse it automatically
For Each n As KDTreeNode(Of Integer) In tree
Dim location = n.Position
Console.WriteLine(location.Length)
Next
' Given a query point, we can also query for other
' points which are near this point within a radius
'
Dim query = New Double() {5, 3}
' Locate all nearby points within an Euclidean distance of 1.5
' (answer should be a single point located at position (5,4))
'
Dim result = tree.Nearest(query, radius:=1.5)
' We can also use alternate distance functions
tree.Distance = Function(a, b) Accord.Math.Distance.Manhattan(a, b)
' And also query for a fixed number of neighbor points
' (answer should be the points at (5,4), (7,2), (2,3))
'
Dim neighbors = tree.Nearest(query, neighbors:=3)
Creates a new .
The number of dimensions in the tree.
Creates a new .
The number of dimensions in the tree.
The Root node, if already existent.
Creates a new .
The number of dimensions in the tree.
The Root node, if already existent.
The number of elements in the Root node.
The number of leaves linked through the Root node.
Inserts a value in the tree at the desired position.
A double-vector with the same number of elements as dimensions in the tree.
The value to be inserted.
Creates the Root node for a new given
a set of data points and their respective stored values.
The data points to be inserted in the tree.
The values associated with each point.
Return the number of leaves in the Root subtree.
Whether the given vector
can be ordered in place. Passing true will change the original order of
the vector. If set to false, all operations will be performed on an extra
copy of the vector.
The Root node for a new
contained the given .
Convenience class for k-dimensional tree static methods. To
create a new KDTree, specify the generic parameter as in
.
Please check the documentation page for
for examples, usage and actual remarks about kd-trees.
Creates a new .
The number of dimensions in the tree.
Creates a new .
The number of dimensions in the tree.
The root node, if already existent.
Creates a new .
The number of dimensions in the tree.
The root node, if already existent.
The number of elements in the root node.
The number of leaves linked through the root node.
Adds a new point to this tree.
A double-vector with the same number of elements as dimensions in the tree.
Creates a new k-dimensional tree from the given points.
The type of the value to be stored.
The points to be added to the tree.
Whether the given vector
can be ordered in place. Passing true will change the original order of
the vector. If set to false, all operations will be performed on an extra
copy of the vector.
A populated with the given data points.
Creates a new k-dimensional tree from the given points.
The points to be added to the tree.
Whether the given vector
can be ordered in place. Passing true will change the original order of
the vector. If set to false, all operations will be performed on an extra
copy of the vector.
A populated with the given data points.
Creates a new k-dimensional tree from the given points.
The type of the value to be stored.
The points to be added to the tree.
The corresponding values at each data point.
Whether the given vector
can be ordered in place. Passing true will change the original order of
the vector. If set to false, all operations will be performed on an extra
copy of the vector.
A populated with the given data points.
Creates a new k-dimensional tree from the given points.
The points to be added to the tree.
The distance function to use.
Whether the given vector
can be ordered in place. Passing true will change the original order of
the vector. If set to false, all operations will be performed on an extra
copy of the vector.
A populated with the given data points.
Creates a new k-dimensional tree from the given points.
The type of the value to be stored.
The points to be added to the tree.
The corresponding values at each data point.
The distance function to use.
Whether the given vector
can be ordered in place. Passing true will change the original order of
the vector. If set to false, all operations will be performed on an extra
copy of the vector.
A populated with the given data points.
Creates a new k-dimensional tree from the given points.
The type of the value to be stored.
The points to be added to the tree.
The distance function to use.
Whether the given vector
can be ordered in place. Passing true will change the original order of
the vector. If set to false, all operations will be performed on an extra
copy of the vector.
A populated with the given data points.
K-dimensional tree node (for ).
K-dimensional tree node (for ).
Gets or sets the value being stored at this node.
Base class for K-dimensional tree nodes.
The class type for the nodes of the tree.
Gets or sets the position of
the node in spatial coordinates.
Gets or sets the dimension index of the split. This value is a
index of the vector and as such should
be higher than zero and less than the number of elements in .
Returns a that represents this instance.
A that represents this instance.
Compares the current object with another object of the same type.
An object to compare with this object.
A value that indicates the relative order of the objects being compared. The return value has the following meanings: Value Meaning Less than zero This object is less than the parameter.Zero This object is equal to . Greater than zero This object is greater than .
Indicates whether the current object is equal to another object of the same type.
An object to compare with this object.
true if the current object is equal to the parameter; otherwise, false.
Collection of k-dimensional tree nodes.
The class type for the nodes of the tree.
This class is used to store neighbor nodes when running one of the
search algorithms for k-dimensional trees.
Gets or sets the maximum number of elements on this
collection, if specified. A value of zero indicates
this instance has no upper limit of elements.
Gets the minimum distance between a node
in this collection and the query point.
Gets the maximum distance between a node
in this collection and the query point.
Gets the farthest node in the collection (with greatest distance).
Gets the nearest node in the collection (with smallest distance).
Creates a new with a maximum size.
The maximum number of elements allowed in this collection.
Attempts to add a value to the collection. If the list is full
and the value is more distant than the farthest node in the
collection, the value will not be added.
The node to be added.
The node distance.
Returns true if the node has been added; false otherwise.
Attempts to add a value to the collection. If the list is full
and the value is more distant than the farthest node in the
collection, the value will not be added.
The node to be added.
The node distance.
Returns true if the node has been added; false otherwise.
Adds the specified item to the collection.
The distance from the node to the query point.
The item to be added.
Removes all elements from this collection.
Gets the
at the specified index. Note: this method will iterate over the entire collection
until the given position is found.
Gets the number of elements in this collection.
Gets a value indicating whether this instance is read only.
For this collection, always returns false.
true if this instance is read only; otherwise, false.
Returns an enumerator that iterates through this collection.
An object
that can be used to iterate through the collection.
Returns an enumerator that iterates through this collection.
An object that
can be used to iterate through the collection.
Determines whether this instance contains the specified item.
The object to locate in the collection.
The value can be null for reference types.
true if the item is found in the collection; otherwise, false.
Copies the entire collection to a compatible one-dimensional , starting
at the specified index of the target
array.
The one-dimensional that is the destination of the
elements copied from tree. The must have zero-based indexing.
The zero-based index in at which copying begins.
Adds the specified item to this collection.
The item.
Not supported.
Removes the farthest tree node from this collection.
Removes the nearest tree node from this collection.
Space-Partitioning Tree.
Gets the dimension of the space covered by this tree.
Initializes a new instance of the class.
The dimensions of the space partitioned by the tree.
Creates a new space-partitioning tree from the given points.
The points to be added to the tree.
A populated with the given data points.
Inserts a point in the Space-Partitioning tree.
Computes non-edge forces using Barnes-Hut algorithm.
Computes edge forces.
Vantage-Point Tree.
Creates a new vantage-point tree from the given points.
The points to be added to the tree.
Whether to perform operations in place, altering the
original array of points instead of creating an extra copy.
A populated with the given data points.
Creates a new vantage-point tree from the given points.
The points to be added to the tree.
Whether to perform operations in place, altering the
original array of points instead of creating an extra copy.
A populated with the given data points.
Creates a new vantage-point tree from the given points.
The points to be added to the tree.
The distance function to use.
Whether to perform operations in place, altering the
original array of points instead of creating an extra copy.
A populated with the given data points.
Creates a new vantage-point tree from the given points.
The type for the position vectors.
The points to be added to the tree.
The distance function to use.
Whether to perform operations in place, altering the
original array of points instead of creating an extra copy.
A populated with the given data points.
Creates a new vantage-point tree from the given points.
The type of the value to be stored.
The points to be added to the tree.
The corresponding values at each data point.
Whether to perform operations in place, altering the
original array of points instead of creating an extra copy.
A populated with the given data points.
Creates a new vantage-point tree from the given points.
The type for the position vectors.
The type of the value to be stored.
The points to be added to the tree.
The corresponding values at each data point.
The distance function to use.
Whether to perform operations in place, altering the
original array of points instead of creating an extra copy.
A populated with the given data points.
Initializes a new instance of the class.
The distance to use when comparing points. Default is
.
Initializes a new instance of the class.
Base class for Vantage-Point Trees.
The type for the position vector of each node.
The class type for the nodes of the tree.
Gets or set the distance function used to
measure distances amongst points on this tree
Gets or sets the radius of the nodes in the tree.
Initializes a new instance of the class.
The distance to use when comparing points.
Retrieves a fixed point of nearest points to a given point.
The queried point.
The number of neighbors to retrieve.
A list of neighbor points, ordered by distance.
Retrieves a fixed point of nearest points to a given point.
The queried point.
The number of neighbors to retrieve.
The list where to store results.
A list of neighbor points, ordered by distance.
Node of a .
The type for the position vector (e.g. double[]).
The type for the value stored at the node.
Gets or sets a value associated with this node.
Returns a that represents this instance.
A that represents this instance.
Base class for nodes.
The type for the position vector (e.g. double[]).
The class type for the nodes of the tree.
Gets or sets the current position for this Vantage-Point Tree Node.
Gets or sets the threshold radius for this node.
Indicates whether the current object is equal to another object of the same type.
An object to compare with this object.
true if the current object is equal to the parameter; otherwise, false.
Returns a that represents this instance.
A that represents this instance.
Node of a .
The type for the position vector (e.g. double[]).
Solver types allowed in LibSVM/Liblinear model files.
Unknown solver type.
L2-regularized logistic regression in the primal (-s 0, L2R_LR).
L2-regularized L2-loss support vector classification
in the dual (-s 1, L2R_L2LOSS_SVC_DUAL, the default).
L2-regularized L2-loss support vector classification
in the primal (-s 2, L2R_L2LOSS_SVC).
L2-regularized L1-loss support vector classification
in the dual (-s 3, L2R_L1LOSS_SVC_DUAL).
Support vector classification by
Crammer and Singer (-s 4, MCSVM_CS).
L1-regularized L2-loss support vector
classification (-s 5, L1R_L2LOSS_SVC).
L1-regularized logistic regression (-s 6, L1R_LR).
L2-regularized logistic regression in the dual (-s 7, L2R_LR_DUAL).
L2-regularized L2-loss support vector regression
in the primal (-s 11, L2R_L2LOSS_SVR).
L2-regularized L2-loss support vector regression
in the dual (-s 12, L2R_L2LOSS_SVR_DUAL).
L2-regularized L1-loss support vector regression
in the dual (-s 13, L2R_L1LOSS_SVR_DUAL).
Reads support vector machines
created from LibSVM or Liblinear. Not all solver types are supported.
This class can be used to import LibSVM or LibLINEAR models into .NET
and use them to make predictions in .NET/C# applications.
If you are looking for ways to load and save SVM models in the Accord.NET
Framework without necessarily being compatible with LibSVM or LIBLINEAR,
please use the class instead.
Gets or sets the solver type used to create the model.
Gets or sets the number of classes that
this classification model can handle.
Obsolete. Please use NumberOfClasses instead.
Gets or sets whether an initial double value should
be appended in the beginning of every feature vector.
If set to a negative number, this functionality is
disabled. Default is 0.
Gets or sets the number of dimensions (features)
the classification or regression model can handle.
Obsolete. Please use NumberOfInputs instead.
Gets or sets the class label for each class
this classification model expects to handle.
Gets or sets the vector of linear weights used
by this model, if it is a compact model. If this
is not a compact model, this will be set to null.
Gets or sets the set of support vectors used
by this model. If the model is compact, this
will be set to null.
Creates a new object.
Creates a that
attends the requisites specified in this model.
A that represents this model.
Creates a support
vector machine learning algorithm that attends the
requisites specified in this model.
A that represents this model.
Saves this model to disk using LibSVM's model format.
The path where the file should be written.
Saves this model to disk using LibSVM's model format.
The stream where the file should be written.
Loads a model specified using LibSVM's model format from disk.
The file path from where the model should be loaded.
The stored on .
Loads a model specified using LibSVM's model format from a stream.
The stream from where the model should be loaded.
The stored on .
Creates a from an existing .
The vector machine from which a libSVM model definition should be created.
A class representing a support vector machine in LibSVM format.