Accord.Statistics Base class for binary classifiers. The data type for the input data. Default is double[]. Initializes a new instance of the class. Computes class-label decisions for the given . The input vectors that should be classified as any of the possible classes. The location where to store the class-labels. A set of class-labels that best describe the vectors according to this classifier. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Views this instance as a multi-class classifier, giving access to more advanced methods, such as the prediction of integer labels. This instance seen as an . Views this instance as a multi-class classifier, giving access to more advanced methods, such as the prediction of integer labels. This instance seen as an . Views this instance as a multi-label classifier, giving access to more advanced methods, such as the prediction of one-hot vectors. This instance seen as an . Base class for generative binary classifiers. The data type for the input data. Default is double[]. Predicts a class label vector for the given input vectors, returning the log-likelihood that the input vector belongs to its predicted class. The input vector. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. System.Double[]. Computes a numerical score measuring the association between the given vector and each class. The input vector. An array where the result will be stored, avoiding unnecessary memory allocations. System.Double[]. Predicts a class label vector for the given input vector, returning the log-likelihood that the input vector belongs to its predicted class. The input vector. Computes the log-likelihood that the given input vector belongs to each of the possible classes. The input vector. System.Double[]. Predicts a class label vector for the given input vector, returning the log-likelihood that the input vector belongs to its predicted class. The input vector. Computes the log-likelihoods that the given input vectors belongs to each of the possible classes. A set of input vectors. System.Double[][]. Computes the log-likelihood that the given input vector belongs to each of the possible classes. The input vector. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. System.Double[]. Computes the log-likelihoods that the given input vectors belongs to each of the possible classes. A set of input vectors. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. System.Double[][]. Predicts a class label vector for the given input vector, returning the log-likelihood that the input vector belongs to its predicted class. The input vector. The class label predicted by the classifier. Predicts a class label vector for the given input vector, returning the log-likelihood that the input vector belongs to its predicted class. The input vector. The class label predicted by the classifier. System.Double. Predicts a class label vector for the given input vector, returning the log-likelihood that the input vector belongs to its predicted class. The input vector. The class label predicted by the classifier. System.Double. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. System.Double[]. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. System.Double[]. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. System.Double[]. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. The input vector. The class label predicted by the classifier. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. The input vector. The class label predicted by the classifier. System.Double[]. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. The input vector. The class label predicted by the classifier. System.Double[]. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the distances will be stored, avoiding unnecessary memory allocations. System.Double[]. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the distances will be stored, avoiding unnecessary memory allocations. System.Double[]. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. The input vector. The class label predicted by the classifier. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. The input vector. The class label predicted by the classifier. An array where the distances will be stored, avoiding unnecessary memory allocations. System.Double[]. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. The input vector. The class label predicted by the classifier. An array where the distances will be stored, avoiding unnecessary memory allocations. System.Double[]. Predicts a class label for each input vector, returning the log-likelihood that each vector belongs to its predicted class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. System.Double[]. Predicts a class label for each input vector, returning the log-likelihood that each vector belongs to its predicted class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. System.Double[]. Predicts a class label for each input vector, returning the log-likelihood that each vector belongs to its predicted class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. System.Double[][]. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. System.Double[][]. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. System.Double[][]. Predicts a class label for each input vector, returning the log-likelihood that each vector belongs to its predicted class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. Predicts a class label for each input vector, returning the log-likelihood that each vector belongs to its predicted class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. System.Double[]. Predicts a class label for each input vector, returning the log-likelihood that each vector belongs to its predicted class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. System.Double[]. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. System.Double[][]. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. System.Double[][]. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. System.Double[][]. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. System.Double[][]. Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class. The input vector. Computes the probabilities that the given input vector belongs to each of the possible classes. The input vector. System.Double[]. Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class. The input vector. Computes the probabilities that the given input vectors belongs to each of the possible classes. A set of input vectors. System.Double[][]. Computes the probabilities that the given input vector belongs to each of the possible classes. The input vector. An array where the probabilities will be stored, avoiding unnecessary memory allocations. System.Double[]. Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class. The input vector. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Computes the probabilities that the given input vectors belongs to each of the possible classes. A set of input vectors. An array where the probabilities will be stored, avoiding unnecessary memory allocations. System.Double[][]. Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class. The input vector. The class label predicted by the classifier. Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class. The input vector. The class label predicted by the classifier. System.Double. Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class. The input vector. The class label predicted by the classifier. System.Double. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. System.Double[]. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. System.Double[]. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. The input vector. The class label predicted by the classifier. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. The input vector. The class label predicted by the classifier. System.Double[]. Probabilitieses the specified input. The input. The decision. System.Double[]. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the distances will be stored, avoiding unnecessary memory allocations. System.Double[]. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the distances will be stored, avoiding unnecessary memory allocations. System.Double[]. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. The input vector. The class label predicted by the classifier. An array where the distances will be stored, avoiding unnecessary memory allocations. System.Double[]. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. The input vector. The class label predicted by the classifier. An array where the distances will be stored, avoiding unnecessary memory allocations. System.Double[]. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. The input vector. The class label predicted by the classifier. An array where the distances will be stored, avoiding unnecessary memory allocations. System.Double[]. Predicts a class label for each input vector, returning the probability that each vector belongs to its predicted class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. System.Double[]. Predicts a class label for each input vector, returning the probability that each vector belongs to its predicted class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. System.Double[]. Predicts a class label for each input vector, returning the probability that each vector belongs to its predicted class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. System.Double[][]. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. System.Double[][]. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. System.Double[][]. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. System.Double[][]. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. System.Double[][]. Predicts a class label for each input vector, returning the probability that each vector belongs to its predicted class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label for each input vector, returning the probability that each vector belongs to its predicted class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the probabilities will be stored, avoiding unnecessary memory allocations. System.Double[]. Predicts a class label for each input vector, returning the probability that each vector belongs to its predicted class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the probabilities will be stored, avoiding unnecessary memory allocations. System.Double[]. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the probabilities will be stored, avoiding unnecessary memory allocations. System.Double[][]. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the probabilities will be stored, avoiding unnecessary memory allocations. System.Double[][]. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the probabilities will be stored, avoiding unnecessary memory allocations. System.Double[][]. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the probabilities will be stored, avoiding unnecessary memory allocations. System.Double[][]. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the probabilities will be stored, avoiding unnecessary memory allocations. System.Double[][]. Applies the transformation to a set of input vectors, producing an associated set of output vectors. The input data to which the transformation should be applied. The location to where to store the result of this transformation. The output generated by applying this transformation to the given input. Applies the transformation to a set of input vectors, producing an associated set of output vectors. The input data to which the transformation should be applied. The location to where to store the result of this transformation. The output generated by applying this transformation to the given input. Applies the transformation to a set of input vectors, producing an associated set of output vectors. The input data to which the transformation should be applied. The location to where to store the result of this transformation. The output generated by applying this transformation to the given input. Views this instance as a multi-class generative classifier, giving access to more advanced methods, such as the prediction of integer labels. This instance seen as an . Views this instance as a multi-class generative classifier, giving access to more advanced methods, such as the prediction of integer labels. This instance seen as an . Views this instance as a multi-label generative classifier, giving access to more advanced methods, such as the prediction of one-hot vectors. This instance seen as an . Base class for score-based binary classifiers. The data type for the input data. Default is double[]. Computes a numerical score measuring the association between the given vector and each class. The input vector. An array where the result will be stored, avoiding unnecessary memory allocations. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. The location where to store the class-labels. A class-label that best described according to this classifier. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. A class-label that best described according to this classifier. Computes a numerical score measuring the association between the given vector and its most strongly associated class (as predicted by the classifier). The input vector. Computes a numerical score measuring the association between the given vector and each class. The input vector. Computes a numerical score measuring the association between the given vector and its most strongly associated class (as predicted by the classifier). The input vector. Computes a numerical score measuring the association between the given vector and each class. The input vector. Computes a numerical score measuring the association between the given vector and each class. The input vector. An array where the result will be stored, avoiding unnecessary memory allocations. Computes a numerical score measuring the association between the given vector and each class. The input vector. An array where the result will be stored, avoiding unnecessary memory allocations. Predicts a class label for the input vector, returning a numerical score measuring the strength of association of the input vector to its most strongly related class. The input vector. The class label predicted by the classifier. Predicts a class label vector for the given input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. The input vector. The class label predicted by the classifier. Predicts a class label vector for the given input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. The input vector. The class label predicted by the classifier. An array where the result will be stored, avoiding unnecessary memory allocations. Predicts a class label for each input vector, returning a numerical score measuring the strength of association of the input vector to the most strongly related class. A set of input vectors. The class labels predicted for each input vector, as predicted by the classifier. Predicts a class label vector for each input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label for each input vector, returning a numerical score measuring the strength of association of the input vector to the most strongly related class. A set of input vectors. The class labels predicted for each input vector, as predicted by the classifier. An array where the result will be stored, avoiding unnecessary memory allocations. Predicts a class label for each input vector, returning a numerical score measuring the strength of association of the input vector to the most strongly related class. A set of input vectors. The class labels predicted for each input vector, as predicted by the classifier. An array where the distances will be stored, avoiding unnecessary memory allocations. System.Double[]. Predicts a class label for each input vector, returning a numerical score measuring the strength of association of the input vector to the most strongly related class. A set of input vectors. The class labels predicted for each input vector, as predicted by the classifier. An array where the distances will be stored, avoiding unnecessary memory allocations. System.Double[]. Predicts a class label vector for each input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the result will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the distances will be stored, avoiding unnecessary memory allocations. System.Double[][]. Predicts a class label vector for each input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the distances will be stored, avoiding unnecessary memory allocations. System.Double[][]. Applies the transformation to a set of input vectors, producing an associated set of output vectors. The input data to which the transformation should be applied. The location to where to store the result of this transformation. The output generated by applying this transformation to the given input. Applies the transformation to a set of input vectors, producing an associated set of output vectors. The input data to which the transformation should be applied. The location to where to store the result of this transformation. The output generated by applying this transformation to the given input. Applies the transformation to a set of input vectors, producing an associated set of output vectors. The input data to which the transformation should be applied. The location to where to store the result of this transformation. The output generated by applying this transformation to the given input. Views this instance as a multi-class distance classifier, giving access to more advanced methods, such as the prediction of integer labels. This instance seen as an . Views this instance as a multi-class distance classifier, giving access to more advanced methods, such as the prediction of integer labels. This instance seen as an . Views this instance as a multi-label distance classifier, giving access to more advanced methods, such as the prediction of one-hot vectors. This instance seen as an . Base class for multi-class classifiers. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. A class-label that best described according to this classifier. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. A class-label that best described according to this classifier. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. A class-label that best described according to this classifier. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. A class-label that best described according to this classifier. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. The location where to store the class-labels. A class-label that best described according to this classifier. Computes class-label decisions for the given . The input vectors that should be classified as any of the possible classes. The location where to store the class-labels. A set of class-labels that best describe the vectors according to this classifier. Computes class-label decisions for the given . The input vectors that should be classified as any of the possible classes. The location where to store the class-labels. A set of class-labels that best describe the vectors according to this classifier. Computes class-label decisions for the given . The input vectors that should be classified as any of the possible classes. The location where to store the class-labels. A set of class-labels that best describe the vectors according to this classifier. Computes class-label decisions for the given . The input vectors that should be classified as any of the possible classes. The location where to store the class-labels. A set of class-labels that best describe the vectors according to this classifier. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. The location where to store the class-labels. A class-label that best described according to this classifier. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. The location where to store the class-labels. A class-label that best described according to this classifier. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. The location where to store the class-labels. A class-label that best described according to this classifier. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. The location where to store the class-labels. A class-label that best described according to this classifier. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. The location where to store the class-labels. A class-label that best described according to this classifier. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. The location where to store the class-labels. A class-label that best described according to this classifier. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. The location where to store the class-labels. A class-label that best described according to this classifier. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. The location where to store the class-labels. A class-label that best described according to this classifier. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. The location where to store the class-labels. A class-label that best described according to this classifier. Computes class-label decisions for the given . The input vectors that should be classified as any of the possible classes. The location where to store the class-labels. A set of class-labels that best describe the vectors according to this classifier. Computes class-label decisions for the given . The input vectors that should be classified as any of the possible classes. The location where to store the class-labels. A set of class-labels that best describe the vectors according to this classifier. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Base class for multi-class classifiers. The data type for the input data. Default is double[]. Computes class-label decisions for the given . The input vectors that should be classified as any of the possible classes. The location where to store the class-labels. A set of class-labels that best describe the vectors according to this classifier. Computes class-label decisions for the given . The input vectors that should be classified as any of the possible classes. The location where to store the class-labels. A set of class-labels that best describe the vectors according to this classifier. Computes class-label decisions for the given . The input vectors that should be classified as any of the possible classes. The location where to store the class-labels. A set of class-labels that best describe the vectors according to this classifier. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. The location where to store the class-labels. A class-label that best described according to this classifier. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Views this instance as a multi-label classifier, giving access to more advanced methods, such as the prediction of one-hot vectors. This instance seen as an . Base class for generative multi-class classifiers. The data type for the input data. Default is double[]. Computes a numerical score measuring the association between the given vector and a given . The input vector. The index of the class whose score will be computed. System.Double. Computes the log-likelihood that the given input vector belongs to the specified . The input vector. The index of the class whose score will be computed. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Computes the log-likelihood that the given input vector belongs to each of the possible classes. The input vector. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Computes the probability that the given input vector belongs to the specified . The input vector. The index of the class whose score will be computed. Computes the probability that the given input vector belongs to the specified . The input vector. The index of the class whose score will be computed. Computes the probability that the given input vector belongs to the specified . The input vector. The index of the class whose score will be computed. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Computes the probability that the given input vector belongs to the specified . The input vector. The index of the class whose score will be computed. Computes the probability that the given input vector belongs to the specified . The input vector. The index of the class whose score will be computed. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Computes the probabilities that the given input vector belongs to each of the possible classes. The input vector. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. Computes the log-likelihood that the given input vector belongs to the specified . The input vector. The index of the class whose score will be computed. Computes the log-likelihood that the given input vector belongs to the specified . The input vector. The index of the class whose score will be computed. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. Computes the log-likelihood that the given input vector belongs to the specified . The input vector. The index of the class whose score will be computed. Computes the log-likelihood that the given input vector belongs to the specified . The input vector. The index of the class whose score will be computed. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. Computes the log-likelihood that the given input vector belongs to the specified . The input vector. The index of the class whose score will be computed. Computes the log-likelihood that the given input vector belongs to the specified . The input vector. The index of the class whose score will be computed. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. Computes the log-likelihood that the given input vector belongs to its most plausible class. The input vector. Computes the log-likelihood that the given input vectors belongs to each of the possible classes. The input vector. Computes the log-likelihood that the given input vectors belongs to each of the possible classes. The input vector. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. Computes the log-likelihood that the given input vector belongs to each of the possible classes. The input vector. Computes the log-likelihood that the given input vectors belongs to each of the possible classes. The input vector. Computes the log-likelihood that the given input vector belongs to each of the possible classes. The input vector. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for the given input vector, returning the log-likelihood that the input vector belongs to its predicted class. The input vector. The class label predicted by the classifier. Predicts a class label vector for the given input vector, returning the log-likelihood that the input vector belongs to its predicted class. The input vector. The class label predicted by the classifier. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. Predicts a class label for each input vector, returning the log-likelihood that each vector belongs to its predicted class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label for each input vector, returning the log-likelihood that each vector belongs to its predicted class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. Predicts a class label for each input vector, returning the log-likelihood that each vector belongs to its predicted class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. Predicts a class label for each input vector, returning the log-likelihood that each vector belongs to its predicted class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class. The input vector. Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class. The input vector. Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class. The input vector. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Computes the probabilities that the given input vector belongs to each of the possible classes. The input vector. Computes the probabilities that the given input vector belongs to each of the possible classes. The input vector. Computes the probabilities that the given input vector belongs to each of the possible classes. The input vector. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class. The input vector. The class label predicted by the classifier. Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class. The input vector. The class label predicted by the classifier. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label for each input vector, returning the probability that each vector belongs to its predicted class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label for each input vector, returning the probability that each vector belongs to its predicted class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. Predicts a class label for each input vector, returning the probability that each vector belongs to its predicted class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label for each input vector, returning the probability that each vector belongs to its predicted class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Views this instance as a multi-label generative classifier, giving access to more advanced methods, such as the prediction of one-hot vectors. This instance seen as an . Views this instance as a multi-class generative classifier. This instance seen as an . Views this instance as a multi-class generative classifier. This instance seen as an . Base class for score-based multi-class classifiers. The data type for the input data. Default is double[]. Computes a numerical score measuring the association between the given vector and each class. The input vector. An array where the result will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for the given input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. The input vector. The class label predicted by the classifier. An array where the scores will be stored, avoiding unnecessary memory allocations. Computes a numerical score measuring the association between the given vector and a given . The input vector. The index of the class whose score will be computed. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. A class-label that best described according to this classifier. Computes a numerical score measuring the association between the given vector and a given . The input vector. The index of the class whose score will be computed. Computes a numerical score measuring the association between the given vector and a given . The input vector. The index of the class whose score will be computed. An array where the scores will be stored, avoiding unnecessary memory allocations. Computes a numerical score measuring the association between the given vector and a given . The input vector. The index of the class whose score will be computed. Computes a numerical score measuring the association between the given vector and a given . The input vector. The index of the class whose score will be computed. An array where the scores will be stored, avoiding unnecessary memory allocations. Computes a numerical score measuring the association between the given vector and its most strongly associated class (as predicted by the classifier). The input vector. Computes a numerical score measuring the association between the given vector and its most strongly associated class (as predicted by the classifier). The input vector. Computes a numerical score measuring the association between the given vector and its most strongly associated class (as predicted by the classifier). The input vector. An array where the scores will be stored, avoiding unnecessary memory allocations. Computes a numerical score measuring the association between the given vector and each class. The input vector. Computes a numerical score measuring the association between the given vector and each class. The input vector. Computes a numerical score measuring the association between the given vector and each class. The input vector. An array where the scores will be stored, avoiding unnecessary memory allocations. Predicts a class label for the input vector, returning a numerical score measuring the strength of association of the input vector to its most strongly related class. The input vector. The class label predicted by the classifier. Predicts a class label for the input vector, returning a numerical score measuring the strength of association of the input vector to its most strongly related class. The input vector. The class label predicted by the classifier. Predicts a class label vector for the given input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. The input vector. The class label predicted by the classifier. Predicts a class label vector for the given input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. The input vector. The class label predicted by the classifier. Predicts a class label vector for the given input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. The input vector. The class label predicted by the classifier. An array where the scores will be stored, avoiding unnecessary memory allocations. Predicts a class label for each input vector, returning a numerical score measuring the strength of association of the input vector to the most strongly related class. A set of input vectors. The class labels predicted for each input vector, as predicted by the classifier. Predicts a class label for each input vector, returning a numerical score measuring the strength of association of the input vector to the most strongly related class. A set of input vectors. The class labels predicted for each input vector, as predicted by the classifier. Predicts a class label vector for each input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label vector for each input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label for each input vector, returning a numerical score measuring the strength of association of the input vector to the most strongly related class. A set of input vectors. The class labels predicted for each input vector, as predicted by the classifier. An array where the scores will be stored, avoiding unnecessary memory allocations. Predicts a class label for each input vector, returning a numerical score measuring the strength of association of the input vector to the most strongly related class. A set of input vectors. The class labels predicted for each input vector, as predicted by the classifier. An array where the scores will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the scores will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the scores will be stored, avoiding unnecessary memory allocations. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Views this instance as a multi-label distance classifier, giving access to more advanced methods, such as the prediction of one-hot vectors. This instance seen as an . Views this instance as a multi-class generative classifier. This instance seen as an . Views this instance as a multi-class generative classifier. This instance seen as an . Base class for multi-label classifiers. The data type for the input data. Default is double[]. Computes whether a class label applies to an vector. The input vectors that should be classified as any of the possible classes. The class label index to be tested. A boolean value indicating whether the given class label applies to the vector. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. A class-label that best described according to this classifier. Computes class-label decisions for the given . The input vectors that should be classified as any of the possible classes. The location where to store the class-labels. A set of class-labels that best describe the vectors according to this classifier. Computes class-label decisions for the given . The input vectors that should be classified as any of the possible classes. The location where to store the class-labels. A set of class-labels that best describe the vectors according to this classifier. Computes class-label decisions for the given . The input vectors that should be classified as any of the possible classes. The location where to store the class-labels. A set of class-labels that best describe the vectors according to this classifier. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. The location where to store the class-labels. A class-label that best described according to this classifier. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. The location where to store the class-labels. A class-label that best described according to this classifier. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Base class for generative multi-label classifiers. The data type for the input data. Default is double[]. Computes a log-likelihood measuring the association between the given vector and a given . The input vector. The index of the class whose score will be computed. The class label associated with the input vector, as predicted by the classifier. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. The input vector. The class label predicted by the classifier. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. The input vector. The class label predicted by the classifier. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. Computes the log-likelihood that the given input vector belongs to the specified . The input vector. The index of the class whose score will be computed. Computes the log-likelihood that the given input vector belongs to each of the possible classes. The input vector. Computes the log-likelihood that the given input vector belongs to each of the possible classes. The input vector. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. Computes the log-likelihood that the given input vector belongs to each of the possible classes. The input vector. Computes the log-likelihood that the given input vector belongs to each of the possible classes. The input vector. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. Computes the probability that the given input vector belongs to the specified . The input vector. The index of the class whose score will be computed. Computes the probability that the given input vector belongs to the specified . The input vector. The index of the class whose score will be computed. Computes the probability that the given input vector belongs to the specified . The input vector. The index of the class whose score will be computed. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Computes the probability that the given input vector belongs to the specified . The input vector. The index of the class whose score will be computed. Computes the probability that the given input vector belongs to the specified . The input vector. The index of the class whose score will be computed. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Computes the probabilities that the given input vector belongs to each of the possible classes. The input vector. Computes the probabilities that the given input vector belongs to each of the possible classes. The input vector. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Computes the probabilities that the given input vector belongs to each of the possible classes. The input vector. Computes the probabilities that the given input vector belongs to each of the possible classes. The input vector. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Computes the log-likelihood that the given input vector belongs to the specified . The input vector. The index of the class whose score will be computed. Computes the log-likelihood that the given input vector belongs to the specified . The input vector. The index of the class whose score will be computed. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. Computes the log-likelihood that the given input vector belongs to the specified . The input vector. The index of the class whose score will be computed. Computes the log-likelihood that the given input vector belongs to the specified . The input vector. The index of the class whose score will be computed. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. The input vector. The class label predicted by the classifier. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. The input vector. The class label predicted by the classifier. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. The input vector. The class label predicted by the classifier. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. The input vector. The class label predicted by the classifier. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. The input vector. The class label predicted by the classifier. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. The input vector. The class label predicted by the classifier. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. The input vector. The class label predicted by the classifier. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. The input vector. The class label predicted by the classifier. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. The input vector. The class label predicted by the classifier. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. A set of input vectors. The labels predicted by the classifier. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Views this instance as a multi-class generative classifier. This instance seen as an . Views this instance as a multi-class generative classifier. This instance seen as an . Base class for score-based multi-label classifiers. The data type for the input data. Default is double[]. Computes a numerical score measuring the association between the given vector and a given . The input vector. The index of the class whose score will be computed. The class label associated with the input vector, as predicted by the classifier. Computes a numerical score measuring the association between the given vector and a given . The input vector. The index of the class whose score will be computed. Predicts a class label vector for the given input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the scores will be stored, avoiding unnecessary memory allocations. Computes a numerical score measuring the association between the given vector and a given . The input vector. The index of the class whose score will be computed. Computes a numerical score measuring the association between the given vector and a given . The input vector. The index of the class whose score will be computed. An array where the scores will be stored, avoiding unnecessary memory allocations. Computes a numerical score measuring the association between the given vector and a given . The input vector. The index of the class whose score will be computed. Computes a numerical score measuring the association between the given vector and a given . The input vector. The index of the class whose score will be computed. An array where the scores will be stored, avoiding unnecessary memory allocations. Computes a numerical score measuring the association between the given vector and each class. The input vector. Computes a numerical score measuring the association between the given vector and each class. The input vector. Computes a numerical score measuring the association between the given vector and each class. The input vector. An array where the scores will be stored, avoiding unnecessary memory allocations. Computes a numerical score measuring the association between the given vector and each class. The input vector. An array where the scores will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for the given input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label vector for the given input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label vector for the given input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label vector for the given input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the scores will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for the given input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the scores will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label vector for each input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label vector for each input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label vector for each input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the scores will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the scores will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the scores will be stored, avoiding unnecessary memory allocations. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. An array where the scores will be stored, avoiding unnecessary memory allocations. A class-label that best described according to this classifier. Predicts a class label vector for the given input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. The input vector. The class label predicted by the classifier. Predicts a class label vector for the given input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. The input vector. The class label predicted by the classifier. An array where the scores will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label vector for each input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the scores will be stored, avoiding unnecessary memory allocations. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. An array where the scores will be stored, avoiding unnecessary memory allocations. A class-label that best described according to this classifier. Predicts a class label vector for the given input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. The input vector. The class label predicted by the classifier. Predicts a class label vector for the given input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. The input vector. The class label predicted by the classifier. An array where the scores will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for each input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. Predicts a class label vector for each input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the scores will be stored, avoiding unnecessary memory allocations. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Views this instance as a multi-class generative classifier. This instance seen as an . Views this instance as a multi-class generative classifier. This instance seen as an . Base implementation for generative observation sequence taggers. A sequence tagger can predict the class label of each individual observation in a input sequence vector. The data type for the input data. Default is double[]. Predicts a the probability that the sequence vector has been generated by this log-likelihood tagger. Predicts a the probability that the sequence vector has been generated by this log-likelihood tagger. Predicts a the log-likelihood for each of the observations in the sequence vector assuming each of the possible states in the tagger model. Predicts a the log-likelihood for each of the observations in the sequence vector assuming each of the possible states in the tagger model. Predicts a the probability that the sequence vector has been generated by this log-likelihood tagger. Predicts a the probability that the sequence vector has been generated by this log-likelihood tagger. Predicts a the probability that the sequence vector has been generated by this log-likelihood tagger. Predicts a the probability that the sequence vector has been generated by this log-likelihood tagger. Predicts a the probability that the sequence vector has been generated by this log-likelihood tagger. Predicts a the probability that the sequence vector has been generated by this log-likelihood tagger. Predicts a the probability that the sequence vector has been generated by this log-likelihood tagger. Predicts a the probability that the sequence vector has been generated by this log-likelihood tagger. Predicts a the probability that the sequence vector has been generated by this log-likelihood tagger. Predicts a the probability that the sequence vector has been generated by this log-likelihood tagger. Predicts a the probabilities for each of the observations in the sequence vector assuming each of the possible states in the tagger model. Predicts a the probabilities for each of the observations in the sequence vector assuming each of the possible states in the tagger model. Predicts a the log-likelihood for each of the observations in the sequence vector assuming each of the possible states in the tagger model. Predicts a the log-likelihood for each of the observations in the sequence vector assuming each of the possible states in the tagger model. Predicts a the log-likelihood for each of the observations in the sequence vector assuming each of the possible states in the tagger model. Predicts a the log-likelihood for each of the observations in the sequence vector assuming each of the possible states in the tagger model. Predicts a the log-likelihood for each of the observations in the sequence vector assuming each of the possible states in the tagger model. Predicts a the log-likelihood for each of the observations in the sequence vector assuming each of the possible states in the tagger model. Predicts a the probabilities for each of the observations in the sequence vector assuming each of the possible states in the tagger model. Predicts a the probabilities for each of the observations in the sequence vector assuming each of the possible states in the tagger model. Predicts a the log-likelihood for each of the observations in the sequence vector assuming each of the possible states in the tagger model. Predicts a the log-likelihood for each of the observations in the sequence vector assuming each of the possible states in the tagger model. Predicts a the log-likelihood for each of the observations in the sequence vector assuming each of the possible states in the tagger model. Predicts a the log-likelihood for each of the observations in the sequence vector assuming each of the possible states in the tagger model. Computes numerical scores measuring the association between each of the given vectors and each possible class. Computes numerical scores measuring the association between each of the given vectors and each possible class. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Common base class for observation sequence taggers. Computes numerical scores measuring the association between each of the given vectors and each possible class. Computes numerical scores measuring the association between each of the given vectors and each possible class. Computes numerical scores measuring the association between each of the given vectors and each possible class. Computes numerical scores measuring the association between each of the given vectors and each possible class. Computes numerical scores measuring the association between each of the given vectors and each possible class. Computes numerical scores measuring the association between each of the given vectors and each possible class. Computes numerical scores measuring the association between each of the given vectors and each possible class. Computes numerical scores measuring the association between each of the given vectors and each possible class. Base class for multi-class and multi-label classifiers. The data type for the input data. Default is double[]. Gets the number of classes expected and recognized by the classifier. The number of classes. Computes class-label decisions for the given . The input vectors that should be classified as any of the possible classes. The location where to store the class-labels. A set of class-labels that best describe the vectors according to this classifier. Computes class-label decisions for the given . The input vectors that should be classified as any of the possible classes. A set of class-labels that best describe the vectors according to this classifier. Computes class-label decisions for the given . The input vectors that should be classified as any of the possible classes. A set of class-labels that best describe the vectors according to this classifier. Computes class-label decisions for the given . The input vectors that should be classified as any of the possible classes. The location where to store the class-labels. A set of class-labels that best describe the vectors according to this classifier. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Base class for data transformation algorithms. The type for the output data that enters in the model. Default is double[]. The type for the input data that exits from the model. Default is double[]. Gets the number of inputs accepted by the model. Gets the number of outputs generated by the model. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Applies the transformation to a set of input vectors, producing an associated set of output vectors. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The location to where to store the result of this transformation. The output generated by applying this transformation to the given input. Base class for data transformation algorithms. The type for the output data that enters in the model. Default is double[]. The type for the input data that exits from the model. Default is double[]. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The location to where to store the result of this transformation. The output generated by applying this transformation to the given input. Applies the transformation to a set of input vectors, producing an associated set of output vectors. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The location to where to store the result of this transformation. The output generated by applying this transformation to the given input. Base class for data transformation algorithms. The type for the output data that enters in the model. Default is double[]. Base class for data transformation algorithms. Common interface for unsupervised learning algorithms. The type for the model being learned. The type for the output data that originates from the model. Learns a model that can map the given inputs to the desired outputs. The model inputs. A model that has learned how to produce suitable outputs given the input data . Minimum (Mean) Distance Classifier. This is one of the simplest possible pattern recognition classifiers. This classifier works by comparing a new input vector against the mean value of the other classes. The class which is closer to this new input vector is considered the winner, and the vector will be classified as having the same label as this class. Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. Gets or sets the class means to which samples will be compared against. Gets or sets the distance function to be used when comparing a sample to a class mean. Initializes a new instance of the class. Initializes a new instance of the class. The input points. The output labels associated with each input points. Initializes a new instance of the class. A distance function. Default is to use the distance. The input points. The output labels associated with each input points. Computes the label for the given input. The input value. The distances from to the class means. The output label assigned to this point. Computes the label for the given input. A input. The output label assigned to this point. Computes a numerical score measuring the association between the given vector and each class. The input vector. An array where the result will be stored, avoiding unnecessary memory allocations. Computes a numerical score measuring the association between the given vector and a given . The input vector. The index of the class whose score will be computed. System.Double. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Base class for multi-class and multi-label classifiers. The data type for the input data. Default is double[]. The data type for the classes. Default is int. Gets the number of classes expected and recognized by the classifier. The number of classes. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. A class-label that best described according to this classifier. Computes class-label decisions for a given set of vectors. The input vectors that should be classified into one of the possible classes. The class-labels that best described each vector according to this classifier. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. The location where to store the class-labels. A class-label that best described according to this classifier. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The location where to store the class-labels. The output generated by applying this transformation to the given input. Common interface for supervised learning algorithms for binary classifiers. The type for the model being learned. The type for the input data that enters the model. Common interface for supervised learning algorithms for binary classifiers. The type for the model being learned. Common interface for supervised learning algorithms. The type for the model being learned. The type for the input data that enters the model. The type for the output data that originates from the model. Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Common interface for supervised learning algorithms for multi-class classifiers. The type for the model being learned. The type for the input data that enters the model. Common interface for supervised learning algorithms for multi-class classifiers. The type for the model being learned. Common interface for supervised learning algorithms for multi-label classifiers. The type for the model being learned. The type for the input data that enters the model. Common interface for supervised learning algorithms for multi-label classifiers. The type for the model being learned. Common interface for unsupervised learning algorithms. The type for the model being learned. The type for the output data that originates from the model. The type for the input data that enters the model. Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . Common base class for supervised learning algorithms for binary classifiers. The type for the model being learned. The type for the input data that enters the model. Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. Gets or sets the classifier being learned. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Base class for multi-class learning algorithm. Gets or sets a cancellation token that can be used to cancel the algorithm while it is running. Gets or sets the model being learned. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Base class for multi-class learning algorithm. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Contains many statistical analysis, such as PCA, LDA, KPCA, KDA, PLS, ICA, Logistic Regression and Stepwise Logistic Regression Analyses. Also contains performance assessment analysis such as contingency tables and ROC curves. The namespace class diagram is shown below. Base class for Principal Component Analyses (PCA and KPCA). Obsolete Obsolete Obsolete Obsolete Obsolete Obsolete Initializes a new instance of the class. Gets the column standard deviations of the source data given at method construction. Gets the column mean of the source data given at method construction. Gets or sets the method used by this analysis. Gets or sets whether calculations will be performed overwriting data in the original source matrix, using less memory. Gets or sets whether the transformation result should be whitened (have unit standard deviation) before it is returned. Gets or sets the number of outputs (dimensionality of the output vectors) that should be generated by this model. Gets the maximum number of outputs (dimensionality of the output vectors) that can be generated by this model. Gets or sets the amount of explained variance that should be generated by this model. This value will alter the that can be generated by this model. Provides access to the Singular Values stored during the analysis. If a covariance method is chosen, then it will contain an empty vector. The singular values. Returns the original data supplied to the analysis. The original data matrix supplied to the analysis. Gets the resulting projection of the source data given on the creation of the analysis into the space spawned by principal components. The resulting projection in principal component space. Gets a matrix whose columns contain the principal components. Also known as the Eigenvectors or loadings matrix. The matrix of principal components. Gets a matrix whose columns contain the principal components. Also known as the Eigenvectors or loadings matrix. The matrix of principal components. Provides access to the Eigenvalues stored during the analysis. The Eigenvalues. Gets or sets a cancellation token that can be used to cancel the algorithm while it is running. The respective role each component plays in the data set. The component proportions. The cumulative distribution of the components proportion role. Also known as the cumulative energy of the principal components. The cumulative proportions. Gets the Principal Components in a object-oriented structure. The collection of principal components. Returns the minimal number of principal components required to represent a given percentile of the data. The percentile of the data requiring representation. The minimal number of components required. Creates additional information about principal components. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Obsolete. Projects a given matrix into principal component space. The matrix to be projected. The number of components to consider. Projects a given matrix into principal component space. The matrix to be projected. The number of components to consider. Projects a given matrix into principal component space. The matrix to be projected. The number of components to consider. Common interface for information components. Those are present in multivariate analysis, such as and . Gets the index for this component. Gets the proportion, or amount of information explained by this component. Gets the cumulative proportion of all discriminants until this component. Determines the method to be used in a statistical analysis. By choosing Center, the method will be run on the mean-centered data. In Principal Component Analysis this means the method will operate on the Covariance matrix of the given data. By choosing Standardize, the method will be run on the mean-centered and standardized data. In Principal Component Analysis this means the method will operate on the Correlation matrix of the given data. One should always choose to standardize when dealing with different units of variables. Determines the method to be used in a statistical analysis. By choosing Center, the method will be run on the mean-centered data. In Principal Component Analysis this means the method will operate on the Covariance matrix of the given data. By choosing Standardize, the method will be run on the mean-centered and standardized data. In Principal Component Analysis this means the method will operate on the Correlation matrix of the given data. One should always choose to standardize when dealing with different units of variables. By choosing CorrelationMatrix, the method will interpret the given data as a correlation matrix. By choosing CovarianceMatrix, the method will interpret the given data as a correlation matrix. By choosing KernelMatrix, the method will interpret the given data as a Kernel (Gram) matrix. Common interface for statistical analysis. Computes the analysis using given source data and parameters. Common interface for descriptive measures, such as and . Gets the variable's index. Gets the variable's name Gets the variable's total sum. Gets the variable's mean. Gets the variable's standard deviation. Gets the variable's median. Gets the variable's outer fences range. Gets the variable's inner fence range. Gets the variable's interquartile range. Gets the variable's mode. Gets the variable's variance. Gets the variable's skewness. Gets the variable's kurtosis. Gets the variable's standard error of the mean. Gets the variable's maximum value. Gets the variable's minimum value. Gets the variable's length. Gets the number of distinct values for the variable. Gets the number of samples for the variable. Gets the 95% confidence interval around the . Gets the 95% deviance interval around the . Gets the variable's observations. Gets a confidence interval for the within the given confidence level percentage. The confidence level. Default is 0.95. A confidence interval for the estimated value. Gets a deviance interval for the within the given confidence level percentage (i.e. uses the standard deviation rather than the standard error to compute the range interval for the variable). The confidence level. Default is 0.95. A confidence interval for the estimated value. Common interface for projective statistical analysis. Projects new data into latent space. Projects new data into latent space with given number of dimensions. Common interface for multivariate regression analysis. Regression analysis attempt to express many numerical dependent variables as a combinations of other features or measurements. Gets the dependent variables' values for each of the source input points. Common interface for regression analysis. Regression analysis attempt to express one numerical dependent variable as a combinations of other features or measurements. When the dependent variable is a category label, the class of analysis methods is known as discriminant analysis. Gets the dependent variable value for each of the source input points. Common interface for discriminant analysis. Discriminant analysis attempt to express one categorical dependent variable as a combinations of other features or measurements. When the dependent variable is a numerical quantity, the class of analysis methods is known as regression analysis. Gets the classification labels (the dependent variable) for each of the source input points. Exponential contrast function. According to Hyvärinen, the Exponential contrast function may be used when the independent components are highly super-Gaussian or when robustness is very important. Initializes a new instance of the class. The exponential alpha constant. Default is 1. Gets the exponential alpha constant. Initializes a new instance of the class. Contrast function. The vector of observations. At method's return, this parameter should contain the evaluation of function over the vector of observations . At method's return, this parameter should contain the evaluation of function derivative over the vector of observations . Common interface for contrast functions. Contrast functions are used as objective functions in neg-entropy calculations. Contrast function. The vector of observations. At method's return, this parameter should contain the evaluation of function over the vector of observations . At method's return, this parameter should contain the evaluation of function derivative over the vector of observations . Kurtosis contrast function. According to Hyvärinen, the kurtosis contrast function is justified on statistical grounds only for estimating sub-Gaussian independent components when there are no outliers. Initializes a new instance of the class. Contrast function. The vector of observations. At method's return, this parameter should contain the evaluation of function over the vector of observations . At method's return, this parameter should contain the evaluation of function derivative over the vector of observations . Log-cosh (Hyperbolic Tangent) contrast function. According to Hyvärinen, the Logcosh contrast function is a good general-purpose contrast function. Initializes a new instance of the class. Initializes a new instance of the class. The log-cosh alpha constant. Default is 1. Gets the exponential log-cosh constant. Contrast function. The vector of observations. At method's return, this parameter should contain the evaluation of function over the vector of observations . At method's return, this parameter should contain the evaluation of function derivative over the vector of observations . Descriptive statistics analysis for circular data. Constructs the Circular Descriptive Analysis. The source data to perform analysis. The length of each circular variable (i.e. 24 for hours). The names for the analyzed variable. Whether the analysis should conserve memory by doing operations over the original array. Constructs the Circular Descriptive Analysis. The source data to perform analysis. The length of each circular variable (i.e. 24 for hours). Whether the analysis should conserve memory by doing operations over the original array. Constructs the Circular Descriptive Analysis. The source data to perform analysis. The length of each circular variable (i.e. 24 for hours). Constructs the Circular Descriptive Analysis. The source data to perform analysis. The length of each circular variable (i.e. 24 for hours). Names for the analyzed variables. Constructs the Circular Descriptive Analysis. The source data to perform analysis. The length of each circular variable (i.e. 24 for hours). Constructs the Circular Descriptive Analysis. The source data to perform analysis. The length of each circular variable (i.e. 24 for hours). Names for the analyzed variables. Constructs the Circular Descriptive Analysis. The length of each circular variable (i.e. 24 for hours). Names for the analyzed variables. Constructs the Circular Descriptive Analysis. The length of each circular variable (i.e. 24 for hours). Computes the analysis using given source data and parameters. Learns a model that can map the given inputs to the desired outputs. The model inputs. A model that has learned how to produce suitable outputs given the input data . Gets or sets whether all reported statistics should respect the circular interval. For example, setting this property to false would allow the , , and properties report minimum and maximum values outside the variable's allowed circular range. Default is true. Gets the source matrix from which the analysis was run. Gets the source matrix from which the analysis was run. Gets or sets the method to be used when computing quantiles (median and quartiles). The quantile method. Gets or sets whether the properties of this class should be computed only when necessary. If set to true, a copy of the input data will be maintained inside an instance of this class, using more memory. Gets the source matrix from which the analysis was run. Gets the column names from the variables in the data. Gets a vector containing the length of the circular domain for each data column. Gets a vector containing the Mean of each data column. Gets a vector containing the Mode of each data column. Gets a vector containing the Standard Deviation of each data column. Gets a vector containing the Standard Error of the Mean of each data column. Gets the 95% confidence intervals for the . Gets the 95% deviance intervals for the . A deviance interval uses the standard deviation rather than the standard error to compute the range interval for a variable. Gets a vector containing the Median of each data column. Gets a vector containing the Variance of each data column. Gets a vector containing the number of distinct elements for each data column. Gets an array containing the Ranges of each data column. Gets an array containing the interquartile range of each data column. Gets an array containing the inner fences of each data column. Gets an array containing the outer fences of each data column. Gets an array containing the sum of each data column. If the analysis has been computed in place, this will contain the sum of the transformed angle values instead. Gets an array containing the sum of cosines for each data column. Gets an array containing the sum of sines for each data column. Gets an array containing the circular concentration for each data column. Gets an array containing the skewness for of each data column. Gets an array containing the kurtosis for of each data column. Gets the number of samples (or observations) in the data. Gets the number of variables (or features) in the data. Gets a collection of DescriptiveMeasures objects that can be bound to a DataGridView. Gets a confidence interval for the within the given confidence level percentage. The confidence level. Default is 0.95. The index of the data column whose confidence interval should be calculated. A confidence interval for the estimated value. Gets a deviance interval for the within the given confidence level percentage (i.e. uses the standard deviation rather than the standard error to compute the range interval for the variable). The confidence level. Default is 0.95. The index of the data column whose confidence interval should be calculated. A confidence interval for the estimated value. Circular descriptive measures for a variable. Gets the circular analysis that originated this measure. Gets the variable's index. Gets the variable's name Gets the variable's total sum. Gets the variable's mean. Gets the variable's standard deviation. Gets the variable's median. Gets the variable's mode. Gets the variable's outer fences range. Gets the variable's inner fence range. Gets the variable's interquartile range. Gets the variable's variance. Gets the variable's maximum value. Gets the variable's minimum value. Gets the variable's length. Gets the number of distinct values for the variable. Gets the number of samples for the variable. Gets the sum of cosines for the variable. Gets the sum of sines for the variable. Gets the transformed variable's observations. Gets the variable's standard error of the mean. Gets the 95% confidence interval around the . Gets the 95% deviance interval around the . Gets the variable's observations. Gets the variable skewness. Gets the variable kurtosis. Gets a confidence interval for the within the given confidence level percentage. The confidence level. Default is 0.95. A confidence interval for the estimated value. Gets a deviance interval for the within the given confidence level percentage (i.e. uses the standard deviation rather than the standard error to compute the range interval for the variable). The confidence level. Default is 0.95. A confidence interval for the estimated value. Collection of descriptive measures. Gets the key for item. Distribution fitness analysis. The distribution analysis class can be used to perform a battery of distribution fitting tests in order to check from which distribution a sample is more likely to have come from. Gets the tested distribution names. The distribution names. Gets the estimated distributions. The estimated distributions. Gets or sets a mapping of fitting options that should be used when attempting to estimate each of the distributions in . Gets the Kolmogorov-Smirnov tests performed against each of the candidate distributions. Gets the Chi-Square tests performed against each of the candidate distributions. Gets the Anderson-Darling tests performed against each of the candidate distributions. Gets the rank of each distribution according to the Kolmogorov-Smirnov test statistic. A value of 0 means the distribution is the most likely. Gets the rank of each distribution according to the Chi-Square test statistic. A value of 0 means the distribution is the most likely. Gets the rank of each distribution according to the Anderson-Darling test statistic. A value of 0 means the distribution is the most likely. Gets the goodness of fit for each candidate distribution. Initializes a new instance of the class. The observations to be fitted against candidate distributions. Initializes a new instance of the class. Obsolete. Please use the method instead. Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . Gets all univariate distributions (types implementing ) loaded in the current domain. Gets all multivariate distributions (types implementing ) loaded in the current domain. Gets a distribution's name in a human-readable form. The distribution whose name must be obtained. Gets the index of the first distribution with the given name. Base class for Discriminant Analysis (LDA, QDA or KDA). Obsolete. Initializes common properties. Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. Returns the original supplied data to be analyzed. Gets or sets the minimum variance proportion needed to keep a discriminant component. If set to zero, all components will be kept. Default is 0.001 (all components which contribute less than 0.001 to the variance in the data will be discarded). Gets the resulting projection of the source data given on the creation of the analysis into discriminant space. Gets the original classifications (labels) of the source data given on the moment of creation of this analysis object. Gets the number of samples used to create the analysis. Gets the number of classes in the analysis. Gets the mean of the original data given at method construction. Gets the standard mean of the original data given at method construction. Gets the Within-Class Scatter Matrix for the data. Gets the Between-Class Scatter Matrix for the data. Gets the Total Scatter Matrix for the data. Gets the Eigenvectors obtained during the analysis, composing a basis for the discriminant factor space. Gets the Eigenvectors obtained during the analysis, composing a basis for the discriminant factor space. Gets the Eigenvalues found by the analysis associated with each vector of the ComponentMatrix matrix. Gets the level of importance each discriminant factor has in discriminant space. Also known as amount of variance explained. The cumulative distribution of the discriminants factors proportions. Also known as the cumulative energy of the first dimensions of the discriminant space or as the amount of variance explained by those dimensions. Gets the discriminant factors in a object-oriented fashion. Gets information about the distinct classes in the analyzed data. Gets the Scatter matrix for each class. Gets the Mean vector for each class. Gets the feature space mean of the projected data. Gets the Standard Deviation vector for each class. Gets the observation count for each class. Obsolete. Obsolete. Obsolete. Obsolete. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Returns the minimum number of discriminant space dimensions (discriminant factors) required to represent a given percentile of the data. The percentile of the data requiring representation. The minimal number of dimensions required. Returns the number of discriminant space dimensions (discriminant factors) whose variance is greater than a given threshold. Classifies a new instance into one of the available classes. Classifies a new instance into one of the available classes. Classifies new instances into one of the available classes. Gets the output of the discriminant function for a given class. Creates additional information about principal components. Goodness-of-fit result for a given distribution. Gets the analysis that has produced this measure. Gets the variable's index. Gets the distribution name Gets (a clone of) the measured distribution. The distribution associated with this good-of-fit measure. Gets the value of the Kolmogorov-Smirnov statistic. The Kolmogorov-Smirnov for the . Gets the rank of this distribution according to the Kolmogorov-Smirnov test. An integer value where 0 indicates most probable. Gets the value of the Chi-Square statistic. The Chi-Square for the . Gets the rank of this distribution according to the Chi-Square test. An integer value where 0 indicates most probable. Gets the value of the Anderson-Darling statistic. The Anderson-Darling for the . Gets the rank of this distribution according to the Anderson-Darling test. An integer value where 0 indicates most probable. Compares the current object with another object of the same type. An object to compare with this object. A value that indicates the relative order of the objects being compared. The return value has the following meanings: Value Meaning Less than zero This object is less than the parameter.Zero This object is equal to . Greater than zero This object is greater than . Compares the current instance with another object of the same type and returns an integer that indicates whether the current instance precedes, follows, or occurs in the same position in the sort order as the other object. An object to compare with this instance. A value that indicates the relative order of the objects being compared. The return value has these meanings: Value Meaning Less than zero This instance precedes in the sort order. Zero This instance occurs in the same position in the sort order as . Greater than zero This instance follows in the sort order. Returns a that represents this instance. A that represents this instance. Returns a that represents this instance. The format to use.-or- A null reference (Nothing in Visual Basic) to use the default format defined for the type of the implementation. The provider to use to format the value.-or- A null reference (Nothing in Visual Basic) to obtain the numeric format information from the current locale setting of the operating system. A that represents this instance. Collection of goodness-of-fit measures. Gets the key for item. Multinomial Logistic Regression Analysis In statistics, multinomial logistic regression is a classification method that generalizes logistic regression to multiclass problems, i.e. with more than two possible discrete outcomes.[1] That is, it is a model that is used to predict the probabilities of the different possible outcomes of a categorically distributed dependent variable, given a set of independent variables (which may be real-valued, binary-valued, categorical-valued, etc.). Multinomial logistic regression is known by a variety of other names, including multiclass LR, multinomial regression,[2] softmax regression, multinomial logit, maximum entropy (MaxEnt) classifier, conditional maximum entropy model.para> References: Wikipedia contributors. "Multinomial logistic regression." Wikipedia, The Free Encyclopedia, 1st April, 2015. Available at: https://en.wikipedia.org/wiki/Multinomial_logistic_regression The first example shows how to reproduce a textbook example using categorical and categorical-with-baseline variables. Those variables can be transformed/factored to their respective representations using the class. However, please note that while this example uses features from the class, the use of this class is not required when learning a modoel. The second example shows how to learn a from the famous Fisher's Iris dataset. This example should demonstrate that filters are not required to successfully learn multinomial logistic regression analyses. Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. Source data used in the analysis. Source data used in the analysis. Gets the dependent variable value for each of the source input points. Gets the dependent variable value for each of the source input points. Gets the resulting values obtained by the regression model. Gets or sets the maximum number of iterations to be performed by the regression algorithm. Default is 50. Gets or sets the difference between two iterations of the regression algorithm when the algorithm should stop. The difference is calculated based on the largest absolute parameter change of the regression. Default is 1e-5. Gets the number of outputs in the regression problem. Gets the Standard Error for each coefficient found during the logistic regression. Gets the Regression model created and evaluated by this analysis. Gets the value of each coefficient. Gets the Log-Likelihood for the model. Gets the Chi-Square (Likelihood Ratio) Test for the model. Gets the Deviance of the model. Gets the Wald Tests for each coefficient. Obsolete. Please use instead. Gets or sets the name of the input variables for the model. Gets or sets the name of the output variable for the model. Gets the Confidence Intervals (C.I.) for each coefficient found in the regression. Gets the collection of coefficients of the model. Constructs a Multinomial Logistic Regression Analysis. The input data for the analysis. The output data for the analysis. Constructs a Multinomial Logistic Regression Analysis. The input data for the analysis. The output data for the analysis. Constructs a Multiple Linear Regression Analysis. The input data for the analysis. The output data for the analysis. The names of the input variables. The names of the output variables. Constructs a Multiple Linear Regression Analysis. The input data for the analysis. The output data for the analysis. The names of the input variables. The names of the output variables. Constructs a Multiple Linear Regression Analysis. The names of the input variables. The names of the output variables. Constructs a Multiple Linear Regression Analysis. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Computes the Multinomial Logistic Regression Analysis. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Represents a Multinomial Logistic Regression coefficient found in the multinomial logistic regression analysis allowing it to be bound to controls like the DataGridView. This class cannot be instantiated. Creates a regression coefficient representation. The analysis to which this coefficient belongs. The coefficient's index. The coefficient's category. Gets the Index of this coefficient on the original analysis coefficient collection. Returns a reference to the parent analysis object. Gets the name of the category that this coefficient belongs to. Gets the name for the current coefficient. Gets the coefficient value. Gets the Standard Error for the current coefficient. Gets the confidence interval (C.I.) for the current coefficient. Gets the upper limit for the confidence interval. Gets the lower limit for the confidence interval. Returns a that represents this instance. A that represents this instance. Represents a Collection of Multinomial Logistic Regression Coefficients found in the . This class cannot be instantiated. Weighted confusion matrix for multi-class decision problems. References: R. G. Congalton. A Review of Assessing the Accuracy of Classifications of Remotely Sensed Data. Available on: http://uwf.edu/zhu/evr6930/2.pdf G. Banko. A Review of Assessing the Accuracy of Classifications of Remotely Sensed Data and of Methods Including Remote Sensing Data in Forest Inventory. Interim report. Available on: http://www.iiasa.ac.at/Admin/PUB/Documents/IR-98-081.pdf Gets the Weights matrix. Creates a new Confusion Matrix. Creates a new Confusion Matrix. Creates a new Confusion Matrix. Gets the row marginals (proportions). Gets the column marginals (proportions). Gets the Kappa coefficient of performance. Gets the standard error of the coefficient of performance. Gets the variance of the coefficient of performance. Gets the variance of the under the null hypothesis that the underlying Kappa value is 0. Gets the standard error of the under the null hypothesis that the underlying Kappa value is 0. Overall agreement. Chance agreement. The chance agreement tells how many samples were correctly classified by chance alone. Creates a new Weighted Confusion Matrix with linear weighting. Creates a new Weighted Confusion Matrix with linear weighting. Class to perform a Procrustes Analysis Procrustes analysis is a form of statistical shape analysis used to analyze the distribution of a set of shapes. It allows to compare shapes (datasets) that have different rotations, scales and positions. It defines a measure called Procrustes distance that is an image of how different the shapes are. References: Wikipedia contributors. "Procrustes analysis" Wikipedia, The Free Encyclopedia, 21 Sept. 2015. Available at: https://en.wikipedia.org/wiki/Procrustes_analysis Amy Ross. "Procrustes Analysis" Available at : // This examples shows how to use the Procrustes Analysis on basic shapes // We're about to demonstrate that a diamond with four identical edges is also a square ! // Define a square double[,] square = { { 100, 100 }, { 300, 100 }, { 300, 300 }, { 100, 300 } }; // Define a diamond with different orientation and scale double[,] diamond = { { 170, 120 }, { 220, 170 }, { 270, 120 }, { 220, 70 } }; // Create the Procrustes analysis object ProcrustesAnalysis pa = new ProcrustesAnalysis(square, diamond); // Compute the analysis on the square and the diamond pa.Compute(); // Assert that the diamond is a square Debug.Assert(pa.ProcrustesDistances[0, 1].IsEqual(0.0, 1E-11)); // Transform the diamond to a square double[,] diamond_to_a_square = pa.ProcrustedDatasets[1].Transform(pa.ProcrustedDatasets[0]); Creates a Procrustes Analysis object using the given sample data Data containing multiple dataset to run the analysis on Creates an empty Procrustes Analysis object Source data given to run the analysis Applies the translation operator to translate the dataset to the zero coordinate The dataset to translate The translated dataset Applies the translation operator to translate the dataset to the given coordinate Dataset to translate New center of the dataset The translated dataset Applies the scale operator to scale the data to the unitary scale Dataset to scale The scaled dataset Calculates the scale of the given dataset Dataset to find the scale The scale of the dataset Applies the scale operator to scale the data to the given scale Dataset to scale Final scale of the output dataset Scaled dataset Applies the rotation operator to the given dataset according to the reference dataset Procrusted dataset to rotate Reference procrusted dataset The rotated dataset Updates the Procrustes Distances according to a set of Procrusted samples Procrusted samples Calculate the Procrustes Distance between two sets of data First data set Second data set The Procrustes distance Procrustes Distances of the computed samples Procrusted models produced from the computed sample data Apply Procrustes translation and scale to the given dataset Procrusted dataset to process and store the results to The dataset itself Compute the Procrustes analysis to extract Procrustes distances and models using the constructor parameters Compute the Procrustes analysis to extract Procrustes distances and models List of sample data sets to analyze Procrustes distances of the analyzed samples Compute the Procrustes analysis to extract Procrustes distances and models by specifying the reference dataset Index of the reference dataset. If out of bounds of the sample array, the first dataset is used. List of sample data sets to analyze Procrustes distances of the analyzed samples Class to represent an original dataset, its Procrustes form and all necessary data (i.e. rotation, center, scale...) Original dataset Procrustes dataset (i.e. original dataset after Procrustes analysis) Original dataset center Original dataset scale Original dataset rotation matrix Transforms the dataset to match the given reference original dataset Dataset to match The transformed dataset matched to the reference Cox's Proportional Hazards Survival Analysis. Proportional hazards models are a class of survival models in statistics. Survival models relate the time that passes before some event occurs to one or more covariates that may be associated with that quantity. In a proportional hazards model, the unique effect of a unit increase in a covariate is multiplicative with respect to the hazard rate. For example, taking a drug may halve one's hazard rate for a stroke occurring, or, changing the material from which a manufactured component is constructed may double its hazard rate for failure. Other types of survival models such as accelerated failure time models do not exhibit proportional hazards. These models could describe a situation such as a drug that reduces a subject's immediate risk of having a stroke, but where there is no reduction in the hazard rate after one year for subjects who do not have a stroke in the first year of analysis. This class uses the to extract more detailed information about a given problem, such as confidence intervals, hypothesis tests and performance measures. This class can also be bound to standard controls such as the DataGridView by setting their DataSource property to the analysis' property. The resulting table is shown below. Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. Constructs a new Cox's Proportional Hazards Analysis. The input data for the analysis. The output data for the analysis. The right-censoring indicative values. Constructs a new Cox's Proportional Hazards Analysis. The input data for the analysis. The output data for the analysis. The right-censoring indicative values. Constructs a new Cox's Proportional Hazards Analysis. The input data for the analysis. The output, binary data for the analysis. The right-censoring indicative values. The names of the input variables. The name of the time variable. The name of the event indication variable. Constructs a new Cox's Proportional Hazards Analysis. The input data for the analysis. The output data for the analysis. The right-censoring indicative values. Constructs a new Cox's Proportional Hazards Analysis. The input data for the analysis. The output data for the analysis. The right-censoring indicative values. Constructs a new Cox's Proportional Hazards Analysis. The input data for the analysis. The output, binary data for the analysis. The right-censoring indicative values. The names of the input variables. The name of the time variable. The name of the event indication variable. Constructs a new Cox's Proportional Hazards Analysis. The names of the input variables. The name of the time variable. The name of the event indication variable. Constructs a new Cox's Proportional Hazards Analysis. Gets or sets the maximum number of iterations to be performed by the regression algorithm. Default is 50. Gets or sets the difference between two iterations of the regression algorithm when the algorithm should stop. The difference is calculated based on the largest absolute parameter change of the regression. Default is 1e-5. Source data used in the analysis. Gets the time passed until the event occurred or until the observation was censored. Gets whether the event of interest happened or not. Gets the dependent variable value for each of the source input points. Gets the resulting probabilities obtained by the logistic regression model. Gets the Proportional Hazards model created and evaluated by this analysis. Gets the collection of coefficients of the model. Gets the Log-Likelihood for the model. Gets the Chi-Square (Likelihood Ratio) Test for the model. Gets the Deviance of the model. Gets or sets the name of the input variables for the model. Gets or sets the name of the output variable for the model. Gets or sets the name of event occurrence variable in the model. Gets the Hazard Ratio for each coefficient found during the proportional hazards. Gets the Standard Error for each coefficient found during the proportional hazards. Gets the Wald Tests for each coefficient. Gets the Likelihood-Ratio Tests for each coefficient. Gets the value of each coefficient. Gets the 95% Confidence Intervals (C.I.) for each coefficient found in the regression. Gets the Log-Likelihood Ratio between this model and another model. Another proportional hazards model. The Likelihood-Ratio between the two models. Computes the Proportional Hazards Analysis for an already computed regression. Computes the Proportional Hazards Analysis. True if the model converged, false otherwise. Computes the Proportional Hazards Analysis. The difference between two iterations of the regression algorithm when the algorithm should stop. If not specified, the value of 1e-4 will be used. The difference is calculated based on the largest absolute parameter change of the regression. The maximum number of iterations to be performed by the regression algorithm. True if the model converged, false otherwise. Learns a model that can map the given inputs to the given outputs. The model inputs. The output (event) associated with each input vector. The time-to-event for the non-censored training samples. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given and . Learns a model that can map the given inputs to the given outputs. The model inputs. The output (event) associated with each input vector. The time-to-event for the non-censored training samples. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given and . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Represents a Proportional Hazards Coefficient found in the Cox's Hazards model, allowing it to be bound to controls like the DataGridView. This class cannot be instantiated outside the . Gets the name for the current coefficient. Gets the Odds ratio for the current coefficient. Gets the Standard Error for the current coefficient. Gets the 95% confidence interval (C.I.) for the current coefficient. Gets the upper limit for the 95% confidence interval. Gets the lower limit for the 95% confidence interval. Gets the coefficient value. Gets the Wald's test performed for this coefficient. Gets the Likelihood-Ratio test performed for this coefficient. Represents a collection of Hazard Coefficients found in the . This class cannot be instantiated. Multiple Linear Regression Analysis Linear regression is an approach to model the relationship between a single scalar dependent variable y and one or more explanatory variables x. This class uses a to extract information about a given problem, such as confidence intervals, hypothesis tests and performance measures. This class can also be bound to standard controls such as the DataGridView by setting their DataSource property to the analysis' property. References: Wikipedia contributors. "Linear regression." Wikipedia, The Free Encyclopedia, 4 Nov. 2012. Available at: http://en.wikipedia.org/wiki/Linear_regression The first example shows how to learn a multiple linear regression analysis from a dataset already given in matricial form (using jagged double[][] arrays). // Now we can show a summary of analysis // Accord.Controls.DataGridBox.Show(regression.Coefficients); // We can also show a summary ANOVA DataGridBox.Show(regression.Table); The second example shows how to learn a multiple linear regression analysis using data given in the form of a System.Data.DataTable. This data is also heterogeneous, mixing both discrete (symbol) variables and continuous variables. This example is also available for . Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. Source data used in the analysis. Source data used in the analysis. Gets the dependent variable value for each of the source input points. Gets the resulting values obtained by the linear regression model. Gets or sets the learning algorithm used to learn the . Gets the standard deviation of the errors. Gets the information matrix obtained during learning. Gets the coefficient of determination, as known as R² Gets the number of samples used to compute the analysis. Gets the adjusted coefficient of determination, as known as R² adjusted Gets a F-Test between the expected outputs and results. Gets a Z-Test between the expected outputs and the results. Gets a Chi-Square Test between the expected outputs and the results. Gets the Standard Error for each coefficient found during the logistic regression. Gets the Regression model created and evaluated by this analysis. Gets the value of each coefficient. Gets or sets the name of the input variables for the model. Gets or sets the name of the output variable for the model. Gets the Confidence Intervals (C.I.) for each coefficient found in the regression. Gets the ANOVA table for the analysis. Gets the collection of coefficients of the model. Constructs a Multiple Linear Regression Analysis. True to use an intercept term, false otherwise. Default is false. Constructs a Multiple Linear Regression Analysis. The input data for the analysis. The output data for the analysis. True to use an intercept term, false otherwise. Default is false. Constructs a Multiple Linear Regression Analysis. The input data for the analysis. The output data for the analysis. True to use an intercept term, false otherwise. Default is false. The names of the input variables. The name of the output variable. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Computes the Multiple Linear Regression Analysis. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Gets the prediction interval for a given input. Gets the confidence interval for a given input. Represents a Linear Regression coefficient found in the Multiple Linear Regression Analysis allowing it to be bound to controls like the DataGridView. This class cannot be instantiated. Creates a regression coefficient representation. The analysis to which this coefficient belongs. The coefficient index. Gets the Index of this coefficient on the original analysis coefficient collection. Returns a reference to the parent analysis object. Gets the name for the current coefficient. Gets a value indicating whether this coefficient is an intercept term. true if this coefficient is the intercept; otherwise, false. Gets the coefficient value. Gets the Standard Error for the current coefficient. Gets the T-test performed for this coefficient. Gets the F-test performed for this coefficient. Gets the confidence interval (C.I.) for the current coefficient. Gets the upper limit for the confidence interval. Gets the lower limit for the confidence interval. Represents a Collection of Linear Regression Coefficients found in the . This class cannot be instantiated. Gets or sets the size of the confidence intervals reported for the coefficients. Default is 0.95. Binary confusion matrix for binary decision problems. For multi-class decision problems, please see . References: R. G. Congalton. A Review of Assessing the Accuracy of Classifications of Remotely Sensed Data. Available on: http://uwf.edu/zhu/evr6930/2.pdf G. Banko. A Review of Assessing the Accuracy of Classifications of Remotely Sensed Data and of Methods Including Remote Sensing Data in Forest Inventory. Interim report. Available on: http://www.iiasa.ac.at/Admin/PUB/Documents/IR-98-081.pdf The first example shows how a confusion matrix can be constructed from a vector of expected (ground-truth) values and their associated predictions (as done by a test, procedure or machine learning classifier): The second example shows how to construct a binary confusion matrix directly from a classifier and a dataset: Constructs a new Confusion Matrix. Constructs a new Confusion Matrix. Constructs a new Confusion Matrix. The values predicted by the model. The actual, truth values from the data. Constructs a new Confusion Matrix. The values predicted by the model. The actual, truth values from the data. Constructs a new Confusion Matrix. The values predicted by the model. The actual, truth values from the data. The integer label which identifies a value as positive. Constructs a new Confusion Matrix. The values predicted by the model. The actual, truth values from the data. The integer label which identifies a value as positive. The integer label which identifies a value as negative. Gets the confusion matrix in count matrix form. The table is listed as true positives, false negatives on its first row, false positives and true negatives in its second row. Gets the marginal sums for table rows. Returns a vector with the sum of true positives and false negatives on its first position, and the sum of false positives and true negatives on the second. Gets the marginal sums for table columns. Returns a vector with the sum of true positives and false positives on its first position, and the sum of false negatives and true negatives on the second. Gets the number of observations for this matrix. Gets the number of classes in this decision problem. Gets the number of observations for this matrix Gets the number of actual positives. The number of positives cases can be computed by taking the sum of true positives and false negatives. Gets the number of actual negatives The number of negatives cases can be computed by taking the sum of true negatives and false positives. Gets the number of predicted positives. The number of cases predicted as positive by the test. This value can be computed by adding the true positives and false positives. Gets the number of predicted negatives. The number of cases predicted as negative by the test. This value can be computed by adding the true negatives and false negatives. Cases correctly identified by the system as positives. Cases correctly identified by the system as negatives. Cases incorrectly identified by the system as positives. Cases incorrectly identified by the system as negatives. Sensitivity, also known as True Positive Rate The Sensitivity is calculated as TPR = TP / (TP + FN). Specificity, also known as True Negative Rate The Specificity is calculated as TNR = TN / (FP + TN). It can also be calculated as: TNR = (1-False Positive Rate). Efficiency, the arithmetic mean of sensitivity and specificity Gets the number of errors between the expected and predicted values. Gets the number of hits between the expected and predicted values. Accuracy, or raw performance of the system The Accuracy is calculated as ACC = (TP + TN) / (P + N). Error rate, or 1 - accuracy. The Error is calculated as ACC = (FP + FN) / (P + N). Prevalence of outcome occurrence. Gets the diagonal of the confusion matrix. Positive Predictive Value, also known as Positive Precision The Positive Predictive Value tells us how likely is that a patient has a disease, given that the test for this disease is positive. The Positive Predictive Rate is calculated as PPV = TP / (TP + FP). Negative Predictive Value, also known as Negative Precision The Negative Predictive Value tells us how likely it is that the disease is NOT present for a patient, given that the patient's test for the disease is negative. The Negative Predictive Value is calculated as NPV = TN / (TN + FN). False Positive Rate, also known as false alarm rate. The rate of false alarms in a test. The False Positive Rate can be calculated as FPR = FP / (FP + TN) or FPR = (1-specificity). False Discovery Rate, or the expected false positive rate. The False Discovery Rate is actually the expected false positive rate. For example, if 1000 observations were experimentally predicted to be different, and a maximum FDR for these observations was 0.10, then 100 of these observations would be expected to be false positives. The False Discovery Rate is calculated as FDR = FP / (FP + TP). Matthews Correlation Coefficient, also known as Phi coefficient A coefficient of +1 represents a perfect prediction, 0 an average random prediction and −1 an inverse prediction. Pearson's contingency coefficient C. Pearson's C measures the degree of association between the two variables. However, C suffers from the disadvantage that it does not reach a maximum of 1 or the minimum of -1; the highest it can reach in a 2 x 2 table is .707; the maximum it can reach in a 4 × 4 table is 0.870. It can reach values closer to 1 in contingency tables with more categories. It should, therefore, not be used to compare associations among tables with different numbers of categories. References: http://en.wikipedia.org/wiki/Contingency_table Geometric agreement. The geometric agreement is the geometric mean of the diagonal elements of the confusion matrix. Odds-ratio. References: http://www.iph.ufrgs.br/corpodocente/marques/cd/rd/presabs.htm Overall agreement. The overall agreement is the sum of the diagonal elements of the contingency table divided by the number of samples. Chance agreement. The chance agreement tells how many samples were correctly classified by chance alone. Kappa coefficient. References: http://www.iph.ufrgs.br/corpodocente/marques/cd/rd/presabs.htm Gets the standard error of the coefficient of performance. Gets the variance of the coefficient of performance. Gets the variance of the under the null hypothesis that the underlying Kappa value is 0. Gets the standard error of the under the null hypothesis that the underlying Kappa value is 0. Diagnostic power. Normalized Mutual Information. Precision, same as the . Recall, same as the . F-Score, computed as the harmonic mean of and . Expected values, or values that could have been generated just by chance. Gets the Chi-Square statistic for the contingency table. Returns a representing this confusion matrix. A representing this confusion matrix. Converts this matrix into a . A with the same contents as this matrix. Combines several confusion matrices into one single matrix. The matrices to combine. Estimates a directly from a classifier, a set of inputs and its expected outputs. The type of the inputs accepted by the classifier. The classifier. The input vectors. The expected outputs associated with each input vector. A capturing the performance of the classifier when trying to predict the outputs from the . Estimates a directly from a classifier, a set of inputs and its expected outputs. The type of the inputs accepted by the classifier. The classifier. The input vectors. The expected outputs associated with each input vector. A capturing the performance of the classifier when trying to predict the outputs from the . General confusion matrix for multi-class decision problems. For binary problems, please see . References: R. G. Congalton. A Review of Assessing the Accuracy of Classifications of Remotely Sensed Data. Available on: http://uwf.edu/zhu/evr6930/2.pdf G. Banko. A Review of Assessing the Accuracy of Classifications of Remotely Sensed Data and of Methods Including Remote Sensing Data in Forest Inventory. Interim report. Available on: http://www.iiasa.ac.at/Admin/PUB/Documents/IR-98-081.pdf The first example shows how a confusion matrix can be constructed from a vector of expected (ground-truth) values and their associated predictions (as done by a test, procedure or machine learning classifier): The second example shows how to construct a general confusion matrix directly from a integer matrix with the class assignments: The third example shows how to construct a general confusion matrix directly from a classifier and a dataset: The next examples reproduce the results shown in various papers, with special attention to the multiple ways of computing the for the statistic: Gets or sets the title that ought be displayed on top of the columns of this . Default is "Expected (Ground-truth)". Gets or sets the title that ought be displayed on left side of this . Default is "Actual (Prediction)". Gets the confusion matrix, in which each element e_ij represents the number of elements from class i classified as belonging to class j. Gets the number of samples. Gets the number of classes. Obsolete. Please use instead. Obsolete. Please use instead. Creates a new Confusion Matrix. Creates a new Confusion Matrix. Creates a new Confusion Matrix. Creates a new Confusion Matrix. Gets the row totals. Gets the column totals. Gets the row errors. Gets the col errors. Gets the row marginals (proportions). Gets the column marginals (proportions). Gets the row precision. Gets the column recall. Gets the diagonal of the confusion matrix. Gets the maximum number of correct matches (the maximum over the diagonal) Gets the minimum number of correct matches (the minimum over the diagonal) Gets the confusion matrix in terms of cell percentages. Gets the Kappa coefficient of performance. Gets the standard error of the coefficient of performance. Gets the variance of the coefficient of performance. Gets the variance of the coefficient of performance using Congalton's delta method. Gets the variance of the under the null hypothesis that the underlying Kappa value is 0. Gets the standard error of the under the null hypothesis that the underlying Kappa value is 0. Gets the Tau coefficient of performance. Tau-b statistic, unlike tau-a, makes adjustments for ties and is suitable for square tables. Values of tau-b range from −1 (100% negative association, or perfect inversion) to +1 (100% positive association, or perfect agreement). A value of zero indicates the absence of association. References: http://en.wikipedia.org/wiki/Kendall_tau_rank_correlation_coefficient LEVADA, Alexandre Luis Magalhães. Combinação de modelos de campos aleatórios markovianos para classificação contextual de imagens multiespectrais [online]. São Carlos : Instituto de Física de São Carlos, Universidade de São Paulo, 2010. Tese de Doutorado em Física Aplicada. Disponível em: http://www.teses.usp.br/teses/disponiveis/76/76132/tde-11052010-165642/. MA, Z.; REDMOND, R. L. Tau coefficients for accuracy assessment of classification of remote sensing data. Phi coefficient. The Pearson correlation coefficient (phi) ranges from −1 to +1, where a value of +1 indicates perfect agreement, a value of -1 indicates perfect disagreement and a value 0 indicates no agreement or relationship. References: http://en.wikipedia.org/wiki/Phi_coefficient, http://www.psychstat.missouristate.edu/introbook/sbk28m.htm Gets the Chi-Square statistic for the contingency table. Tschuprow's T association measure. Tschuprow's T is a measure of association between two nominal variables, giving a value between 0 and 1 (inclusive). It is closely related to Cramér's V, coinciding with it for square contingency tables. References: http://en.wikipedia.org/wiki/Tschuprow's_T Pearson's contingency coefficient C. Pearson's C measures the degree of association between the two variables. However, C suffers from the disadvantage that it does not reach a maximum of 1 or the minimum of -1; the highest it can reach in a 2 x 2 table is .707; the maximum it can reach in a 4 × 4 table is 0.870. It can reach values closer to 1 in contingency tables with more categories. It should, therefore, not be used to compare associations among tables with different numbers of categories. For a improved version of C, see . References: http://en.wikipedia.org/wiki/Contingency_table Sakoda's contingency coefficient V. Sakoda's V is an adjusted version of Pearson's C so it reaches a maximum of 1 when there is complete association in a table of any number of rows and columns. References: http://en.wikipedia.org/wiki/Contingency_table Cramer's V association measure. Cramér's V varies from 0 (corresponding to no association between the variables) to 1 (complete association) and can reach 1 only when the two variables are equal to each other. In practice, a value of 0.1 already provides a good indication that there is substantive relationship between the two variables. References: http://en.wikipedia.org/wiki/Cram%C3%A9r%27s_V, http://www.acastat.com/Statbook/chisqassoc.htm Overall agreement. The overall agreement is the sum of the diagonal elements of the contingency table divided by the number of samples. Accuracy. This is the same value as . The accuracy, or . Error. This is the same value as 1.0 - . The average error, or 1.0 - . Geometric agreement. The geometric agreement is the geometric mean of the diagonal elements of the confusion matrix. Chance agreement. The chance agreement tells how many samples were correctly classified by chance alone. Expected values, or values that could have been generated just by chance. Gets binary confusion matrices for each class in the multi-class classification problem. You can use this property to obtain recall, precision and other metrics for each of the classes. Combines several confusion matrices into one single matrix. The matrices to combine. Estimates a directly from a classifier, a set of inputs and its expected outputs. The type of the inputs accepted by the classifier. The classifier. The input vectors. The expected outputs associated with each input vector. A capturing the performance of the classifier when trying to predict the outputs from the . Estimates a directly from a classifier, a set of inputs and its expected outputs. The type of the inputs accepted by the classifier. The classifier. The input vectors. The expected outputs associated with each input vector. A capturing the performance of the classifier when trying to predict the outputs from the . Descriptive statistics analysis. Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample and the measures. Together with simple graphics analysis, they form the basis of virtually every quantitative analysis of data. This class can also be bound to standard controls such as the DataGridView by setting their DataSource property to the analysis' property. References: Wikipedia, The Free Encyclopedia. Descriptive Statistics. Available on: http://en.wikipedia.org/wiki/Descriptive_statistics Constructs the Descriptive Analysis. Constructs the Descriptive Analysis. Names for the analyzed variables. Constructs the Descriptive Analysis. The source data to perform analysis. Constructs the Descriptive Analysis. The source data to perform analysis. Constructs the Descriptive Analysis. The source data to perform analysis. Names for the analyzed variables. Constructs the Descriptive Analysis. The source data to perform analysis. Constructs the Descriptive Analysis. The source data to perform analysis. Names for the analyzed variables. Computes the analysis using given source data and parameters. Learns a model that can map the given inputs to the desired outputs. The model inputs. A model that has learned how to produce suitable outputs given the input data . Gets or sets whether the properties of this class should be computed only when necessary. If set to true, a copy of the input data will be maintained inside an instance of this class, using more memory. Gets the source matrix from which the analysis was run. Gets the source matrix from which the analysis was run. Gets or sets the method to be used when computing quantiles (median and quartiles). The quantile method. Gets the column names from the variables in the data. Gets the mean subtracted data. Gets the mean subtracted and deviation divided data. Also known as Z-Scores. Gets the Covariance Matrix Gets the Correlation Matrix Gets a vector containing the Mean of each data column. Gets a vector containing the Standard Deviation of each data column. Gets a vector containing the Standard Error of the Mean of each data column. Gets the 95% confidence intervals for the . Gets the 95% deviance intervals for the . A deviance interval uses the standard deviation rather than the standard error to compute the range interval for a variable. Gets a vector containing the Mode of each data column. Gets a vector containing the Median of each data column. Gets a vector containing the Variance of each data column. Gets a vector containing the number of distinct elements for each data column. Gets an array containing the Ranges of each data column. Gets an array containing the interquartile range of each data column. Gets an array containing the inner fences of each data column. Gets an array containing the outer fences of each data column. Gets an array containing the sum of each data column. Gets an array containing the skewness for of each data column. Gets an array containing the kurtosis for of each data column. Gets the number of samples (or observations) in the data. Gets the number of variables (or features) in the data. Gets a collection of DescriptiveMeasures objects that can be bound to a DataGridView. Gets a confidence interval for the within the given confidence level percentage. The confidence level. Default is 0.95. The index of the data column whose confidence interval should be calculated. A confidence interval for the estimated value. Gets a deviance interval for the within the given confidence level percentage (i.e. uses the standard deviation rather than the standard error to compute the range interval for the variable). The confidence level. Default is 0.95. The index of the data column whose confidence interval should be calculated. A confidence interval for the estimated value. Descriptive measures for a variable. Gets the descriptive analysis that originated this measure. Gets the variable's index. Gets the variable's name Gets the variable's total sum. Gets the variable's mean. Gets the variable's standard deviation. Gets the variable's median. Gets the variable's outer fences range. Gets the variable's inner fence range. Gets the variable's interquartile range. Gets the variable's mode. Gets the variable's variance. Gets the variable's skewness. Gets the variable's kurtosis. Gets the variable's standard error of the mean. Gets the variable's maximum value. Gets the variable's minimum value. Gets the variable's length. Gets the number of distinct values for the variable. Gets the number of samples for the variable. Gets the 95% confidence interval around the . Gets the 95% deviance interval around the . Gets the variable's observations. Gets a confidence interval for the within the given confidence level percentage. The confidence level. Default is 0.95. A confidence interval for the estimated value. Gets a deviance interval for the within the given confidence level percentage (i.e. uses the standard deviation rather than the standard error to compute the range interval for the variable). The confidence level. Default is 0.95. A confidence interval for the estimated value. Collection of descriptive measures. Gets the key for item. FastICA's algorithms to be used in Independent Component Analysis. Deflation algorithm. In the deflation algorithm, components are found one at a time through a series of sequential operations. It is particularly useful when only a small number of components should be computed from the input data set. Symmetric parallel algorithm (default). In the parallel (symmetric) algorithm, all components are computed at once. This is the default algorithm for Independent Component Analysis. Independent Component Analysis (ICA). Independent Component Analysis is a computational method for separating a multivariate signal (or mixture) into its additive subcomponents, supposing the mutual statistical independence of the non-Gaussian source signals. When the independence assumption is correct, blind ICA separation of a mixed signal gives very good results. It is also used for signals that are not supposed to be generated by a mixing for analysis purposes. A simple application of ICA is the "cocktail party problem", where the underlying speech signals are separated from a sample data consisting of people talking simultaneously in a room. Usually the problem is simplified by assuming no time delays or echoes. An important note to consider is that if N sources are present, at least N observations (e.g. microphones) are needed to get the original signals. References: Hyvärinen, A (1999). Fast and Robust Fixed-Point Algorithms for Independent Component Analysis. IEEE Transactions on Neural Networks, 10(3),626-634. Available on: http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.50.4731 E. Bingham and A. Hyvärinen A fast fixed-point algorithm for independent component analysis of complex-valued signals. Int. J. of Neural Systems, 10(1):1-8, 2000. FastICA: FastICA Algorithms to perform ICA and Projection Pursuit. Available on: http://cran.r-project.org/web/packages/fastICA/index.html Wikipedia, The Free Encyclopedia. Independent component analysis. Available on: http://en.wikipedia.org/wiki/Independent_component_analysis Constructs a new Independent Component Analysis. The source data to perform analysis. The matrix should contain variables as columns and observations of each variable as rows. Constructs a new Independent Component Analysis. The source data to perform analysis. The matrix should contain variables as columns and observations of each variable as rows. The FastICA algorithm to be used in the analysis. Default is . Constructs a new Independent Component Analysis. The source data to perform analysis. The matrix should contain variables as columns and observations of each variable as rows. The analysis method to perform. Default is . Constructs a new Independent Component Analysis. The source data to perform analysis. The matrix should contain variables as columns and observations of each variable as rows. The analysis method to perform. Default is . The FastICA algorithm to be used in the analysis. Default is . Constructs a new Independent Component Analysis. The source data to perform analysis. The matrix should contain variables as columns and observations of each variable as rows. The analysis method to perform. Default is . The FastICA algorithm to be used in the analysis. Default is . Constructs a new Independent Component Analysis. Gets or sets the parallelization options for this algorithm. Gets or sets a cancellation token that can be used to cancel the algorithm while it is running. Source data used in the analysis. Gets or sets the maximum number of iterations to perform. If zero, the method will run until convergence. The iterations. Gets or sets the maximum absolute change in parameters between iterations that determine convergence. Gets the resulting projection of the source data given on the creation of the analysis into the space spawned by independent components. The resulting projection in independent component space. Gets a matrix containing the mixing coefficients for the original source data being analyzed. Each column corresponds to an independent component. Gets a matrix containing the demixing coefficients for the original source data being analyzed. Each column corresponds to an independent component. Gets the whitening matrix used to transform the original data to have unit variance. Gets the Independent Components in a object-oriented structure. The collection of independent components. Gets or sets whether calculations will be performed overwriting data in the original source matrix, using less memory. Gets or sets the FastICA algorithm used by the analysis. Gets or sets the normalization method used for this analysis. Gets or sets the Contrast function to be used by the analysis. Gets the column means of the original data. Gets the column standard deviations of the original data. Computes the Independent Component Analysis algorithm. Computes the Independent Component Analysis algorithm. Separates a mixture into its components (demixing). Separates a mixture into its components (demixing). Combines components into a single mixture (mixing). Combines components into a single mixture (mixing). Deflation iterative algorithm. Returns a matrix in which each row contains the mixing coefficients for each component. Parallel (symmetric) iterative algorithm. Returns a matrix in which each row contains the mixing coefficients for each component. Computes the maximum absolute change between two members of a matrix. Computes the maximum absolute change between two members of a vector. Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . Obsolete Obsolete Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Represents an Independent Component found in the Independent Component Analysis, allowing it to be directly bound to controls like the DataGridView. Creates an independent component representation. The analysis to which this component belongs. The component index. Gets the Index of this component on the original component collection. Returns a reference to the parent analysis object. Gets the mixing vector for the current independent component. Gets the demixing vector for the current independent component. Gets the whitening factor for the current independent component. Represents a Collection of Independent Components found in the Independent Component Analysis. This class cannot be instantiated. Kernel (Fisher) Discriminant Analysis. Kernel (Fisher) discriminant analysis (kernel FDA) is a non-linear generalization of linear discriminant analysis (LDA) using techniques of kernel methods. Using a kernel, the originally linear operations of LDA are done in a reproducing kernel Hilbert space with a non-linear mapping. The algorithm used is a multi-class generalization of the original algorithm by Mika et al. in Fisher discriminant analysis with kernels (1999). This class can also be bound to standard controls such as the DataGridView by setting their DataSource property to the analysis' property. References: Mika et al, Fisher discriminant analysis with kernels (1999). Available on http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.35.9904 Gets a classification pipeline that can be used to classify new samples into one of the learned in this discriminant analysis. This pipeline is only available after a call to the method. Gets or sets the matrix of original values used to create this analysis. Those values are required to build kernel (Gram) matrices when classifying new samples. Constructs a new Kernel Discriminant Analysis object. The source data to perform analysis. The matrix should contain variables as columns and observations of each variable as rows. The labels for each observation row in the input matrix. The kernel to be used in the analysis. Constructs a new Kernel Discriminant Analysis object. The source data to perform analysis. The matrix should contain variables as columns and observations of each variable as rows. The labels for each observation row in the input matrix. The kernel to be used in the analysis. Constructs a new Kernel Discriminant Analysis object. Constructs a new Kernel Discriminant Analysis object. Gets or sets the Kernel used in the analysis. Gets or sets the regularization parameter to avoid non-singularities at the solution. Computes the Multi-Class Kernel Discriminant Analysis algorithm. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Classifies a new instance into one of the available classes. Classifies a new instance into one of the available classes. Classifies new instances into one of the available classes. Gets the output of the discriminant function for a given class. Standard regression and classification pipeline for . Gets or sets the first step in the pipeline. Gets or sets the second step in the pipeline. Computes a numerical score measuring the association between the given vector and each class. The input vector. An array where the result will be stored, avoiding unnecessary memory allocations. Computes a numerical score measuring the association between the given vector and a given . The input vector. The index of the class whose score will be computed. System.Double. Kernel Principal Component Analysis. Kernel principal component analysis (kernel PCA) is an extension of principal component analysis (PCA) using techniques of kernel methods. Using a kernel, the originally linear operations of PCA are done in a reproducing kernel Hilbert space with a non-linear mapping. This class can also be bound to standard controls such as the DataGridView by setting their DataSource property to the analysis' property. References: Heiko Hoffmann, Unsupervised Learning of Visuomotor Associations (Kernel PCA topic). PhD thesis. 2005. Available on: http://www.heikohoffmann.de/htmlthesis/hoffmann_diss.html James T. Kwok, Ivor W. Tsang. The Pre-Image Problem in Kernel Methods. 2003. Available on: http://www.hpl.hp.com/conferences/icml2003/papers/345.pdf The example below shows a typical usage of the analysis. We will be replicating the exact same example which can be found on the documentation page. However, while we will be using a kernel, any other kernel function could have been used. It is also possible to create a KPCA from a kernel matrix that already exists. The example below shows how this could be accomplished. Constructs the Kernel Principal Component Analysis. Constructs the Kernel Principal Component Analysis. The kernel to be used in the analysis. The analysis method to perform. True to center the data in feature space, false otherwise. Default is true. The maximum number of components that the analysis will be able to project data into. Whether to whiten the results or not. If set to true the generatred output will be normalized to have unit standard deviation. Constructs the Kernel Principal Component Analysis. The source data to perform analysis. The matrix should contain variables as columns and observations of each variable as rows. The kernel to be used in the analysis. The analysis method to perform. True to center the data in feature space, false otherwise. Default is true. Constructs the Kernel Principal Component Analysis. The source data to perform analysis. The matrix should contain variables as columns and observations of each variable as rows. The kernel to be used in the analysis. The analysis method to perform. True to center the data in feature space, false otherwise. Default is true. Constructs the Kernel Principal Component Analysis. The source data to perform analysis. The matrix should contain variables as columns and observations of each variable as rows. The kernel to be used in the analysis. The analysis method to perform. Constructs the Kernel Principal Component Analysis. The source data to perform analysis. The matrix should contain variables as columns and observations of each variable as rows. The kernel to be used in the analysis. The analysis method to perform. Constructs the Kernel Principal Component Analysis. The source data to perform analysis. The kernel to be used in the analysis. Constructs the Kernel Principal Component Analysis. The source data to perform analysis. The kernel to be used in the analysis. Gets or sets the Kernel used in the analysis. Gets or sets whether the points should be centered in feature space. Gets or sets the minimum variance proportion needed to keep a principal component. If set to zero, all components will be kept. Default is 0.001 (all components which contribute less than 0.001 to the variance in the data will be discarded). Gets or sets a boolean value indicating whether this analysis should store enough information to allow the reversion of the transformation to be computed. Set this to no in case you would like to store the analysis object to disk and you do not need to reverse a transformation after it has been computed. Computes the Kernel Principal Component Analysis algorithm. Learns a model that can map the given inputs to the desired outputs. The model inputs. A model that has learned how to produce suitable outputs given the input data . Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . Obsolete. Projects a given matrix into principal component space. The matrix to be projected. The matrix where to store the results. Reverts a set of projected data into it's original form. Complete reverse transformation is not always possible and is not even guaranteed to exist. This method works using a closed-form MDS approach as suggested by Kwok and Tsang. It is currently a direct implementation of the algorithm without any kind of optimization. Reference: - http://cmp.felk.cvut.cz/cmp/software/stprtool/manual/kernels/preimage/list/rbfpreimg3.html The kpca-transformed data. Reverts a set of projected data into it's original form. Complete reverse transformation is not always possible and is not even guaranteed to exist. This method works using a closed-form MDS approach as suggested by Kwok and Tsang. It is currently a direct implementation of the algorithm without any kind of optimization. Reference: - http://cmp.felk.cvut.cz/cmp/software/stprtool/manual/kernels/preimage/list/rbfpreimg3.html The kpca-transformed data. The number of nearest neighbors to use while constructing the pre-image. Reverts a set of projected data into it's original form. Complete reverse transformation is not always possible and is not even guaranteed to exist. This method works using a closed-form MDS approach as suggested by Kwok and Tsang. It is currently a direct implementation of the algorithm without any kind of optimization. Reference: - http://cmp.felk.cvut.cz/cmp/software/stprtool/manual/kernels/preimage/list/rbfpreimg3.html The kpca-transformed data. The number of nearest neighbors to use while constructing the pre-image. Linear Discriminant Analysis (LDA). Linear Discriminant Analysis (LDA) is a method of finding such a linear combination of variables which best separates two or more classes. In itself LDA is not a classification algorithm, although it makes use of class labels. However, the LDA result is mostly used as part of a linear classifier. The other alternative use is making a dimension reduction before using nonlinear classification algorithms. It should be noted that several similar techniques (differing in requirements to the sample) go together under the general name of Linear Discriminant Analysis. Described below is one of these techniques with only two requirements: The sample size shall exceed the number of variables, and Classes may overlap, but their centers shall be distant from each other. Moreover, LDA requires the following assumptions to be true: Independent subjects. Normality: the variance-covariance matrix of the predictors is the same in all groups. If the latter assumption is violated, it is common to use quadratic discriminant analysis in the same manner as linear discriminant analysis instead. This class can also be bound to standard controls such as the DataGridView by setting their DataSource property to the analysis' property. References: R. Gutierrez-Osuna, Linear Discriminant Analysis. Available on: http://research.cs.tamu.edu/prism/lectures/pr/pr_l10.pdf Gets a classification pipeline that can be used to classify new samples into one of the learned in this discriminant analysis. This pipeline is only available after a call to the method. Constructs a new Linear Discriminant Analysis object. The source data to perform analysis. The matrix should contain variables as columns and observations of each variable as rows. The labels for each observation row in the input matrix. Constructs a new Linear Discriminant Analysis object. The source data to perform analysis. The matrix should contain variables as columns and observations of each variable as rows. The labels for each observation row in the input matrix. Constructs a new Linear Discriminant Analysis object. Computes the Multi-Class Linear Discriminant Analysis algorithm. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. A location to store the output, avoiding unnecessary memory allocations. The output generated by applying this transformation to the given input. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Transform Classifies a new instance into one of the available classes. Classifies a new instance into one of the available classes. Classifies new instances into one of the available classes. Gets the output of the discriminant function for a given class. Standard regression and classification pipeline for . Gets or sets the first step in the pipeline. Gets or sets the second step in the pipeline. Computes a numerical score measuring the association between the given vector and each class. The input vector. An array where the result will be stored, avoiding unnecessary memory allocations. Computes a numerical score measuring the association between the given vector and a given . The input vector. The index of the class whose score will be computed. System.Double. Represents a class found during Discriminant Analysis, allowing it to be bound to controls like the DataGridView. This class cannot be instantiated. Creates a new Class representation Gets the Index of this class on the original analysis collection. Gets the number labeling this class. Gets the prevalence of the class on the original data set. Gets the class' mean vector. Gets the feature-space means of the projected data. Gets the class' standard deviation vector. Gets the Scatter matrix for this class. Gets the indices of the rows in the original data which belong to this class. Gets the subset of the original data spawned by this class. Gets the number of observations inside this class. Discriminant function for the class. Represents a discriminant factor found during Discriminant Analysis, allowing it to be bound to controls like the DataGridView. This class cannot be instantiated. Creates a new discriminant factor representation. Gets the index of this discriminant factor on the original analysis collection. Gets the Eigenvector for this discriminant factor. Gets the Eigenvalue for this discriminant factor. Gets the proportion, or amount of information explained by this discriminant factor. Gets the cumulative proportion of all discriminant factors until this factor. Represents a collection of Discriminants factors found in the Discriminant Analysis. This class cannot be instantiated. Represents a collection of classes found in the Discriminant Analysis. This class cannot be instantiated. Logistic Regression Analysis. The Logistic Regression Analysis tries to extract useful information about a logistic regression model. This class can also be bound to standard controls such as the DataGridView by setting their DataSource property to the analysis' property. References: E. F. Connor. Logistic Regression. Available on: http://userwww.sfsu.edu/~efc/classes/biol710/logistic/logisticreg.htm C. Shalizi. Logistic Regression and Newton's Method. Lecture notes. Available on: http://www.stat.cmu.edu/~cshalizi/350/lectures/26/lecture-26.pdf A. Storkey. Learning from Data: Learning Logistic Regressors. Available on: http://www.inf.ed.ac.uk/teaching/courses/lfd/lectures/logisticlearn-print.pdf The following example shows to create a Logistic regresion analysis using a full dataset composed of input vectors and a binary output vector. Each input vector has an associated label (1 or 0) in the output vector, where 1 represents a positive label (yes, or true) and 0 represents a negative label (no, or false). The resulting table is shown below. The analysis can also be created from data given in a summary form. Instead of having one input vector associated with one positive or negative label, each input vector is associated with the proportion of positive to negative labels in the original dataset. The last example shows how to learn a logistic regression analysis using data given in the form of a System.Data.DataTable. This data is also heterogeneous, mixing both discrete (symbol) variables and continuous variables. This example is also available for . Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. Constructs a Logistic Regression Analysis. The input data for the analysis. The output data for the analysis. Constructs a Logistic Regression Analysis. The input data for the analysis. The output data for the analysis. The weights associated with each input vector. Constructs a Logistic Regression Analysis. The input data for the analysis. The output, binary data for the analysis. The names of the input variables. The name of the output variable. Constructs a Logistic Regression Analysis. The input data for the analysis. The output, binary data for the analysis. The names of the input variables. The name of the output variable. The weights associated with each input vector. Constructs a Logistic Regression Analysis. Gets or sets the maximum number of iterations to be performed by the regression algorithm. Default is 50. Gets or sets the difference between two iterations of the regression algorithm when the algorithm should stop. The difference is calculated based on the largest absolute parameter change of the regression. Default is 1e-5. Gets or sets the regularization value to be added in the objective function. Default is 1e-10. Gets or sets whether nested models should be computed in order to calculate the likelihood-ratio test of each of the coefficients. Default is false. Gets the source matrix from which the analysis was run. Gets the source matrix from which the analysis was run. Gets the dependent variable value for each of the source input points. Gets the resulting probabilities obtained by the logistic regression model. Gets the sample weight associated with each input vector. Gets the Logistic Regression model created and evaluated by this analysis. Gets the collection of coefficients of the model. Gets the Log-Likelihood for the model. Gets the Chi-Square (Likelihood Ratio) Test for the model. Gets the Deviance of the model. Gets or sets the name of the input variables for the model. Gets or sets the name of the output variable for the model. Gets the Odds Ratio for each coefficient found during the logistic regression. Gets the Standard Error for each coefficient found during the logistic regression. Gets the Wald Tests for each coefficient. Gets the Likelihood-Ratio Tests for each coefficient. Since this operation might be potentially time-consuming, the likelihood-ratio tests will be computed on the first time this property is acessed. Gets the value of each coefficient. Gets the 95% Confidence Intervals (C.I.) for each coefficient found in the regression. Gets the number of samples used to compute the analysis. Gets the information matrix obtained during learning. Gets the Log-Likelihood Ratio between this model and another model. Another logistic regression model. The Likelihood-Ratio between the two models. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Computes the Logistic Regression Analysis. The likelihood surface for the logistic regression learning is convex, so there will be only one peak. Any local maxima will be also a global maxima. True if the model converged, false otherwise. Computes the Logistic Regression Analysis for an already computed regression. Computes the Logistic Regression Analysis. The likelihood surface for the logistic regression learning is convex, so there will be only one peak. Any local maxima will be also a global maxima. The difference between two iterations of the regression algorithm when the algorithm should stop. If not specified, the value of 1e-5 will be used. The difference is calculated based on the largest absolute parameter change of the regression. The maximum number of iterations to be performed by the regression algorithm. True if the model converged, false otherwise. Creates a new from summarized data. In summary data, instead of having a set of inputs and their associated outputs, we have the number of times an input vector had a positive label in the data set and how many times it had a negative label. The input data. The number of positives labels for each input vector. The number of negative labels for each input vector. A created from the given summary data. Learns a model that can map the given inputs to the given outputs. The model inputs. The number of positives labels for each input vector. The number of negative labels for each input vector. A created from the given summary data. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Gets the confidence interval for a given input. Gets the prediction interval for a given input. Represents a Logistic Regression Coefficient found in the Logistic Regression, allowing it to be bound to controls like the DataGridView. This class cannot be instantiated outside the . Gets the name for the current coefficient. Gets the Odds ratio for the current coefficient. Gets the Standard Error for the current coefficient. Gets the 95% confidence interval (C.I.) for the current coefficient. Gets the upper limit for the 95% confidence interval. Gets the lower limit for the 95% confidence interval. Gets the coefficient value. Gets the Wald's test performed for this coefficient. Gets the Likelihood-Ratio test performed for this coefficient. Since this operation might be potentially time-consuming, the likelihood-ratio tests will be computed on the first time this property is acessed. Returns a that represents this instance. A that represents this instance. Represents a collection of Logistic Coefficients found in the . This class cannot be instantiated. The PLS algorithm to use in the Partial Least Squares Analysis. Sijmen de Jong's SIMPLS algorithm. The SIMPLS algorithm is considerably faster than NIPALS, especially when the number of input variables increases; but gives slightly different results in the case of multiple outputs. Traditional NIPALS algorithm. Partial Least Squares Regression/Analysis (a.k.a Projection To Latent Structures) Partial least squares regression (PLS-regression) is a statistical method that bears some relation to principal components regression; instead of finding hyperplanes of maximum variance between the response and independent variables, it finds a linear regression model by projecting the predicted variables and the observable variables to a new space. Because both the X and Y data are projected to new spaces, the PLS family of methods are known as bilinear factor models. References: Abdi, H. (2010). Partial least square regression, projection on latent structure regression, PLS-Regression. Wiley Interdisciplinary Reviews: Computational Statistics, 2, 97-106. Available in: http://www.utdallas.edu/~herve/abdi-wireCS-PLS2010.pdf Abdi, H. (2007). Partial least square regression (PLS regression). In N.J. Salkind (Ed.): Encyclopedia of Measurement and Statistics. Thousand Oaks (CA): Sage. pp. 740-744. Resource available online in: http://www.utdallas.edu/~herve/Abdi-PLS-pretty.pdf Martin Anderson, "A comparison of nine PLS1 algorithms". Available on: http://onlinelibrary.wiley.com/doi/10.1002/cem.1248/pdf Mevik, B-H. Wehrens, R. (2007). The pls Package: Principal Component and Partial Least Squares Regression in R. Journal of Statistical Software, Volume 18, Issue 2. Resource available online in: http://www.jstatsoft.org/v18/i02/paper Garson, D. Partial Least Squares Regression (PLS). http://faculty.chass.ncsu.edu/garson/PA765/pls.htm De Jong, S. (1993). SIMPLS: an alternative approach to partial least squares regression. Chemometrics and Intelligent Laboratory Systems, 18: 251–263. http://dx.doi.org/10.1016/0169-7439(93)85002-X Rosipal, Roman and Nicole Kramer. (2006). Overview and Recent Advances in Partial Least Squares, in Subspace, Latent Structure and Feature Selection Techniques, pp 34–51. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.85.7735 Yi Cao. (2008). Partial Least-Squares and Discriminant Analysis: A tutorial and tool using PLS for discriminant analysis. Wikipedia contributors. Partial least squares regression. Wikipedia, The Free Encyclopedia; 2009. Available from: http://en.wikipedia.org/wiki/Partial_least_squares_regression. Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. Constructs a new Partial Least Squares Analysis. The input source data to perform analysis. The output source data to perform analysis. Constructs a new Partial Least Squares Analysis. The input source data to perform analysis. The output source data to perform analysis. The PLS algorithm to use in the analysis. Default is . Constructs a new Partial Least Squares Analysis. The input source data to perform analysis. The output source data to perform analysis. The analysis method to perform. Default is . The PLS algorithm to use in the analysis. Default is . Constructs a new Partial Least Squares Analysis. The analysis method to perform. Default is . The PLS algorithm to use in the analysis. Default is . Source data used in the analysis. Gets the dependent variables' values for each of the source input points. Gets information about independent (input) variables. Gets information about dependent (output) variables. Gets the Weight matrix obtained during the analysis. For the NIPALS algorithm this is the W matrix. For the SIMPLS algorithm this is the R matrix. Gets information about the factors discovered during the analysis in a object-oriented structure which can be data-bound directly to many controls. Gets or sets the PLS algorithm used by the analysis. Gets or sets the method used by this analysis. Gets the Variable Importance in Projection (VIP). This method has been implemented considering only PLS models fitted using the NIPALS algorithm containing a single response (output) variable. Gets or sets whether calculations will be performed overwriting data in the original source matrix, using less memory. Gets or sets the number of latent factors that can be considered in this model. Gets the number of inputs accepted by the model. The number of inputs. This property is read-only. Gets the number of outputs generated by the model. The number of outputs. This property is read-only. Gets the maximum number of latent factors that can be considered in this model. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Computes the Partial Least Squares Analysis. Computes the Partial Least Squares Analysis. The number of factors to compute. The number of factors should be a value between 1 and min(rows-1,cols) where rows and columns are the number of observations and variables in the input source data matrix. Projects a given set of inputs into latent space. Projects a given set of inputs into latent space. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Projects a given set of outputs into latent space. Projects a given set of outputs into latent space. Projects a given set of outputs into latent space. Projects a given set of outputs into latent space. Creates a Multivariate Linear Regression model using coefficients obtained by the Partial Least Squares. Creates a Multivariate Linear Regression model using coefficients obtained by the Partial Least Squares. Computes PLS parameters using NIPALS algorithm. The number of factors to compute. The mean-centered input values X. The mean-centered output values Y. The tolerance for convergence. The algorithm implementation follows the original paper by Hervé Abdi, with overall structure as suggested in Yi Cao's tutorial. References: Abdi, H. (2010). Partial least square regression, projection on latent structure regression, PLS-Regression. Wiley Interdisciplinary Reviews: Computational Statistics, 2, 97-106. Available in: http://www.utdallas.edu/~herve/abdi-wireCS-PLS2010.pdf Yi Cao. (2008). Partial Least-Squares and Discriminant Analysis: A tutorial and tool using PLS for discriminant analysis. Computes PLS parameters using SIMPLS algorithm. The number of factors to compute. The mean-centered input values X. The mean-centered output values Y. The algorithm implementation is based on the appendix code by Martin Anderson, with modifications for multiple output variables as given in the sources listed below. References: Martin Anderson, "A comparison of nine PLS1 algorithms". Available on: http://onlinelibrary.wiley.com/doi/10.1002/cem.1248/pdf Abdi, H. (2010). Partial least square regression, projection on latent structure regression, PLS-Regression. Wiley Interdisciplinary Reviews: Computational Statistics, 2, 97-106. Available from: http://www.utdallas.edu/~herve/abdi-wireCS-PLS2010.pdf StatSoft, Inc. (2012). Electronic Statistics Textbook: Partial Least Squares (PLS). Tulsa, OK: StatSoft. Available from: http://www.statsoft.com/textbook/partial-least-squares/#SIMPLS De Jong, S. (1993). SIMPLS: an alternative approach to partial least squares regression. Chemometrics and Intelligent Laboratory Systems, 18: 251–263. http://dx.doi.org/10.1016/0169-7439(93)85002-X Adjusts a data matrix, centering and standardizing its values using the already computed column's means and standard deviations. Adjusts a data matrix, centering and standardizing its values using the already computed column's means and standard deviations. Returns the index for the column with largest squared sum. Computes the variable importance in projection (VIP). A predictor factors matrix in which each row represents the importance of the variable in a projection considering the number of factors indicated by the column's index. References: Il-Gyo Chong, Chi-Hyuck Jun, Performance of some variable selection methods when multicollinearity is present, Chemometrics and Intelligent Laboratory Systems, Volume 78, Issues 1-2, 28 July 2005, Pages 103-112, ISSN 0169-7439, DOI: 10.1016/j.chemolab.2004.12.011. Represents a Partial Least Squares Factor found in the Partial Least Squares Analysis, allowing it to be directly bound to controls like the DataGridView. Creates a partial least squares factor representation. The analysis to which this component belongs. The component index. Gets the Index of this component on the original factor collection. Returns a reference to the parent analysis object. Gets the proportion of prediction variables variance explained by this factor. Gets the cumulative proportion of dependent variables variance explained by this factor. Gets the proportion of dependent variable variance explained by this factor. Gets the cumulative proportion of dependent variable variance explained by this factor. Gets the input variable's latent vectors for this factor. Gets the output variable's latent vectors for this factor. Gets the importance of each variable for the given component. Gets the proportion, or amount of information explained by this component. Gets the cumulative proportion of all discriminants until this component. Represents a Collection of Partial Least Squares Factors found in the Partial Least Squares Analysis. This class cannot be instantiated. Represents source variables used in Partial Least Squares Analysis. Can represent either input variables (predictor variables) or output variables (independent variables or regressors). Source data used in the analysis. Can be either input data or output data depending if the variables chosen are predictor variables or dependent variables, respectively. Gets the resulting projection (scores) of the source data into latent space. Can be either from input data or output data depending if the variables chosen are predictor variables or dependent variables, respectively. Gets the column means of the source data. Can be either from input data or output data, depending if the variables chosen are predictor variables or dependent variables, respectively. Gets the column standard deviations of the source data. Can be either from input data or output data, depending if the variables chosen are predictor variables or dependent variables, respectively. Gets the loadings (a.k.a factors or components) for the variables obtained during the analysis. Can be either from input data or output data, depending if the variables chosen are predictor variables or dependent variables, respectively. Gets the amount of variance explained by each latent factor. Can be either by input variables' latent factors or output variables' latent factors, depending if the variables chosen are predictor variables or dependent variables, respectively. Gets the cumulative variance explained by each latent factor. Can be either by input variables' latent factors or output variables' latent factors, depending if the variables chosen are predictor variables or dependent variables, respectively. Projects a given dataset into latent space. Can be either input variable's latent space or output variable's latent space, depending if the variables chosen are predictor variables or dependent variables, respectively. Projects a given dataset into latent space. Can be either input variable's latent space or output variable's latent space, depending if the variables chosen are predictor variables or dependent variables, respectively. Principal component analysis (PCA) is a technique used to reduce multidimensional data sets to lower dimensions for analysis. Principal Components Analysis or the Karhunen-Loève expansion is a classical method for dimensionality reduction or exploratory data analysis. Mathematically, PCA is a process that decomposes the covariance matrix of a matrix into two parts: Eigenvalues and column eigenvectors, whereas Singular Value Decomposition (SVD) decomposes a matrix per se into three parts: singular values, column eigenvectors, and row eigenvectors. The relationships between PCA and SVD lie in that the eigenvalues are the square of the singular values and the column vectors are the same for both. This class uses SVD on the data set which generally gives better numerical accuracy. This class can also be bound to standard controls such as the DataGridView by setting their DataSource property to the analysis' property. The example below shows a typical usage of the analysis. However, users often ask why the framework produces different values than other packages such as STATA or MATLAB. After the simple introductory example below, we will be exploring why those results are often different. A question often asked by users is "why my matrices have inverted signs" or "why my results differ from [another software]". In short, despite any differences, the results are most likely correct (unless you firmly believe you have found a bug; in this case, please fill in a bug report). The example below explores, in the same steps given in Lindsay's tutorial, anything that would cause any discrepancies between the results given by Accord.NET and results given by other softwares. Some users would like to analyze huge amounts of data. In this case, computing the SVD directly on the data could result in memory exceptions or excessive computing times. If your data's number of dimensions is much less than the number of observations (i.e. your matrix have less columns than rows) then it would be a better idea to summarize your data in the form of a covariance or correlation matrix and compute PCA using the EVD. The example below shows how to compute the analysis with covariance matrices only. Constructs a new Principal Component Analysis. The source data to perform analysis. The matrix should contain variables as columns and observations of each variable as rows. The analysis method to perform. Default is . Constructs a new Principal Component Analysis. The source data to perform analysis. The matrix should contain variables as columns and observations of each variable as rows. The analysis method to perform. Default is . Constructs a new Principal Component Analysis. The analysis method to perform. Default is . Whether to whiten the results or not. If set to true the generatred output will be normalized to have unit standard deviation. The maximum number of components that the analysis will be able to project data into. Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . Computes the Principal Component Analysis algorithm. Projects a given matrix into principal component space. The matrix to be projected. The array where to store the results. Reverts a set of projected data into it's original form. Complete reverse transformation is only possible if all components are present, and, if the data has been standardized, the original standard deviation and means of the original matrix are known. The pca transformed data. Reverts a set of projected data into it's original form. Complete reverse transformation is only possible if all components are present, and, if the data has been standardized, the original standard deviation and means of the original matrix are known. The pca transformed data. Adjusts a data matrix, centering and standardizing its values using the already computed column's means and standard deviations. Adjusts a data matrix, centering and standardizing its values using the already computed column's means and standard deviations. Constructs a new Principal Component Analysis from a Covariance matrix. This method may be more suitable to high dimensional problems in which the original data matrix may not fit in memory but the covariance matrix will. The mean vector for the source data. The covariance matrix of the data. Constructs a new Principal Component Analysis from a Correlation matrix. This method may be more suitable to high dimensional problems in which the original data matrix may not fit in memory but the covariance matrix will. The mean vector for the source data. The standard deviation vectors for the source data. The correlation matrix of the data. Constructs a new Principal Component Analysis from a Kernel (Gram) matrix. This method may be more suitable to high dimensional problems in which the original data matrix may not fit in memory but the covariance matrix will. The mean vector for the source data. The standard deviation vectors for the source data. The kernel matrix for the data. Reduces the dimensionality of a given matrix to the given number of . The matrix that should have its dimensionality reduced. The number of dimensions for the reduced matrix. Represents a Principal Component found in the Principal Component Analysis, allowing it to be bound to controls like the DataGridView. This class cannot be instantiated. Creates a principal component representation. The analysis to which this component belongs. The component index. Gets the Index of this component on the original analysis principal component collection. Returns a reference to the parent analysis object. Gets the proportion of data this component represents. Gets the cumulative proportion of data this component represents. If available, gets the Singular Value of this component found during the Analysis. Gets the Eigenvalue of this component found during the analysis. Gets the Eigenvector of this component. Represents a Collection of Principal Components found in the . This class cannot be instantiated. Methods for computing the area under Receiver-Operating Characteristic (ROC) curves (also known as the ROC AUC). Method of DeLong, E. R., D. M. DeLong, and D. L. Clarke-Pearson. 1988. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44:837–845. Method of Hanley, J.A. and McNeil, B.J. 1983. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 148:839-843. Receiver Operating Characteristic (ROC) Curve. In signal detection theory, a receiver operating characteristic (ROC), or simply ROC curve, is a graphical plot of the sensitivity vs. (1 − specificity) for a binary classifier system as its discrimination threshold is varied. This package does not attempt to fit a curve to the obtained points. It just computes the area under the ROC curve directly using the trapezoidal rule. Also note that the curve construction algorithm uses the convention that a higher test value represents a positive for a condition while computing sensitivity and specificity values. References: Wikipedia, The Free Encyclopedia. Receiver Operating Characteristic. Available on: http://en.wikipedia.org/wiki/Receiver_operating_characteristic Anaesthesist. The magnificent ROC. Available on: http://www.anaesthetist.com/mnm/stats/roc/Findex.htm The following example shows how to measure the accuracy of a binary classifier using a ROC curve. // This example shows how to measure the accuracy of a // binary classifier using a ROC curve. For this example, // we will be creating a Support Vector Machine trained // on the following training instances: double[][] inputs = { // Those are from class -1 new double[] { 2, 4, 0 }, new double[] { 5, 5, 1 }, new double[] { 4, 5, 0 }, new double[] { 2, 5, 5 }, new double[] { 4, 5, 1 }, new double[] { 4, 5, 0 }, new double[] { 6, 2, 0 }, new double[] { 4, 1, 0 }, // Those are from class +1 new double[] { 1, 4, 5 }, new double[] { 7, 5, 1 }, new double[] { 2, 6, 0 }, new double[] { 7, 4, 7 }, new double[] { 4, 5, 0 }, new double[] { 6, 2, 9 }, new double[] { 4, 1, 6 }, new double[] { 7, 2, 9 }, }; int[] outputs = { -1, -1, -1, -1, -1, -1, -1, -1, // fist eight from class -1 +1, +1, +1, +1, +1, +1, +1, +1 // last eight from class +1 }; // Next, we create a linear Support Vector Machine with 4 inputs SupportVectorMachine machine = new SupportVectorMachine(inputs: 3); // Create the sequential minimal optimization learning algorithm var smo = new SequentialMinimalOptimization(machine, inputs, outputs); // We learn the machine double error = smo.Run(); // And then extract its predicted labels double[] predicted = new double[inputs.Length]; for (int i = 0; i < predicted.Length; i++) predicted[i] = machine.Compute(inputs[i]); // At this point, the output vector contains the labels which // should have been assigned by the machine, and the predicted // vector contains the labels which have been actually assigned. // Create a new ROC curve to assess the performance of the model var roc = new ReceiverOperatingCharacteristic(outputs, predicted); roc.Compute(100); // Compute a ROC curve with 100 cut-off points // Generate a connected scatter plot for the ROC curve and show it on-screen ScatterplotBox.Show(roc.GetScatterplot(includeRandom: true), nonBlocking: true) .SetSymbolSize(0) // do not display data points .SetLinesVisible(true) // show lines connecting points .SetScaleTight(true) // tighten the scale to points .WaitForClose(); The resulting graph is shown below. Constructs a new Receiver Operating Characteristic model An array of binary values. Typically represented as 0 and 1, or -1 and 1, indicating negative and positive cases, respectively. The maximum value will be treated as the positive case, and the lowest as the negative. An array of continuous values trying to approximate the measurement array. Constructs a new Receiver Operating Characteristic model An array of binary values. Typically represented as 0 and 1, or -1 and 1, indicating negative and positive cases, respectively. The maximum value will be treated as the positive case, and the lowest as the negative. An array of continuous values trying to approximate the measurement array. Constructs a new Receiver Operating Characteristic model An array of binary values. Typically represented as 0 and 1, or -1 and 1, indicating negative and positive cases, respectively. The maximum value will be treated as the positive case, and the lowest as the negative. An array of continuous values trying to approximate the measurement array. Gets the points of the curve. Gets the number of actual positive cases. Gets the number of actual negative cases. Gets the number of cases (observations) being analyzed. Gets the area under this curve (AUC). Gets the standard error for the . Gets the variance of the curve's . Gets the ground truth values, or the values which should have been given by the test if it was perfect. Gets the actual values given by the test. Gets the actual test results for subjects which should have been labeled as positive. Gets the actual test results for subjects which should have been labeled as negative. Gets DeLong's pseudoaccuracies for the positive subjects. Gets DeLong's pseudoaccuracies for the negative subjects Computes a n-points ROC curve. Each point in the ROC curve will have a threshold increase of 1/npoints over the previous point, starting at zero. The number of points for the curve. Computes a ROC curve with 1/increment points The increment over the previous point for each point in the curve. Computes a ROC curve with 1/increment points The increment over the previous point for each point in the curve. True to force the inclusion of the (0,0) point, false otherwise. Default is false. Computes a ROC curve with the given increment points Computes a single point of a ROC curve using the given cutoff value. Generates a representing the ROC curve. True to include a plot of the random curve (a diagonal line going from lower left to upper right); false otherwise. Returns a that represents this curve. A that represents this curve. Calculates the area under the ROC curve using the trapezium method. The area under a ROC curve can never be less than 0.50. If the area is first calculated as less than 0.50, the definition of abnormal will be reversed from a higher test value to a lower test value. Saves the curve to a stream. The stream to which the curve is to be serialized. Loads a curve from a stream. The stream from which the curve is to be deserialized. The deserialized curve. Loads a curve from a file. The path to the file from which the curve is to be deserialized. The deserialized curve. Saves the curve to a stream. The path to the file to which the curve is to be serialized. Object to hold information about a Receiver Operating Characteristic Curve Point Constructs a new Receiver Operating Characteristic point. Gets the cutoff value (discrimination threshold) for this point. Returns a System.String that represents the current ReceiverOperatingCharacteristicPoint. Represents a Collection of Receiver Operating Characteristic (ROC) Curve points. This class cannot be instantiated. Gets the (1-specificity, sensitivity) values as (x,y) coordinates. An jagged double array where each element is a double[] vector with two positions; the first is the value for 1-specificity (x) and the second the value for sensitivity (y). Gets an array containing (1-specificity) values for each point in the curve. Gets an array containing (sensitivity) values for each point in the curve. Returns a that represents this instance. A that represents this instance. Backward Stepwise Logistic Regression Analysis. The Backward Stepwise regression is an exploratory analysis procedure, where the analysis begins with a full (saturated) model and at each step variables are eliminated from the model in a iterative fashion. Significance tests are performed after each removal to track which of the variables can be discarded safely without implying in degradation. When no more variables can be removed from the model without causing a significant loss in the model likelihood, the method can stop. // Suppose we have the following data about some patients. // The first variable is continuous and represent patient // age. The second variable is dichotomic and give whether // they smoke or not (this is completely fictional data). double[][] inputs = { // Age Smoking new double[] { 55, 0 }, // 1 new double[] { 28, 0 }, // 2 new double[] { 65, 1 }, // 3 new double[] { 46, 0 }, // 4 new double[] { 86, 1 }, // 5 new double[] { 56, 1 }, // 6 new double[] { 85, 0 }, // 7 new double[] { 33, 0 }, // 8 new double[] { 21, 1 }, // 9 new double[] { 42, 1 }, // 10 new double[] { 33, 0 }, // 11 new double[] { 20, 1 }, // 12 new double[] { 43, 1 }, // 13 new double[] { 31, 1 }, // 14 new double[] { 22, 1 }, // 15 new double[] { 43, 1 }, // 16 new double[] { 46, 0 }, // 17 new double[] { 86, 1 }, // 18 new double[] { 56, 1 }, // 19 new double[] { 55, 0 }, // 20 }; // Additionally, we also have information about whether // or not they those patients had lung cancer. The array // below gives 0 for those who did not, and 1 for those // who did. double[] output = { 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0 }; // Create a Stepwise Logistic Regression analysis var regression = new StepwiseLogisticRegressionAnalysis(inputs, output, new[] { "Age", "Smoking" }, "Cancer"); regression.Compute(); // compute the analysis. // The full model will be stored in the complete property: StepwiseLogisticRegressionModel full = regression.Complete; // The best model will be stored in the current property: StepwiseLogisticRegressionModel best = regression.Current; // Let's check the full model results DataGridBox.Show(full.Coefficients); // We can see only the Smoking variable is statistically significant. // This is an indication the Age variable could be discarded from // the model. // And check the best inner model result DataGridBox.Show(best.Coefficients); // This is the best nested model found. This model only has the // Smoking variable, which is still significant. Since no other // variables can be dropped, this is the best final model. // The variables used in the current best model are string[] inputVariableNames = best.Inputs; // Smoking // The best model likelihood ratio p-value is ChiSquareTest test = best.ChiSquare; // {0.816990081334823} // so the model is distinguishable from a null model. We can also // query the other nested models by checking the Nested property: DataGridBox.Show(regression.Nested); // Finally, we can also use the analysis to classify a new patient double y = regression.Current.Regression.Compute(new double[] { 1 }); // For a smoking person, the answer probability is approximately 83%. Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. Constructs a Stepwise Logistic Regression Analysis. The input data for the analysis. The output data for the analysis. Constructs a Stepwise Logistic Regression Analysis. The input data for the analysis. The output data for the analysis. The names for the input variables. The name for the output variable. Gets or sets the maximum number of iterations to be performed by the regression algorithm. Default is 50. Gets or sets the difference between two iterations of the regression algorithm when the algorithm should stop. The difference is calculated based on the largest absolute parameter change of the regression. Default is 1e-5. Source data used in the analysis. Gets the dependent variable value for each of the source input points. Gets the resulting probabilities obtained by the most likely logistic regression model. Gets the current best nested model. Gets the full model. Gets the collection of nested models obtained after a step of the backward stepwise procedure. Gets or sets the name of the input variables. Gets or sets the name of the output variables. Gets or sets the significance threshold used to determine if a nested model is significant or not. Gets the final set of input variables indices as selected by the stepwise procedure. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Computes the Stepwise Logistic Regression. Computes one step of the Stepwise Logistic Regression Analysis. Returns the index of the variable discarded in the step or -1 in case no variable could be discarded. Fits a logistic regression model to data until convergence. Stepwise Logistic Regression Nested Model. Gets information about the regression model coefficients in a object-oriented structure. Gets the Stepwise Logistic Regression Analysis from which this model belongs to. Gets the regression model. Gets the subset of the original variables used by the model. Gets the name of the variables used in this model combined as a single string. Gets the Chi-Square Likelihood Ratio test for the model. Gets the subset of the original variables used by the model. Gets the Odds Ratio for each coefficient found during the logistic regression. Gets the Standard Error for each coefficient found during the logistic regression. Gets the Wald Tests for each coefficient. Gets the value of each coefficient. Gets the 95% Confidence Intervals (C.I.) for each coefficient found in the regression. Gets the Likelihood-Ratio Tests for each coefficient. Constructs a new Logistic regression model. Stepwise Logistic Regression Nested Model collection. This class cannot be instantiated. Represents a Logistic Regression Coefficient found in the Logistic Regression, allowing it to be bound to controls like the DataGridView. This class cannot be instantiated outside the . Gets the name for the current coefficient. Gets the Odds ratio for the current coefficient. Gets the Standard Error for the current coefficient. Gets the 95% confidence interval (C.I.) for the current coefficient. Gets the upper limit for the 95% confidence interval. Gets the lower limit for the 95% confidence interval. Gets the coefficient value. Gets the Wald's test performed for this coefficient. Gets the Likelihood-Ratio test performed for this coefficient. Represents a collection of Logistic Coefficients found in the . This class cannot be instantiated. Common interface for multivariate statistical analysis. Source data used in the analysis. Set of statistics functions operating over a circular space. This class represents collection of common functions used in statistics. The values are handled as belonging to a distribution defined over a circle, such as the . Transforms circular data into angles (normalizes the data to be between -PI and PI). The samples to be transformed. The maximum possible sample value (such as 24 for hour data). Whether to perform the transformation in place. A double array containing the same data in , but normalized between -PI and PI. Transforms circular data into angles (normalizes the data to be between -PI and PI). The sample to be transformed. The maximum possible sample value (such as 24 for hour data). The normalized to be between -PI and PI. Transforms angular data back into circular data (reverts the transformation. The angle to be reconverted into the original unit. The maximum possible sample value (such as 24 for hour data). Whether range values should be wrapped to be contained in the circle. If set to false, range values could be returned outside the [+pi;-pi] range. The original before being converted. Computes the sum of cosines and sines for the given angles. A double array containing the angles in radians. The sum of cosines, returned as an out parameter. The sum of sines, returned as an out parameter. Computes the Mean direction of the given angles. A double array containing the angles in radians. The mean direction of the given angles. Computes the circular Mean direction of the given circular samples. The minimum possible value for a sample must be zero and the maximum must be indicated in the parameter . A double array containing the circular samples. The maximum possible value of the samples. The circular Mean direction of the given samples. Computes the Mean direction of the given angles. The number of samples. The sum of the cosines of the samples. The sum of the sines of the samples. The mean direction of the given angles. Computes the mean resultant vector length (r) of the given angles. A double array containing the angles in radians. The mean resultant vector length of the given angles. Computes the resultant vector length (r) of the given circular samples. The minimum possible value for a sample must be zero and the maximum must be indicated in the parameter . A double array containing the circular samples. The maximum possible value of the samples. The mean resultant vector length of the given samples. Computes the mean resultant vector length (r) of the given angles. The number of samples. The sum of the cosines of the samples. The sum of the sines of the samples. The mean resultant vector length of the given angles. Computes the circular variance of the given circular samples. The minimum possible value for a sample must be zero and the maximum must be indicated in the parameter . A double array containing the circular samples. The maximum possible value of the samples. The circular variance of the given samples. Computes the Variance of the given angles. A double array containing the angles in radians. The variance of the given angles. Computes the Variance of the given angles. The number of samples. The sum of the cosines of the samples. The sum of the sines of the samples. The variance of the angles. Computes the circular standard deviation of the given circular samples. The minimum possible value for a sample must be zero and the maximum must be indicated in the parameter . A double array containing the circular samples. The maximum possible value of the samples. The circular standard deviation of the given samples. Computes the Standard Deviation of the given angles. A double array containing the angles in radians. The standard deviation of the given angles. Computes the Standard Deviation of the given angles. The number of samples. The sum of the cosines of the samples. The sum of the sines of the samples. The standard deviation of the angles. Computes the circular angular deviation of the given circular samples. The minimum possible value for a sample must be zero and the maximum must be indicated in the parameter . A double array containing the circular samples. The maximum possible value of the samples. The circular angular deviation of the given samples. Computes the Angular Deviation of the given angles. A double array containing the angles in radians. The angular deviation of the given angles. Computes the Angular Deviation of the given angles. The number of samples. The sum of the cosines of the samples. The sum of the sines of the samples. The angular deviation of the angles. Computes the circular standard error of the given circular samples. The minimum possible value for a sample must be zero and the maximum must be indicated in the parameter . A double array containing the circular samples. The maximum possible value of the samples. The confidence level. Default is 0.05. The circular standard error of the given samples. Computes the standard error of the given angles. A double array containing the angles in radians. The confidence level. Default is 0.05. The standard error of the given angles. Computes the standard error of the given angles. The number of samples. The sum of the cosines of the samples. The sum of the sines of the samples. The confidence level. Default is 0.05. The standard error of the angles. Computes the angular distance between two angles. The first angle. The second angle. The distance between the two angles. Computes the distance between two circular samples. The first sample. The second sample. The maximum possible value of the samples. The distance between the two angles. Computes the angular distance between two angles. The cosine of the first sample. The sin of the first sample. The cosine of the second sample. The sin of the second sample. The distance between the two angles. Computes the circular Median of the given circular samples. The minimum possible value for a sample must be zero and the maximum must be indicated in the parameter . A double array containing the circular samples. The maximum possible value of the samples. The circular Median of the given samples. Computes the circular Median direction of the given angles. A double array containing the angles in radians. The circular Median of the given angles. Computes the circular quartiles of the given circular samples. The minimum possible value for a sample must be zero and the maximum must be indicated in the parameter . A double array containing the circular samples. The maximum possible value of the samples. The first quartile, as an out parameter. The third quartile, as an out parameter. Whether range values should be wrapped to be contained in the circle. If set to false, range values could be returned outside the [+pi;-pi] range. The quartile definition that should be used. See for datails. The median of the given samples. Computes the circular quartiles of the given circular samples. The minimum possible value for a sample must be zero and the maximum must be indicated in the parameter . A double array containing the circular samples. The maximum possible value of the samples. The first quartile, as an out parameter. The third quartile, as an out parameter. The median value of the , if already known. Whether range values should be wrapped to be contained in the circle. If set to false, range values could be returned outside the [+pi;-pi] range. The quartile definition that should be used. See for datails. The median of the given samples. Computes the circular quartiles of the given circular angles. A double array containing the angles in radians. The first quartile, as an out parameter. The third quartile, as an out parameter. Whether range values should be wrapped to be contained in the circle. If set to false, range values could be returned outside the [+pi;-pi] range. The quartile definition that should be used. See for datails. The median of the given angles. Computes the circular quartiles of the given circular samples. The minimum possible value for a sample must be zero and the maximum must be indicated in the parameter . A double array containing the circular samples. The maximum possible value of the samples. The sample quartiles, as an out parameter. Whether range values should be wrapped to be contained in the circle. If set to false, range values could be returned outside the [+pi;-pi] range. The quartile definition that should be used. See for datails. The median of the given samples. Computes the circular quartiles of the given circular samples. The minimum possible value for a sample must be zero and the maximum must be indicated in the parameter . A double array containing the circular samples. The maximum possible value of the samples. The sample quartiles, as an out parameter. The median value of the , if already known. Whether range values should be wrapped to be contained in the circle. If set to false, range values could be returned outside the [+pi;-pi] range. The quartile definition that should be used. See for datails. The median of the given samples. Computes the circular quartiles of the given circular angles. A double array containing the angles in radians. The sample quartiles, as an out parameter. Whether range values should be wrapped to be contained in the circle. If set to false, range values could be returned outside the [+pi;-pi] range. The quartile definition that should be used. See for datails. The median of the given angles. Computes the circular quartiles of the given circular angles. A double array containing the angles in radians. The sample quartiles, as an out parameter. The angular median, if already known. Whether range values should be wrapped to be contained in the circle. If set to false, range values could be returned outside the [+pi;-pi] range. The quartile definition that should be used. See for datails. The median of the given angles. Computes the circular quartiles of the given circular angles. A double array containing the angles in radians. The first quartile, as an out parameter. The third quartile, as an out parameter. The angular median, if already known. Whether range values should be wrapped to be contained in the circle. If set to false, range values could be returned outside the [+pi;-pi] range. The quartile definition that should be used. See for datails. The median of the given angles. Computes the concentration (kappa) of the given angles. A double array containing the angles in radians. The concentration (kappa) parameter of the for the given data. Computes the concentration (kappa) of the given angles. A double array containing the angles in radians. The mean of the angles, if already known. The concentration (kappa) parameter of the for the given data. Computes the Weighted Mean of the given angles. A double array containing the angles in radians. An unit vector containing the importance of each angle in . The sum of this array elements should add up to 1. The mean of the given angles. Computes the Weighted Concentration of the given angles. A double array containing the angles in radians. An unit vector containing the importance of each angle in . The sum of this array elements should add up to 1. The mean of the given angles. Computes the Weighted Concentration of the given angles. A double array containing the angles in radians. An unit vector containing the importance of each angle in . The sum of this array elements should add up to 1. The mean of the angles, if already known. The mean of the given angles. Computes the maximum likelihood estimate of kappa given by Best and Fisher (1981). This method implements the approximation to the Maximum Likelihood Estimative of the kappa concentration parameter as suggested by Best and Fisher (1981), cited by Zheng Sun (2006) and Hussin and Mohamed (2008). Other useful approximations are given by Suvrit Sra (2009). References: A.G. Hussin and I.B. Mohamed, 2008. Efficient Approximation for the von Mises Concentration Parameter. Asian Journal of Mathematics & Statistics, 1: 165-169. Suvrit Sra, "A short note on parameter approximation for von Mises-Fisher distributions: and a fast implementation of $I_s(x)$". (revision of Apr. 2009). Computational Statistics (2011). Available on: http://www.kyb.mpg.de/publications/attachments/vmfnote_7045%5B0%5D.pdf Zheng Sun. M.Sc. Comparing measures of fit for circular distributions. Master thesis, 2006. Available on: https://dspace.library.uvic.ca:8443/bitstream/handle/1828/2698/zhengsun_master_thesis.pdf Computes the circular skewness of the given circular angles. A double array containing the angles in radians. The circular skewness for the given . Computes the circular kurtosis of the given circular angles. A double array containing the angles in radians. The circular kurtosis for the given . Computes the complex circular central moments of the given circular angles. Computes the complex circular non-central moments of the given circular angles. Common interface for divergence measures (between probability distributions). The type of the first distribution to be compared. The type of the second distribution to be compared. Common interface for divergence measures (between probability distributions). The type of the distributions to be compared. Contains more than 40 statistical distributions, with support for most probability distribution measures and estimation methods. This namespace contains a huge collection of probability distributions, ranging the from the common and simple Normal (Gaussian) and Poisson distributions to Inverse-Wishart and multivariate mixture distributions, including many specialized univariate distributions used in statistical hypothesis testing. Some of those distributions include the , , , and many others. For a complete list of all univariate probability distributions, check the namespace. For a complete list of all multivariate distributions, please see the namespace. A list of density kernels such as the Gaussian kernel and the Epanechnikov kernel are available in the namespace. The namespace class diagram is shown below. The namespace class diagram for univariate distributions is shown below. The namespace class diagram for multivariate distributions is shown below. Contains density estimation kernels which can be used in combination with empirical distributions and multivariate empirical distributions. Common interface for density estimation kernels. Those kernels are different from kernel functions. Density estimation kernels are required to obey normalization rules in order to fulfill integrability and behavioral properties. Moreover, they are defined over a single input vector, the point estimate of a random variable. Computes the kernel density function. The input point. A density estimate around . Epanechnikov density kernel. References: Comaniciu, Dorin, and Peter Meer. "Mean shift: A robust approach toward feature space analysis." Pattern Analysis and Machine Intelligence, IEEE Transactions on 24.5 (2002): 603-619. Available at: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1000236 Dan Styer, Oberlin College Department of Physics and Astronomy; Volume of a d-dimensional sphere. Last updated 30 August 2007. Available at: http://www.oberlin.edu/physics/dstyer/StatMech/VolumeDSphere.pdf David W. Scott, Multivariate Density Estimation: Theory, Practice, and Visualization, Wiley, Aug 31, 1992 The following example shows how to fit a using Epanechnikov kernels. Gets or sets the kernel's normalization constant. Initializes a new instance of the class. Initializes a new instance of the class. The constant by which the kernel formula is multiplied at the end. Default is to consider the area of a unit-sphere of dimension 1. Initializes a new instance of the class. The desired dimension d. Computes the kernel density function. The input point. A density estimate around . Computes the kernel profile function. The point estimate x. The value of the profile function at point . Computes the derivative of the kernel profile function. The point estimate x. The value of the derivative profile function at point . Gaussian density kernel. This class provides a Gaussian density kernel (not to be confused with a kernel function) to be used in density estimation models (i.e. ) and clustering algorithms (i.e. MeanShift. References: Comaniciu, Dorin, and Peter Meer. "Mean shift: A robust approach toward feature space analysis." Pattern Analysis and Machine Intelligence, IEEE Transactions on 24.5 (2002): 603-619. Available at: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1000236 Dan Styer, Oberlin College Department of Physics and Astronomy; Volume of a d-dimensional sphere. Last updated 30 August 2007. Available at: http://www.oberlin.edu/physics/dstyer/StatMech/VolumeDSphere.pdf David W. Scott, Multivariate Density Estimation: Theory, Practice, and Visualization, Wiley, Aug 31, 1992 The following example shows how to fit a using Gaussian kernels: Gets or sets the kernel's normalization constant. Initializes a new instance of the class. The desired dimension d. Initializes a new instance of the class. The normalization constant to use. Computes the kernel density function. The input point. A density estimate around . Computes the kernel profile function. The squared point estimate . The value of the profile function at point ². Computes the derivative of the kernel profile function. The point estimate x. The value of the derivative profile function at point . Common interface for radially symmetric kernels. Computes the kernel profile function. The point estimate x. The value of the profile function at point . Computes the derivative of the kernel profile function. The point estimate x. The value of the derivative profile function at point . Uniform density kernel. References: Comaniciu, Dorin, and Peter Meer. "Mean shift: A robust approach toward feature space analysis." Pattern Analysis and Machine Intelligence, IEEE Transactions on 24.5 (2002): 603-619. Available at: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1000236 Dan Styer, Oberlin College Department of Physics and Astronomy; Volume of a d-dimensional sphere. Last updated 30 August 2007. Available at: http://www.oberlin.edu/physics/dstyer/StatMech/VolumeDSphere.pdf David W. Scott, Multivariate Density Estimation: Theory, Practice, and Visualization, Wiley, Aug 31, 1992 The following example demonstrates how to use the Mean Shift algorithm with a uniform kernel to solve a clustering task: Gets or sets the kernel's normalization constant. Initializes a new instance of the class. Initializes a new instance of the class. The normalization constant c. Computes the kernel density function. The input point. A density estimate around . Computes the kernel profile function. The point estimate x. The value of the profile function at point . Computes the derivative of the kernel profile function. The point estimate x. The value of the derivative profile function at point . Contains special options which can be used in distribution fitting (estimation) methods. BetaPERT's distribution estimation method. Estimates the mode using the classic method. Estimates the mode using the Vose method. Estimation options for Beta PERT distributions. Gets or sets the index of the minimum observed value, if already known. Default is -1. Gets or sets the index of the maximum observed value, if already known. Default is -1. Gets or sets which estimation method should be used by the fitting algorithm. Default is . Gets or sets a value indicating whether the observations are already sorted. Set to true if the observations are sorted; otherwise, false. Gets or sets a value indicating whether the maximum value should be treated as fixed and not be estimated. Default is true. Gets or sets a value indicating whether the minimum value should be treated as fixed and not be estimated. Default is true. Initializes a new instance of the class. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Estimation methods for Beta distributions. Method-of-moments estimation. Maximum Likelihood estimation. Estimation options for Beta distributions. Gets or sets which estimation method should be used by the fitting algorithm. Default is . Initializes a new instance of the class. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Estimation options for Gamma distributions. Gets or sets the relative tolerance when iteratively estimating the distribution. Default is 1e-8. The relative tolerance value. Gets or sets the maximum number of iterations to attempt when estimating the Gamma distribution. Default is 1000. Initializes a new instance of the class. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Common interface for fitting options that support sharing parameters between multiple components of a compound, mixture distribution. Gets or sets a post processing step can be called after all component distributions have been fitted (or their .Fit() method has been called). Triangular distribution's mode estimation method. Estimates the mode using the mean-maximum-minimum method. Estimates the mode using the standard algorithm. Estimates the mode using the bisection algorithm. Estimation options for Triangular distributions. Gets or sets the index of the minimum observed value, if already known. Default is -1. Gets or sets the index of the maximum observed value, if already known. Default is -1. Gets or sets a value indicating whether the observations are already sorted. Set to true if the observations are sorted; otherwise, false. Gets or sets the mode estimation method to use. Default is . Gets or sets a value indicating whether the maximum value should be treated as fixed and not be estimated. Default is true. Gets or sets a value indicating whether the minimum value should be treated as fixed and not be estimated. Default is true. Initializes a new instance of the class. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Expectation Maximization algorithm for mixture model fitting in the log domain. The type of the observations being fitted. This class implements a generic version of the Expectation-Maximization algorithm which can be used with both univariate or multivariate distribution types. Gets or sets the fitting options to be used when any of the component distributions need to be estimated from the data. Gets or sets convergence properties for the expectation-maximization algorithm. Gets the current coefficient values. Gets the current component distribution values. Gets the responsibility of each input vector when estimating each of the component distributions, in the last iteration. Creates a new algorithm. The initial coefficient values. The initial component distributions. Estimates a mixture distribution for the given observations using the Expectation-Maximization algorithm. The observations from the mixture distribution. The log-likelihood of the estimated mixture model. Computes the log-likelihood of the distribution for a given set of observations. Smoothing rule function definition for Empirical distributions. The observations for the empirical distribution. The fractional importance for each sample. Those values must sum up to one. The number of times each sample should be repeated. An estimative of the smoothing parameter. Estimation options for Multivariate Empirical distributions. Gets or sets the smoothing rule used to compute the smoothing parameter in the . Default is to use Silverman's rule. Gets or sets whether the empirical distribution should be take the observation and weight vectors directly instead of making a copy beforehand. Initializes a new instance of the class. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Smoothing rule function definition for Empirical distributions. The observations for the empirical distribution. The fractional importance for each sample. Those values must sum up to one. The number of times each sample should be repeated. An estimative of the smoothing parameter. Estimation options for Empirical distributions. Gets or sets the smoothing rule used to compute the smoothing parameter in the . Default is to use the normal distribution bandwidth approximation. Gets or sets whether the empirical distribution should be take the observation and weight vectors directly instead of making a copy beforehand. Initializes a new instance of the class. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Expectation Maximization algorithm for mixture model fitting. The type of the observations being fitted. This class implements a generic version of the Expectation-Maximization algorithm which can be used with both univariate or multivariate distribution types. Gets or sets the fitting options to be used when any of the component distributions need to be estimated from the data. Gets or sets convergence properties for the expectation-maximization algorithm. Gets the current coefficient values. Gets the current component distribution values. Gets the responsibility of each input vector when estimating each of the component distributions, in the last iteration. Creates a new algorithm. The initial coefficient values. The initial component distributions. Estimates a mixture distribution for the given observations using the Expectation-Maximization algorithm. The observations from the mixture distribution. The log-likelihood of the estimated mixture model. Estimates a mixture distribution for the given observations using the Expectation-Maximization algorithm, assuming different weights for each observation. The observations from the mixture distribution. The weight of each observation. The log-likelihood of the estimated mixture model. Computes the log-likelihood of the distribution for a given set of observations. Computes the log-likelihood of the distribution for a given set of observations. Computes the log-likelihood of the distribution for a given set of observations. Computes the log-likelihood of the distribution for a given set of observations. Estimation options for multivariate independent distributions. Gets or sets the fitting options for the inner independent components in the joint distribution. Gets or sets the fitting options for specific inner independent components in the joint distribution. Gets or sets whether the data to be fitted has already been transposed. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Estimation options for multivariate independent distributions. Gets or sets the fitting options for the inner independent components in the joint distribution. Gets or sets the fitting options for specific inner independent components in the joint distribution. Initializes a new instance of the class. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Fitting options for hidden Markov model distributions. Gets or sets the learning function for the hidden Markov model. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Options for Survival distributions. Default survival estimation method. Returns . Gets or sets the values for the right-censoring variable. Initializes a new instance of the class. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Options for Empirical Hazard distributions. Default hazard estimator. Returns . Default tie handling method. Returns . Gets or sets the estimator to be used. Default is . Gets or sets the tie handling method to be used. Default is . Initializes a new instance of the class. Initializes a new instance of the class. Initializes a new instance of the class. Initializes a new instance of the class. Initializes a new instance of the class. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Estimation options for Cauchy distributions. Gets or sets a value indicating whether the distribution parameters should be estimated using maximum likelihood. Default is true. The Cauchy distribution parameters can be estimated in many ways. One approach is to use order statistics to derive approximations to the location and scale parameters by analysis the interquartile range of the data. The other approach is to use Maximum Likelihood to estimate the parameters. The MLE does not exists in simple algebraic form, so it has to be estimated using numeric optimization. true if the parameters should be estimated by ML; otherwise, false. Gets or sets a value indicating whether the scale parameter should be estimated. Default is true. true if the scale parameter should be estimated; otherwise, false. Gets or sets a value indicating whether the location parameter should be estimated. Default is true. true if the location parameter should be estimated; otherwise, false. Initializes a new instance of the class. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Estimable parameters of Hypergeometric distributions. Population size parameter N. Successes in population parameter m. Estimation options for Hypergeometric distributions. Gets or sets which parameter of the Hypergeometric distribution should be estimated. The hypergeometric parameters to estimate. Initializes a new instance of the class. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Estimation options for general discrete (categorical) distributions. Gets or sets the minimum allowed probability in the frequency tables specifying the discrete distribution. Default is 1e-10. Gets ors sets whether to use Laplace's rule of succession to avoid zero probabilities. Default is false. Gets or sets how much percent of the previous value for the distribution should be kept in its updated value. Default is 0. Gets or sets whether current frequency values in the distribution should be considered as priors during the next time the distribution is estimated. Default is false. Initializes a new instance of the class. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Estimation options for Von-Mises distributions. Gets or sets a value indicating whether to use bias correction when estimating the concentration parameter of the von-Mises distribution. true to use bias correction; otherwise, false. For more information, see: Best, D. and Fisher N. (1981). The bias of the maximum likelihood estimators of the von Mises-Fisher concentration parameters. Communications in Statistics - Simulation and Computation, B10(5), 493-502. Initializes a new instance of the class. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Common interface for distribution fitting option objects. Estimation options for univariate and multivariate mixture distributions. Gets or sets the convergence criterion for the Expectation-Maximization algorithm. Default is 1e-3. The convergence threshold. Gets or sets the maximum number of iterations to be performed by the Expectation-Maximization algorithm. Default is zero (iterate until convergence). Gets or sets the maximum number of iterations to be performed by the Expectation-Maximization algorithm. Default is zero (iterate until convergence). Gets or sets the parallelization options to be used when fitting. The parallel options. Gets or sets the fitting options for the inner component distributions of the mixture density. The fitting options for inner distributions. Gets or sets whether to make computations using the log -domain. This might improve accuracy on large datasets. Initializes a new instance of the class. Initializes a new instance of the class. The convergence criterion for the Expectation-Maximization algorithm. Default is 1e-3. Initializes a new instance of the class. The convergence criterion for the Expectation-Maximization algorithm. Default is 1e-3. The fitting options for the inner component distributions of the mixture density. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Estimation options for Normal distributions. Gets or sets the regularization step to avoid singular or non-positive definite covariance matrices. Default is 0. Setting this property to a small constant like 1e-6 is more efficient than setting to true. The regularization step. Gets or sets a value indicating whether the covariance matrix to be estimated should be assumed to be diagonal. true to estimate a diagonal covariance matrix; otherwise, false. Gets or sets whether the estimation function should allow non-positive definite covariance matrices by using the Singular Value Decomposition Function. Enabling this property can significantly increase learning times. Gets or sets whether the normal distributions should have only a single, shared covariance matrix among all components in a mixture. Setting this property only has effect if the distributions are part of a or Gets or sets a post processing step can be called after all component distributions have been fitted (or their .Fit() method has been called). Initializes a new instance of the class. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Common interface for distributions which can be estimated from data. The type of the observations, such as . The type of the options specifying object. Common interface for distributions which can be estimated from data. The type of the observations, such as . The type of the options specifying object. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Common interface for distributions which can be estimated from data. The type of the observations, such as . Common interface for distributions which can be estimated from data. Common interface for distributions which can be estimated from data. The type of the observations, such as . Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Common interface for sampleable distributions (i.e. distributions that allow the generation of new samples through the method. The type of the observations, such as . Generates a random observation from the current distribution. The location where to store the sample. A random observation drawn from this distribution. Generates a random observation from the current distribution. The location where to store the sample. The random number generator to use as a source of randomness. Default is to use . A random observation drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Discovers the parameters of a probability distribution and helps determine their range and whether then can be constructed automatically from their indicated parameter ranges. Gets the reflection constructor information. Gets a value indicating whether it is possible to discover enough information about this constructor such that the distribution can be constructed using reflection. true if this instance is buildable; otherwise, false. Gets the parameters of the constructor. Initializes a new instance of the class. The distribution's constructor. Discovers the parameters of a probability distribution and helps determine their range and whether then can be constructed automatically from their indicated parameter ranges. Gets the distribution's type information. Gets the name of this distribution in a more human-readable form. Gets a value indicating whether it is possible to discover enough information about this constructor such that the distribution can be constructed using reflection. true if this instance is buildable; otherwise, false. Gets a value indicating whether the distribution modeled by is a discrete-valued distribution. Discrete distributions are assumed to inherit from or . Gets a value indicating whether the distribution modeled by is a continuous-valued distribution. Discrete distributions are assumed to inherit from or . Gets a value indicating whether the distribution modeled by is univariate. A distribution is assumed to be univariate if it implements the interface. Gets a value indicating whether the distribution modeled by is multivariate. A distribution is assumed to be univariate if it implements the interface. Initializes a new instance of the class. The type for the distribution. Gets the public constructors of the distribution. Gets the fitting options object that are expected by the distribution, if any. An Accord.NET distribution object can be fitted to a set of observed values. However, some distributions have different settings on how this fitting can be done. This function creates an object that contains those possible settings that can be configured for a given distribution type. Gets the name of the distribution modeled by a given Accord.NET type. The name is returned in a normalized form (i.e. given a type whose name is NormalDistribution, the function would return "Normal"). Gets the fitting options object that are expected by one distribution, if any. An Accord.NET distribution object can be fit to a set of observed values. However, some distributions have different settings on how this fitting can be done. This function creates an object that contains those possible settings that can be configured for a given distribution type. Gets the fitting options object that are expected by one distribution, if any. An Accord.NET distribution object can be fit to a set of observed values. However, some distributions have different settings on how this fitting can be done. This function creates an object that contains those possible settings that can be configured for a given distribution type. Returns a that represents this instance. A that represents this instance. Discovers the parameters of a univariate probability distribution and helps determine their range and whether then can be constructed automatically from their indicated parameter ranges. Initializes a new instance of the class. The type for the distribution. Gets an array containing all univariate distributions by inspecting Accord.NET assemblies using reflection. Creates a new instance of the distribution using the given arguments. The arguments to be passed to the distribution's constructor. Creates a new instance of the distribution using default arguments, if the distribution declared them using parameter attributes. Discovers the parameters of a probability distribution and helps determine their range and whether then can be constructed automatically from their indicated parameter ranges. Gets the reflection parameter information. Gets the name of this parameter. Gets the position of this parameter in the declaration of the constructor. Gets the range of valid values for this parameter (i.e. in a , the standard deviation parameter cannot be negative). Gets the default value for this parameter (i.e. in a , the default value for the mean is 0). Gets a value indicating whether it is possible to discover enough information about this constructor such that the distribution can be constructed using reflection. true if this instance is buildable; otherwise, false. Initializes a new instance of the class. The parameter information. Tries to get the valid range of a distribution's parameter. Tries to get the default value of a distribution's parameter. Returns a that represents this instance. A that represents this instance. Contains a multivariate distributions such as the multivariate Normal, Multinomial, Independent, Joint and Mixture distributions. The namespace class diagram is shown below. Abstract class for Matrix Probability Distributions. A probability distribution identifies either the probability of each value of an unidentified random variable (when the variable is discrete), or the probability of the value falling within a particular interval (when the variable is continuous). The probability distribution describes the range of possible values that a random variable can attain and the probability that the value of the random variable is within any (measurable) subset of that range. The function describing the probability that a given value will occur is called the probability function (or probability density function, abbreviated PDF), and the function describing the cumulative probability that a given value or any value smaller than it will occur is called the distribution function (or cumulative distribution function, abbreviated CDF). References: Wikipedia, The Free Encyclopedia. Probability distribution. Available on: http://en.wikipedia.org/wiki/Probability_distribution Weisstein, Eric W. "Statistical Distribution." From MathWorld--A Wolfram Web Resource. http://mathworld.wolfram.com/StatisticalDistribution.html Constructs a new MultivariateDistribution class. The number of rows for matrices modeled by the distribution. The number of rows for matrices modeled by the distribution. Gets the number of variables for this distribution. Gets the number of rows that matrices from this distribution should have. Gets the number of columns that matrices from this distribution should have. Gets the mean for this distribution. A vector containing the mean values for the distribution. Gets the variance for this distribution. A vector containing the variance values for the distribution. Gets the variance-covariance matrix for this distribution. A matrix containing the covariance values for the distribution. Gets the mode for this distribution. A vector containing the mode values for the distribution. Gets the median for this distribution. A vector containing the median values for the distribution. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The Probability Density Function (PDF) describes the probability that a given value x will occur. The probability of x occurring in the current distribution. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The Probability Density Function (PDF) describes the probability that a given value x will occur. The probability of x occurring in the current distribution. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The Probability Density Function (PDF) describes the probability that a given value x will occur. The probability of x occurring in the current distribution. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The Probability Density Function (PDF) describes the probability that a given value x will occur. The probability of x occurring in the current distribution. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The logarithm of the probability of x occurring in the current distribution. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The logarithm of the probability of x occurring in the current distribution. Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Generates a random vector of observations from the current distribution. The number of samples to generate. A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Generates a random observation from the current distribution. The location where to store the sample. The random number generator to use as a source of randomness. Default is to use . A random observation drawn from this distribution. Generates a random observation from the current distribution. The location where to store the sample. A random observation drawn from this distribution. Generates a random observation from the current distribution. The location where to store the sample. The random number generator to use as a source of randomness. Default is to use . A random observation drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. A random vector of observations drawn from this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. The x. System.Double. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. The x. System.Double. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Multivariate empirical distribution. Empirical distributions are based solely on the data. This class uses the empirical distribution function and the Gaussian kernel density estimation to provide an univariate continuous distribution implementation which depends only on sampled data. References: Wikipedia, The Free Encyclopedia. Empirical Distribution Function. Available on: http://en.wikipedia.org/wiki/Empirical_distribution_function PlanetMath. Empirical Distribution Function. Available on: http://planetmath.org/encyclopedia/EmpiricalDistributionFunction.html Wikipedia, The Free Encyclopedia. Kernel Density Estimation. Available on: http://en.wikipedia.org/wiki/Kernel_density_estimation Bishop, Christopher M.; Pattern Recognition and Machine Learning. Springer; 1st ed. 2006. Buch-Kromann, T.; Nonparametric Density Estimation (Multidimension), 2007. Available in http://www.buch-kromann.dk/tine/nonpar/Nonparametric_Density_Estimation_multidim.pdf W. Härdle, M. Müller, S. Sperlich, A. Werwatz; Nonparametric and Semiparametric Models, 2004. Available in http://sfb649.wiwi.hu-berlin.de/fedc_homepage/xplore/ebooks/html/spm/spmhtmlnode18.html The first example shows how to fit a using Gaussian kernels: The second example shows how to the same as above, but using Epanechnikov kernels instead. Creates a new Empirical Distribution from the data samples. The data samples. Creates a new Empirical Distribution from the data samples. The data samples. The fractional weights to use for the samples. The weights must sum up to one. Creates a new Empirical Distribution from the data samples. The data samples. The number of repetition counts for each sample. Creates a new Empirical Distribution from the data samples. The kernel density function to use. Default is to use the . The data samples forming the distribution. Creates a new Empirical Distribution from the data samples. The kernel density function to use. Default is to use the . The data samples forming the distribution. The fractional weights to use for the samples. The weights must sum up to one. Creates a new Empirical Distribution from the data samples. The kernel density function to use. Default is to use the . The data samples forming the distribution. The number of repetition counts for each sample. Creates a new Empirical Distribution from the data samples. The kernel density function to use. Default is to use the . The data samples. The kernel smoothing or bandwidth to be used in density estimation. By default, the normal distribution approximation will be used. Creates a new Empirical Distribution from the data samples. The kernel density function to use. Default is to use the . The data samples. The number of repetition counts for each sample. The kernel smoothing or bandwidth to be used in density estimation. By default, the normal distribution approximation will be used. Creates a new Empirical Distribution from the data samples. The kernel density function to use. Default is to use the . The data samples. The fractional weights to use for the samples. The weights must sum up to one. The kernel smoothing or bandwidth to be used in density estimation. By default, the normal distribution approximation will be used. Creates a new Empirical Distribution from the data samples. The data samples. The number of repetition counts for each sample. The kernel smoothing or bandwidth to be used in density estimation. By default, the normal distribution approximation will be used. Creates a new Empirical Distribution from the data samples. The data samples. The fractional weights to use for the samples. The weights must sum up to one. The kernel smoothing or bandwidth to be used in density estimation. By default, the normal distribution approximation will be used. Gets the kernel density function used in this distribution. Gets the samples giving this empirical distribution. Gets the fractional weights associated with each sample. Note that changing values on this array will not result int any effect in this distribution. The distribution must be computed from scratch with new values in case new weights needs to be used. Gets the repetition counts associated with each sample. Note that changing values on this array will not result int any effect in this distribution. The distribution must be computed from scratch with new values in case new weights needs to be used. Gets the total number of samples in this distribution. Gets the bandwidth smoothing parameter used in the kernel density estimation. Gets the mean for this distribution. A vector containing the mean values for the distribution. Gets the variance for this distribution. A vector containing the variance values for the distribution. Gets the variance-covariance matrix for this distribution. A matrix containing the covariance values for the distribution. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Gets the Silverman's rule. estimative of the smoothing parameter. This is the default smoothing rule applied used when estimating s. This method is described on Wikipedia, at http://en.wikipedia.org/wiki/Multivariate_kernel_density_estimation The observations for the empirical distribution. An estimative of the smoothing parameter. Gets the Silverman's rule. estimative of the smoothing parameter. This is the default smoothing rule applied used when estimating s. This method is described on Wikipedia, at http://en.wikipedia.org/wiki/Multivariate_kernel_density_estimation The observations for the empirical distribution. The fractional importance for each sample. Those values must sum up to one. An estimative of the smoothing parameter. Gets the Silverman's rule. estimative of the smoothing parameter. This is the default smoothing rule applied used when estimating s. This method is described on Wikipedia, at http://en.wikipedia.org/wiki/Multivariate_kernel_density_estimation The observations for the empirical distribution. The number of times each sample should be repeated. An estimative of the smoothing parameter. Gets the Silverman's rule. estimative of the smoothing parameter. This is the default smoothing rule applied used when estimating s. This method is described on Wikipedia, at http://en.wikipedia.org/wiki/Multivariate_kernel_density_estimation The observations for the empirical distribution. The fractional importance for each sample. Those values must sum up to one. The number of times each sample should be repeated. An estimative of the smoothing parameter. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Returns a that represents this instance. The format. The format provider. A that represents this instance. Inverse Wishart Distribution. The inverse Wishart distribution, also called the inverted Wishart distribution, is a probability distribution defined on real-valued positive-definite matrices. In Bayesian statistics it is used as the conjugate prior for the covariance matrix of a multivariate normal distribution. References: Wikipedia, The Free Encyclopedia. Inverse Wishart distribution. Available from: http://en.wikipedia.org/wiki/Inverse-Wishart_distribution // Create a Inverse Wishart with the parameters var invWishart = new InverseWishartDistribution( // Degrees of freedom degreesOfFreedom: 4, // Scale parameter inverseScale: new double[,] { { 1.7, -0.2 }, { -0.2, 5.3 }, } ); // Common measures double[] var = invWishart.Variance; // { -3.4, -10.6 } double[,] cov = invWishart.Covariance; // see below double[,] mmean = invWishart.MeanMatrix; // see below // cov mean // -5.78 -4.56 1.7 -0.2 // -4.56 -56.18 -0.2 5.3 // (the above matrix representations have been transcribed to text using) string scov = cov.ToString(DefaultMatrixFormatProvider.InvariantCulture); string smean = mmean.ToString(DefaultMatrixFormatProvider.InvariantCulture); // For compatibility reasons, .Mean stores a flattened mean matrix double[] mean = invWishart.Mean; // { 1.7, -0.2, -0.2, 5.3 } // Probability density functions double pdf = invWishart.ProbabilityDensityFunction(new double[,] { { 5.2, 0.2 }, // 0.000029806281690351203 { 0.2, 4.2 }, }); double lpdf = invWishart.LogProbabilityDensityFunction(new double[,] { { 5.2, 0.2 }, // -10.420791391688828 { 0.2, 4.2 }, }); Creates a new Inverse Wishart distribution. The degrees of freedom v. The inverse scale matrix Ψ (psi). Gets the mean for this distribution. A vector containing the mean values for the distribution. Gets the variance for this distribution. A vector containing the variance values for the distribution. Gets the variance-covariance matrix for this distribution. A matrix containing the covariance values for the distribution. Not supported. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a matrix distribution, such as the Wishart's, this should be a positive-definite matrix or a matrix written in flat vector form. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a matrix distribution, such as the Wishart's, this should be a positive-definite matrix or a matrix written in flat vector form. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Not supported. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. The format. The format provider. A that represents this instance. Uniform distribution inside a n-dimensional ball. Initializes a new instance of the class. The number of dimensions in the n-dimensional sphere. Initializes a new instance of the class. The sphere's mean. The sphere's radius. Gets the sphere radius. Gets the sphere volume. Gets the sphere center (mean) vector. A vector containing the mean values for the distribution. Gets the variance for this distribution. A vector containing the variance values for the distribution. Gets the variance-covariance matrix for this distribution. A matrix containing the covariance values for the distribution. Not implemented. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The logarithm of the probability of x occurring in the current distribution. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Returns a that represents this instance. The format. The format provider. A that represents this instance. Uniform distribution inside a n-dimensional ball. Initializes a new instance of the class. The number of dimensions in the n-dimensional sphere. Initializes a new instance of the class. The sphere's mean. The sphere's radius. Gets the sphere radius. Gets the sphere volume. Gets the sphere center (mean) vector. A vector containing the mean values for the distribution. Gets the variance for this distribution. A vector containing the variance values for the distribution. Gets the variance-covariance matrix for this distribution. A matrix containing the covariance values for the distribution. Not implemented. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The logarithm of the probability of x occurring in the current distribution. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The sphere's mean. The sphere's radius. A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The sphere's mean. The sphere's radius. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The number of dimensions in the n-dimensional sphere. A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The number of dimensions in the n-dimensional sphere. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The number of dimensions in the n-dimensional sphere. The location where to store the samples. A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The number of dimensions in the n-dimensional sphere. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Returns a that represents this instance. The format. The format provider. A that represents this instance. Von-Mises Fisher distribution. In directional statistics, the von Mises–Fisher distribution is a probability distribution on the (p-1)-dimensional sphere in R^p. If p = 2 the distribution reduces to the von Mises distribution on the circle. References: Wikipedia, The Free Encyclopedia. Von Mises-Fisher Distribution. Available on: https://en.wikipedia.org/wiki/Von_Mises%E2%80%93Fisher_distribution Constructs a Von-Mises Fisher distribution with unit mean. The number of dimensions in the distribution. The concentration value κ (kappa). Constructs a Von-Mises Fisher distribution with unit mean. The mean direction vector (with unit length). The concentration value κ (kappa). Gets the mean for this distribution. A vector containing the mean values for the distribution. Not supported. Not supported. Not supported. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The probability of x occurring in the current distribution. x;The vector should have the same dimension as the distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. The format. The format provider. A that represents this instance. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Wishart Distribution. The Wishart distribution is a generalization to multiple dimensions of the Chi-Squared distribution, or, in the case of non-integer degrees of freedom, of the Gamma distribution . References: Wikipedia, The Free Encyclopedia. Wishart distribution. Available from: http://en.wikipedia.org/wiki/Wishart_distribution // Create a Wishart distribution with the parameters: WishartDistribution wishart = new WishartDistribution( // Degrees of freedom degreesOfFreedom: 7, // Scale parameter scale: new double[,] { { 4, 1, 1 }, { 1, 2, 2 }, // (must be symmetric and positive definite) { 1, 2, 6 }, } ); // Common measures double[] var = wishart.Variance; // { 224, 56, 504 } double[,] cov = wishart.Covariance; // see below double[,] meanm = wishart.MeanMatrix; // see below // 224 63 175 28 7 7 // cov = 63 56 112 mean = 7 14 14 // 175 112 504 7 14 42 // (the above matrix representations have been transcribed to text using) string scov = cov.ToString(DefaultMatrixFormatProvider.InvariantCulture); string smean = meanm.ToString(DefaultMatrixFormatProvider.InvariantCulture); // For compatibility reasons, .Mean stores a flattened mean matrix double[] mean = wishart.Mean; // { 28, 7, 7, 7, 14, 14, 7, 14, 42 } // Probability density functions double pdf = wishart.ProbabilityDensityFunction(new double[,] { { 8, 3, 1 }, { 3, 7, 1 }, // 0.000000011082455043473361 { 1, 1, 8 }, }); double lpdf = wishart.LogProbabilityDensityFunction(new double[,] { { 8, 3, 1 }, { 3, 7, 1 }, // -18.317902605850534 { 1, 1, 8 }, }); Creates a new Wishart distribution. The number of rows in the covariance matrices. The degrees of freedom n. Creates a new Wishart distribution. The degrees of freedom n. The positive-definite matrix scale matrix V. Gets the degrees of freedom for this Wishart distribution. Gets the mean for this distribution. A vector containing the mean values for the distribution. Gets the variance for this distribution. A vector containing the variance values for the distribution. Gets the variance-covariance matrix for this distribution. A matrix containing the covariance values for the distribution. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a matrix distribution, such as the Wishart's, this should be a positive-definite matrix or a matrix written in flat vector form. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a matrix distribution, such as the Wishart's, this should be a positive-definite matrix or a matrix written in flat vector form. The logarithm of the probability of x occurring in the current distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Unsupported. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. The format. The format provider. A that represents this instance. Generates a random vector of observations from the current distribution. The degrees of freedom n. The positive-definite matrix scale matrix V. A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The degrees of freedom n. The positive-definite matrix scale matrix V. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The degrees of freedom n. The positive-definite matrix scale matrix V. A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The degrees of freedom n. The positive-definite matrix scale matrix V. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The degrees of freedom n. The positive-definite matrix scale matrix V. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Common interface for components of and distributions. The type of the component distribution. Gets the component distribution. Gets the weight associated with this component. Joint distribution assuming independence between vector components. In probability and statistics, given at least two random variables X, Y, ..., that are defined on a probability space, the joint probability distribution for X, Y, ... is a probability distribution that gives the probability that each of X, Y, ... falls in any particular range or discrete set of values specified for that variable. In the case of only two random variables, this is called a bivariate distribution, but the concept generalizes to any number of random variables, giving a multivariate distribution. This class is also available in a generic version, allowing for any choice of component distribution (. References: Wikipedia, The Free Encyclopedia. Beta distribution. Available from: http://en.wikipedia.org/wiki/Joint_probability_distribution The following example shows how to declare and initialize an Independent Joint Gaussian Distribution using known means and variances for each component. // Declare two normal distributions NormalDistribution pa = new NormalDistribution(4.2, 1); // p(a) NormalDistribution pb = new NormalDistribution(7.0, 2); // p(b) // Now, create a joint distribution combining these two: var joint = new Independent(pa, pb); // This distribution assumes the distributions of the two components are independent, // i.e. if we have 2D input vectors on the form {a, b}, then p({a,b}) = p(a) * p(b). // Lets check a simple example. Consider a 2D input vector x = { 4.2, 7.0 } as // double[] x = new double[] { 4.2, 7.0 }; // Those two should be completely equivalent: double p1 = joint.ProbabilityDensityFunction(x); double p2 = pa.ProbabilityDensityFunction(x[0]) * pb.ProbabilityDensityFunction(x[1]); bool equal = p1 == p2; // at this point, equal should be true. Initializes a new instance of the class. The components. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Joint distribution assuming independence between vector components. The type of the underlying distributions. The type for the observations being modeled by the distribution (i.e. double). The options for fitting the distribution to the observations. In probability and statistics, given at least two random variables X, Y, ..., that are defined on a probability space, the joint probability distribution for X, Y, ... is a probability distribution that gives the probability that each of X, Y, ... falls in any particular range or discrete set of values specified for that variable. In the case of only two random variables, this is called a bivariate distribution, but the concept generalizes to any number of random variables, giving a multivariate distribution. References: Wikipedia, The Free Encyclopedia. Beta distribution. Available from: http://en.wikipedia.org/wiki/Joint_probability_distribution The following example shows how to declare and initialize an Independent Joint Gaussian Distribution using known means and variances for each component. // Declare two normal distributions NormalDistribution pa = new NormalDistribution(4.2, 1); // p(a) NormalDistribution pb = new NormalDistribution(7.0, 2); // p(b) // Now, create a joint distribution combining these two: var joint = new Independent<NormalDistribution>(pa, pb); // This distribution assumes the distributions of the two components are independent, // i.e. if we have 2D input vectors on the form {a, b}, then p({a,b}) = p(a) * p(b). // Lets check a simple example. Consider a 2D input vector x = { 4.2, 7.0 } as // double[] x = new double[] { 4.2, 7.0 }; // Those two should be completely equivalent: double p1 = joint.ProbabilityDensityFunction(x); double p2 = pa.ProbabilityDensityFunction(x[0]) * pb.ProbabilityDensityFunction(x[1]); bool equal = p1 == p2; // at this point, equal should be true. The following example shows how to fit a distribution (estimate its parameters) from a given dataset. // Let's consider an input dataset of 2D vectors. We would // like to estimate an Independent<NormalDistribution> // which best models this data. double[][] data = { // x, y new double[] { 1, 8 }, new double[] { 2, 6 }, new double[] { 5, 7 }, new double[] { 3, 9 }, }; // We start by declaring some initial guesses for the // distributions of each random variable (x, and y): // var distX = new NormalDistribution(0, 1); var distY = new NormalDistribution(0, 1); // Next, we declare our initial guess Independent distribution var joint = new Independent<NormalDistribution>(distX, distY); // We can now fit the distribution to our data, // producing an estimate of its parameters: // joint.Fit(data); // At this point, we have estimated our distribution. double[] mean = joint.Mean; // should be { 2.75, 7.50 } double[] var = joint.Variance; // should be { 2.917, 1.667 } // | 2.917, 0.000 | double[,] cov = joint.Covariance; // Cov = | | // | 0.000, 1.667 | // The covariance matrix is diagonal, as it would be expected // if is assumed there are no interactions between components. Initializes a new instance of the class. The number of independent component distributions. A function that creates a new distribution for each component index. Initializes a new instance of the class. The number of independent component distributions. A function that creates a new distribution for each component index. Initializes a new instance of the class. The number of independent component distributions. A base component which will be cloned to all dimensions. Initializes a new instance of the class. The components. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. For an example on how to fit an independent joint distribution, please take a look at the examples section for . Joint distribution assuming independence between vector components. The type of the underlying distributions. The type for the observations being modeled by the distribution (i.e. double). In probability and statistics, given at least two random variables X, Y, ..., that are defined on a probability space, the joint probability distribution for X, Y, ... is a probability distribution that gives the probability that each of X, Y, ... falls in any particular range or discrete set of values specified for that variable. In the case of only two random variables, this is called a bivariate distribution, but the concept generalizes to any number of random variables, giving a multivariate distribution. References: Wikipedia, The Free Encyclopedia. Beta distribution. Available from: http://en.wikipedia.org/wiki/Joint_probability_distribution The following example shows how to declare and initialize an Independent Joint Gaussian Distribution using known means and variances for each component. // Declare two normal distributions NormalDistribution pa = new NormalDistribution(4.2, 1); // p(a) NormalDistribution pb = new NormalDistribution(7.0, 2); // p(b) // Now, create a joint distribution combining these two: var joint = new Independent<NormalDistribution>(pa, pb); // This distribution assumes the distributions of the two components are independent, // i.e. if we have 2D input vectors on the form {a, b}, then p({a,b}) = p(a) * p(b). // Lets check a simple example. Consider a 2D input vector x = { 4.2, 7.0 } as // double[] x = new double[] { 4.2, 7.0 }; // Those two should be completely equivalent: double p1 = joint.ProbabilityDensityFunction(x); double p2 = pa.ProbabilityDensityFunction(x[0]) * pb.ProbabilityDensityFunction(x[1]); bool equal = p1 == p2; // at this point, equal should be true. The following example shows how to fit a distribution (estimate its parameters) from a given dataset. // Let's consider an input dataset of 2D vectors. We would // like to estimate an Independent<NormalDistribution> // which best models this data. double[][] data = { // x, y new double[] { 1, 8 }, new double[] { 2, 6 }, new double[] { 5, 7 }, new double[] { 3, 9 }, }; // We start by declaring some initial guesses for the // distributions of each random variable (x, and y): // var distX = new NormalDistribution(0, 1); var distY = new NormalDistribution(0, 1); // Next, we declare our initial guess Independent distribution var joint = new Independent<NormalDistribution>(distX, distY); // We can now fit the distribution to our data, // producing an estimate of its parameters: // joint.Fit(data); // At this point, we have estimated our distribution. double[] mean = joint.Mean; // should be { 2.75, 7.50 } double[] var = joint.Variance; // should be { 2.917, 1.667 } // | 2.917, 0.000 | double[,] cov = joint.Covariance; // Cov = | | // | 0.000, 1.667 | // The covariance matrix is diagonal, as it would be expected // if is assumed there are no interactions between components. Initializes a new instance of the class. The number of independent component distributions. A function that creates a new distribution for each component index. Initializes a new instance of the class. The number of independent component distributions. A function that creates a new distribution for each component index. Initializes a new instance of the class. The number of independent component distributions. A base component which will be cloned to all dimensions. Initializes a new instance of the class. The components. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). For an example on how to fit an independent joint distribution, please take a look at the examples section for . Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. For an example on how to fit an independent joint distribution, please take a look at the examples section for . Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. For an example on how to fit an independent joint distribution, please take a look at the examples section for . Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Generates a random observation from the current distribution. The location where to store the sample. A random observation drawn from this distribution. Generates a random observation from the current distribution. The location where to store the sample. The random number generator to use as a source of randomness. Default is to use . A random observation drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Abstract class for multivariate discrete probability distributions. A probability distribution identifies either the probability of each value of an unidentified random variable (when the variable is discrete), or the probability of the value falling within a particular interval (when the variable is continuous). The probability distribution describes the range of possible values that a random variable can attain and the probability that the value of the random variable is within any (measurable) subset of that range. The function describing the probability that a given discrete value will occur is called the probability function (or probability mass function, abbreviated PMF), and the function describing the cumulative probability that a given value or any value smaller than it will occur is called the distribution function (or cumulative distribution function, abbreviated CDF). References: Wikipedia, The Free Encyclopedia. Probability distribution. Available on: http://en.wikipedia.org/wiki/Probability_distribution Weisstein, Eric W. "Statistical Distribution." From MathWorld--A Wolfram Web Resource. http://mathworld.wolfram.com/StatisticalDistribution.html Constructs a new MultivariateDiscreteDistribution class. Gets the number of variables for this distribution. Gets the mean for this distribution. An array of double-precision values containing the mean values for this distribution. Gets the mean for this distribution. An array of double-precision values containing the variance values for this distribution. Gets the variance for this distribution. An multidimensional array of double-precision values containing the covariance values for this distribution. Gets the mode for this distribution. An array of double-precision values containing the mode values for this distribution. Gets the median for this distribution. An array of double-precision values containing the median values for this distribution. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The Probability Mass Function (PMF) describes the probability that a given value x will occur. The probability of x occurring in the current distribution. Gets the probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The Probability Mass Function (PMF) describes the probability that a given value x will occur. The probability of x occurring in the current distribution. Gets the log-probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The Probability Mass Function (PMF) describes the probability that a given value x will occur. The logarithm of the probability of x occurring in the current distribution. Gets the log-probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The Probability Mass Function (PMF) describes the probability that a given value x will occur. The logarithm of the probability of x occurring in the current distribution. Not supported. Not supported. Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Gets the marginal distribution of a given variable. The variable index. Gets the marginal distribution of a given variable evaluated at a given value. The variable index. The variable value. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Generates a random vector of observations from the current distribution. The number of samples to generate. A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Joint distribution assuming independence between vector components. The type of the underlying distributions. In probability and statistics, given at least two random variables X, Y, ..., that are defined on a probability space, the joint probability distribution for X, Y, ... is a probability distribution that gives the probability that each of X, Y, ... falls in any particular range or discrete set of values specified for that variable. In the case of only two random variables, this is called a bivariate distribution, but the concept generalizes to any number of random variables, giving a multivariate distribution. References: Wikipedia, The Free Encyclopedia. Beta distribution. Available from: http://en.wikipedia.org/wiki/Joint_probability_distribution The following example shows how to declare and initialize an Independent Joint Gaussian Distribution using known means and variances for each component. // Declare two normal distributions NormalDistribution pa = new NormalDistribution(4.2, 1); // p(a) NormalDistribution pb = new NormalDistribution(7.0, 2); // p(b) // Now, create a joint distribution combining these two: var joint = new Independent<NormalDistribution>(pa, pb); // This distribution assumes the distributions of the two components are independent, // i.e. if we have 2D input vectors on the form {a, b}, then p({a,b}) = p(a) * p(b). // Lets check a simple example. Consider a 2D input vector x = { 4.2, 7.0 } as // double[] x = new double[] { 4.2, 7.0 }; // Those two should be completely equivalent: double p1 = joint.ProbabilityDensityFunction(x); double p2 = pa.ProbabilityDensityFunction(x[0]) * pb.ProbabilityDensityFunction(x[1]); bool equal = p1 == p2; // at this point, equal should be true. The following example shows how to fit a distribution (estimate its parameters) from a given dataset. // Let's consider an input dataset of 2D vectors. We would // like to estimate an Independent<NormalDistribution> // which best models this data. double[][] data = { // x, y new double[] { 1, 8 }, new double[] { 2, 6 }, new double[] { 5, 7 }, new double[] { 3, 9 }, }; // We start by declaring some initial guesses for the // distributions of each random variable (x, and y): // var distX = new NormalDistribution(0, 1); var distY = new NormalDistribution(0, 1); // Next, we declare our initial guess Independent distribution var joint = new Independent<NormalDistribution>(distX, distY); // We can now fit the distribution to our data, // producing an estimate of its parameters: // joint.Fit(data); // At this point, we have estimated our distribution. double[] mean = joint.Mean; // should be { 2.75, 7.50 } double[] var = joint.Variance; // should be { 2.917, 1.667 } // | 2.917, 0.000 | double[,] cov = joint.Covariance; // Cov = | | // | 0.000, 1.667 | // The covariance matrix is diagonal, as it would be expected // if is assumed there are no interactions between components. Initializes a new instance of the class. The number of independent component distributions. Initializes a new instance of the class. The number of independent component distributions. A function that creates a new distribution for each component index. Initializes a new instance of the class. The number of independent component distributions. A function that creates a new distribution for each component index. Initializes a new instance of the class. The number of independent component distributions. A base component which will be cloned to all dimensions. Initializes a new instance of the class. The components. Gets or sets the components of this joint distribution. Gets the components of this joint distribution. Gets the mean for this distribution. A vector containing the mean values for the distribution. Gets the variance for this distribution. A vector containing the variance values for the distribution. Gets the variance-covariance matrix for this distribution. A matrix containing the covariance values for the distribution. For an independent distribution, this matrix will always be diagonal. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The logarithm of the probability of x occurring in the current distribution. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. For an example on how to fit an independent joint distribution, please take a look at the examples section for . Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. For an example on how to fit an independent joint distribution, please take a look at the examples section for . Resets cached values (should be called after re-estimation). Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Represents one component distribution in a mixture distribution. The distribution type. Gets the weight associated with this component. Gets the component distribution. Initializes a new instance of the struct. The mixture distribution. The component index. Mixture of multivariate probability distributions. A mixture density is a probability density function which is expressed as a convex combination (i.e. a weighted sum, with non-negative weights that sum to 1) of other probability density functions. The individual density functions that are combined to make the mixture density are called the mixture components, and the weights associated with each component are called the mixture weights. References: Wikipedia, The Free Encyclopedia. Mixture density. Available on: http://en.wikipedia.org/wiki/Mixture_density The type of the multivariate component distributions. // Randomly initialize some mixture components MultivariateNormalDistribution[] components = new MultivariateNormalDistribution[2]; components[0] = new MultivariateNormalDistribution(new double[] { 2 }, new double[,] { { 1 } }); components[1] = new MultivariateNormalDistribution(new double[] { 5 }, new double[,] { { 1 } }); // Create an initial mixture var mixture = new MultivariateMixture<MultivariateNormalDistribution>(components); // Now, suppose we have a weighted data // set. Those will be the input points: double[][] points = new double[] { 0, 3, 1, 7, 3, 5, 1, 2, -1, 2, 7, 6, 8, 6 } // (14 points) .ToArray(); // And those are their respective unnormalized weights: double[] weights = { 1, 1, 1, 2, 2, 1, 1, 1, 2, 1, 2, 3, 1, 1 }; // (14 weights) // Let's normalize the weights so they sum up to one: weights = weights.Divide(weights.Sum()); // Now we can fit our model to the data: mixture.Fit(points, weights); // done! // Our model will be: double mean1 = mixture.Components[0].Mean[0]; // 1.41126 double mean2 = mixture.Components[1].Mean[0]; // 6.53301 // With mixture coefficients double pi1 = mixture.Coefficients[0]; // 0.51408489193241225 double pi2 = mixture.Coefficients[1]; // 0.48591510806758775 // If we need the GaussianMixtureModel functionality, we can // use the estimated mixture to initialize a new model: GaussianMixtureModel gmm = new GaussianMixtureModel(mixture); mean1 = gmm.Gaussians[0].Mean[0]; // 1.41126 (same) mean2 = gmm.Gaussians[1].Mean[0]; // 6.53301 (same) p1 = gmm.Gaussians[0].Proportion; // 0.51408 (same) p2 = gmm.Gaussians[1].Proportion; // 0.48591 (same) Initializes a new instance of the class. The mixture distribution components. Initializes a new instance of the class. The mixture weight coefficients. The mixture distribution components. Gets the mixture components. Gets the weight coefficients. Gets the probability density function (pdf) for one of the component distributions evaluated at point x. The index of the desired component distribution. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the log-probability density function (pdf) for one of the component distributions evaluated at point x. The index of the desired component distribution. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The Probability Density Function (PDF) describes the probability that a given value x will occur. The logarithm of the probability of x occurring in the current distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the cumulative distribution function (cdf) for one of the component distributions evaluated at point x. The component distribution's index. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Computes the log-likelihood of the distribution for a given set of observations. Computes the log-likelihood of the distribution for a given set of observations. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Gets the mean for this distribution. Gets the variance-covariance matrix for this distribution. Gets the variance vector for this distribution. Estimates a new mixture model from a given set of observations. A set of observations. The initial components of the mixture model. Returns a new Mixture fitted to the given observations. Estimates a new mixture model from a given set of observations. A set of observations. The initial components of the mixture model. The initial mixture coefficients. Returns a new Mixture fitted to the given observations. Estimates a new mixture model from a given set of observations. A set of observations. The initial components of the mixture model. The initial mixture coefficients. The convergence threshold for the Expectation-Maximization estimation. Returns a new Mixture fitted to the given observations. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Returns a that represents this instance. A that represents this instance. Multinomial probability distribution. The multinomial distribution is a generalization of the binomial distribution. The binomial distribution is the probability distribution of the number of "successes" in n independent Bernoulli trials, with the same probability of "success" on each trial. In a multinomial distribution, the analog of the Bernoulli distribution is the categorical distribution, where each trial results in exactly one of some fixed finite number k of possible outcomes, with probabilities p1, ..., pk and there are n independent trials. References: Wikipedia, The Free Encyclopedia. Multinomial distribution. Available on: http://en.wikipedia.org/wiki/Multinomial_distribution // distribution parameters int numberOfTrials = 5; double[] probabilities = { 0.25, 0.75 }; // Create a new Multinomial distribution with 5 trials for 2 symbols var dist = new MultinomialDistribution(numberOfTrials, probabilities); int dimensions = dist.Dimension; // 2 double[] mean = dist.Mean; // { 1.25, 3.75 } double[] median = dist.Median; // { 1.25, 3.75 } double[] var = dist.Variance; // { -0.9375, -0.9375 } double pdf1 = dist.ProbabilityMassFunction(new[] { 2, 3 }); // 0.26367187499999994 double pdf2 = dist.ProbabilityMassFunction(new[] { 1, 4 }); // 0.3955078125 double pdf3 = dist.ProbabilityMassFunction(new[] { 5, 0 }); // 0.0009765625 double lpdf = dist.LogProbabilityMassFunction(new[] { 1, 4 }); // -0.9275847384929139 // output is "Multinomial(x; n = 5, p = { 0.25, 0.75 })" string str = dist.ToString(CultureInfo.InvariantCulture); Initializes a new instance of the class. The total number of trials N. A vector containing the probabilities of seeing each of possible outcomes. Gets the event probabilities associated with the trials. Gets the number of Bernoulli trials N. Gets the mean for this distribution. Gets the variance vector for this distribution. Gets the variance-covariance matrix for this distribution. Gets the support interval for this distribution. A containing the support interval for this distribution. Not supported. Gets the probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Mass Function (PMF) describes the probability that a given value x will occur. Gets the log-probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Mass Function (PMF) describes the probability that a given value x will occur. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Dirichlet distribution. The Dirichlet distribution, often denoted Dir(α), is a family of continuous multivariate probability distributions parameterized by a vector α of positive real numbers. It is the multivariate generalization of the beta distribution. Dirichlet distributions are very often used as prior distributions in Bayesian statistics, and in fact the Dirichlet distribution is the conjugate prior of the categorical distribution and multinomial distribution. That is, its probability density function returns the belief that the probabilities of K rival events are xi given that each event has been observed αi−1 times. References: Wikipedia, The Free Encyclopedia. Dirichlet distribution. Available from: http://en.wikipedia.org/wiki/Dirichlet_distribution // Create a Dirichlet with the following concentrations var dirich = new DirichletDistribution(0.42, 0.57, 1.2); // Common measures double[] mean = dirich.Mean; // { 0.19, 0.26, 0.54 } double[] median = dirich.Median; // { 0.19, 0.26, 0.54 } double[] var = dirich.Variance; // { 0.048, 0.060, 0.077 } double[,] cov = dirich.Covariance; // see below // 0.0115297440926238 0.0156475098399895 0.0329421259789253 // cov = 0.0156475098399895 0.0212359062114143 0.0447071709713986 // 0.0329421259789253 0.0447071709713986 0.0941203599397865 // (the above matrix representation has been transcribed to text using) string str = cov.ToString(DefaultMatrixFormatProvider.InvariantCulture); // Probability mass functions double pdf1 = dirich.ProbabilityDensityFunction(new double[] { 2, 5 }); // 0.12121671541846207 double pdf2 = dirich.ProbabilityDensityFunction(new double[] { 4, 2 }); // 0.12024840322466089 double pdf3 = dirich.ProbabilityDensityFunction(new double[] { 3, 7 }); // 0.082907634905068528 double lpdf = dirich.LogProbabilityDensityFunction(new double[] { 3, 7 }); // -2.4900281233124044 Creates a new symmetric Dirichlet distribution. The number k of categories. The common concentration parameter α (alpha). Creates a new Dirichlet distribution. The concentration parameters αi (alpha_i). Gets the mean for this distribution. A vector containing the mean values for the distribution. Gets the variance for this distribution. A vector containing the variance values for the distribution. Gets the variance-covariance matrix for this distribution. A matrix containing the covariance values for the distribution. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Not supported. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The logarithm of the probability of x occurring in the current distribution. Not supported. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. The format. The format provider. A that represents this instance. Hidden Markov Model probability distribution. Initializes a new instance of the class. The model. Gets the mean for this distribution. An array of double-precision values containing the mean values for this distribution. Gets the mean for this distribution. An array of double-precision values containing the variance values for this distribution. Gets the variance for this distribution. An multidimensional array of double-precision values containing the covariance values for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Mass Function (PMF) describes the probability that a given value x will occur. Gets the log-probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Mass Function (PMF) describes the probability that a given value x will occur. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. The format. The format provider. A that represents this instance. Abstract class for Multivariate Probability Distributions. A probability distribution identifies either the probability of each value of an unidentified random variable (when the variable is discrete), or the probability of the value falling within a particular interval (when the variable is continuous). The probability distribution describes the range of possible values that a random variable can attain and the probability that the value of the random variable is within any (measurable) subset of that range. The function describing the probability that a given value will occur is called the probability function (or probability density function, abbreviated PDF), and the function describing the cumulative probability that a given value or any value smaller than it will occur is called the distribution function (or cumulative distribution function, abbreviated CDF). References: Wikipedia, The Free Encyclopedia. Probability distribution. Available on: http://en.wikipedia.org/wiki/Probability_distribution Weisstein, Eric W. "Statistical Distribution." From MathWorld--A Wolfram Web Resource. http://mathworld.wolfram.com/StatisticalDistribution.html Constructs a new MultivariateDistribution class. The number of dimensions in the distribution. Gets the number of variables for this distribution. Gets the mean for this distribution. A vector containing the mean values for the distribution. Gets the variance for this distribution. A vector containing the variance values for the distribution. Gets the variance-covariance matrix for this distribution. A matrix containing the covariance values for the distribution. Gets the mode for this distribution. A vector containing the mode values for the distribution. Gets the median for this distribution. A vector containing the median values for the distribution. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The Probability Density Function (PDF) describes the probability that a given value x will occur. The probability of x occurring in the current distribution. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The Probability Density Function (PDF) describes the probability that a given value x will occur. The probability of x occurring in the current distribution. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The Probability Density Function (PDF) describes the probability that a given value x will occur. The probability of x occurring in the current distribution. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The Probability Density Function (PDF) describes the probability that a given value x will occur. The probability of x occurring in the current distribution. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The logarithm of the probability of x occurring in the current distribution. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The logarithm of the probability of x occurring in the current distribution. Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Generates a random vector of observations from the current distribution. The number of samples to generate. A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Generates a random observation from the current distribution. The location where to store the sample. A random observation drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the observations. A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Generates a random observation from the current distribution. The location where to store the sample. The random number generator to use as a source of randomness. Default is to use . A random observation drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the observations. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Joint distribution of multiple discrete univariate distributions. This class builds a (potentially huge) lookup table for discrete symbol distributions. For example, given a discrete variable A which may take symbols a, b, c; and a discrete variable B which may assume values x, y, z, this class will build the probability table: x y z a p(a,x) p(a,y) p(a,z) b p(b,x) p(b,y) p(b,z) c p(c,x) p(c,y) p(c,z) Thus comprising the probabilities for all possible simple combination. This distribution is a generalization of the for multivariate discrete observations. The following example should demonstrate how to estimate a joint distribution of two discrete variables. The first variable can take up to three distinct values, whereas the second can assume up to five. // Lets create a joint distribution for two discrete variables: // the first of which can assume 3 distinct symbol values: 0, 1, 2 // the second which can assume 5 distinct symbol values: 0, 1, 2, 3, 4 int[] symbols = { 3, 5 }; // specify the symbol counts // Create the joint distribution for the above variables JointDistribution joint = new JointDistribution(symbols); // Now, suppose we would like to fit the distribution (estimate // its parameters) from the following multivariate observations: // double[][] observations = { new double[] { 0, 0 }, new double[] { 1, 1 }, new double[] { 2, 1 }, new double[] { 0, 0 }, }; // Estimate parameters joint.Fit(observations); // At this point, we can query the distribution for observations: double p1 = joint.ProbabilityMassFunction(new[] { 0, 0 }); // should be 0.50 double p2 = joint.ProbabilityMassFunction(new[] { 1, 1 }); // should be 0.25 double p3 = joint.ProbabilityMassFunction(new[] { 2, 1 }); // should be 0.25 // As it can be seem, indeed {0,0} appeared twice at the data, // and {1,1} and {2,1 appeared one fourth of the data each. Gets the frequency of observation of each discrete variable. Gets the integer value where the discrete distribution starts. Gets the integer value where the discrete distribution ends. Gets the number of symbols in the distribution. Gets the number of symbols for each discrete variable. Gets the support interval for this distribution. A containing the support interval for this distribution. Constructs a new joint discrete distribution. Constructs a new joint discrete distribution. Constructs a new joint discrete distribution. Constructs a new joint discrete distribution. Constructs a new joint discrete distribution. Gets or sets the probability value attached to the given index. Constructs a new multidimensional uniform discrete distribution. The number of dimensions in the joint distribution. The integer value where the distribution starts, also known as a. Default value is 0. The integer value where the distribution ends, also known as b. Gets the probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The Probability Mass Function (PMF) describes the probability that a given value x will occur. The probability of x occurring in the current distribution. Gets the log-probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Mass Function (PMF) describes the probability that a given value x will occur. Gets the mean for this distribution. An array of double-precision values containing the mean values for this distribution. Gets the mean for this distribution. An array of double-precision values containing the variance values for this distribution. Gets the variance for this distribution. An multidimensional array of double-precision values containing the covariance values for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. This method does not accept fitting options. weights;The weight vector should have the same size as the observations Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Estimates a new from a given set of observations. Please see . Multivariate Normal (Gaussian) distribution. The Gaussian is the most widely used distribution for continuous variables. In the case of many variables, it is governed by two parameters, the mean vector and the variance-covariance matrix. When a covariance matrix given to the class constructor is not positive definite, the distribution is degenerate and this may be an indication indication that it may be entirely contained in a r-dimensional subspace. Applying a rotation to an orthogonal basis to recover a non-degenerate r-dimensional distribution may help in this case. References: Ai Access. Glossary of Data Modeling. Positive definite matrix. Available on: http://www.aiaccess.net/English/Glossaries/GlosMod/e_gm_positive_definite_matrix.htm The following example shows how to create a Multivariate Gaussian distribution with known parameters mean vector and covariance matrix // Create a multivariate Gaussian distribution var dist = new MultivariateNormalDistribution( // mean vector mu mean: new double[] { 4, 2 }, // covariance matrix sigma covariance: new double[,] { { 0.3, 0.1 }, { 0.1, 0.7 } } ); // Common measures double[] mean = dist.Mean; // { 4, 2 } double[] median = dist.Median; // { 4, 2 } double[] var = dist.Variance; // { 0.3, 0.7 } (diagonal from cov) double[,] cov = dist.Covariance; // { { 0.3, 0.1 }, { 0.1, 0.7 } } // Probability mass functions double pdf1 = dist.ProbabilityDensityFunction(new double[] { 2, 5 }); // 0.000000018917884164743237 double pdf2 = dist.ProbabilityDensityFunction(new double[] { 4, 2 }); // 0.35588127170858852 double pdf3 = dist.ProbabilityDensityFunction(new double[] { 3, 7 }); // 0.000000000036520107734505265 double lpdf = dist.LogProbabilityDensityFunction(new double[] { 3, 7 }); // -24.033158110192296 // Cumulative distribution function (for up to two dimensions) double cdf = dist.DistributionFunction(new double[] { 3, 5 }); // 0.033944035782101548 // Generate samples from this distribution: double[][] sample = dist.Generate(1000000); The following example demonstrates how to fit a multivariate Gaussian to a set of observations. Since those observations would lead to numerical difficulties, the example also demonstrates how to specify a regularization constant to avoid getting a . double[][] observations = { new double[] { 1, 2 }, new double[] { 1, 2 }, new double[] { 1, 2 }, new double[] { 1, 2 } }; // Create a multivariate Gaussian for 2 dimensions var normal = new MultivariateNormalDistribution(2); // Specify a regularization constant in the fitting options NormalOptions options = new NormalOptions() { Regularization = double.Epsilon }; // Fit the distribution to the data normal.Fit(observations, options); // Check distribution parameters double[] mean = normal.Mean; // { 1, 2 } double[] var = normal.Variance; // { 4.9E-324, 4.9E-324 } (almost machine zero) The next example shows how to estimate a Gaussian distribution from data available inside a Microsoft Excel spreadsheet using the ExcelReader class. // Create a new Excel reader to read data from a spreadsheet ExcelReader reader = new ExcelReader(@"test.xls", hasHeaders: false); // Extract the "Data" worksheet from the xls DataTable table = reader.GetWorksheet("Data"); // Convert the data table to a jagged matrix double[][] observations = table.ToArray(); // Estimate a new Multivariate Normal Distribution from the observations var dist = MultivariateNormalDistribution.Estimate(observations, new NormalOptions() { Regularization = 1e-10 // this value will be added to the diagonal until it becomes positive-definite }); Constructs a multivariate Gaussian distribution with zero mean vector and identity covariance matrix. The number of dimensions in the distribution. Constructs a multivariate Gaussian distribution with given mean vector and covariance matrix. The mean vector μ (mu) for the distribution. The covariance matrix Σ (sigma) for the distribution. Constructs a multivariate Gaussian distribution with given mean vector and covariance matrix. The mean vector μ (mu) for the distribution. Constructs a multivariate Gaussian distribution with given mean vector and covariance matrix. The mean vector μ (mu) for the distribution. The covariance matrix Σ (sigma) for the distribution. Gets the Mean vector μ (mu) for the Gaussian distribution. Gets the Variance vector diag(Σ), the diagonal of the sigma matrix, for the Gaussian distribution. Gets the variance-covariance matrix Σ (sigma) for the Gaussian distribution. Computes the cumulative distribution function for distributions up to two dimensions. For more than two dimensions, this method is not supported. Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The logarithm of the probability of x occurring in the current distribution. Gets the Mahalanobis distance between a sample and this distribution. A point in the distribution space. The Mahalanobis distance between the point and this distribution. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Please see . Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Please see . Estimates a new Normal distribution from a given set of observations. Estimates a new Normal distribution from a given set of observations. Please see . Estimates a new Normal distribution from a given set of observations. Please see . Estimates a new Normal distribution from a given set of observations. Please see . Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Converts this multivariate normal distribution into a joint distribution of independent normal distributions. A independent joint distribution of normal distributions. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Creates a new univariate Normal distribution. The mean value for the distribution. The standard deviation for the distribution. A object that actually represents a . Creates a new bivariate Normal distribution. The mean value for the first variate in the distribution. The mean value for the second variate in the distribution. The standard deviation for the first variate. The standard deviation for the second variate. The correlation coefficient between the two distributions. A bi-dimensional . Returns a that represents this instance. The format. The format provider. A that represents this instance. Generates a random vector of observations from a distribution with the given parameters. The number of samples to generate. The mean vector μ (mu) for the distribution. The covariance matrix Σ (sigma) for the distribution. A random vector of observations drawn from this distribution. Common interface for multivariate probability distributions. This interface is implemented by both multivariate Discrete Distributions and Continuous Distributions. For Univariate distributions, see . Gets the number of variables for the distribution. Gets the Mean vector for the distribution. An array of double-precision values containing the mean values for this distribution. Gets the Median vector for the distribution. An array of double-precision values containing the median values for this distribution. Gets the Mode vector for the distribution. An array of double-precision values containing the mode values for this distribution. Gets the Variance vector for the distribution. An array of double-precision values containing the variance values for this distribution. Gets the Variance-Covariance matrix for the distribution. An multidimensional array of double-precision values containing the covariance values for this distribution. Common interface for multivariate probability distributions. This interface is implemented by both multivariate Discrete Distributions and Continuous Distributions. However, unlike , this interface has a generic parameter that allows to define the type of the distribution values (i.e. ). For Univariate distributions, see . Gets the number of variables for the distribution. Gets the Mean vector for the distribution. An array of double-precision values containing the mean values for this distribution. Gets the Median vector for the distribution. An array of double-precision values containing the median values for this distribution. Gets the Mode vector for the distribution. An array of double-precision values containing the mode values for this distribution. Gets the Variance vector for the distribution. An array of double-precision values containing the variance values for this distribution. Gets the Variance-Covariance matrix for the distribution. An multidimensional array of double-precision values containing the covariance values for this distribution. Metropolis-Hasting sampling algorithm. References: Wikipedia, The Free Encyclopedia. Metropolis-Hastings algorithm. Available on: https://en.wikipedia.org/wiki/Metropolis%E2%80%93Hastings_algorithm Darren Wilkinson, Metropolis Hastings MCMC when the proposal and target have differing support. Available on: https://darrenjw.wordpress.com/2012/06/04/metropolis-hastings-mcmc-when-the-proposal-and-target-have-differing-support/ Gets the target distribution whose samples must be generated. Initializes a new instance of the class. The target distribution whose samples should be generated. The proposal distribution that is used to generate new parameter samples to be explored. Metropolis-Hasting sampling algorithm. References: Wikipedia, The Free Encyclopedia. Metropolis-Hastings algorithm. Available on: https://en.wikipedia.org/wiki/Metropolis%E2%80%93Hastings_algorithm Darren Wilkinson, Metropolis Hastings MCMC when the proposal and target have differing support. Available on: https://darrenjw.wordpress.com/2012/06/04/metropolis-hastings-mcmc-when-the-proposal-and-target-have-differing-support/ Gets or sets the move proposal distribution. Initializes a new instance of the algorithm. The number of dimensions in each observation. A function specifying the log probability density of the distribution to be sampled. The proposal distribution that is used to generate new parameter samples to be explored. Initializes a new instance of the class. Initializes the algorithm. Metropolis-Hasting sampling algorithm. References: Wikipedia, The Free Encyclopedia. Metropolis-Hastings algorithm. Available on: https://en.wikipedia.org/wiki/Metropolis%E2%80%93Hastings_algorithm Darren Wilkinson, Metropolis Hastings MCMC when the proposal and target have differing support. Available on: https://darrenjw.wordpress.com/2012/06/04/metropolis-hastings-mcmc-when-the-proposal-and-target-have-differing-support/ Gets the last successfully generated observation. Gets or sets a factory method to create random number generators used in this instance. Gets the log-probability of the last successfully generated sample. Gets the log-probability density function of the target distribution. Gets or sets the move proposal distribution. Gets the acceptance rate for the proposals generated by the proposal distribution. Gets the number of dimensions in each observation. Gets or sets how many initial samples will get discarded as part of the initial thermalization (warm-up, initialization) process. Initializes a new instance of the algorithm. The number of dimensions in each observation. A function specifying the log probability density of the distribution to be sampled. The proposal distribution that is used to generate new parameter samples to be explored. Initializes a new instance of the class. Initializes the algorithm. Attempts to generate a new observation from the target distribution, storing its value in the property. A new observation, if the method has succeed; otherwise, null. True if the sample was successfully generated; otherwise, returns false. Attempts to generate a new observation from the target distribution, storing its value in the property. True if the sample was successfully generated; otherwise, false. Thermalizes the sample generation process, generating up to samples and discarding them. This step is done automatically upon the first call to any of the functions. Generates a random vector of observations from the current distribution. The number of samples to generate. A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. A random observation drawn from this distribution. Metropolis-Hasting sampling algorithm. References: Wikipedia, The Free Encyclopedia. Metropolis-Hastings algorithm. Available on: https://en.wikipedia.org/wiki/Metropolis%E2%80%93Hastings_algorithm Darren Wilkinson, Metropolis Hastings MCMC when the proposal and target have differing support. Available on: https://darrenjw.wordpress.com/2012/06/04/metropolis-hastings-mcmc-when-the-proposal-and-target-have-differing-support/ Initializes a new instance of the algorithm. The number of dimensions in each observation. A function specifying the log probability density of the distribution to be sampled. The proposal distribution that is used to generate new parameter samples to be explored. Initializes a new instance of the algorithm. The number of dimensions in each observation. A function specifying the log probability density of the distribution to be sampled. Creates a new sampler using independent Normal distributions as the parameter proposal generation priors. The number of dimensions in each observation. A function specifying the log probability density of the distribution to be sampled. A sampling algorithm that can generate samples from the target distribution. Creates a new sampler using independent Normal distributions as the parameter proposal generation priors. The number of dimensions in each observation. The target distribution whose samples should be generated. A sampling algorithm that can generate samples from the target distribution. Creates a new sampler using symmetric geometric distributions as the parameter proposal generation priors. The number of dimensions in each observation. A function specifying the log probability density of the distribution to be sampled. A sampling algorithm that can generate samples from the target distribution. Creates a new sampler using symmetric geometric distributions as the parameter proposal generation priors. The number of dimensions in each observation. The target distribution whose samples should be generated. A sampling algorithm that can generate samples from the target distribution. Contains univariate distributions such as Normal, Cauchy, Hypergeometric, Poisson, Bernoulli, and specialized distributions such as the Kolmogorov-Smirnov, Nakagami, Weibull, and Von-Mises distributions. The namespace class diagram is shown below. Beta Distribution (of the first kind). The beta distribution is a family of continuous probability distributions defined on the interval (0, 1) parameterized by two positive shape parameters, typically denoted by α and β. The beta distribution can be suited to the statistical modeling of proportions in applications where values of proportions equal to 0 or 1 do not occur. One theoretical case where the beta distribution arises is as the distribution of the ratio formed by one random variable having a Gamma distribution divided by the sum of it and another independent random variable also having a Gamma distribution with the same scale parameter (but possibly different shape parameter). References: Wikipedia, The Free Encyclopedia. Beta distribution. Available from: http://en.wikipedia.org/wiki/Beta_distribution Note: More advanced examples, including distribution estimation and random number generation are also available at the page. The following example shows how to instantiate and use a Beta distribution given its alpha and beta parameters: double alpha = 0.42; double beta = 1.57; // Create a new Beta distribution with α = 0.42 and β = 1.57 BetaDistribution distribution = new BetaDistribution(alpha, beta); // Common measures double mean = distribution.Mean; // 0.21105527638190955 double median = distribution.Median; // 0.11577711097114812 double var = distribution.Variance; // 0.055689279830523512 // Cumulative distribution functions double cdf = distribution.DistributionFunction(x: 0.27); // 0.69358638272337991 double ccdf = distribution.ComplementaryDistributionFunction(x: 0.27); // 0.30641361727662009 double icdf = distribution.InverseDistributionFunction(p: cdf); // 0.26999999068687469 // Probability density functions double pdf = distribution.ProbabilityDensityFunction(x: 0.27); // 0.94644031936694828 double lpdf = distribution.LogProbabilityDensityFunction(x: 0.27); // -0.055047364344046057 // Hazard (failure rate) functions double hf = distribution.HazardFunction(x: 0.27); // 3.0887671630877072 double chf = distribution.CumulativeHazardFunction(x: 0.27); // 1.1828193992944409 // String representation string str = distribution.ToString(); // B(x; α = 0.42, β = 1.57) The following example shows to create a Beta distribution given a discrete number of trials and the number of successes within those trials. It also shows how to compute the 2.5 and 97.5 percentiles of the distribution: int trials = 100; int successes = 78; BetaDistribution distribution = new BetaDistribution(successes, trials); double mean = distribution.Mean; // 0.77450980392156865 double median = distribution.Median; // 0.77630912598534851 double p025 = distribution.InverseDistributionFunction(p: 0.025); // 0.68899653915764347 double p975 = distribution.InverseDistributionFunction(p: 0.975); // 0.84983461640764513 The next example shows how to generate 1000 new samples from a Beta distribution: // Using the distribution's parameters double[] samples = GeneralizedBetaDistribution .Random(alpha: 2, beta: 3, min: 0, max: 1, samples: 1000); // Using an existing distribution var b = new GeneralizedBetaDistribution(alpha: 1, beta: 2); double[] new_samples = b.Generate(1000); And finally, how to estimate the parameters of a Beta distribution from a set of observations, using either the Method-of-moments or the Maximum Likelihood Estimate. // Draw 100000 observations from a Beta with α = 2, β = 3: double[] samples = GeneralizedBetaDistribution .Random(alpha: 2, beta: 3, samples: 100000); // Estimate a distribution from the data var B = BetaDistribution.Estimate(samples); // Explicitly using Method-of-moments estimation var mm = BetaDistribution.Estimate(samples, new BetaOptions { Method = BetaEstimationMethod.Moments }); // Explicitly using Maximum Likelihood estimation var mle = BetaDistribution.Estimate(samples, new BetaOptions { Method = BetaEstimationMethod.MaximumLikelihood }); Creates a new Beta distribution. Creates a new Beta distribution. The number of success r. Default is 0. The number of trials n. Default is 1. Creates a new Beta distribution. The shape parameter α (alpha). The shape parameter β (beta). Gets the shape parameter α (alpha) Gets the shape parameter β (beta). Gets the number of successes r. Gets the number of trials n. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the mean for this distribution. The distribution's mean value. The Beta's mean is computed as μ = a / (a + b). Gets the variance for this distribution. The distribution's variance. The Beta's variance is computed as σ² = (a * b) / ((a + b)² * (a + b + 1)). Gets the entropy for this distribution. The distribution's entropy. Gets the mode for this distribution. The beta distribution's mode is given by (a - 1) / (a + b - 2). The distribution's mode value. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. The Beta's CDF is computed using the Incomplete (regularized) Beta function I_x(a,b) as CDF(x) = I_x(a,b) Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. The Beta's PDF is computed as pdf(x) = c * x^(a - 1) * (1 - x)^(b - 1) where constant c is c = 1.0 / Beta.Function(a, b) Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Computes the Gradient of the Log-Likelihood function for estimating Beta distributions. The observed values. The current alpha value. The current beta value. A bi-dimensional value containing the gradient w.r.t to alpha in its first position, and the gradient w.r.t to be in its second position. Computes the Gradient of the Log-Likelihood function for estimating Beta distributions. The sum of log(y), where y refers to all observed values. The sum of log(1 - y), where y refers to all observed values. The total number of observed values. The current alpha value. The current beta value. A bi-dimensional vector to store the gradient. A bi-dimensional vector containing the gradient w.r.t to alpha in its first position, and the gradient w.r.t to be in its second position. Computes the Log-Likelihood function for estimating Beta distributions. The observed values. The current alpha value. The current beta value. The log-likelihood value for the given observations and given Beta parameters. Computes the Log-Likelihood function for estimating Beta distributions. The sum of log(y), where y refers to all observed values. The sum of log(1 - y), where y refers to all observed values. The total number of observed values. The current alpha value. The current beta value. The log-likelihood value for the given observations and given Beta parameters. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. The random number generator to use as a source of randomness. Default is to use . A random observations drawn from this distribution. Generates a random vector of observations from the Beta distribution with the given parameters. The shape parameter α (alpha). The shape parameter β (beta). The number of samples to generate. An array of double values sampled from the specified Beta distribution. Generates a random vector of observations from the Beta distribution with the given parameters. The shape parameter α (alpha). The shape parameter β (beta). The number of samples to generate. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Beta distribution. Generates a random vector of observations from the Beta distribution with the given parameters. The shape parameter α (alpha). The shape parameter β (beta). The number of samples to generate. The location where to store the samples. An array of double values sampled from the specified Beta distribution. Generates a random vector of observations from the Beta distribution with the given parameters. The shape parameter α (alpha). The shape parameter β (beta). The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Beta distribution. Generates a random observation from the Beta distribution with the given parameters. The shape parameter α (alpha). The shape parameter β (beta). A random double value sampled from the specified Beta distribution. Generates a random observation from the Beta distribution with the given parameters. The shape parameter α (alpha). The shape parameter β (beta). The random number generator to use as a source of randomness. Default is to use . A random double value sampled from the specified Beta distribution. Estimates a new Beta distribution from a set of observations. Estimates a new Beta distribution from a set of weighted observations. Estimates a new Beta distribution from a set of weighted observations. Estimates a new Beta distribution from a set of observations. Beta prime distribution. In probability theory and statistics, the beta prime distribution (also known as inverted beta distribution or beta distribution of the second kind) is an absolutely continuous probability distribution defined for x > 0 with two parameters α and β, having the probability density function: x^(α-1) (1+x)^(-α-β) f(x) = -------------------- B(α,β) where B is the Beta function. While the related beta distribution is the conjugate prior distribution of the parameter of a Bernoulli distribution expressed as a probability, the beta prime distribution is the conjugate prior distribution of the parameter of a Bernoulli distribution expressed in odds. The distribution is a Pearson type VI distribution. References: Wikipedia, The Free Encyclopedia. Beta Prime distribution. Available on: http://en.wikipedia.org/wiki/Beta_prime_distribution The following example shows how to create and test the main characteristics of an Beta prime distribution given its two non-negative shape parameters: // Create a new Beta-Prime distribution with shape (4,2) var betaPrime = new BetaPrimeDistribution(alpha: 4, beta: 2); double mean = betaPrime.Mean; // 4.0 double median = betaPrime.Median; // 2.1866398762435981 double mode = betaPrime.Mode; // 1.0 double var = betaPrime.Variance; // +inf double cdf = betaPrime.DistributionFunction(x: 0.4); // 0.02570357589099781 double pdf = betaPrime.ProbabilityDensityFunction(x: 0.4); // 0.16999719504628183 double lpdf = betaPrime.LogProbabilityDensityFunction(x: 0.4); // -1.7719733417957513 double ccdf = betaPrime.ComplementaryDistributionFunction(x: 0.4); // 0.97429642410900219 double icdf = betaPrime.InverseDistributionFunction(p: cdf); // 0.39999982363709291 double hf = betaPrime.HazardFunction(x: 0.4); // 0.17448200654307533 double chf = betaPrime.CumulativeHazardFunction(x: 0.4); // 0.026039684773113869 string str = betaPrime.ToString(CultureInfo.InvariantCulture); // BetaPrime(x; α = 4, β = 2) Constructs a new Beta-Prime distribution with the given two non-negative shape parameters a and b. The distribution's non-negative shape parameter a. The distribution's non-negative shape parameter b. Gets the distribution's non-negative shape parameter a. Gets the distribution's non-negative shape parameter b. Gets the mean for this distribution. The distribution's mean value. Gets the mode for this distribution. The distribution's mode value. Gets the variance for this distribution. The distribution's variance. Not supported. Gets the support interval for this distribution, which for the Beta- Prime distribution ranges from 0 to all positive numbers. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Cauchy-Lorentz distribution. The Cauchy distribution, named after Augustin Cauchy, is a continuous probability distribution. It is also known, especially among physicists, as the Lorentz distribution (after Hendrik Lorentz), Cauchy–Lorentz distribution, Lorentz(ian) function, or Breit–Wigner distribution. The simplest Cauchy distribution is called the standard Cauchy distribution. It has the distribution of a random variable that is the ratio of two independent standard normal random variables. References: Wikipedia, The Free Encyclopedia. Cauchy distribution. Available from: http://en.wikipedia.org/wiki/Cauchy_distribution The following example demonstrates how to instantiate a Cauchy distribution with a given location parameter x0 and scale parameter γ (gamma), calculating its main properties and characteristics: double location = 0.42; double scale = 1.57; // Create a new Cauchy distribution with x0 = 0.42 and γ = 1.57 CauchyDistribution cauchy = new CauchyDistribution(location, scale); // Common measures double mean = cauchy.Mean; // NaN - Cauchy's mean is undefined. double var = cauchy.Variance; // NaN - Cauchy's variance is undefined. double median = cauchy.Median; // 0.42 // Cumulative distribution functions double cdf = cauchy.DistributionFunction(x: 0.27); // 0.46968025841608563 double ccdf = cauchy.ComplementaryDistributionFunction(x: 0.27); // 0.53031974158391437 double icdf = cauchy.InverseDistributionFunction(p: 0.69358638272337991); // 1.5130304686978195 // Probability density functions double pdf = cauchy.ProbabilityDensityFunction(x: 0.27); // 0.2009112009763413 double lpdf = cauchy.LogProbabilityDensityFunction(x: 0.27); // -1.6048922547266871 // Hazard (failure rate) functions double hf = cauchy.HazardFunction(x: 0.27); // 0.3788491832800277 double chf = cauchy.CumulativeHazardFunction(x: 0.27); // 0.63427516833243092 // String representation string str = cauchy.ToString(CultureInfo.InvariantCulture); // "Cauchy(x; x0 = 0.42, γ = 1.57) The following example shows how to fit a Cauchy distribution (estimate its location and shape parameters) given a set of observation values. // Create an initial distribution CauchyDistribution cauchy = new CauchyDistribution(); // Consider a vector of univariate observations double[] observations = { 0.25, 0.12, 0.72, 0.21, 0.62, 0.12, 0.62, 0.12 }; // Fit to the observations cauchy.Fit(observations); // Check estimated values double location = cauchy.Location; // 0.18383 double gamma = cauchy.Scale; // -0.10530 It is also possible to estimate only some of the Cauchy parameters at a time. For this, you can specify a object and pass it alongside the observations: // Create options to estimate location only CauchyOptions options = new CauchyOptions() { EstimateLocation = true, EstimateScale = false }; // Create an initial distribution with a pre-defined scale CauchyDistribution cauchy = new CauchyDistribution(location: 0, scale: 4.2); // Fit to the observations cauchy.Fit(observations, options); // Check estimated values double location = cauchy.Location; // 0.3471218110202 double gamma = cauchy.Scale; // 4.2 (unchanged) Constructs a Cauchy-Lorentz distribution with location parameter 0 and scale 1. Constructs a Cauchy-Lorentz distribution with given location and scale parameters. The location parameter x0. The scale parameter gamma (γ). Gets the distribution's location parameter x0. Gets the distribution's scale parameter gamma. Gets the median for this distribution. The distribution's median value. The Cauchy's median is the location parameter x0. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the mode for this distribution. The distribution's mode value. The Cauchy's median is the location parameter x0. Cauchy's mean is undefined. Undefined. Cauchy's variance is undefined. Undefined. Gets the entropy for this distribution. The distribution's entropy. The Cauchy's entropy is defined as log(scale) + log(4*π). Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. The Cauchy's CDF is defined as CDF(x) = 1/π * atan2(x-location, scale) + 0.5. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. The Cauchy's PDF is defined as PDF(x) = c / (1.0 + ((x-location)/scale)²) where the constant c is given by c = 1.0 / (π * scale); Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. See . Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. See . Gets the Standard Cauchy Distribution, with zero location and unitary shape. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. The random number generator to use as a source of randomness. Default is to use . A random observations drawn from this distribution. Generates a random observation from the Cauchy distribution with the given parameters. The location parameter x0. The scale parameter gamma. A random double value sampled from the specified Cauchy distribution. Generates a random vector of observations from the Cauchy distribution with the given parameters. The location parameter x0. The scale parameter gamma. The number of samples to generate. An array of double values sampled from the specified Cauchy distribution. Generates a random vector of observations from the Cauchy distribution with the given parameters. The location parameter x0. The scale parameter gamma. The number of samples to generate. The location where to store the samples. An array of double values sampled from the specified Cauchy distribution. Generates a random observation from the Cauchy distribution with the given parameters. The location parameter x0. The scale parameter gamma. The random number generator to use as a source of randomness. Default is to use . A random double value sampled from the specified Cauchy distribution. Generates a random vector of observations from the Cauchy distribution with the given parameters. The location parameter x0. The scale parameter gamma. The number of samples to generate. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Cauchy distribution. Generates a random vector of observations from the Cauchy distribution with the given parameters. The location parameter x0. The scale parameter gamma. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Cauchy distribution. Returns a that represents this instance. A that represents this instance. Birnbaum-Saunders (Fatigue Life) distribution. The Birnbaum–Saunders distribution, also known as the fatigue life distribution, is a probability distribution used extensively in reliability applications to model failure times. There are several alternative formulations of this distribution in the literature. It is named after Z. W. Birnbaum and S. C. Saunders. References: Wikipedia, The Free Encyclopedia. Birnbaum–Saunders distribution. Available from: http://en.wikipedia.org/wiki/Birnbaum%E2%80%93Saunders_distribution NIST/SEMATECH e-Handbook of Statistical Methods, Birnbaum-Saunders (Fatigue Life) Distribution Available from: http://www.itl.nist.gov/div898/handbook/eda/section3/eda366a.htm This example shows how to create a Birnbaum-Saunders distribution and compute some of its properties. // Creates a new Birnbaum-Saunders distribution var bs = new BirnbaumSaundersDistribution(shape: 0.42); double mean = bs.Mean; // 1.0882000000000001 double median = bs.Median; // 1.0 double var = bs.Variance; // 0.21529619999999997 double cdf = bs.DistributionFunction(x: 1.4); // 0.78956384911580346 double pdf = bs.ProbabilityDensityFunction(x: 1.4); // 1.3618433601225426 double lpdf = bs.LogProbabilityDensityFunction(x: 1.4); // 0.30883919386130815 double ccdf = bs.ComplementaryDistributionFunction(x: 1.4); // 0.21043615088419654 double icdf = bs.InverseDistributionFunction(p: cdf); // 2.0618330099769064 double hf = bs.HazardFunction(x: 1.4); // 6.4715276077824093 double chf = bs.CumulativeHazardFunction(x: 1.4); // 1.5585729930861034 string str = bs.ToString(CultureInfo.InvariantCulture); // BirnbaumSaunders(x; μ = 0, β = 1, γ = 0.42) Constructs a Birnbaum-Saunders distribution with location parameter 0, scale 1, and shape 1. Constructs a Birnbaum-Saunders distribution with location parameter 0, scale 1, and the given shape. The shape parameter gamma (γ). Default is 1. Constructs a Birnbaum-Saunders distribution with given location, shape and scale parameters. The location parameter μ. Default is 0. The scale parameter beta (β). Default is 1. The shape parameter gamma (γ). Default is 1. Gets the distribution's location parameter μ. Gets the distribution's scale parameter β. Gets the distribution's shape parameter γ. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the mean for this distribution. The Birnbaum Saunders mean is defined as 1 + 0.5γ². The distribution's mean value. Gets the variance for this distribution. The Birnbaum Saunders variance is defined as γ² (1 + (5/4)γ²). The distribution's mean value. This method is not supported. This method is not supported. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Distribution types supported by the Anderson-Darling distribution. The statistic should reflect p-values for a Anderson-Darling comparison against an Uniform distribution. The statistic should reflect p-values for a Anderson-Darling comparison against a Normal distribution. Anderson-Darling (A²) distribution. // Create a new Anderson Darling distribution (A²) for comparing against a Gaussian var a2 = new AndersonDarlingDistribution(AndersonDarlingDistributionType.Normal, 30); double median = a2.Median; // 0.33089957635450062 double chf = a2.CumulativeHazardFunction(x: 0.27); // 0.42618068373640966 double cdf = a2.DistributionFunction(x: 0.27); // 0.34700165471995292 double ccdf = a2.ComplementaryDistributionFunction(x: 0.27); // 0.65299834528004708 double icdf = a2.InverseDistributionFunction(p: cdf); // 0.27000000012207787 string str = a2.ToString(CultureInfo.InvariantCulture); // "A²(x; n = 30)" Gets the type of the distribution that the Anderson-Darling is being performed against. Gets the number of samples distribution parameter. Creates a new Anderson-Darling distribution. The type of the compared distribution. The number of samples. Gets the support interval for this distribution. A containing the support interval for this distribution. Not supported. Not supported. Not supported. Not supported. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. See . Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. A single point in the distribution range. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Not supported. Not supported. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Grubb's statistic distribution. Gets the number of samples for the distribution. Not supported. Not supported. Not supported. Gets the support interval for this distribution. A containing the support interval for this distribution. Initializes a new instance of the class. The number of samples. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. System.Double. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . The Inverse Cumulative Distribution Function (ICDF) specifies, for a given probability, the value which the random variable will be at, or below, with that probability. Not supported. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. The format. The format provider. A that represents this instance. Kumaraswamy distribution. In probability and statistics, the Kumaraswamy's double bounded distribution is a family of continuous probability distributions defined on the interval [0,1] differing in the values of their two non-negative shape parameters, a and b. It is similar to the Beta distribution, but much simpler to use especially in simulation studies due to the simple closed form of both its probability density function and cumulative distribution function. This distribution was originally proposed by Poondi Kumaraswamy for variables that are lower and upper bounded. A good example of the use of the Kumaraswamy distribution is the storage volume of a reservoir of capacity zmax whose upper bound is zmax and lower bound is 0 (Fletcher and Ponnambalam, 1996). References: Wikipedia, The Free Encyclopedia. Kumaraswamy distribution. Available on: http://en.wikipedia.org/wiki/Kumaraswamy_distribution The following example shows how to create and test the main characteristics of an Kumaraswamy distribution given its two non-negative shape parameters: // Create a new Kumaraswamy distribution with shape (4,2) var kumaraswamy = new KumaraswamyDistribution(a: 4, b: 2); double mean = kumaraswamy.Mean; // 0.71111111111111114 double median = kumaraswamy.Median; // 0.73566031573423674 double mode = kumaraswamy.Mode; // 0.80910671157022118 double var = kumaraswamy.Variance; // 0.027654320987654302 double cdf = kumaraswamy.DistributionFunction(x: 0.4); // 0.050544639999999919 double pdf = kumaraswamy.ProbabilityDensityFunction(x: 0.4); // 0.49889280000000014 double lpdf = kumaraswamy.LogProbabilityDensityFunction(x: 0.4); // -0.69536403596913343 double ccdf = kumaraswamy.ComplementaryDistributionFunction(x: 0.4); // 0.94945536000000008 double icdf = kumaraswamy.InverseDistributionFunction(p: cdf); // 0.40000011480618253 double hf = kumaraswamy.HazardFunction(x: 0.4); // 0.52545155993431869 double chf = kumaraswamy.CumulativeHazardFunction(x: 0.4); // 0.051866764053008864 string str = kumaraswamy.ToString(CultureInfo.InvariantCulture); // Kumaraswamy(x; a = 4, b = 2) Constructs a new Kumaraswamy's double bounded distribution with the given two non-negative shape parameters a and b. The distribution's non-negative shape parameter a. The distribution's non-negative shape parameter b. Gets the distribution's non-negative shape parameter a. Gets the distribution's non-negative shape parameter b. Gets the mean for this distribution. The distribution's mean value. Gets the variance for this distribution. The distribution's variance. Gets the median for this distribution. The distribution's median value. Gets the mode for this distribution. The distribution's mode value. Not supported. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Rademacher distribution. Gets the mean for this distribution. The distribution's mean value. Gets the variance for this distribution. The distribution's variance. Gets the median for this distribution. The distribution's median value. Gets the entropy for this distribution. The distribution's entropy. Returns NaN. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets P(X ≤ k), the cumulative distribution function (cdf) for this distribution evaluated at point k. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The probability of k occurring in the current distribution. The Probability Mass Function (PMF) describes the probability that a given value x will occur. Returns a that represents this instance. The format. The format provider. A that represents this instance. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Shapiro-Wilk distribution. The Shapiro-Wilk distribution models the distribution of Shapiro-Wilk's test statistic. References: Royston, P. "Algorithm AS 181: The W test for Normality", Applied Statistics (1982), Vol. 31, pp. 176–180. Royston, P. "Remark AS R94", Applied Statistics (1995), Vol. 44, No. 4, pp. 547-551. Available at http://lib.stat.cmu.edu/apstat/R94 Royston, P. "Approximating the Shapiro-Wilk W-test for non-normality", Statistics and Computing (1992), Vol. 2, pp. 117-119. Royston, P. "An Extension of Shapiro and Wilk's W Test for Normality to Large Samples", Journal of the Royal Statistical Society Series C (1982a), Vol. 31, No. 2, pp. 115-124. // Create a new Shapiro-Wilk's W for 5 samples var sw = new ShapiroWilkDistribution(samples: 5); double mean = sw.Mean; // 0.81248567196628929 double median = sw.Median; // 0.81248567196628929 double mode = sw.Mode; // 0.81248567196628929 double cdf = sw.DistributionFunction(x: 0.84); // 0.83507812080728383 double pdf = sw.ProbabilityDensityFunction(x: 0.84); // 0.82021062372326459 double lpdf = sw.LogProbabilityDensityFunction(x: 0.84); // -0.1981941135071546 double ccdf = sw.ComplementaryDistributionFunction(x: 0.84); // 0.16492187919271617 double icdf = sw.InverseDistributionFunction(p: cdf); // 0.84000000194587177 double hf = sw.HazardFunction(x: 0.84); // 4.9733281462602292 double chf = sw.CumulativeHazardFunction(x: 0.84); // 1.8022833766369502 string str = sw.ToString(CultureInfo.InvariantCulture); // W(x; n = 12) Gets the number of samples distribution parameter. Creates a new Shapiro-Wilk distribution. The number of samples. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the mean for this distribution. The distribution's mean value. Gets the mode for this distribution. The distribution's mode value. Not supported. Gets the median for this distribution. The distribution's median value. Not supported. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. See . Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. A single point in the distribution range. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Not supported. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Log-Logistic distribution. In probability and statistics, the log-logistic distribution (known as the Fisk distribution in economics) is a continuous probability distribution for a non-negative random variable. It is used in survival analysis as a parametric model for events whose rate increases initially and decreases later, for example mortality rate from cancer following diagnosis or treatment. It has also been used in hydrology to model stream flow and precipitation, and in economics as a simple model of the distribution of wealth or income. The log-logistic distribution is the probability distribution of a random variable whose logarithm has a logistic distribution. It is similar in shape to the log-normal distribution but has heavier tails. Its cumulative distribution function can be written in closed form, unlike that of the log-normal. References: Wikipedia, The Free Encyclopedia. Log-logistic distribution. Available on: http://en.wikipedia.org/wiki/Log-logistic_distribution This examples shows how to create a Log-Logistic distribution and compute some of its properties and characteristics. // Create a LLD2 distribution with scale = 0.42, shape = 2.2 var log = new LogLogisticDistribution(scale: 0.42, shape: 2.2); double mean = log.Mean; // 0.60592605102976937 double median = log.Median; // 0.42 double mode = log.Mode; // 0.26892249963239817 double var = log.Variance; // 1.4357858982592435 double cdf = log.DistributionFunction(x: 1.4); // 0.93393329906725353 double pdf = log.ProbabilityDensityFunction(x: 1.4); // 0.096960115938100763 double lpdf = log.LogProbabilityDensityFunction(x: 1.4); // -2.3334555609306102 double ccdf = log.ComplementaryDistributionFunction(x: 1.4); // 0.066066700932746525 double icdf = log.InverseDistributionFunction(p: cdf); // 1.4000000000000006 double hf = log.HazardFunction(x: 1.4); // 1.4676094699628273 double chf = log.CumulativeHazardFunction(x: 1.4); // 2.7170904270953637 string str = log.ToString(CultureInfo.InvariantCulture); // LogLogistic(x; α = 0.42, β = 2.2) Constructs a Log-Logistic distribution with unit scale and unit shape. Constructs a Log-Logistic distribution with the given scale and unit shape. The distribution's scale value α (alpha). Default is 1. Constructs a Log-Logistic distribution with the given scale and shape parameters. The distribution's scale value α (alpha). Default is 1. The distribution's shape value β (beta). Default is 1. Gets the distribution's scale value (α). Gets the distribution's shape value (β). Gets the mean for this distribution. The distribution's mean value. Gets the median for this distribution. The distribution's median value. Gets the variance for this distribution. The distribution's variance. Gets the mode for this distribution. The distribution's mode value. Not supported. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . Gets the first derivative of the inverse distribution function (icdf) for this distribution evaluated at probability p. A probability value between 0 and 1. Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. A single point in the distribution range. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Gets the hazard function, also known as the failure rate or the conditional failure density function for this distribution evaluated at point x. A single point in the distribution range. The conditional failure density function h(x) evaluated at x in the current distribution. The hazard function is the ratio of the probability density function f(x) to the survival function, S(x). Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Creates a new using the location-shape parametrization. In this parametrization, is taken as 1 / . The location parameter μ (mu) [taken as μ = α]. The distribution's shape value σ (sigma) [taken as σ = β]. A with α = μ and β = 1/σ. Inverse chi-Square (χ²) probability distribution In probability and statistics, the inverse-chi-squared distribution (or inverted-chi-square distribution) is a continuous probability distribution of a positive-valued random variable. It is closely related to the chi-squared distribution and its specific importance is that it arises in the application of Bayesian inference to the normal distribution, where it can be used as the prior and posterior distribution for an unknown variance. The inverse-chi-squared distribution (or inverted-chi-square distribution) is the probability distribution of a random variable whose multiplicative inverse (reciprocal) has a chi-squared distribution. It is also often defined as the distribution of a random variable whose reciprocal divided by its degrees of freedom is a chi-squared distribution. That is, if X has the chi-squared distribution with v degrees of freedom, then according to the first definition, 1/X has the inverse-chi-squared distribution with v degrees of freedom; while according to the second definition, vX has the inverse-chi-squared distribution with v degrees of freedom. Only the first definition is covered by this class. References: Wikipedia, The Free Encyclopedia. Inverse-chi-square distribution. Available on: http://en.wikipedia.org/wiki/Inverse-chi-squared_distribution The following example demonstrates how to create a new inverse χ² distribution with the given degrees of freedom. // Create a new inverse χ² distribution with 7 d.f. var invchisq = new InverseChiSquareDistribution(degreesOfFreedom: 7); double mean = invchisq.Mean; // 0.2 double median = invchisq.Median; // 6.345811068141737 double var = invchisq.Variance; // 75 double cdf = invchisq.DistributionFunction(x: 6.27); // 0.50860033566176044 double pdf = invchisq.ProbabilityDensityFunction(x: 6.27); // 0.0000063457380298844403 double lpdf = invchisq.LogProbabilityDensityFunction(x: 6.27); // -11.967727146795536 double ccdf = invchisq.ComplementaryDistributionFunction(x: 6.27); // 0.49139966433823956 double icdf = invchisq.InverseDistributionFunction(p: cdf); // 6.2699998329362963 double hf = invchisq.HazardFunction(x: 6.27); // 0.000012913598625327002 double chf = invchisq.CumulativeHazardFunction(x: 6.27); // 0.71049750196765715 string str = invchisq.ToString(); // "Inv-χ²(x; df = 7)" Constructs a new Inverse Chi-Square distribution with the given degrees of freedom. The degrees of freedom for the distribution. Gets the Degrees of Freedom for this distribution. Gets the probability density function (pdf) for the χ² distribution evaluated at point x. The Probability Density Function (PDF) describes the probability that a given value x will occur. The probability of x occurring in the current distribution. Gets the cumulative distribution function (cdf) for the χ² distribution evaluated at point x. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the mean for this distribution. Gets the variance for this distribution. Gets the mode for this distribution. The distribution's mode value. Gets the entropy for this distribution. This method is not supported. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Hyperbolic Secant distribution. In probability theory and statistics, the hyperbolic secant distribution is a continuous probability distribution whose probability density function and characteristic function are proportional to the hyperbolic secant function. The hyperbolic secant function is equivalent to the inverse hyperbolic cosine, and thus this distribution is also called the inverse-cosh distribution. References: Wikipedia, The Free Encyclopedia. Hyperbolic secant distribution. Available on: http://en.wikipedia.org/wiki/Sech_distribution This examples shows how to create a Sech distribution, compute some of its properties and generate a number of random samples from it. // Create a new hyperbolic secant distribution var sech = new HyperbolicSecantDistribution(); double mean = sech.Mean; // 0.0 double median = sech.Median; // 0.0 double mode = sech.Mode; // 0.0 double var = sech.Variance; // 1.0 double cdf = sech.DistributionFunction(x: 1.4); // 0.92968538268895873 double pdf = sech.ProbabilityDensityFunction(x: 1.4); // 0.10955386512899701 double lpdf = sech.LogProbabilityDensityFunction(x: 1.4); // -2.2113389316917877 double ccdf = sech.ComplementaryDistributionFunction(x: 1.4); // 0.070314617311041272 double icdf = sech.InverseDistributionFunction(p: cdf); // 1.40 double hf = sech.HazardFunction(x: 1.4); // 1.5580524977385339 string str = sech.ToString(); // Sech(x) Constructs a Hyperbolic Secant (Sech) distribution. Gets the mean for this distribution (always zero). The distribution's mean value. Gets the median for this distribution (always zero). The distribution's median value. Gets the variance for this distribution (always one). The distribution's variance. Gets the Standard Deviation (the square root of the variance) for the current distribution. The distribution's standard deviation. Gets the mode for this distribution (always zero). The distribution's mode value. Gets the support interval for this distribution (-inf, +inf). A containing the support interval for this distribution. Gets the entropy for this distribution. The distribution's entropy. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. The format. The format provider. A that represents this instance. Logistic distribution. In probability theory and statistics, the logistic distribution is a continuous probability distribution. Its cumulative distribution function is the logistic function, which appears in logistic regression and feedforward neural networks. It resembles the normal distribution in shape but has heavier tails (higher kurtosis). The Tukey lambda distribution can be considered a generalization of the logistic distribution since it adds a shape parameter, λ (the Tukey distribution becomes logistic when λ is zero). References: Wikipedia, The Free Encyclopedia. Logistic distribution. Available on: http://en.wikipedia.org/wiki/Logistic_distribution This examples shows how to create a Logistic distribution, compute some of its properties and generate a number of random samples from it. // Create a logistic distribution with μ = 0.42 and scale = 3 var log = new LogisticDistribution(location: 0.42, scale: 1.2); double mean = log.Mean; // 0.42 double median = log.Median; // 0.42 double mode = log.Mode; // 0.42 double var = log.Variance; // 4.737410112522892 double cdf = log.DistributionFunction(x: 1.4); // 0.693528308197921 double pdf = log.ProbabilityDensityFunction(x: 1.4); // 0.17712232827170876 double lpdf = log.LogProbabilityDensityFunction(x: 1.4); // -1.7309146649427332 double ccdf = log.ComplementaryDistributionFunction(x: 1.4); // 0.306471691802079 double icdf = log.InverseDistributionFunction(p: cdf); // 1.3999999999999997 double hf = log.HazardFunction(x: 1.4); // 0.57794025683160088 double chf = log.CumulativeHazardFunction(x: 1.4); // 1.1826298874077226 string str = log.ToString(CultureInfo.InvariantCulture); // Logistic(x; μ = 0.42, scale = 1.2) Constructs a Logistic distribution with zero location and unit scale. Constructs a Logistic distribution with given location and unit scale. The distribution's location value μ (mu). Constructs a Logistic distribution with given location and scale parameters. The distribution's location value μ (mu). The distribution's scale value s. Gets the location value μ (mu). Gets the location value μ (mu). The distribution's mean value. Gets the distribution's scale value (s). Gets the median for this distribution. The distribution's median value. Gets the variance for this distribution. The distribution's variance. Gets the mode for this distribution. In the logistic distribution, the mode is equal to the distribution value. The distribution's mode value. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the entropy for this distribution. In the logistic distribution, the entropy is equal to ln(s) + 2. The distribution's entropy. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . Gets the first derivative of the inverse distribution function (icdf) for this distribution evaluated at probability p. A probability value between 0 and 1. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. General continuous distribution. The general continuous distribution provides the automatic calculation for a variety of distribution functions and measures given only definitions for the Probability Density Function (PDF) or the Cumulative Distribution Function (CDF). Values such as the Expected value, Variance, Entropy and others are computed through numeric integration. // Let's suppose we have a formula that defines a probability distribution // but we dont know much else about it. We don't know the form of its cumulative // distribution function, for example. We would then like to know more about // it, such as the underlying distribution's moments, characteristics, and // properties. // Let's suppose the formula we have is this one: double mu = 5; double sigma = 4.2; Func>double, double> df = x => 1.0 / (sigma * Math.Sqrt(2 * Math.PI)) * Math.Exp(-Math.Pow(x - mu, 2) / (2 * sigma * sigma)); // And for the moment, let's also pretend we don't know it is actually the // p.d.f. of a Gaussian distribution with mean 5 and std. deviation of 4.2. // So, let's create a distribution based _solely_ on the formula we have: var distribution = GeneralContinuousDistribution.FromDensityFunction(df); // Now, we can check everything that we can know about it: double mean = distribution.Mean; // 5 (note that all of those have been double median = distribution.Median; // 5 detected automatically simply from double var = distribution.Variance; // 17.64 the given density formula through double mode = distribution.Mode; // 5 numerical methods) double cdf = distribution.DistributionFunction(x: 1.4); // 0.19568296915377595 double pdf = distribution.ProbabilityDensityFunction(x: 1.4); // 0.065784567984404935 double lpdf = distribution.LogProbabilityDensityFunction(x: 1.4); // -2.7213699972695058 double ccdf = distribution.ComplementaryDistributionFunction(x: 1.4); // 0.80431703084622408 double icdf = distribution.InverseDistributionFunction(p: cdf); // 1.3999999997024655 double hf = distribution.HazardFunction(x: 1.4); // 0.081789351041333558 double chf = distribution.CumulativeHazardFunction(x: 1.4); // 0.21776177055276186 Creates a new with the given PDF and CDF functions. The distribution's support over the real line. A probability density function. A cumulative distribution function. Creates a new with the given PDF and CDF functions. A distribution whose properties will be numerically estimated. Creates a new from an existing continuous distribution. The distribution. A representing the same but whose measures and functions are computed using numerical integration and differentiation. Creates a new using only a probability density function definition. A probability density function. A created from the whose measures and functions are computed using numerical integration and differentiation. Creates a new using only a probability density function definition. The distribution's support over the real line. A probability density function. A created from the whose measures and functions are computed using numerical integration and differentiation. Creates a new using only a cumulative distribution function definition. A cumulative distribution function. A created from the whose measures and functions are computed using numerical integration and differentiation. Creates a new using only a cumulative distribution function definition. The distribution's support over the real line. A cumulative distribution function. A created from the whose measures and functions are computed using numerical integration and differentiation. Creates a new using only a probability density function definition. The distribution's support over the real line. A probability density function. The integration method to use for numerical computations. A created from the whose measures and functions are computed using numerical integration and differentiation. Creates a new using only a cumulative distribution function definition. The distribution's support over the real line. A cumulative distribution function. The integration method to use for numerical computations. A created from the whose measures and functions are computed using numerical integration and differentiation. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the mean for this distribution. The distribution's mean value. Gets the variance for this distribution. The distribution's variance. Gets the entropy for this distribution. The distribution's entropy. Gets the mode for this distribution. The distribution's mode value. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. The format. The format provider. A that represents this instance. Lévy distribution. In probability theory and statistics, the Lévy distribution, named after Paul Lévy, is a continuous probability distribution for a non-negative random variable. In spectroscopy, this distribution, with frequency as the dependent variable, is known as a van der Waals profile. It is a special case of the inverse-gamma distribution. It is one of the few distributions that are stable and that have probability density functions that can be expressed analytically, the others being the normal distribution and the Cauchy distribution. All three are special cases of the stable distributions, which do not generally have a probability density function which can be expressed analytically. References: Wikipedia, The Free Encyclopedia. Lévy distribution. Available on: https://en.wikipedia.org/wiki/L%C3%A9vy_distribution This examples shows how to create a Lévy distribution and how to compute some of its measures and properties. // Create a new Lévy distribution on 1 with scale 4.2: var levy = new LevyDistribution(location: 1, scale: 4.2); double mean = levy.Mean; // +inf double median = levy.Median; // 10.232059220934481 double mode = levy.Mode; // NaN double var = levy.Variance; // +inf double cdf = levy.DistributionFunction(x: 1.4); // 0.0011937454448720029 double pdf = levy.ProbabilityDensityFunction(x: 1.4); // 0.016958939623898304 double lpdf = levy.LogProbabilityDensityFunction(x: 1.4); // -4.0769601727487803 double ccdf = levy.ComplementaryDistributionFunction(x: 1.4); // 0.99880625455512795 double icdf = levy.InverseDistributionFunction(p: cdf); // 1.3999999 double hf = levy.HazardFunction(x: 1.4); // 0.016979208476674869 double chf = levy.CumulativeHazardFunction(x: 1.4); // 0.0011944585265140923 string str = levy.ToString(CultureInfo.InvariantCulture); // Lévy(x; μ = 1, c = 4.2) Constructs a new with zero location and unit scale. Constructs a new in the given and with unit scale. The distribution's location. Constructs a new in the given and . The distribution's location. The distribution's scale. Gets the location μ (mu) for this distribution. Gets the location c for this distribution. Gets the mean for this distribution, which for the Levy distribution is always positive infinity. This property always returns Double.PositiveInfinity. Gets the median for this distribution. The distribution's median value. Gets the variance for this distribution, which for the Levy distribution is always positive infinity. This property always returns Double.PositiveInfinity. Gets the mode for this distribution. The distribution's mode value. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the entropy for this distribution. The distribution's entropy. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . The Inverse Cumulative Distribution Function (ICDF) specifies, for a given probability, the value which the random variable will be at, or below, with that probability. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Folded Normal (Gaussian) distribution. The folded normal distribution is a probability distribution related to the normal distribution. Given a normally distributed random variable X with mean μ and variance σ², the random variable Y = |X| has a folded normal distribution. Such a case may be encountered if only the magnitude of some variable is recorded, but not its sign. The distribution is called Folded because probability mass to the left of the x = 0 is "folded" over by taking the absolute value. The Half-Normal (Gaussian) distribution is a special case of this distribution and can be created using a named constructor. References: Wikipedia, The Free Encyclopedia. Folded Normal distribution. Available on: https://en.wikipedia.org/wiki/Folded_normal_distribution This examples shows how to create a Folded Normal distribution and how to compute some of its properties and measures. // Creates a new Folded Normal distribution based on a Normal // distribution with mean value 4 and standard deviation 4.2: // var fn = new FoldedNormalDistribution(mean: 4, stdDev: 4.2); double mean = fn.Mean; // 4.765653108337438 double median = fn.Median; // 4.2593565881862734 double mode = fn.Mode; // 2.0806531871308014 double var = fn.Variance; // 10.928550450993715 double cdf = fn.DistributionFunction(x: 1.4); // 0.16867109769018807 double pdf = fn.ProbabilityDensityFunction(x: 1.4); // 0.11998602818182187 double lpdf = fn.LogProbabilityDensityFunction(x: 1.4); // -2.1203799747969523 double ccdf = fn.ComplementaryDistributionFunction(x: 1.4); // 0.83132890230981193 double icdf = fn.InverseDistributionFunction(p: cdf); // 1.4 double hf = fn.HazardFunction(x: 1.4); // 0.14433039420191671 double chf = fn.CumulativeHazardFunction(x: 1.4); // 0.18472977144474392 string str = fn.ToString(CultureInfo.InvariantCulture); // FN(x; μ = 4, σ² = 17.64) Creates a new with zero mean and unit standard deviation. Creates a new with the given and unit standard deviation. The mean of the original normal distribution that should be folded. Creates a new with the given and standard deviation The mean of the original normal distribution that should be folded. The standard deviation of the original normal distribution that should be folded. Creates a new Half-normal distribution with the given standard deviation. The half-normal distribution is a special case of the when location is zero. The standard deviation of the original normal distribution that should be folded. Gets the mean for this distribution. The distribution's mean value. Gets the variance for this distribution. The distribution's variance. Gets the support interval for this distribution. A containing the support interval for this distribution. This method is not supported. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Shift Log-Logistic distribution. The shifted log-logistic distribution is a probability distribution also known as the generalized log-logistic or the three-parameter log-logistic distribution. It has also been called the generalized logistic distribution, but this conflicts with other uses of the term: see generalized logistic distribution. References: Wikipedia, The Free Encyclopedia. Shifted log-logistic distribution. Available on: http://en.wikipedia.org/wiki/Shifted_log-logistic_distribution This examples shows how to create a Shifted Log-Logistic distribution, compute some of its properties and generate a number of random samples from it. // Create a LLD3 distribution with μ = 0.0, scale = 0.42, and shape = 0.1 var log = new ShiftedLogLogisticDistribution(location: 0, scale: 0.42, shape: 0.1); double mean = log.Mean; // 0.069891101544818923 double median = log.Median; // 0.0 double mode = log.Mode; // -0.083441677069328604 double var = log.Variance; // 0.62447259946747213 double cdf = log.DistributionFunction(x: 1.4); // 0.94668863559417671 double pdf = log.ProbabilityDensityFunction(x: 1.4); // 0.090123683626808615 double lpdf = log.LogProbabilityDensityFunction(x: 1.4); // -2.4065722895662613 double ccdf = log.ComplementaryDistributionFunction(x: 1.4); // 0.053311364405823292 double icdf = log.InverseDistributionFunction(p: cdf); // 1.4000000037735139 double hf = log.HazardFunction(x: 1.4); // 1.6905154207038875 double chf = log.CumulativeHazardFunction(x: 1.4); // 2.9316057546685061 string str = log.ToString(CultureInfo.InvariantCulture); // LLD3(x; μ = 0, σ = 0.42, ξ = 0.1) Constructs a Shifted Log-Logistic distribution with zero location, unit scale, and zero shape. Constructs a Shifted Log-Logistic distribution with the given location, unit scale and zero shape. The distribution's location value μ (mu). Constructs a Shifted Log-Logistic distribution with the given location and scale and zero shape. The distribution's location value μ (mu). The distribution's scale value σ (sigma). Constructs a Shifted Log-Logistic distribution with the given location and scale and zero shape. The distribution's location value μ (mu). The distribution's scale value s. The distribution's shape value ξ (ksi). Gets the mean for this distribution. The distribution's mean value. Gets the distribution's location value μ (mu). Gets the distribution's scale value (σ). Gets the distribution's shape value (ξ). Gets the median for this distribution. The distribution's median value. Gets the variance for this distribution. The distribution's variance. Gets the mode for this distribution. The distribution's mode value. Not supported. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Skew Normal distribution. In probability theory and statistics, the skew normal distribution is a continuous probability distribution that generalises the normal distribution to allow for non-zero skewness. References: Wikipedia, The Free Encyclopedia. Skew normal distribution. Available on: https://en.wikipedia.org/wiki/Skew_normal_distribution This examples shows how to create a Skew normal distribution and compute some of its properties and derived measures. // Create a Skew normal distribution with location 2, scale 3 and shape 4.2 var skewNormal = new SkewNormalDistribution(location: 2, scale: 3, shape: 4.2); double mean = skewNormal.Mean; // 4.3285611780515953 double median = skewNormal.Median; // 4.0230040653062265 double var = skewNormal.Variance; // 3.5778028400709641 double mode = skewNormal.Mode; // 3.220622226764422 double cdf = skewNormal.DistributionFunction(x: 1.4); // 0.020166854942526125 double pdf = skewNormal.ProbabilityDensityFunction(x: 1.4); // 0.052257431834162059 double lpdf = skewNormal.LogProbabilityDensityFunction(x: 1.4); // -2.9515731621912877 double ccdf = skewNormal.ComplementaryDistributionFunction(x: 1.4); // 0.97983314505747388 double icdf = skewNormal.InverseDistributionFunction(p: cdf); // 1.3999998597203041 double hf = skewNormal.HazardFunction(x: 1.4); // 0.053332990517581239 double chf = skewNormal.CumulativeHazardFunction(x: 1.4); // 0.020372981958858238 string str = skewNormal.ToString(CultureInfo.InvariantCulture); // Sn(x; ξ = 2, ω = 3, α = 4.2) Constructs a Skew normal distribution with zero location, unit scale and zero shape. Constructs a Skew normal distribution with given location, unit scale and zero skewness. The distribution's location value ξ (ksi). Constructs a Skew normal distribution with given location and scale and zero skewness. The distribution's location value ξ (ksi). The distribution's scale value ω (omega). Constructs a Skew normal distribution with given mean and standard deviation. The distribution's location value ξ (ksi). The distribution's scale value ω (omega). The distribution's shape value α (alpha). Gets the skew-normal distribution's location value ξ (ksi). Gets the skew-normal distribution's scale value ω (omega). Gets the skew-normal distribution's shape value α (alpha). Not supported. Gets the mean for this distribution. The distribution's mean value. Gets the variance for this distribution. The distribution's variance. Gets the skewness for this distribution. Gets the excess kurtosis for this distribution. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Create a new that corresponds to a with the given mean and standard deviation. The distribution's mean value μ (mu). The distribution's standard deviation σ (sigma). A representing a with the given parameters. Trapezoidal distribution. Trapezoidal distributions have been used in many areas and studied under varying scopes, such as in the excellent work of (van Dorp and Kotz, 2003), risk analysis (Pouliquen, 1970) and (Powell and Wilson, 1997), fuzzy set theory (Chen and Hwang, 1992), applied phyisics, and biomedical applications (Flehinger and Kimmel, 1987). Trapezoidal distributions are appropriate for modeling events that are comprised by three different stages: one growth stage, where probability grows up until a plateau is reached; a stability stage, where probability stays more or less the same; and a decline stage, where probability decreases until zero (van Dorp and Kotz, 2003). References: J. René van Dorp, Samuel Kotz, Trapezoidal distribution. Available on: http://www.seas.gwu.edu/~dorpjr/Publications/JournalPapers/Metrika2003VanDorp.pdf Powell MR, Wilson JD (1997). Risk Assessment for National Natural Resource Conservation Programs, Discussion Paper 97-49. Resources for the Future, Washington D.C. Chen SJ, Hwang CL (1992). Fuzzy Multiple Attribute Decision-Making: Methods and Applications, Springer-Verlag, Berlin, New York. Flehinger BJ, Kimmel M (1987). The natural history of lung cancer in periodically screened population. Biometrics 1987, 43, 127-144. The following example shows how to create and test the main characteristics of a Trapezoidal distribution given its parameters: // Create a new trapezoidal distribution with linear growth between // 0 and 2, stability between 2 and 8, and decrease between 8 and 10. // // // +-----------+ // /| |\ // / | | \ // / | | \ // -------+---+-----------+---+------- // ... 0 2 4 6 8 10 ... // var trapz = new TrapezoidalDistribution(a: 0, b: 2, c: 8, d: 10, n1: 1, n3: 1); double mean = trapz.Mean; // 2.25 double median = trapz.Median; // 3.0 double mode = trapz.Mode; // 3.1353457616424696 double var = trapz.Variance; // 17.986666666666665 double cdf = trapz.DistributionFunction(x: 1.4); // 0.13999999999999999 double pdf = trapz.ProbabilityDensityFunction(x: 1.4); // 0.10000000000000001 double lpdf = trapz.LogProbabilityDensityFunction(x: 1.4); // -2.3025850929940455 double ccdf = trapz.ComplementaryDistributionFunction(x: 1.4); // 0.85999999999999999 double icdf = trapz.InverseDistributionFunction(p: cdf); // 1.3999999999999997 double hf = trapz.HazardFunction(x: 1.4); // 0.11627906976744187 double chf = trapz.CumulativeHazardFunction(x: 1.4); // 0.15082288973458366 string str = trapz.ToString(CultureInfo.InvariantCulture); // Trapezoidal(x; a=0, b=2, c=8, d=10, n1=1, n3=1, α = 1) Creates a new trapezoidal distribution. The minimum value a. The beginning of the stability region b. The end of the stability region c. The maximum value d. Creates a new trapezoidal distribution. The minimum value a. The beginning of the stability region b. The end of the stability region c. The maximum value d. The growth slope between points and . Default is 2. The growth slope between points and . Default is 2. Creates a new trapezoidal distribution. The minimum value a. The beginning of the stability region b. The end of the stability region c. The maximum value d. The growth slope between points and . Default is 2. The growth slope between points and . Default is 2. The boundary ratio α. Default is 1. Gets the mean for this distribution. The distribution's mean value. Gets the variance for this distribution. The distribution's variance. Not supported. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. The 4-parameter Beta distribution. The generalized beta distribution is a family of continuous probability distributions defined on any interval (min, max) parameterized by two positive shape parameters and two real location parameters, typically denoted by α, β, a and b. The beta distribution can be suited to the statistical modeling of proportions in applications where values of proportions equal to 0 or 1 do not occur. One theoretical case where the beta distribution arises is as the distribution of the ratio formed by one random variable having a Gamma distribution divided by the sum of it and another independent random variable also having a Gamma distribution with the same scale parameter (but possibly different shape parameter). References: Wikipedia, The Free Encyclopedia. Beta distribution. Available from: http://en.wikipedia.org/wiki/Beta_distribution Wikipedia, The Free Encyclopedia. Three-point estimation. Available from: https://en.wikipedia.org/wiki/Three-point_estimation Broadleaf Capital International Pty Ltd. Beta PERT origins. Available from: http://broadleaf.com.au/resource-material/beta-pert-origins/ Malcolm, D. G., Roseboom J. H., Clark C.E., and Fazar, W. Application of a technique of research and development program evaluation, Operations Research, 7, 646-669, 1959. Available from: http://mech.vub.ac.be/teaching/info/Ontwerpmethodologie/Appendix%20les%202%20PERT.pdf Clark, C. E. The PERT model for the distribution of an activity time, Operations Research, 10, 405-406, 1962. Available from: http://connection.ebscohost.com/c/articles/18246172/pert-model-distribution-activity-time Note: Simpler examples are also available at the page. The following example shows how to create a simpler 2-parameter Beta distribution and compute some of its properties and measures. The following example shows how to create a 4-parameter (Generalized) Beta distribution and compute some of its properties and measures. The following example shows how to create a 4-parameter Beta distribution with a three-point estimate using PERT. The following example shows how to create a 4-parameter Beta distribution with a three-point estimate using Vose's modification for PERT. The next example shows how to generate 1000 new samples from a Beta distribution: // Using the distribution's parameters double[] samples = GeneralizedBetaDistribution.Random(alpha: 2, beta: 3, min: 0, max: 1, samples: 1000); // Using an existing distribution var b = new GeneralizedBetaDistribution(alpha: 1, beta: 2); double[] new_samples = b.Generate(1000); And finally, how to estimate the parameters of a Beta distribution from a set of observations, using either the Method-of-moments or the Maximum Likelihood Estimate. // First we will be drawing 100000 observations from a 4-parameter // Beta distribution with α = 2, β = 3, min = 10 and max = 15: double[] samples = GeneralizedBetaDistribution.Random(alpha: 2, beta: 3, min: 10, max: 15, samples: 100000); // We can estimate a distribution with the known max and min var B = GeneralizedBetaDistribution.Estimate(samples, 10, 15); // We can explicitly ask for a Method-of-moments estimation var mm = GeneralizedBetaDistribution.Estimate(samples, 10, 15, new GeneralizedBetaOptions { Method = BetaEstimationMethod.Moments }); // or explicitly ask for the Maximum Likelihood estimation var mle = GeneralizedBetaDistribution.Estimate(samples, 10, 15, new GeneralizedBetaOptions { Method = BetaEstimationMethod.MaximumLikelihood }); Constructs a Beta distribution defined in the interval (0,1) with the given parameters α and β. The shape parameter α (alpha). The shape parameter β (beta). Constructs a Beta distribution defined in the interval (a, b) with parameters α, β, a and b. The shape parameter α (alpha). The shape parameter β (beta). The minimum possible value a. The maximum possible value b. Constructs a BetaPERT distribution defined in the interval (a, b) using Vose's PERT estimation for the parameters a, b, mode and λ. The minimum possible value a. The maximum possible value b. The most likely value m. A Beta distribution initialized using the Vose's PERT method. Constructs a BetaPERT distribution defined in the interval (a, b) using Vose's PERT estimation for the parameters a, b, mode and λ. The minimum possible value a. The maximum possible value b. The most likely value m. The scale parameter λ (lambda). Default is 4. A Beta distribution initialized using the Vose's PERT method. Constructs a BetaPERT distribution defined in the interval (a, b) using usual PERT estimation for the parameters a, b, mode and λ. The minimum possible value a. The maximum possible value b. The most likely value m. A Beta distribution initialized using the PERT method. Constructs a BetaPERT distribution defined in the interval (a, b) using usual PERT estimation for the parameters a, b, mode and λ. The minimum possible value a. The maximum possible value b. The most likely value m. The scale parameter λ (lambda). Default is 4. A Beta distribution initialized using the PERT method. Constructs a BetaPERT distribution defined in the interval (a, b) using Golenko-Ginzburg observation that the mode is often at 2/3 of the guessed interval. The minimum possible value a. The maximum possible value b. A Beta distribution initialized using the Golenko-Ginzburg's method. Constructs a standard Beta distribution defined in the interval (0, 1) based on the number of successed and trials for an experiment. The number of success r. Default is 0. The number of trials n. Default is 1. A standard Beta distribution initialized using the given parameters. Gets the minimum value A. Gets the maximum value B. Gets the shape parameter α (alpha) Gets the shape parameter β (beta). Gets the mean for this distribution, defined as (a + 4 * m + 6 * b). The distribution's mean value. Gets the variance for this distribution, defined as ((b - a) / (k+2))² The distribution's variance. Gets the mode for this distribution. The beta distribution's mode is given by (a - 1) / (a + b - 2). The distribution's mode value. Gets the distribution support, defined as (, ). Gets the entropy for this distribution. The distribution's entropy. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Generates a random vector of observations from the Beta distribution with the given parameters. The shape parameter α (alpha). The shape parameter β (beta). The minimum possible value a. The maximum possible value b. The number of samples to generate. An array of double values sampled from the specified Beta distribution. Generates a random vector of observations from the Beta distribution with the given parameters. The shape parameter α (alpha). The shape parameter β (beta). The minimum possible value a. The maximum possible value b. The number of samples to generate. The location where to store the samples. An array of double values sampled from the specified Beta distribution. Generates a random observation from a Beta distribution with the given parameters. The shape parameter α (alpha). The shape parameter β (beta). The minimum possible value a. The maximum possible value b. A random double value sampled from the specified Beta distribution. Generates a random vector of observations from the Beta distribution with the given parameters. The shape parameter α (alpha). The shape parameter β (beta). The minimum possible value a. The maximum possible value b. The number of samples to generate. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Beta distribution. Generates a random vector of observations from the Beta distribution with the given parameters. The shape parameter α (alpha). The shape parameter β (beta). The minimum possible value a. The maximum possible value b. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Beta distribution. Generates a random observation from a Beta distribution with the given parameters. The shape parameter α (alpha). The shape parameter β (beta). The minimum possible value a. The maximum possible value b. The random number generator to use as a source of randomness. Default is to use . A random double value sampled from the specified Beta distribution. Estimates a new Beta distribution from a set of observations. Estimates a new Beta distribution from a set of weighted observations. Estimates a new Beta distribution from a set of weighted observations. Estimates a new Beta distribution from a set of observations. Triangular distribution. In probability theory and statistics, the triangular distribution is a continuous probability distribution with lower limit a, upper limit b and mode c, where a < b and a ≤ c ≤ b. References: Wikipedia, The Free Encyclopedia. Triangular distribution. Available on: https://en.wikipedia.org/wiki/Triangular_distribution This example shows how to create a Triangular distribution with minimum 1, maximum 6, and most common value 3. // Create a new Triangular distribution (1, 3, 6). var trig = new TriangularDistribution(a: 1, b: 6, c: 3); double mean = trig.Mean; // 3.3333333333333335 double median = trig.Median; // 3.2613872124741694 double mode = trig.Mode; // 3.0 double var = trig.Variance; // 1.0555555555555556 double cdf = trig.DistributionFunction(x: 2); // 0.10000000000000001 double pdf = trig.ProbabilityDensityFunction(x: 2); // 0.20000000000000001 double lpdf = trig.LogProbabilityDensityFunction(x: 2); // -1.6094379124341003 double ccdf = trig.ComplementaryDistributionFunction(x: 2); // 0.90000000000000002 double icdf = trig.InverseDistributionFunction(p: cdf); // 2.0000000655718773 double hf = trig.HazardFunction(x: 2); // 0.22222222222222224 double chf = trig.CumulativeHazardFunction(x: 2); // 0.10536051565782628 string str = trig.ToString(CultureInfo.InvariantCulture); // Triangular(x; a = 1, b = 6, c = 3) Constructs a Triangular distribution with the given parameters a, b and c. The minimum possible value in the distribution (a). The maximum possible value in the distribution (b). The most common value in the distribution (c). Gets the triangular parameter A (the minimum value). Gets the triangular parameter B (the maximum value). Gets the mean for this distribution, defined as (a + b + c) / 3. The distribution's mean value. Gets the median for this distribution. The distribution's median value. Gets the variance for this distribution, defined as (a² + b² + c² - ab - ac - bc) / 18. The distribution's variance. Gets the mode for this distribution, also known as the triangular's c. The distribution's mode value. Gets the distribution support, defined as (, ). Gets the entropy for this distribution, defined as 0.5 + log((max-min)/2)). The distribution's entropy. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Gets the minimum value in a set of weighted observations. Gets the maximum value in a set of weighted observations. Finds the index of the last largest value in a set of observations. Finds the index of the first smallest value in a set of observations. Finds the index of the first smallest value in a set of weighted observations. Finds the index of the last largest value in a set of weighted observations. Gumbel distribution (as known as the Extreme Value Type I distribution). In probability theory and statistics, the Gumbel distribution is used to model the distribution of the maximum (or the minimum) of a number of samples of various distributions. Such a distribution might be used to represent the distribution of the maximum level of a river in a particular year if there was a list of maximum values for the past ten years. It is useful in predicting the chance that an extreme earthquake, flood or other natural disaster will occur. The potential applicability of the Gumbel distribution to represent the distribution of maxima relates to extreme value theory which indicates that it is likely to be useful if the distribution of the underlying sample data is of the normal or exponential type. The Gumbel distribution is a particular case of the generalized extreme value distribution (also known as the Fisher-Tippett distribution). It is also known as the log-Weibull distribution and the double exponential distribution (a term that is alternatively sometimes used to refer to the Laplace distribution). It is related to the Gompertz distribution[citation needed]: when its density is first reflected about the origin and then restricted to the positive half line, a Gompertz function is obtained. In the latent variable formulation of the multinomial logit model — common in discrete choice theory — the errors of the latent variables follow a Gumbel distribution. This is useful because the difference of two Gumbel-distributed random variables has a logistic distribution. The Gumbel distribution is named after Emil Julius Gumbel (1891–1966), based on his original papers describing the distribution. References: Wikipedia, The Free Encyclopedia. Gumbel distribution. Available on: http://en.wikipedia.org/wiki/Gumbel_distribution The following example shows how to create and test the main characteristics of an Gumbel distribution given its location and scale parameters: var gumbel = new GumbelDistribution(location: 4.795, scale: 1 / 0.392); double mean = gumbel.Mean; // 6.2674889410753387 double median = gumbel.Median; // 5.7299819402593481 double mode = gumbel.Mode; // 4.7949999999999999 double var = gumbel.Variance; // 10.704745853604138 double cdf = gumbel.DistributionFunction(x: 3.4); // 0.17767760424788051 double pdf = gumbel.ProbabilityDensityFunction(x: 3.4); // 0.12033954114322486 double lpdf = gumbel.LogProbabilityDensityFunction(x: 3.4); // -2.1174380222001519 double ccdf = gumbel.ComplementaryDistributionFunction(x: 3.4); // 0.82232239575211952 double icdf = gumbel.InverseDistributionFunction(p: cdf); // 3.3999999904866245 double hf = gumbel.HazardFunction(x: 1.4); // 0.03449691276402958 double chf = gumbel.CumulativeHazardFunction(x: 1.4); // 0.022988793482259906 string str = gumbel.ToString(CultureInfo.InvariantCulture); // Gumbel(x; μ = 4.795, β = 2.55) Creates a new Gumbel distribution with location zero and unit scale. Creates a new Gumbel distribution with the given location and scale. The location parameter μ (mu). Default is 0. The scale parameter β (beta). Default is 1. Gets the distribution's location parameter mu (μ). Gets the distribution's scale parameter beta (β). Gets the mean for this distribution. The distribution's mean value. Gets the variance for this distribution. The distribution's variance. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the median for this distribution. The distribution's median value. Gets the mode for this distribution. The distribution's mode value. Gets the entropy for this distribution. The distribution's entropy. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. A single point in the distribution range. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Gets the cumulative hazard function for this distribution evaluated at point x. A single point in the distribution range. The cumulative hazard function H(x) evaluated at x in the current distribution. Gets the hazard function, also known as the failure rate or the conditional failure density function for this distribution evaluated at point x. A single point in the distribution range. The conditional failure density function h(x) evaluated at x in the current distribution. The hazard function is the ratio of the probability density function f(x) to the survival function, S(x). Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . The Inverse Cumulative Distribution Function (ICDF) specifies, for a given probability, the value which the random variable will be at, or below, with that probability. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Tukey-Lambda distribution. Formalized by John Tukey, the Tukey lambda distribution is a continuous probability distribution defined in terms of its quantile function. It is typically used to identify an appropriate distribution and not used in statistical models directly. The Tukey lambda distribution has a single shape parameter λ. As with other probability distributions, the Tukey lambda distribution can be transformed with a location parameter, μ, and a scale parameter, σ. Since the general form of probability distribution can be expressed in terms of the standard distribution, the subsequent formulas are given for the standard form of the function. References: Wikipedia, The Free Encyclopedia. Tukey-Lambda distribution. Available on: http://en.wikipedia.org/wiki/Tukey_lambda_distribution This examples shows how to create a Tukey distribution and compute some of its properties . var tukey = new TukeyLambdaDistribution(lambda: 0.14); double mean = tukey.Mean; // 0.0 double median = tukey.Median; // 0.0 double mode = tukey.Mode; // 0.0 double var = tukey.Variance; // 2.1102970222144855 double stdDev = tukey.StandardDeviation; // 1.4526861402982014 double cdf = tukey.DistributionFunction(x: 1.4); // 0.83252947230217966 double pdf = tukey.ProbabilityDensityFunction(x: 1.4); // 0.17181242109370659 double lpdf = tukey.LogProbabilityDensityFunction(x: 1.4); // -1.7613519723149427 double ccdf = tukey.ComplementaryDistributionFunction(x: 1.4); // 0.16747052769782034 double icdf = tukey.InverseDistributionFunction(p: cdf); // 1.4000000000000004 double hf = tukey.HazardFunction(x: 1.4); // 1.0219566231014163 double chf = tukey.CumulativeHazardFunction(x: 1.4); // 1.7842102556452939 string str = tukey.ToString(CultureInfo.InvariantCulture); // Tukey(x; λ = 0.14) Constructs a Tukey-Lambda distribution with the given lambda (shape) parameter. Gets the distribution shape parameter lambda (λ). Gets the mean for this distribution (always zero). The distribution's mean value. Gets the median for this distribution (always zero). The distribution's median value. Gets the mode for this distribution (always zero). The distribution's median value. Gets the entropy for this distribution. The distribution's entropy. Gets the variance for this distribution. The distribution's variance. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . Gets the first derivative of the inverse distribution function (icdf) for this distribution evaluated at probability p. A probability value between 0 and 1. Gets the log of the quantile density function, which in turn is the first derivative of the inverse distribution function (icdf), evaluated at probability p. A probability value between 0 and 1. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Power Lognormal distribution. References: NIST/SEMATECH e-Handbook of Statistical Methods. Power Lognormal distribution. Available on: http://www.itl.nist.gov/div898/handbook/eda/section3/eda366e.htm This example shows how to create a Power Lognormal distribution and compute some of its properties. // Create a Power-Lognormal distribution with p = 4.2 and s = 1.2 var plog = new PowerLognormalDistribution(power: 4.2, shape: 1.2); double cdf = plog.DistributionFunction(x: 1.4); // 0.98092157745191766 double pdf = plog.ProbabilityDensityFunction(x: 1.4); // 0.046958580233533977 double lpdf = plog.LogProbabilityDensityFunction(x: 1.4); // -3.0584893374471496 double ccdf = plog.ComplementaryDistributionFunction(x: 1.4); // 0.019078422548082351 double icdf = plog.InverseDistributionFunction(p: cdf); // 1.4 double hf = plog.HazardFunction(x: 1.4); // 10.337649063164642 double chf = plog.CumulativeHazardFunction(x: 1.4); // 3.9591972920568446 string str = plog.ToString(CultureInfo.InvariantCulture); // PLD(x; p = 4.2, σ = 1.2) Constructs a Power Lognormal distribution with the given power and shape parameters. The distribution's power p. The distribution's shape σ. Gets the distribution's power parameter (p). Gets the distribution's shape parameter sigma (σ). Not supported. Not supported. Not supported. Not supported. Not supported. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. Gets the hazard function, also known as the failure rate or the conditional failure density function for this distribution evaluated at point x. A single point in the distribution range. The conditional failure density function h(x) evaluated at x in the current distribution. Gets the cumulative hazard function for this distribution evaluated at point x. A single point in the distribution range. The cumulative hazard function H(x) evaluated at x in the current distribution. Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. A single point in the distribution range. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Generalized Normal distribution (also known as Exponential Power distribution). The generalized normal distribution or generalized Gaussian distribution (GGD) is either of two families of parametric continuous probability distributions on the real line. Both families add a shape parameter to the normal distribution. To distinguish the two families, they are referred to below as "version 1" and "version 2". However this is not a standard nomenclature. Known also as the exponential power distribution, or the generalized error distribution, this is a parametric family of symmetric distributions. It includes all normal and Laplace distributions, and as limiting cases it includes all continuous uniform distributions on bounded intervals of the real line. References: Wikipedia, The Free Encyclopedia. Generalized normal distribution. Available on: https://en.wikipedia.org/wiki/Generalized_normal_distribution This examples shows how to create a Generalized normal distribution and compute some of its properties. // Creates a new generalized normal distribution with the given parameters var normal = new GeneralizedNormalDistribution(location: 1, scale: 5, shape: 0.42); double mean = normal.Mean; // 1 double median = normal.Median; // 1 double mode = normal.Mode; // 1 double var = normal.Variance; // 19200.781700666659 double cdf = normal.DistributionFunction(x: 1.4); // 0.51076148867681703 double pdf = normal.ProbabilityDensityFunction(x: 1.4); // 0.024215092283124507 double lpdf = normal.LogProbabilityDensityFunction(x: 1.4); // -3.7207791921441378 double ccdf = normal.ComplementaryDistributionFunction(x: 1.4); // 0.48923851132318297 double icdf = normal.InverseDistributionFunction(p: cdf); // 1.4000000149740108 double hf = normal.HazardFunction(x: 1.4); // 0.049495474543966168 double chf = normal.CumulativeHazardFunction(x: 1.4); // 0.7149051552030572 string str = normal.ToString(CultureInfo.InvariantCulture); // GGD(x; μ = 1, α = 5, β = 0.42) Constructs a Generalized Normal distribution with the given parameters. The location parameter μ. The scale parameter α. The shape parameter β. Create an distribution using a specialization. The Laplace's location parameter μ (mu). The Laplace's scale parameter b. A that provides a . Create an distribution using a specialization. The Normal's mean parameter μ (mu). The Normal's standard deviation σ (sigma). A that provides a distribution. Gets the location value μ (mu) for the distribution. Gets the median for this distribution. The distribution's median value. Gets the mode for this distribution. In the Generalized Normal Distribution, the mode is equal to the distribution's value. The distribution's mode value. Gets the variance for this distribution. The distribution's variance. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the Entropy for this Normal distribution. Gets the cumulative distribution function (cdf) for the Generalized Normal distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. See . Gets the probability density function (pdf) for the Normal distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The probability of x occurring in the current distribution. See . Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Power Normal distribution. References: NIST/SEMATECH e-Handbook of Statistical Methods. Power Normal distribution. Available on: http://www.itl.nist.gov/div898/handbook/eda/section3/eda366d.htm This example shows how to create a Power Normal distribution and compute some of its properties. // Create a new Power-Normal distribution with p = 4.2 var pnormal = new PowerNormalDistribution(power: 4.2); double cdf = pnormal.DistributionFunction(x: 1.4); // 0.99997428721920678 double pdf = pnormal.ProbabilityDensityFunction(x: 1.4); // 0.00020022645890003279 double lpdf = pnormal.LogProbabilityDensityFunction(x: 1.4); // -0.20543269836728234 double ccdf = pnormal.ComplementaryDistributionFunction(x: 1.4); // 0.000025712780793218926 double icdf = pnormal.InverseDistributionFunction(p: cdf); // 1.3999999999998953 double hf = pnormal.HazardFunction(x: 1.4); // 7.7870402470368854 double chf = pnormal.CumulativeHazardFunction(x: 1.4); // 10.568522382550167 string str = pnormal.ToString(); // PND(x; p = 4.2) Constructs a Power Normal distribution with given power (shape) parameter. The distribution's power p. Gets the distribution shape (power) parameter. Not supported. Not supported. Not supported. Not supported. Not supported. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. The Inverse Cumulative Distribution Function (ICDF) specifies, for a given probability, the value which the random variable will be at, or below, with that probability. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The Probability Density Function (PDF) describes the probability that a given value x will occur. The probability of x occurring in the current distribution. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The Probability Density Function (PDF) describes the probability that a given value x will occur. The logarithm of the probability of x occurring in the current distribution. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. U-quadratic distribution. In probability theory and statistics, the U-quadratic distribution is a continuous probability distribution defined by a unique quadratic function with lower limit a and upper limit b. This distribution is a useful model for symmetric bimodal processes. Other continuous distributions allow more flexibility, in terms of relaxing the symmetry and the quadratic shape of the density function, which are enforced in the U-quadratic distribution - e.g., Beta distribution, Gamma distribution, etc. References: Wikipedia, The Free Encyclopedia. U-quadratic distribution. Available on: http://en.wikipedia.org/wiki/U-quadratic_distribution The following example shows how to create and test the main characteristics of an U-quadratic distribution given its two parameters: // Create a new U-quadratic distribution with values var u2 = new UQuadraticDistribution(a: 0.42, b: 4.2); double mean = u2.Mean; // 2.3100000000000001 double median = u2.Median; // 2.3100000000000001 double mode = u2.Mode; // 0.8099060089153145 double var = u2.Variance; // 2.1432600000000002 double cdf = u2.DistributionFunction(x: 1.4); // 0.44419041812731797 double pdf = u2.ProbabilityDensityFunction(x: 1.4); // 0.18398763254730335 double lpdf = u2.LogProbabilityDensityFunction(x: 1.4); // -1.6928867380489712 double ccdf = u2.ComplementaryDistributionFunction(x: 1.4); // 0.55580958187268203 double icdf = u2.InverseDistributionFunction(p: cdf); // 1.3999998213768274 double hf = u2.HazardFunction(x: 1.4); // 0.3310263776442936 double chf = u2.CumulativeHazardFunction(x: 1.4); // 0.58732952203701494 string str = u2.ToString(CultureInfo.InvariantCulture); // "UQuadratic(x; a = 0.42, b = 4.2)" Constructs a new U-quadratic distribution. Parameter a. Parameter b. Gets the mean for this distribution. The distribution's mean value. Gets the median for this distribution. The distribution's median value. Gets the variance for this distribution. The distribution's variance. Not supported. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Wrapped Cauchy Distribution. In probability theory and directional statistics, a wrapped Cauchy distribution is a wrapped probability distribution that results from the "wrapping" of the Cauchy distribution around the unit circle. The Cauchy distribution is sometimes known as a Lorentzian distribution, and the wrapped Cauchy distribution may sometimes be referred to as a wrapped Lorentzian distribution. The wrapped Cauchy distribution is often found in the field of spectroscopy where it is used to analyze diffraction patterns (e.g. see Fabry–Pérot interferometer). References: Wikipedia, The Free Encyclopedia. Directional statistics. Available on: http://en.wikipedia.org/wiki/Directional_statistics Wikipedia, The Free Encyclopedia. Wrapped Cauchy distribution. Available on: http://en.wikipedia.org/wiki/Wrapped_Cauchy_distribution // Create a Wrapped Cauchy distribution with μ = 0.42, γ = 3 var dist = new WrappedCauchyDistribution(mu: 0.42, gamma: 3); // Common measures double mean = dist.Mean; // 0.42 double var = dist.Variance; // 0.950212931632136 // Probability density functions double pdf = dist.ProbabilityDensityFunction(x: 0.42); // 0.1758330112785475 double lpdf = dist.LogProbabilityDensityFunction(x: 0.42); // -1.7382205338929015 // String representation string str = dist.ToString(); // "WrappedCauchy(x; μ = 0,42, γ = 3)" Initializes a new instance of the class. The mean resultant parameter μ. The gamma parameter γ. Gets the mean for this distribution. The distribution's mean value. Gets the variance for this distribution. The distribution's variance. Not supported. Not supported. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the entropy for this distribution. The distribution's entropy. Not supported. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Returns a that represents this instance. A that represents this instance. Inverse Gamma Distribution. The inverse gamma distribution is a two-parameter family of continuous probability distributions on the positive real line, which is the distribution of the reciprocal of a variable distributed according to the gamma distribution. Perhaps the chief use of the inverse gamma distribution is in Bayesian statistics, where it serves as the conjugate prior of the variance of a normal distribution. However, it is common among Bayesians to consider an alternative parameterization of the normal distribution in terms of the precision, defined as the reciprocal of the variance, which allows the gamma distribution to be used directly as a conjugate prior. References: Wikipedia, The Free Encyclopedia. Inverse Gamma Distribution. Available from: http://en.wikipedia.org/wiki/Inverse-gamma_distribution John D. Cook. (2008). The Inverse Gamma Distribution. // Create a new inverse Gamma distribution with α = 0.42 and β = 0.5 var invGamma = new InverseGammaDistribution(shape: 0.42, scale: 0.5); // Common measures double mean = invGamma.Mean; // -0.86206896551724133 double median = invGamma.Median; // 3.1072323347401709 double var = invGamma.Variance; // -0.47035626665061164 // Cumulative distribution functions double cdf = invGamma.DistributionFunction(x: 0.27); // 0.042243552114989695 double ccdf = invGamma.ComplementaryDistributionFunction(x: 0.27); // 0.95775644788501035 double icdf = invGamma.InverseDistributionFunction(p: cdf); // 0.26999994629410995 // Probability density functions double pdf = invGamma.ProbabilityDensityFunction(x: 0.27); // 0.35679850067181362 double lpdf = invGamma.LogProbabilityDensityFunction(x: 0.27); // -1.0305840804381006 // Hazard (failure rate) functions double hf = invGamma.HazardFunction(x: 0.27); // 0.3725357333377633 double chf = invGamma.CumulativeHazardFunction(x: 0.27); // 0.043161763098266373 // String representation string str = invGamma.ToString(); // Γ^(-1)(x; α = 0.42, β = 0.5) Creates a new Inverse Gamma Distribution. The shape parameter α (alpha). The scale parameter β (beta). Gets the mean for this distribution. The distribution's mean value. In the Inverse Gamma distribution, the Mean is given as b / (a - 1). Gets the mode for this distribution. The distribution's mode value. Gets the variance for this distribution. The distribution's variance. In the Inverse Gamma distribution, the Variance is given as b² / ((a - 1)² * (a - 2)). Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the entropy for this distribution. The distribution's entropy. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. In the Inverse Gamma CDF is computed in terms of the Upper Incomplete Regularized Gamma Function Q as CDF(x) = Q(a, b / x). See . Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. See . Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. See . Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Laplace's Distribution (as known as the double exponential distribution). In probability theory and statistics, the Laplace distribution is a continuous probability distribution named after Pierre-Simon Laplace. It is also sometimes called the double exponential distribution. The difference between two independent identically distributed exponential random variables is governed by a Laplace distribution, as is a Brownian motion evaluated at an exponentially distributed random time. Increments of Laplace motion or a variance gamma process evaluated over the time scale also have a Laplace distribution. The probability density function of the Laplace distribution is also reminiscent of the normal distribution; however, whereas the normal distribution is expressed in terms of the squared difference from the mean μ, the Laplace density is expressed in terms of the absolute difference from the mean. Consequently the Laplace distribution has fatter tails than the normal distribution. References: Wikipedia, The Free Encyclopedia. Laplace distribution. Available from: http://en.wikipedia.org/wiki/Laplace_distribution // Create a new Laplace distribution with μ = 4 and b = 2 var laplace = new LaplaceDistribution(location: 4, scale: 2); // Common measures double mean = laplace.Mean; // 4.0 double median = laplace.Median; // 4.0 double var = laplace.Variance; // 8.0 // Cumulative distribution functions double cdf = laplace.DistributionFunction(x: 0.27); // 0.077448104942453522 double ccdf = laplace.ComplementaryDistributionFunction(x: 0.27); // 0.92255189505754642 double icdf = laplace.InverseDistributionFunction(p: cdf); // 0.27 // Probability density functions double pdf = laplace.ProbabilityDensityFunction(x: 0.27); // 0.038724052471226761 double lpdf = laplace.LogProbabilityDensityFunction(x: 0.27); // -3.2512943611198906 // Hazard (failure rate) functions double hf = laplace.HazardFunction(x: 0.27); // 0.041974931360160776 double chf = laplace.CumulativeHazardFunction(x: 0.27); // 0.080611649844768624 // String representation string str = laplace.ToString(CultureInfo.InvariantCulture); // Laplace(x; μ = 4, b = 2) Creates a new Laplace distribution. The location parameter μ (mu). The scale parameter b. Gets the mean for this distribution. The distribution's mean value. The Laplace's distribution mean has the same value as the location parameter μ. Gets the mode for this distribution (μ). The Laplace's distribution mode has the same value as the location parameter μ. Gets the median for this distribution. The distribution's median value. The Laplace's distribution median has the same value as the location parameter μ. Gets the variance for this distribution. The distribution's variance. The Laplace's variance is computed as 2*b². Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the entropy for this distribution. The distribution's entropy. The Laplace's entropy is defined as ln(2*e*b), in which e is the Euler constant. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. See . Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. See . Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. See . Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Continuity correction to be used when aproximating discrete values through a continuous distribution. No correction for continuity should be applied. The correction for continuity is -0.5 when the statistic is greater than the mean and +0.5 when it is less than the mean. This correction is used/described in http://vassarstats.net/textbook/ch12a.html. The correction for continuity will be -0.5 when computing values at the right (upper) tail of the distribution, and +0.5 when computing at the left (lower) tail. Mann-Whitney's U statistic distribution. This is the distribution for Mann-Whitney's U statistic used in . This distribution is based on sample statistics. This is the distribution for the first sample statistic, U1. Some textbooks (and statistical packages) use alternate definitions for U, which should be compared with the appropriate statistic tables or alternate distributions. // Consider the following rank statistics double[] ranks = { 1, 2, 3, 4, 5 }; // Create a new Mann-Whitney U's distribution with n1 = 2 and n2 = 3 var mannWhitney = new MannWhitneyDistribution(ranks, n1: 2, n2: 3); // Common measures double mean = mannWhitney.Mean; // 2.7870954605658511 double median = mannWhitney.Median; // 1.5219615583481305 double var = mannWhitney.Variance; // 18.28163603621158 // Cumulative distribution functions double cdf = mannWhitney.DistributionFunction(x: 4); // 0.6 double ccdf = mannWhitney.ComplementaryDistributionFunction(x: 4); // 0.4 double icdf = mannWhitney.InverseDistributionFunction(p: cdf); // 3.6666666666666661 // Probability density functions double pdf = mannWhitney.ProbabilityDensityFunction(x: 4); // 0.2 double lpdf = mannWhitney.LogProbabilityDensityFunction(x: 4); // -1.6094379124341005 // Hazard (failure rate) functions double hf = mannWhitney.HazardFunction(x: 4); // 0.5 double chf = mannWhitney.CumulativeHazardFunction(x: 4); // 0.916290731874155 // String representation string str = mannWhitney.ToString(); // MannWhitney(u; n1 = 2, n2 = 3) Gets the number of observations in the first sample. Gets the number of observations in the second sample. Gets or sets the continuity correction to be applied when using the Normal approximation to this distribution. Gets whether this distribution computes the exact probabilities (by searching all possible rank combinations) or gives fast approximations. true if this distribution is exact; otherwise, false. Gets the statistic values for all possible combinations of ranks. This is used to compute the exact distribution. Constructs a Mann-Whitney's U-statistic distribution. The number of observations in the first sample. The number of observations in the second sample. Constructs a Mann-Whitney's U-statistic distribution. The rank statistics. The number of observations in the first sample. The number of observations in the second sample. True to compute the exact distribution. May require a significant amount of processing power for large samples (n > 30). If left at null, whether to compute the exact or approximate distribution will depend on the number of samples. Constructs a Mann-Whitney's U-statistic distribution. The global rank statistics for the first sample. The global rank statistics for the second sample. True to compute the exact distribution. May require a significant amount of processing power for large samples (n > 30). If left at null, whether to compute the exact or approximate distribution will depend on the number of samples. Gets the cumulative distribution function (cdf) for this distribution evaluated at point k. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. A single point in the distribution range. System.Double. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Gets the Mann-Whitney's U statistic for the first sample. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Gets the mean for this distribution. The distribution's mean value. The mean of Mann-Whitney's U distribution is defined as (n1 * n2) / 2. Gets the variance for this distribution. The distribution's variance. The variance of Mann-Whitney's U distribution is defined as (n1 * n2 * (n1 + n2 + 1)) / 12. This method is not supported. This method is not supported. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the probability density function (pdf) for this distribution evaluated at point u. A single point in the distribution range. The probability of u occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value u will occur. See . Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of u occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value u will occur. See . Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . Returns a that represents this instance. A that represents this instance. Noncentral t-distribution. As with other noncentrality parameters, the noncentral t-distribution generalizes a probability distribution – Student's t-distribution – using a noncentrality parameter. Whereas the central distribution describes how a test statistic is distributed when the difference tested is null, the noncentral distribution also describes how t is distributed when the null is false. This leads to its use in statistics, especially calculating statistical power. The noncentral t-distribution is also known as the singly noncentral t-distribution, and in addition to its primary use in statistical inference, is also used in robust modeling for data. References: Wikipedia, The Free Encyclopedia. Noncentral t-distribution. Available on: http://en.wikipedia.org/wiki/Noncentral_t-distribution var distribution = new NoncentralTDistribution( degreesOfFreedom: 4, noncentrality: 2.42); double mean = distribution.Mean; // 3.0330202123035104 double median = distribution.Median; // 2.6034842414893795 double var = distribution.Variance; // 4.5135883917583683 double cdf = distribution.DistributionFunction(x: 1.4); // 0.15955740661144721 double pdf = distribution.ProbabilityDensityFunction(x: 1.4); // 0.23552141805184526 double lpdf = distribution.LogProbabilityDensityFunction(x: 1.4); // -1.4459534225195116 double ccdf = distribution.ComplementaryDistributionFunction(x: 1.4); // 0.84044259338855276 double icdf = distribution.InverseDistributionFunction(p: cdf); // 1.4000000000123853 double hf = distribution.HazardFunction(x: 1.4); // 0.28023498559521387 double chf = distribution.CumulativeHazardFunction(x: 1.4); // 0.17382662901507062 string str = distribution.ToString(CultureInfo.InvariantCulture); // T(x; df = 4, μ = 2.42) Gets the degrees of freedom (v) for the distribution. Gets the noncentrality parameter μ (mu) for the distribution. Initializes a new instance of the class. The degrees of freedom v. The noncentrality parameter μ (mu). Gets the mean for this distribution. The noncentral t-distribution's mean is defined in terms of the Gamma function Γ(x) as μ * sqrt(v/2) * Γ((v - 1) / 2) / Γ(v / 2) for v > 1. Gets the variance for this distribution. The noncentral t-distribution's variance is defined in terms of the Gamma function Γ(x) as a - b * c² in which a = v*(1+μ²) / (v-2), b = (u² * v) / 2 and c = Γ((v - 1) / 2) / Γ(v / 2) for v > 2. Gets the mode for this distribution. The distribution's mode value. Not supported. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. See . Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. See . Not supported. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Computes the cumulative probability at t of the non-central T-distribution with DF degrees of freedom and non-centrality parameter. This function is based on the original work done by Russell Lent hand John Burkardt, shared under the LGPL license. Original FORTRAN code can be found at: http://people.sc.fsu.edu/~jburkardt/f77_src/asa243/asa243.html Exponential distribution. In probability theory and statistics, the exponential distribution (a.k.a. negative exponential distribution) is a family of continuous probability distributions. It describes the time between events in a Poisson process, i.e. a process in which events occur continuously and independently at a constant average rate. It is the continuous analogue of the geometric distribution. Note that the exponential distribution is not the same as the class of exponential families of distributions, which is a large class of probability distributions that includes the exponential distribution as one of its members, but also includes the normal distribution, binomial distribution, gamma distribution, Poisson, and many others. References: Wikipedia, The Free Encyclopedia. Exponential distribution. Available on: http://en.wikipedia.org/wiki/Exponential_distribution The following example shows how to create and test the main characteristics of an Exponential distribution given a lambda (λ) rate of 0.42: // Create an Exponential distribution with λ = 0.42 var exp = new ExponentialDistribution(rate: 0.42); // Common measures double mean = exp.Mean; // 2.3809523809523809 double median = exp.Median; // 1.6503504299046317 double var = exp.Variance; // 5.6689342403628125 // Cumulative distribution functions double cdf = exp.DistributionFunction(x: 0.27); // 0.10720652870550407 double ccdf = exp.ComplementaryDistributionFunction(x: 0.27); // 0.89279347129449593 double icdf = exp.InverseDistributionFunction(p: cdf); // 0.27 // Probability density functions double pdf = exp.ProbabilityDensityFunction(x: 0.27); // 0.3749732579436883 double lpdf = exp.LogProbabilityDensityFunction(x: 0.27); // -0.98090056770472311 // Hazard (failure rate) functions double hf = exp.HazardFunction(x: 0.27); // 0.42 double chf = exp.CumulativeHazardFunction(x: 0.27); // 0.1134 // String representation string str = exp.ToString(CultureInfo.InvariantCulture); // Exp(x; λ = 0.42) The following example shows how to generate random samples drawn from an Exponential distribution and later how to re-estimate a distribution from the generated samples. // Create an Exponential distribution with λ = 2.5 var exp = new ExponentialDistribution(rate: 2.5); // Generate a million samples from this distribution: double[] samples = target.Generate(1000000); // Create a default exponential distribution var newExp = new ExponentialDistribution(); // Fit the samples newExp.Fit(samples); // Check the estimated parameters double rate = newExp.Rate; // 2.5 Creates a new Exponential distribution with the given rate. Creates a new Exponential distribution with the given rate. The rate parameter lambda (λ). Default is 1. Gets the distribution's rate parameter lambda (λ). Gets the mean for this distribution. The distribution's mean value. In the Exponential distribution, the mean is defined as 1/λ. Gets the variance for this distribution. The distribution's variance. In the Exponential distribution, the variance is defined as 1/(λ²) Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the median for this distribution. The distribution's median value. In the Exponential distribution, the median is defined as ln(2) / λ. Gets the mode for this distribution. The distribution's mode value. In the Exponential distribution, the median is defined as 0. Gets the entropy for this distribution. The distribution's entropy. In the Exponential distribution, the median is defined as 1 - ln(λ). Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. The Exponential CDF is defined as CDF(x) = 1 - exp(-λ*x). Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. The Exponential PDF is defined as PDF(x) = λ * exp(-λ*x). Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. The Inverse Cumulative Distribution Function (ICDF) specifies, for a given probability, the value which the random variable will be at, or below, with that probability. The Exponential ICDF is defined as ICDF(p) = -ln(1-p)/λ. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Please see . Estimates a new Exponential distribution from a given set of observations. Estimates a new Exponential distribution from a given set of observations. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. The random number generator to use as a source of randomness. Default is to use . A random observations drawn from this distribution. Generates a random vector of observations from the Exponential distribution with the given parameters. The rate parameter lambda. The number of samples to generate. An array of double values sampled from the specified Exponential distribution. Generates a random vector of observations from the Exponential distribution with the given parameters. The rate parameter lambda. The number of samples to generate. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Exponential distribution. Generates a random vector of observations from the Exponential distribution with the given parameters. The rate parameter lambda. The number of samples to generate. The location where to store the samples. An array of double values sampled from the specified Exponential distribution. Generates a random vector of observations from the Exponential distribution with the given parameters. The rate parameter lambda. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Exponential distribution. Generates a random observation from the Exponential distribution with the given parameters. The rate parameter lambda. A random double value sampled from the specified Exponential distribution. Generates a random observation from the Exponential distribution with the given parameters. The rate parameter lambda. The random number generator to use as a source of randomness. Default is to use . A random double value sampled from the specified Exponential distribution. Returns a that represents this instance. A that represents this instance. Gamma distribution. The gamma distribution is a two-parameter family of continuous probability distributions. There are three different parameterizations in common use: With a parameter k and a parameter θ. With a shape parameter α = k and an inverse scale parameter β = 1/θ, called a parameter. With a shape parameter k and a parameter μ = k/β. In each of these three forms, both parameters are positive real numbers. The parameterization with k and θ appears to be more common in econometrics and certain other applied fields, where e.g. the gamma distribution is frequently used to model waiting times. For instance, in life testing, the waiting time until death is a random variable that is frequently modeled with a gamma distribution. This is the default construction method for this class. The parameterization with α and β is more common in Bayesian statistics, where the gamma distribution is used as a conjugate prior distribution for various types of inverse scale (aka rate) parameters, such as the λ of an exponential distribution or a Poisson distribution – or for that matter, the β of the gamma distribution itself. (The closely related inverse gamma distribution is used as a conjugate prior for scale parameters, such as the variance of a normal distribution.) In order to create a Gamma distribution using the Bayesian parameterization, you can use . If k is an integer, then the distribution represents an Erlang distribution; i.e., the sum of k independent exponentially distributed random variables, each of which has a mean of θ (which is equivalent to a rate parameter of 1/θ). The gamma distribution is the maximum entropy probability distribution for a random variable X for which E[X] = kθ = α/β is fixed and greater than zero, and E[ln(X)] = ψ(k) + ln(θ) = ψ(α) − ln(β) is fixed (ψ is the digamma function). References: Wikipedia, The Free Encyclopedia. Gamma distribution. Available on: http://en.wikipedia.org/wiki/Gamma_distribution The following example shows how to create, test and compute the main functions of a Gamma distribution given parameters θ = 4 and k = 2: Constructs a Gamma distribution. Constructs a Gamma distribution. The scale parameter θ (theta). Default is 1. The shape parameter k. Default is 1. Constructs a Gamma distribution using α and β parameterization. The shape parameter α = k. The inverse scale parameter β = 1/θ. A Gamma distribution constructed with the given parameterization. Constructs a Gamma distribution using k and μ parameterization. The shape parameter α = k. The mean parameter μ = k/β. A Gamma distribution constructed with the given parameterization. Gets the distribution's scale parameter θ (theta). Gets the distribution's shape parameter k. Gets the inverse scale parameter β = 1/θ. Gets the mean for this distribution. The distribution's mean value. In the Gamma distribution, the mean is computed as k*θ. Gets the variance for this distribution. The distribution's variance. In the Gamma distribution, the variance is computed as k*θ². Gets the mode for this distribution. The distribution's mode value. Gets the entropy for this distribution. The distribution's entropy. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. The Gamma's CDF is computed in terms of the Lower Incomplete Regularized Gamma Function P as CDF(x) = P(shape, x / scale). See . Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. See . Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. See . Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Estimates a new Gamma distribution from a given set of observations. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Generates a random vector of observations from the Gamma distribution with the given parameters. The scale parameter theta (or inverse beta). The shape parameter k (or alpha). The number of samples to generate. An array of double values sampled from the specified Gamma distribution. Generates a random vector of observations from the Gamma distribution with the given parameters. The scale parameter theta (or inverse beta). The shape parameter k (or alpha). The number of samples to generate. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Gamma distribution. Generates a random vector of observations from the Gamma distribution with the given parameters. The scale parameter theta (or inverse beta). The shape parameter k (or alpha). The number of samples to generate. The location where to store the samples. An array of double values sampled from the specified Gamma distribution. Generates a random vector of observations from the Gamma distribution with the given parameters. The scale parameter theta (or inverse beta). The shape parameter k (or alpha). The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Gamma distribution. Generates a random observation from the Gamma distribution with the given parameters. The scale parameter theta (or inverse beta). The shape parameter k (or alpha). A random double value sampled from the specified Gamma distribution. Generates a random observation from the Gamma distribution with the given parameters. The scale parameter theta (or inverse beta). The shape parameter k (or alpha). The random number generator to use as a source of randomness. Default is to use . A random double value sampled from the specified Gamma distribution. Random Gamma-distribution number generation based on Marsaglia's Simple Method (2000). Random Gamma-distribution number generation based on Marsaglia's Simple Method (2000). Returns a that represents this instance. A that represents this instance. Gets the standard Gamma distribution, with scale θ = 1 and location k = 1. Kolmogorov-Smirnov distribution. This class is based on the excellent paper and original Java code by Simard and L'Ecuyer (2010). Includes additional modifications for increased performance and readability, shared under the LGPL under permission of original authors. L'Ecuyer and Simard partitioned the problem of evaluating the CDF using multiple approximation and asymptotic methods in order to achieve a best compromise between speed and precision. The distribution function of this class follows the same partitioning scheme as described by L'Ecuyer and Simard, which is described in the table below. For n <= 140 and: 1/n > x >= 1-1/nUses the Ruben-Gambino formula. 1/n < nx² < 0.754693Uses the Durbin matrix algorithm. 0.754693 <= nx² < 4Uses the Pomeranz algorithm. 4 <= nx² < 18Uses the complementary distribution function. nx² >= 18Returns the constant 1. For 140 < n <= 10^5 nx² >= 18Returns the constant 1. nx^(3/2) < 1.4Durbin matrix algorithm. nx^(3/2) > 1.4Pelz-Good asymptotic series. For n > 10^5 nx² >= 18Returns the constant 1. nx² < 18Pelz-Good asymptotic series. References: R. Simard and P. L'Ecuyer. (2011). "Computing the Two-Sided Kolmogorov-Smirnov Distribution", Journal of Statistical Software, Vol. 39, Issue 11, Mar 2011. Available on: http://www.iro.umontreal.ca/~lecuyer/myftp/papers/ksdist.pdf Marsaglia, G., Tsang, W. W., Wang, J. (2003) "Evaluating Kolmogorov's Distribution", Journal of Statistical Software, 8 (18), 1–4. jstor. Available on: http://www.jstatsoft.org/v08/i18/paper Durbin, J. (1972). Distribution Theory for Tests Based on The Sample Distribution Function, Society for Industrial & Applied Mathematics, Philadelphia. The following example shows how to build a Kolmogorov-Smirnov distribution for 42 samples and compute its main functions and characteristics: // Create a Kolmogorov-Smirnov distribution with n = 42 var ks = new KolmogorovSmirnovDistribution(samples: 42); // Common measures double mean = ks.Mean; // 0.13404812830261556 double median = ks.Median; // 0.12393613519421857 double var = ks.Variance; // 0.019154717445778062 // Cumulative distribution functions double cdf = ks.DistributionFunction(x: 0.27); // 0.99659863602996079 double ccdf = ks.ComplementaryDistributionFunction(x: 0.27); // 0.0034013639700392062 double icdf = ks.InverseDistributionFunction(p: cdf); // 0.26999997446092017 // Hazard (failure rate) functions double chf = ks.CumulativeHazardFunction(x: 0.27); // 5.6835787601476619 // String representation string str = ks.ToString(); // "KS(x; n = 42)" Gets the number of samples distribution parameter. Creates a new Kolmogorov-Smirnov distribution. The number of samples. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the mean for this distribution. The mean of the K-S distribution for n samples is computed as Mean = sqrt(π/2) * ln(2) / sqrt(n). See . Not supported. Gets the variance for this distribution. The variance of the K-S distribution for n samples is computed as Var = (π² / 12 - mean²) / n, in which mean is the K-S distribution . See . Gets the entropy for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. See . Not supported. Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. See . Computes the Upper Tail of the P[Dn >= x] distribution. This function approximates the upper tail of the P[Dn >= x] distribution using the one-sided Kolmogorov-Smirnov statistic. Not supported. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Computes the Cumulative Distribution Function (CDF) for the Kolmogorov-Smirnov statistic's distribution. The sample size. The Kolmogorov-Smirnov statistic. Returns the cumulative probability of the statistic under a sample size . This function computes the cumulative probability P[Dn <= x] of the Kolmogorov-Smirnov distribution using multiple methods as suggested by Richard Simard (2010). Simard partitioned the problem of evaluating the CDF using multiple approximation and asymptotic methods in order to achieve a best compromise between speed and precision. This function follows the same partitioning as Simard, which is described in the table below. For n <= 140 and: 1/n > x >= 1-1/nUses the Ruben-Gambino formula. 1/n < nx² < 0.754693Uses the Durbin matrix algorithm. 0.754693 <= nx² < 4Uses the Pomeranz algorithm. 4 <= nx² < 18Uses the complementary distribution function. nx² >= 18Returns the constant 1. For 140 < n <= 10^5 nx² >= 18Returns the constant 1. nx^(3/2) < 1.4Durbin matrix algorithm. nx^(3/2) > 1.4Pelz-Good asymptotic series. For n > 10^5 nx² >= 18Returns the constant 1. nx² < 18Pelz-Good asymptotic series. Computes the Complementary Cumulative Distribution Function (1-CDF) for the Kolmogorov-Smirnov statistic's distribution. The sample size. The Kolmogorov-Smirnov statistic. Returns the complementary cumulative probability of the statistic under a sample size . Pelz-Good algorithm for computing lower-tail areas of the Kolmogorov-Smirnov distribution. As stated in Simard's paper, Pelz and Good (1976) generalized Kolmogorov's approximation to an asymptotic series in 1/sqrt(n). References: Wolfgang Pelz and I. J. Good, "Approximating the Lower Tail-Areas of the Kolmogorov-Smirnov One-Sample Statistic", Journal of the Royal Statistical Society, Series B. Vol. 38, No. 2 (1976), pp. 152-156 Computes the Upper Tail of the P[Dn >= x] distribution. This function approximates the upper tail of the P[Dn >= x] distribution using the one-sided Kolmogorov-Smirnov statistic. Pomeranz algorithm. Durbin's algorithm for computing P[Dn < d] The method presented by Marsaglia (2003), as stated in the paper, is based on a succession of developments starting with Kolmogorov and culminating in a masterful treatment by Durbin (1972). Durbin's monograph summarized and extended many previous works published in the years 1933-73. This function implements the small C procedure provided by Marsaglia on his paper with corrections made by Simard (2010). Further optimizations also have been performed. References: - Marsaglia, G., Tsang, W. W., Wang, J. (2003) "Evaluating Kolmogorov's Distribution", Journal of Statistical Software, 8 (18), 1–4. jstor. Available on: http://www.jstatsoft.org/v08/i18/paper - Durbin, J. (1972) Distribution Theory for Tests Based on The Sample Distribution Function, Society for Industrial & Applied Mathematics, Philadelphia. Computes matrix power. Used in the Durbin algorithm. Initializes the Pomeranz algorithm. Creates matrix A of the Pomeranz algorithm. Computes matrix H of the Pomeranz algorithm. Bernoulli probability distribution. The Bernoulli distribution is a distribution for a single binary variable x E {0,1}, representing, for example, the flipping of a coin. It is governed by a single continuous parameter representing the probability of an observation to be equal to 1. References: Wikipedia, The Free Encyclopedia. Bernoulli distribution. Available on: http://en.wikipedia.org/wiki/Bernoulli_distribution C. Bishop. “Pattern Recognition and Machine Learning”. Springer. 2006. // Create a distribution with probability 0.42 var bern = new BernoulliDistribution(mean: 0.42); // Common measures double mean = bern.Mean; // 0.42 double median = bern.Median; // 0.0 double var = bern.Variance; // 0.2436 double mode = bern.Mode; // 0.0 // Probability mass functions double pdf = bern.ProbabilityMassFunction(k: 1); // 0.42 double lpdf = bern.LogProbabilityMassFunction(k: 0); // -0.54472717544167193 // Cumulative distribution functions double cdf = bern.DistributionFunction(k: 0); // 0.58 double ccdf = bern.ComplementaryDistributionFunction(k: 0); // 0.42 // Quantile functions int icdf0 = bern.InverseDistributionFunction(p: 0.57); // 0 int icdf1 = bern.InverseDistributionFunction(p: 0.59); // 1 // Hazard / failure rate functions double hf = bern.HazardFunction(x: 0); // 1.3809523809523814 double chf = bern.CumulativeHazardFunction(x: 0); // 0.86750056770472328 // String representation string str = bern.ToString(CultureInfo.InvariantCulture); // "Bernoulli(x; p = 0.42, q = 0.58)" Creates a new Bernoulli distribution. Creates a new Bernoulli distribution. The probability of an observation being equal to 1. Default is 0.5 Gets the mean for this distribution. Gets the median for this distribution. The distribution's median value. Gets the mode for this distribution. The distribution's mode value. Gets the variance for this distribution. Gets the entropy for this distribution. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point k. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . Gets P(X > k) the complementary cumulative distribution function (ccdf) for this distribution evaluated at point k. This function is also known as the Survival function. A single point in the distribution range. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Gets the probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The probability of k occurring in the current distribution. The Probability Mass Function (PMF) describes the probability that a given value k will occur. Gets the log-probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of k occurring in the current distribution. The Probability Mass Function (PMF) describes the probability that a given value k will occur. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. The random number generator to use as a source of randomness. Default is to use . A random observations drawn from this distribution. Returns a that represents this instance. A that represents this instance. Binomial probability distribution. The binomial distribution is the discrete probability distribution of the number of successes in a sequence of >n independent yes/no experiments, each of which yields success with probability p. Such a success/failure experiment is also called a Bernoulli experiment or Bernoulli trial; when n = 1, the binomial distribution is a Bernoulli distribution. References: Wikipedia, The Free Encyclopedia. Binomial distribution. Available on: http://en.wikipedia.org/wiki/Binomial_distribution C. Bishop. “Pattern Recognition and Machine Learning”. Springer. 2006. // Creates a distribution with n = 16 and success probability 0.12 var bin = new BinomialDistribution(trials: 16, probability: 0.12); // Common measures double mean = bin.Mean; // 1.92 double median = bin.Median; // 2 double var = bin.Variance; // 1.6896 double mode = bin.Mode; // 2 // Probability mass functions double pdf = bin.ProbabilityMassFunction(k: 1); // 0.28218979948821621 double lpdf = bin.LogProbabilityMassFunction(k: 0); // -2.0453339441581582 // Cumulative distribution functions double cdf = bin.DistributionFunction(k: 0); // 0.12933699143209909 double ccdf = bin.ComplementaryDistributionFunction(k: 0); // 0.87066300856790091 // Quantile functions int icdf0 = bin.InverseDistributionFunction(p: 0.37); // 1 int icdf1 = bin.InverseDistributionFunction(p: 0.50); // 2 int icdf2 = bin.InverseDistributionFunction(p: 0.99); // 5 int icdf3 = bin.InverseDistributionFunction(p: 0.999); // 7 // Hazard (failure rate) functions double hf = bin.HazardFunction(x: 0); // 1.3809523809523814 double chf = bin.CumulativeHazardFunction(x: 0); // 0.86750056770472328 // String representation string str = bin.ToString(CultureInfo.InvariantCulture); // "Binomial(x; n = 16, p = 0.12)" Gets the number of trials n for the distribution. Gets the success probability p for the distribution. Constructs a new binomial distribution. Constructs a new binomial distribution. The number of trials n. Constructs a new binomial distribution. The number of trials n. The success probability p in each trial. Gets the mean for this distribution. The distribution's mean value. Gets the variance for this distribution. The distribution's variance. Gets the mode for this distribution. The distribution's mode value. Gets the entropy for this distribution. The distribution's entropy. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point k. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. The Inverse Cumulative Distribution Function (ICDF) specifies, for a given probability, the value which the random variable will be at, or below, with that probability. Gets the probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The probability of k occurring in the current distribution. The Probability Mass Function (PMF) describes the probability that a given value x will occur. Gets the log-probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Mass Function (PMF) describes the probability that a given value k will occur. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Chi-Square (χ²) probability distribution In probability theory and statistics, the chi-square distribution (also chi-squared or χ²-distribution) with k degrees of freedom is the distribution of a sum of the squares of k independent standard normal random variables. It is one of the most widely used probability distributions in inferential statistics, e.g. in hypothesis testing, or in construction of confidence intervals. References: Wikipedia, The Free Encyclopedia. Chi-square distribution. Available on: http://en.wikipedia.org/wiki/Chi-square_distribution The following example demonstrates how to create a new χ² distribution with the given degrees of freedom. // Create a new χ² distribution with 7 d.f. var chisq = new ChiSquareDistribution(degreesOfFreedom: 7); // Common measures double mean = chisq.Mean; // 7 double median = chisq.Median; // 6.345811195595612 double var = chisq.Variance; // 14 // Cumulative distribution functions double cdf = chisq.DistributionFunction(x: 6.27); // 0.49139966433823956 double ccdf = chisq.ComplementaryDistributionFunction(x: 6.27); // 0.50860033566176044 double icdf = chisq.InverseDistributionFunction(p: cdf); // 6.2700000000852318 // Probability density functions double pdf = chisq.ProbabilityDensityFunction(x: 6.27); // 0.11388708001184455 double lpdf = chisq.LogProbabilityDensityFunction(x: 6.27); // -2.1725478476948092 // Hazard (failure rate) functions double hf = chisq.HazardFunction(x: 6.27); // 0.22392254197721179 double chf = chisq.CumulativeHazardFunction(x: 6.27); // 0.67609276602233315 // String representation string str = chisq.ToString(); // "χ²(x; df = 7) Constructs a new Chi-Square distribution with given degrees of freedom. Constructs a new Chi-Square distribution with given degrees of freedom. The degrees of freedom for the distribution. Default is 1. Gets the Degrees of Freedom for this distribution. Gets the probability density function (pdf) for the χ² distribution evaluated at point x. The Probability Density Function (PDF) describes the probability that a given value x will occur. References: Wikipedia, the free encyclopedia. Chi-square distribution. The probability of x occurring in the current distribution. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the cumulative distribution function (cdf) for the χ² distribution evaluated at point x. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. The χ² distribution function is defined in terms of the Incomplete Gamma Function Γ(a, x) as CDF(x; df) = Γ(df/2, x/d). Gets the complementary cumulative distribution function (ccdf) for the χ² distribution evaluated at point x. This function is also known as the Survival function. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. The χ² complementary distribution function is defined in terms of the Complemented Incomplete Gamma Function Γc(a, x) as CDF(x; df) = Γc(df/2, x/d). Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the mean for this distribution. The χ² distribution mean is the number of degrees of freedom. Gets the variance for this distribution. The χ² distribution variance is twice its degrees of freedom. Gets the mode for this distribution. The χ² distribution mode is max(degrees of freedom - 2, 0). The distribution's mode value. Gets the entropy for this distribution. This method is not supported. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. The random number generator to use as a source of randomness. Default is to use . A random observations drawn from this distribution. Generates a random vector of observations from the Chi-Square distribution with the given parameters. An array of double values sampled from the specified Chi-Square distribution. Generates a random observation from the Chi-Square distribution with the given parameters. The degrees of freedom for the distribution. A random double value sampled from the specified Chi-Square distribution. Generates a random vector of observations from the Chi-Square distribution with the given parameters. An array of double values sampled from the specified Chi-Square distribution. Generates a random vector of observations from the Chi-Square distribution with the given parameters. An array of double values sampled from the specified Chi-Square distribution. Generates a random observation from the Chi-Square distribution with the given parameters. The degrees of freedom for the distribution. The random number generator to use as a source of randomness. Default is to use . A random double value sampled from the specified Chi-Square distribution. Returns a that represents this instance. A that represents this instance. Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. A probability value between 0 and 1. The degrees of freedom of the Chi-Square distribution. A sample which could original the given probability value when applied in the . Abstract class for univariate discrete probability distributions. A probability distribution identifies either the probability of each value of an unidentified random variable (when the variable is discrete), or the probability of the value falling within a particular interval (when the variable is continuous). The probability distribution describes the range of possible values that a random variable can attain and the probability that the value of the random variable is within any (measurable) subset of that range. The function describing the probability that a given discrete value will occur is called the probability function (or probability mass function, abbreviated PMF), and the function describing the cumulative probability that a given value or any value smaller than it will occur is called the distribution function (or cumulative distribution function, abbreviated CDF). References: Wikipedia, The Free Encyclopedia. Probability distribution. Available on: http://en.wikipedia.org/wiki/Probability_distribution Weisstein, Eric W. "Statistical Distribution." From MathWorld--A Wolfram Web Resource. http://mathworld.wolfram.com/StatisticalDistribution.html Constructs a new UnivariateDistribution class. Gets the mean for this distribution. The distribution's mean value. Gets the variance for this distribution. The distribution's variance. Gets the entropy for this distribution. The distribution's entropy. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the mode for this distribution. The distribution's mode value. Gets the median for this distribution. The distribution's median value. Gets the Standard Deviation (the square root of the variance) for the current distribution. The distribution's standard deviation. Gets the Quartiles for this distribution. A object containing the first quartile (Q1) as its minimum value, and the third quartile (Q2) as the maximum. Gets the distribution range within a given percentile. If 0.25 is passed as the argument, this function returns the same as the function. The percentile at which the distribution ranges will be returned. A object containing the minimum value for the distribution value, and the third quartile (Q2) as the maximum. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the distribution range within a given percentile. If 0.25 is passed as the argument, this function returns the same as the function. The percentile at which the distribution ranges will be returned. A object containing the minimum value for the distribution value, and the third quartile (Q2) as the maximum. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The Probability Density Function (PDF) describes the probability that a given value x will occur. The probability of x occurring in the current distribution. Gets P(X ≤ k), the cumulative distribution function (cdf) for this distribution evaluated at point k. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets P(X ≤ k), the cumulative distribution function (cdf) for this distribution evaluated at point k. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets P(X ≤ k) or P(X < k), the cumulative distribution function (cdf) for this distribution evaluated at point k, depending on the value of the parameter. A single point in the distribution range. True to return P(X ≤ x), false to return P(X < x). Default is true. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. // Compute P(X = k) double equal = dist.ProbabilityMassFunction(k: 1); // Compute P(X < k) double less = dist.DistributionFunction(k: 1, inclusive: false); // Compute P(X ≤ k) double lessThanOrEqual = dist.DistributionFunction(k: 1, inclusive: true); // Compute P(X > k) double greater = dist.ComplementaryDistributionFunction(k: 1); // Compute P(X ≥ k) double greaterThanOrEqual = dist.ComplementaryDistributionFunction(k: 1, inclusive: true); Gets the cumulative distribution function (cdf) for this distribution in the semi-closed interval (a; b] given as P(a < X ≤ b). The start of the semi-closed interval (a; b]. The end of the semi-closed interval (a; b]. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. The Inverse Cumulative Distribution Function (ICDF) specifies, for a given probability, the value which the random variable will be at, or below, with that probability. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. The Inverse Cumulative Distribution Function (ICDF) specifies, for a given probability, the value which the random variable will be at, or below, with that probability. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p using a numerical approximation based on binary search. The Inverse Cumulative Distribution Function (ICDF) specifies, for a given probability, the value which the random variable will be at, or below, with that probability. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . Computes the cumulative distribution function by summing the outputs of the for all elements in the distribution domain. Note that this method should not be used in case there is a more efficient formula for computing the CDF of a distribution. A single point in the distribution range. Gets the first derivative of the inverse distribution function (icdf) for this distribution evaluated at probability p. A probability value between 0 and 1. Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point k. This function is also known as the Survival function. A single point in the distribution range. True to return P(X >= x), false to return P(X > x). Default is false. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. // Compute P(X = k) double equal = dist.ProbabilityMassFunction(k: 1); // Compute P(X < k) double less = dist.DistributionFunction(k: 1, inclusive: false); // Compute P(X ≤ k) double lessThanOrEqual = dist.DistributionFunction(k: 1, inclusive: true); // Compute P(X > k) double greater = dist.ComplementaryDistributionFunction(k: 1); // Compute P(X ≥ k) double greaterThanOrEqual = dist.ComplementaryDistributionFunction(k: 1, inclusive: true); Gets P(X > k) the complementary cumulative distribution function (ccdf) for this distribution evaluated at point k. This function is also known as the Survival function. A single point in the distribution range. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Gets P(X > k) the complementary cumulative distribution function (ccdf) for this distribution evaluated at point k. This function is also known as the Survival function. A single point in the distribution range. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Gets the probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The Probability Mass Function (PMF) describes the probability that a given value x will occur. The probability of k occurring in the current distribution. Gets the probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The Probability Mass Function (PMF) describes the probability that a given value x will occur. The probability of k occurring in the current distribution. Gets the log-probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The Probability Mass Function (PMF) describes the probability that a given value k will occur. The logarithm of the probability of x occurring in the current distribution. Gets the log-probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The Probability Mass Function (PMF) describes the probability that a given value k will occur. The logarithm of the probability of x occurring in the current distribution. Gets the hazard function, also known as the failure rate or the conditional failure density function for this distribution evaluated at point x. The hazard function is the ratio of the probability density function f(x) to the survival function, S(x). A single point in the distribution range. The conditional failure density function h(x) evaluated at x in the current distribution. Gets the cumulative hazard function for this distribution evaluated at point x. A single point in the distribution range. The cumulative hazard function H(x) evaluated at x in the current distribution. Gets the log-cumulative hazard function for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the cumulative hazard function H(x) evaluated at x in the current distribution. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Generates a random vector of observations from the current distribution. The number of samples to generate. A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. The random number generator to use as a source of randomness. Default is to use . A random observations drawn from this distribution. Wilcoxon's W statistic distribution. This is the distribution for the positive side statistic W+ of the Wilcoxon test. Some textbooks (and statistical packages) use alternate definitions for W, which should be compared with the appropriate statistic tables or alternate distributions. The Wilcoxon signed-rank test is a non-parametric statistical hypothesis test used when comparing two related samples, matched samples, or repeated measurements on a single sample to assess whether their population mean ranks differ (i.e. it is a paired difference test). It can be used as an alternative to the paired Student's t-test, t-test for matched pairs, or the t-test for dependent samples when the population cannot be assumed to be normally distributed. References: Wikipedia, The Free Encyclopedia. Wilcoxon signed-rank test. Available on: http://en.wikipedia.org/wiki/Wilcoxon_signed-rank_test // Compute some rank statistics (see other examples below) double[] ranks = { 1, 2, 3, 4, 5.5, 5.5, 7, 8, 9, 10, 11, 12 }; // Create a new Wilcoxon's W distribution WilcoxonDistribution W = new WilcoxonDistribution(ranks); // Common measures double mean = W.Mean; // 39.0 double median = W.Median; // 38.5 double var = W.Variance; // 162.5 // Probability density functions double pdf = W.ProbabilityDensityFunction(w: 42); // 0.38418508862319295 double lpdf = W.LogProbabilityDensityFunction(w: 42); // 0.38418508862319295 // Cumulative distribution functions double cdf = W.DistributionFunction(w: 42); // 0.60817384423279575 double ccdf = W.ComplementaryDistributionFunction(x: 42); // 0.39182615576720425 // Quantile function double icdf = W.InverseDistributionFunction(p: cdf); // 42 // Hazard (failure rate) functions double hf = W.HazardFunction(x: 42); // 0.98049883339449373 double chf = W.CumulativeHazardFunction(x: 42); // 0.936937017743799 // String representation string str = W.ToString(); // "W+(x; R)" The following example shows how to compute the W+ statistic given a sample. The Statsstics is given as the sum of all positive signed ranks in a sample. // Suppose we have computed a vector of differences between // samples and an hypothesized value (as in Wilcoxon's test). double[] differences = ... // differences between samples and an hypothesized median // Compute the ranks of the absolute differences and their sign double[] ranks = Measures.Rank(differences.Abs()); int[] signs = Accord.Math.Matrix.Sign(differences).ToInt32(); // Compute the W+ statistics from the signed ranks double W = WilcoxonDistribution.WPositive(Signs, ranks); Gets the number of effective samples. Gets whether this distribution computes the exact probabilities (by searching all possible sign combinations) or gives fast approximations. true if this distribution is exact; otherwise, false. Gets the statistic values for all possible combinations of ranks. This is used to compute the exact distribution. Gets or sets the continuity correction to be applied when using the Normal approximation to this distribution. Creates a new Wilcoxon's W+ distribution. The number of observations. Creates a new Wilcoxon's W+ distribution. The rank statistics for the samples. True to compute the exact distribution. May require a significant amount of processing power for large samples (n > 12). If left at null, whether to compute the exact or approximate distribution will depend on the number of samples. Computes the Wilcoxon's W+ statistic. The W+ statistic is computed as the sum of all positive signed ranks. Computes the Wilcoxon's W- statistic. The W- statistic is computed as the sum of all negative signed ranks. Computes the Wilcoxon's W statistic (equivalent to Mann-Whitney U when used in two-sample tests). The W statistic is computed as the minimum of the W+ and W- statistics. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Gets the mean for this distribution. The distribution's mean value. Gets the variance for this distribution. The distribution's variance. Gets the mode for this distribution. The distribution's mode value. In the current implementation, returns the same as the . Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the entropy for this distribution. The distribution's entropy. Gets the cumulative distribution function (cdf) for this distribution evaluated at point k. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. A single point in the distribution range. System.Double. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . Gets the probability density function (pdf) for this distribution evaluated at point w. A single point in the distribution range. The probability of w occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the log-probability density function (pdf) for this distribution evaluated at point w. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Returns a that represents this instance. A that represents this instance. Degenerate distribution. In mathematics, a degenerate distribution or deterministic distribution is the probability distribution of a random variable which only takes a single value. Examples include a two-headed coin and rolling a die whose sides all show the same number. While this distribution does not appear random in the everyday sense of the word, it does satisfy the definition of random variable. The degenerate distribution is localized at a point k0 on the real line. The probability mass function is a Delta function at k0. References: Wikipedia, The Free Encyclopedia. Degenerate distribution. Available on: http://en.wikipedia.org/wiki/Degenerate_distribution This example shows how to create a Degenerate distribution and compute some of its properties. var dist = new DegenerateDistribution(value: 2); double mean = dist.Mean; // 2 double median = dist.Median; // 2 double mode = dist.Mode; // 2 double var = dist.Variance; // 1 double cdf1 = dist.DistributionFunction(k: 1); // 0 double cdf2 = dist.DistributionFunction(k: 2); // 1 double pdf1 = dist.ProbabilityMassFunction(k: 1); // 0 double pdf2 = dist.ProbabilityMassFunction(k: 2); // 1 double pdf3 = dist.ProbabilityMassFunction(k: 3); // 0 double lpdf = dist.LogProbabilityMassFunction(k: 2); // 0 double ccdf = dist.ComplementaryDistributionFunction(k: 2); // 0.0 int icdf1 = dist.InverseDistributionFunction(p: 0.0); // 1 int icdf2 = dist.InverseDistributionFunction(p: 0.5); // 3 int icdf3 = dist.InverseDistributionFunction(p: 1.0); // 2 double hf = dist.HazardFunction(x: 0); // 0.0 double chf = dist.CumulativeHazardFunction(x: 0); // 0.0 string str = dist.ToString(CultureInfo.InvariantCulture); // Degenerate(x; k0 = 2) Gets the unique value whose probability is different from zero. Initializes a new instance of the class. Initializes a new instance of the class. The only value whose probability is different from zero. Default is zero. Gets the mean for this distribution. In the Degenerate distribution, the mean is equal to the unique value within its domain. The distribution's mean value, which should equal . Gets the median for this distribution, which should equal . In the Degenerate distribution, the mean is equal to the unique value within its domain. The distribution's median value. Gets the variance for this distribution, which should equal 0. In the Degenerate distribution, the variance equals 0. The distribution's variance. Gets the mode for this distribution, which should equal . In the Degenerate distribution, the mean is equal to the unique value within its domain. The distribution's mode value. Gets the entropy for this distribution, which is zero. The distribution's entropy. Gets the support interval for this distribution. The degenerate distribution's support is given only on the point interval (, ). A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point k. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The probability of k occurring in the current distribution. The Probability Mass Function (PMF) describes the probability that a given value x will occur. Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. The does not support fitting. Symmetric Geometric Distribution. The Symmetric Geometric Distribution can be seen as a discrete case of the . Gets the success probability for the distribution. Creates a new symmetric geometric distribution. The success probability. Gets the support interval for this distribution, which in the case of the Symmetric Geometric is [-inf, +inf]. A containing the support interval for this distribution. Gets the mean for this distribution, which in the case of the Symmetric Geometric is zero. The distribution's mean value. Gets the variance for this distribution. The distribution's variance. Not supported. Not supported. Gets the probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The probability of k occurring in the current distribution. The Probability Mass Function (PMF) describes the probability that a given value x will occur. Gets the log-probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Mass Function (PMF) describes the probability that a given value k will occur. Generates a random observation from the current distribution. The random number generator to use as a source of randomness. Default is to use . A random observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Negative Binomial distribution. The negative binomial distribution is a discrete probability distribution of the number of successes in a sequence of Bernoulli trials before a specified (non-random) number of failures (denoted r) occur. For example, if one throws a die repeatedly until the third time “1” appears, then the probability distribution of the number of non-“1”s that had appeared will be negative binomial. References: Wikipedia, The Free Encyclopedia. Negative Binomial distribution. Available from: http://en.wikipedia.org/wiki/Negative_binomial_distribution // Create a new Negative Binomial distribution with r = 7 and p = 0.42 var dist = new NegativeBinomialDistribution(failures: 7, probability: 0.42); // Common measures double mean = dist.Mean; // 5.068965517241379 double median = dist.Median; // 5.0 double var = dist.Variance; // 8.7395957193816862 // Cumulative distribution functions double cdf = dist.DistributionFunction(k: 2); // 0.19605133020527743 double ccdf = dist.ComplementaryDistributionFunction(k: 2); // 0.80394866979472257 // Probability mass functions double pmf1 = dist.ProbabilityMassFunction(k: 4); // 0.054786846293416853 double pmf2 = dist.ProbabilityMassFunction(k: 5); // 0.069908015870399909 double pmf3 = dist.ProbabilityMassFunction(k: 6); // 0.0810932984096639 double lpmf = dist.LogProbabilityMassFunction(k: 2); // -2.3927801721315989 // Quantile function int icdf1 = dist.InverseDistributionFunction(p: 0.17); // 2 int icdf2 = dist.InverseDistributionFunction(p: 0.46); // 4 int icdf3 = dist.InverseDistributionFunction(p: 0.87); // 8 // Hazard (failure rate) functions double hf = dist.HazardFunction(x: 4); // 0.10490438293398294 double chf = dist.CumulativeHazardFunction(x: 4); // 0.64959916255036043 // String representation string str = dist.ToString(CultureInfo.InvariantCulture); // "NegativeBinomial(x; r = 7, p = 0.42)" Creates a new Negative Binomial distribution. Number of failures r. Success probability in each experiment. Gets the number of failures. The number of failures. Gets the probability of success. The probability of success in each experiment. Gets the mean for this distribution. The distribution's mean value. Gets the variance for this distribution. The distribution's variance. Gets the entropy for this distribution. The distribution's entropy. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets P( X<= k), the cumulative distribution function (cdf) for this distribution evaluated at point k. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The probability of k occurring in the current distribution. The Probability Mass Function (PMF) describes the probability that a given value x will occur. Gets the log-probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Mass Function (PMF) describes the probability that a given value k will occur. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Returns a that represents this instance. A that represents this instance. Pareto's Distribution. The Pareto distribution, named after the Italian economist Vilfredo Pareto, is a power law probability distribution that coincides with social, scientific, geophysical, actuarial, and many other types of observable phenomena. Outside the field of economics it is sometimes referred to as the Bradford distribution. References: Wikipedia, The Free Encyclopedia. Pareto distribution. Available from: http://en.wikipedia.org/wiki/Pareto_distribution // Creates a new Pareto's distribution with xm = 0.42, α = 3 var pareto = new ParetoDistribution(scale: 0.42, shape: 3); // Common measures double mean = pareto.Mean; // 0.63 double median = pareto.Median; // 0.52916684095584676 double var = pareto.Variance; // 0.13229999999999997 // Cumulative distribution functions double cdf = pareto.DistributionFunction(x: 1.4); // 0.973 double ccdf = pareto.ComplementaryDistributionFunction(x: 1.4); // 0.027000000000000024 double icdf = pareto.InverseDistributionFunction(p: cdf); // 1.4000000446580794 // Probability density functions double pdf = pareto.ProbabilityDensityFunction(x: 1.4); // 0.057857142857142857 double lpdf = pareto.LogProbabilityDensityFunction(x: 1.4); // -2.8497783609309111 // Hazard (failure rate) functions double hf = pareto.HazardFunction(x: 1.4); // 2.142857142857141 double chf = pareto.CumulativeHazardFunction(x: 1.4); // 3.6119184129778072 // String representation string str = pareto.ToString(CultureInfo.InvariantCulture); // Pareto(x; xm = 0.42, α = 3) Creates new Pareto distribution. Creates new Pareto distribution. The scale parameter xm. Default is 1. The shape parameter α (alpha). Default is 1. Gets the scale parameter xm for this distribution. Gets the shape parameter α (alpha) for this distribution. Gets the mean for this distribution. The distribution's mean value. The Pareto distribution's mean is defined as α * xm / (α - 1). Gets the variance for this distribution. The distribution's variance. The Pareto distribution's mean is defined as α * xm² / ((α - 1)² * (α - 2). Gets the entropy for this distribution. The distribution's entropy. The Pareto distribution's Entropy is defined as ln(xm / α) + 1 / α + 1. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the mode for this distribution. The distribution's mode value. Gets the median for this distribution. The distribution's median value. The Pareto distribution's median is defined as xm * 2^(1 / α). Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. See . Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. See . Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. See . Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Discrete uniform distribution. In probability theory and statistics, the discrete uniform distribution is a symmetric probability distribution whereby a finite number of values are equally likely to be observed; every one of n values has equal probability 1/n. Another way of saying "discrete uniform distribution" would be "a known, finite number of outcomes equally likely to happen". A simple example of the discrete uniform distribution is throwing a fair die. The possible values are 1, 2, 3, 4, 5, 6, and each time the die is thrown the probability of a given score is 1/6. If two dice are thrown and their values added, the resulting distribution is no longer uniform since not all sums have equal probability. The discrete uniform distribution itself is inherently non-parametric. It is convenient, however, to represent its values generally by an integer interval [a,b], so that a,b become the main parameters of the distribution (often one simply considers the interval [1,n] with the single parameter n). References: Wikipedia, The Free Encyclopedia. Uniform distribution (discrete). Available on: http://en.wikipedia.org/wiki/Uniform_distribution_(discrete) // Create an uniform (discrete) distribution in [2, 6] var dist = new UniformDiscreteDistribution(a: 2, b: 6); // Common measures double mean = dist.Mean; // 4.0 double median = dist.Median; // 4.0 double var = dist.Variance; // 1.3333333333333333 // Cumulative distribution functions double cdf = dist.DistributionFunction(k: 2); // 0.2 double ccdf = dist.ComplementaryDistributionFunction(k: 2); // 0.8 // Probability mass functions double pmf1 = dist.ProbabilityMassFunction(k: 4); // 0.2 double pmf2 = dist.ProbabilityMassFunction(k: 5); // 0.2 double pmf3 = dist.ProbabilityMassFunction(k: 6); // 0.2 double lpmf = dist.LogProbabilityMassFunction(k: 2); // -1.6094379124341003 // Quantile function int icdf1 = dist.InverseDistributionFunction(p: 0.17); // 2 int icdf2 = dist.InverseDistributionFunction(p: 0.46); // 4 int icdf3 = dist.InverseDistributionFunction(p: 0.87); // 6 // Hazard (failure rate) functions double hf = dist.HazardFunction(x: 4); // 0.5 double chf = dist.CumulativeHazardFunction(x: 4); // 0.916290731874155 // String representation string str = dist.ToString(CultureInfo.InvariantCulture); // "U(x; a = 2, b = 6)" Creates a discrete uniform distribution defined in the interval [a;b]. The starting (minimum) value a. The ending (maximum) value b. Gets the minimum value of the distribution (a). Gets the maximum value of the distribution (b). Gets the length of the distribution (b - a + 1). Gets the mean for this distribution. Gets the variance for this distribution. Gets the entropy for this distribution. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point k. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The probability of k occurring in the current distribution. The Probability Mass Function (PMF) describes the probability that a given value k will occur. Gets the log-probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of k occurring in the current distribution. The Probability Mass Function (PMF) describes the probability that a given value k will occur. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Generates a random vector of observations from the Uniform distribution with the given parameters. The starting number a. The ending number b. The number of samples to generate. An array of double values sampled from the specified Uniform distribution. Generates a random vector of observations from the Uniform distribution with the given parameters. The starting number a. The ending number b. The number of samples to generate. The location where to store the samples. An array of double values sampled from the specified Uniform distribution. Generates a random observation from the Uniform distribution defined in the interval 0 and MAXVALUE. The number of samples to generate. An array of double values sampled from the specified Uniform distribution. Generates a random observation from the Uniform distribution defined in the interval 0 and MAXVALUE. The number of samples to generate. The location where to store the samples. An array of double values sampled from the specified Uniform distribution. Generates a random observation from the Uniform distribution defined in the interval 0 and MAXVALUE. A random double value sampled from the specified Uniform distribution. Generates a random observation from the Uniform distribution with the given parameters. The starting number a. The ending number b. A random double value sampled from the specified Uniform distribution. Generates a random vector of observations from the Uniform distribution with the given parameters. The starting number a. The ending number b. The number of samples to generate. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Uniform distribution. Generates a random vector of observations from the Uniform distribution with the given parameters. The starting number a. The ending number b. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Uniform distribution. Generates a random observation from the Uniform distribution defined in the interval 0 and MAXVALUE. The number of samples to generate. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Uniform distribution. Generates a random observation from the Uniform distribution defined in the interval 0 and MAXVALUE. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Uniform distribution. Generates a random observation from the Uniform distribution defined in the interval 0 and MAXVALUE. A random double value sampled from the specified Uniform distribution. Generates a random observation from the Uniform distribution with the given parameters. The starting number a. The ending number b. The random number generator to use as a source of randomness. Default is to use . A random double value sampled from the specified Uniform distribution. Returns a that represents this instance. A that represents this instance. Generates a random observation from the current distribution. The random number generator to use as a source of randomness. Default is to use . A random observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. (Shifted) Geometric Distribution. This class represents the shifted version of the Geometric distribution with support on { 0, 1, 2, 3, ... }. This is the probability distribution of the number Y = X − 1 of failures before the first success, supported on the set { 0, 1, 2, 3, ... }. References: Wikipedia, The Free Encyclopedia. Geometric distribution. Available on: http://en.wikipedia.org/wiki/Geometric_distribution // Create a Geometric distribution with 42% success probability var dist = new GeometricDistribution(probabilityOfSuccess: 0.42); // Common measures double mean = dist.Mean; // 1.3809523809523812 double median = dist.Median; // 1.0 double var = dist.Variance; // 3.2879818594104315 double mode = dist.Mode; // 0.0 // Cumulative distribution functions double cdf = dist.DistributionFunction(k: 2); // 0.80488799999999994 double ccdf = dist.ComplementaryDistributionFunction(k: 2); // 0.19511200000000006 // Probability mass functions double pdf1 = dist.ProbabilityMassFunction(k: 0); // 0.42 double pdf2 = dist.ProbabilityMassFunction(k: 1); // 0.2436 double pdf3 = dist.ProbabilityMassFunction(k: 2); // 0.141288 double lpdf = dist.LogProbabilityMassFunction(k: 2); // -1.956954918588067 // Quantile functions int icdf1 = dist.InverseDistributionFunction(p: 0.17); // 0 int icdf2 = dist.InverseDistributionFunction(p: 0.46); // 1 int icdf3 = dist.InverseDistributionFunction(p: 0.87); // 3 // Hazard (failure rate) functions double hf = dist.HazardFunction(x: 0); // 0.72413793103448265 double chf = dist.CumulativeHazardFunction(x: 0); // 0.54472717544167193 // String representation string str = dist.ToString(CultureInfo.InvariantCulture); // "Geometric(x; p = 0.42)" Gets the success probability for the distribution. Creates a new (shifted) geometric distribution. The success probability. Gets the mean for this distribution. The distribution's mean value. Gets the mode for this distribution. The distribution's mode value. Gets the median for this distribution. The distribution's median value. Gets the variance for this distribution. The distribution's variance. Gets the entropy for this distribution. The distribution's entropy. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point k. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The probability of k occurring in the current distribution. The Probability Mass Function (PMF) describes the probability that a given value x will occur. Gets the log-probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Mass Function (PMF) describes the probability that a given value k will occur. Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Generates a random observation from the current distribution. The random number generator to use as a source of randomness. Default is to use . A random observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. The probability of success. A random observations drawn from this distribution. Generates a random vector of observations from the current distribution. The probability of success. The number of samples to generate. The location where to store the samples. A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The probability of success. The number of samples to generate. The location where to store the samples. A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. The probability of success. The random number generator to use as a source of randomness. Default is to use . A random observations drawn from this distribution. Generates a random vector of observations from the current distribution. The probability of success. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The probability of success. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Hypergeometric probability distribution. The hypergeometric distribution is a discrete probability distribution that describes the probability of k successes in n draws from a finite population without replacement. This is in contrast to the binomial distribution, which describes the probability of k successes in n draws with replacement. References: Wikipedia, The Free Encyclopedia. Hypergeometric distribution. Available on: http://en.wikipedia.org/wiki/Hypergeometric_distribution // Distribution parameters int populationSize = 15; // population size N int success = 7; // number of successes in the sample int samples = 8; // number of samples drawn from N // Create a new Hypergeometric distribution with N = 15, n = 8, and s = 7 var dist = new HypergeometricDistribution(populationSize, success, samples); // Common measures double mean = dist.Mean; // 1.3809523809523812 double median = dist.Median; // 4.0 double var = dist.Variance; // 3.2879818594104315 double mode = dist.Mode; // 4.0 // Cumulative distribution functions double cdf = dist.DistributionFunction(k: 2); // 0.80488799999999994 double ccdf = dist.ComplementaryDistributionFunction(k: 2); // 0.19511200000000006 // Probability mass functions double pdf1 = dist.ProbabilityMassFunction(k: 4); // 0.38073038073038074 double pdf2 = dist.ProbabilityMassFunction(k: 5); // 0.18275058275058276 double pdf3 = dist.ProbabilityMassFunction(k: 6); // 0.030458430458430458 double lpdf = dist.LogProbabilityMassFunction(k: 2); // -2.3927801721315989 // Quantile function int icdf1 = dist.InverseDistributionFunction(p: 0.17); // 3 int icdf2 = dist.InverseDistributionFunction(p: 0.46); // 4 int icdf3 = dist.InverseDistributionFunction(p: 0.87); // 5 // Hazard (failure rate) functions double hf = dist.HazardFunction(x: 4); // 1.7753623188405792 double chf = dist.CumulativeHazardFunction(x: 4); // 1.5396683418789763 // String representation string str = dist.ToString(CultureInfo.InvariantCulture); // "HyperGeometric(x; N = 15, m = 7, n = 8)" Gets the size N of the population for this distribution. Gets the size n of the sample drawn from N. Gets the count of success trials in the population for this distribution. This is often referred as m. Constructs a new Hypergeometric distribution. To build from the number of failures and succcesses like in R, please see . Size N of the population. The number m of successes in the population. The number n of samples drawn from the population. Creates a new from the number of successes, failures, and the number of samples drawn. This is the same parameterization used in R. The number m of successes in the population. The number m of failures in the population. The number n of samples drawn from the population. Gets the mean for this distribution. The distribution's mean value. Gets the variance for this distribution. The distribution's variance. Gets the entropy for this distribution. The distribution's entropy. Gets the mode for this distribution. The distribution's mode value. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point k. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The probability of k occurring in the current distribution. The Probability Mass Function (PMF) describes the probability that a given value x will occur. Gets the log-probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Mass Function (PMF) describes the probability that a given value k will occur. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Inverse Gaussian (Normal) Distribution, also known as the Wald distribution. The Inverse Gaussian distribution is a two-parameter family of continuous probability distributions with support on (0,∞). As λ tends to infinity, the inverse Gaussian distribution becomes more like a normal (Gaussian) distribution. The inverse Gaussian distribution has several properties analogous to a Gaussian distribution. The name can be misleading: it is an "inverse" only in that, while the Gaussian describes a Brownian Motion's level at a fixed time, the inverse Gaussian describes the distribution of the time a Brownian Motion with positive drift takes to reach a fixed positive level. References: Wikipedia, The Free Encyclopedia. Inverse Gaussian distribution. Available on: http://en.wikipedia.org/wiki/Inverse_Gaussian_distribution // Create a new inverse Gaussian distribution with μ = 0.42 and λ = 1.2 var invGaussian = new InverseGaussianDistribution(mean: 0.42, shape: 1.2); // Common measures double mean = invGaussian.Mean; // 0.42 double median = invGaussian.Median; // 0.35856861093990083 double var = invGaussian.Variance; // 0.061739999999999989 // Cumulative distribution functions double cdf = invGaussian.DistributionFunction(x: 0.27); // 0.30658791274125458 double ccdf = invGaussian.ComplementaryDistributionFunction(x: 0.27); // 0.69341208725874548 double icdf = invGaussian.InverseDistributionFunction(p: cdf); // 0.26999999957543408 // Probability density functions double pdf = invGaussian.ProbabilityDensityFunction(x: 0.27); // 2.3461495925760354 double lpdf = invGaussian.LogProbabilityDensityFunction(x: 0.27); // 0.85277551314980737 // Hazard (failure rate) functions double hf = invGaussian.HazardFunction(x: 0.27); // 3.383485283406336 double chf = invGaussian.CumulativeHazardFunction(x: 0.27); // 0.36613081401302111 // String representation string str = invGaussian.ToString(CultureInfo.InvariantCulture); // "N^-1(x; μ = 0.42, λ = 1.2)" Constructs a new Inverse Gaussian distribution. The mean parameter mu. The shape parameter lambda. Gets the shape parameter (lambda) for this distribution. The distribution's lambda value. Gets the mean for this distribution. The distribution's mean value. Gets the variance for this distribution. The distribution's variance. Gets the mode for this distribution. The distribution's mode value. Gets the entropy for this distribution. The distribution's entropy. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. See . Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. See . Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. See . Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Estimates a new Normal distribution from a given set of observations. Estimates a new Normal distribution from a given set of observations. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. The random number generator to use as a source of randomness. Default is to use . A random observations drawn from this distribution. Generates a random observation from the Inverse Gaussian distribution with the given parameters. The mean parameter mu. The shape parameter lambda. A random double value sampled from the specified Uniform distribution. Generates a random observation from the Inverse Gaussian distribution with the given parameters. The mean parameter mu. The shape parameter lambda. The random number generator to use as a source of randomness. Default is to use . A random double value sampled from the specified Uniform distribution. Generates a random vector of observations from the Inverse Gaussian distribution with the given parameters. The mean parameter mu. The shape parameter lambda. The number of samples to generate. An array of double values sampled from the specified Uniform distribution. Generates a random vector of observations from the Inverse Gaussian distribution with the given parameters. The mean parameter mu. The shape parameter lambda. The number of samples to generate. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Uniform distribution. Generates a random vector of observations from the Inverse Gaussian distribution with the given parameters. The mean parameter mu. The shape parameter lambda. The number of samples to generate. The location where to store the samples. An array of double values sampled from the inverse Gaussian distribution. Generates a random vector of observations from the Inverse Gaussian distribution with the given parameters. The mean parameter mu. The shape parameter lambda. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the inverse Gaussian distribution. Returns a that represents this instance. The format. The format provider. A that represents this instance. Nakagami distribution. The Nakagami distribution has been used in the modeling of wireless signal attenuation while traversing multiple paths. References: Wikipedia, The Free Encyclopedia. Nakagami distribution. Available on: http://en.wikipedia.org/wiki/Nakagami_distribution Laurenson, Dave (1994). "Nakagami Distribution". Indoor Radio Channel Propagation Modeling by Ray Tracing Techniques. R. Kolar, R. Jirik, J. Jan (2004) "Estimator Comparison of the Nakagami-m Parameter and Its Application in Echocardiography", Radioengineering, 13 (1), 8–12 var nakagami = new NakagamiDistribution(shape: 2.4, spread: 4.2); double mean = nakagami.Mean; // 1.946082119049118 double median = nakagami.Median; // 1.9061151110206338 double var = nakagami.Variance; // 0.41276438591729486 double cdf = nakagami.DistributionFunction(x: 1.4); // 0.20603416752368109 double pdf = nakagami.ProbabilityDensityFunction(x: 1.4); // 0.49253215371343023 double lpdf = nakagami.LogProbabilityDensityFunction(x: 1.4); // -0.708195533773302 double ccdf = nakagami.ComplementaryDistributionFunction(x: 1.4); // 0.79396583247631891 double icdf = nakagami.InverseDistributionFunction(p: cdf); // 1.400000000131993 double hf = nakagami.HazardFunction(x: 1.4); // 0.62034426869133652 double chf = nakagami.CumulativeHazardFunction(x: 1.4); // 0.23071485080660473 string str = nakagami.ToString(CultureInfo.InvariantCulture); // Nakagami(x; μ = 2,4, ω = 4,2)" Initializes a new instance of the class. The shape parameter μ (mu). The spread parameter ω (omega). Gets the distribution's shape parameter μ (mu). The shape parameter μ (mu). Gets the distribution's spread parameter ω (omega). The spread parameter ω (omega). Gets the mean for this distribution. The distribution's mean value. Nakagami's mean is defined in terms of the Gamma function Γ(x) as (Γ(μ + 0.5) / Γ(μ)) * sqrt(ω / μ). Gets the mode for this distribution. The distribution's mode value. Gets the variance for this distribution. The distribution's variance. Nakagami's variance is defined in terms of the Gamma function Γ(x) as ω * (1 - (1 / μ) * (Γ(μ + 0.5) / Γ(μ))². This method is not supported. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. The Nakagami's distribution CDF is defined in terms of the Lower incomplete regularized Gamma function P(a, x) as CDF(x) = P(μ, μ / ω) * x². See . Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Nakagami's PDF is defined as PDF(x) = c * x^(2 * μ - 1) * exp(-(μ / ω) * x²) in which c = 2 * μ ^ μ / (Γ(μ) * ω ^ μ)) See . Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Nakagami's PDF is defined as PDF(x) = c * x^(2 * μ - 1) * exp(-(μ / ω) * x²) in which c = 2 * μ ^ μ / (Γ(μ) * ω ^ μ)) See . Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Estimates a new Nakagami distribution from a given set of observations. Estimates a new Nakagami distribution from a given set of observations. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. The random number generator to use as a source of randomness. Default is to use . A random observations drawn from this distribution. Generates a random vector of observations from the Nakagami distribution with the given parameters. The shape parameter μ. The spread parameter ω. The number of samples to generate. An array of double values sampled from the specified Nakagami distribution. Generates a random vector of observations from the Nakagami distribution with the given parameters. The shape parameter μ. The spread parameter ω. The number of samples to generate. The location where to store the samples. An array of double values sampled from the specified Nakagami distribution. Generates a random observation from the Nakagami distribution with the given parameters. The shape parameter μ. The spread parameter ω. A random double value sampled from the specified Nakagami distribution. Generates a random vector of observations from the Nakagami distribution with the given parameters. The shape parameter μ. The spread parameter ω. The number of samples to generate. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Nakagami distribution. Generates a random vector of observations from the Nakagami distribution with the given parameters. The shape parameter μ. The spread parameter ω. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Nakagami distribution. Generates a random observation from the Nakagami distribution with the given parameters. The shape parameter μ. The spread parameter ω. The random number generator to use as a source of randomness. Default is to use . A random double value sampled from the specified Nakagami distribution. Returns a that represents this instance. The format. The format provider. A that represents this instance. Rayleigh distribution. In probability theory and statistics, the Rayleigh distribution is a continuous probability distribution. A Rayleigh distribution is often observed when the overall magnitude of a vector is related to its directional components. One example where the Rayleigh distribution naturally arises is when wind speed is analyzed into its orthogonal 2-dimensional vector components. Assuming that the magnitude of each component is uncorrelated and normally distributed with equal variance, then the overall wind speed (vector magnitude) will be characterized by a Rayleigh distribution. References: Wikipedia, The Free Encyclopedia. Rayleigh distribution. Available on: http://en.wikipedia.org/wiki/Rayleigh_distribution // Create a new Rayleigh's distribution with σ = 0.42 var rayleigh = new RayleighDistribution(sigma: 0.42); // Common measures double mean = rayleigh.Mean; // 0.52639193767251 double median = rayleigh.Median; // 0.49451220943852386 double var = rayleigh.Variance; // 0.075711527953380237 // Cumulative distribution functions double cdf = rayleigh.DistributionFunction(x: 1.4); // 0.99613407986052716 double ccdf = rayleigh.ComplementaryDistributionFunction(x: 1.4); // 0.0038659201394728449 double icdf = rayleigh.InverseDistributionFunction(p: cdf); // 1.4000000080222026 // Probability density functions double pdf = rayleigh.ProbabilityDensityFunction(x: 1.4); // 0.030681905868831811 double lpdf = rayleigh.LogProbabilityDensityFunction(x: 1.4); // -3.4840821835248961 // Hazard (failure rate) functions double hf = rayleigh.HazardFunction(x: 1.4); // 7.9365079365078612 double chf = rayleigh.CumulativeHazardFunction(x: 1.4); // 5.5555555555555456 // String representation string str = rayleigh.ToString(CultureInfo.InvariantCulture); // Rayleigh(x; σ = 0.42) Creates a new Rayleigh distribution. The scale parameter σ (sigma). Gets the mean for this distribution. The distribution's mean value. The Rayleight's mean value is defined as mean = σ * sqrt(π / 2). Gets the Rayleight's scale parameter σ (sigma) Gets the variance for this distribution. The distribution's variance. The Rayleight's variance value is defined as var = (4 - π) / 2 * σ². Gets the mode for this distribution. In the Rayleigh distribution, the mode equals σ (sigma). The distribution's mode value. Gets the entropy for this distribution. The distribution's entropy. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. See . Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. See . Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. See . Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Estimates a new Gamma distribution from a given set of observations. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. The random number generator to use as a source of randomness. Default is to use . A random observations drawn from this distribution. Generates a random vector of observations from the Rayleigh distribution with the given parameters. The Rayleigh distribution's sigma. The number of samples to generate. An array of double values sampled from the specified Rayleigh distribution. Generates a random vector of observations from the Rayleigh distribution with the given parameters. The Rayleigh distribution's sigma. The number of samples to generate. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Rayleigh distribution. Generates a random vector of observations from the Rayleigh distribution with the given parameters. The Rayleigh distribution's sigma. The number of samples to generate. The location where to store the samples. An array of double values sampled from the specified Rayleigh distribution. Generates a random vector of observations from the Rayleigh distribution with the given parameters. The Rayleigh distribution's sigma. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Rayleigh distribution. Generates a random observation from the Rayleigh distribution with the given parameters. The Rayleigh distribution's sigma. A random double value sampled from the specified Rayleigh distribution. Generates a random observation from the Rayleigh distribution with the given parameters. The Rayleigh distribution's sigma. The random number generator to use as a source of randomness. Default is to use . A random double value sampled from the specified Rayleigh distribution. Returns a that represents this instance. A that represents this instance. Student's t-distribution. In probability and statistics, Student's t-distribution (or simply the t-distribution) is a family of continuous probability distributions that arises when estimating the mean of a normally distributed population in situations where the sample size is small and population standard deviation is unknown. It plays a role in a number of widely used statistical analyses, including the Student's t-test for assessing the statistical significance of the difference between two sample means, the construction of confidence intervals for the difference between two population means, and in linear regression analysis. The Student's t-distribution also arises in the Bayesian analysis of data from a normal family. If we take k samples from a normal distribution with fixed unknown mean and variance, and if we compute the sample mean and sample variance for these k samples, then the t-distribution (for k) can be defined as the distribution of the location of the true mean, relative to the sample mean and divided by the sample standard deviation, after multiplying by the normalizing term sqrt(n), where n is the sample size. In this way the t-distribution can be used to estimate how likely it is that the true mean lies in any given range. The t-distribution is symmetric and bell-shaped, like the normal distribution, but has heavier tails, meaning that it is more prone to producing values that fall far from its mean. This makes it useful for understanding the statistical behavior of certain types of ratios of random quantities, in which variation in the denominator is amplified and may produce outlying values when the denominator of the ratio falls close to zero. The Student's t-distribution is a special case of the generalized hyperbolic distribution. References: Wikipedia, The Free Encyclopedia. Student's t-distribution. Available on: http://en.wikipedia.org/wiki/Student's_t-distribution // Create a new Student's T distribution with d.f = 4.2 TDistribution t = new TDistribution(degreesOfFreedom: 4.2); // Common measures double mean = t.Mean; // 0.0 double median = t.Median; // 0.0 double var = t.Variance; // 1.9090909090909089 // Cumulative distribution functions double cdf = t.DistributionFunction(x: 1.4); // 0.88456136730659074 double pdf = t.ProbabilityDensityFunction(x: 1.4); // 0.13894002185341031 double lpdf = t.LogProbabilityDensityFunction(x: 1.4); // -1.9737129364307417 // Probability density functions double ccdf = t.ComplementaryDistributionFunction(x: 1.4); // 0.11543863269340926 double icdf = t.InverseDistributionFunction(p: cdf); // 1.4000000000000012 // Hazard (failure rate) functions double hf = t.HazardFunction(x: 1.4); // 1.2035833984833988 double chf = t.CumulativeHazardFunction(x: 1.4); // 2.1590162088918525 // String representation string str = t.ToString(CultureInfo.InvariantCulture); // T(x; df = 4.2) Gets the degrees of freedom for the distribution. Initializes a new instance of the class. The degrees of freedom. Gets the mean for this distribution. The distribution's mean value. In the T Distribution, the mean is zero if the number of degrees of freedom is higher than 1. Otherwise, it is undefined. Gets the mode for this distribution (always zero). The distribution's mode value (zero). Gets the variance for this distribution. Gets the support interval for this distribution. A containing the support interval for this distribution. Not supported. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. See . Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. See . Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. See . Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. The Inverse Cumulative Distribution Function (ICDF) specifies, for a given probability, the value which the random variable will be at, or below, with that probability. See . Not supported. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Gets the inverse of the cumulative distribution function (icdf) for the left tail T-distribution evaluated at probability p. Based on the stdtril function from the Cephes Math Library Release 2.8, adapted with permission of Stephen L. Moshier. Continuous Uniform Distribution. The continuous uniform distribution or rectangular distribution is a family of symmetric probability distributions such that for each member of the family, all intervals of the same length on the distribution's support are equally probable. The support is defined by the two parameters, a and b, which are its minimum and maximum values. The distribution is often abbreviated U(a,b). It is the maximum entropy probability distribution for a random variate X under no constraint other than that it is contained in the distribution's support. References: Wikipedia, The Free Encyclopedia. Uniform Distribution (continuous). Available on: http://en.wikipedia.org/wiki/Uniform_distribution_(continuous) The following example demonstrates how to create an uniform distribution defined over the interval [0.42, 1.1]. // Create a new uniform continuous distribution from 0.42 to 1.1 var uniform = new UniformContinuousDistribution(a: 0.42, b: 1.1); // Common measures double mean = uniform.Mean; // 0.76 double median = uniform.Median; // 0.76 double var = uniform.Variance; // 0.03853333333333335 // Cumulative distribution functions double cdf = uniform.DistributionFunction(x: 0.9); // 0.70588235294117641 double ccdf = uniform.ComplementaryDistributionFunction(x: 0.9); // 0.29411764705882359 double icdf = uniform.InverseDistributionFunction(p: cdf); // 0.9 // Probability density functions double pdf = uniform.ProbabilityDensityFunction(x: 0.9); // 1.4705882352941173 double lpdf = uniform.LogProbabilityDensityFunction(x: 0.9); // 0.38566248081198445 // Hazard (failure rate) functions double hf = uniform.HazardFunction(x: 0.9); // 4.9999999999999973 double chf = uniform.CumulativeHazardFunction(x: 0.9); // 1.2237754316221154 // String representation string str = uniform.ToString(CultureInfo.InvariantCulture); // "U(x; a = 0.42, b = 1.1)" Creates a new uniform distribution defined in the interval [0;1]. Creates a new uniform distribution defined in the interval [min;max]. The output range for the uniform distribution. Creates a new uniform distribution defined in the interval [a;b]. The starting number a. The ending number b. Gets the minimum value of the distribution (a). Gets the maximum value of the distribution (b). Gets the length of the distribution (b-a). Gets the mean for this distribution. Gets the variance for this distribution. Gets the mode for this distribution. The mode of the uniform distribution is any value contained in the interval of the distribution. The framework return the same value as the . The distribution's mode value. Gets the entropy for this distribution. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Gets the Standard Uniform Distribution, starting at zero and ending at one (a=0, b=1). Estimates a new uniform distribution from a given set of observations. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. The random number generator to use as a source of randomness. Default is to use . A random observations drawn from this distribution. Generates a random vector of observations from the Uniform distribution with the given parameters. The starting number a. The ending number b. The number of samples to generate. An array of double values sampled from the specified Uniform distribution. Generates a random vector of observations from the Uniform distribution with the given parameters. The starting number a. The ending number b. The number of samples to generate. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Uniform distribution. Generates a random vector of observations from the Uniform distribution with the given parameters. The starting number a. The ending number b. The number of samples to generate. The location where to store the samples. An array of double values sampled from the specified Uniform distribution. Generates a random vector of observations from the Uniform distribution with the given parameters. The starting number a. The ending number b. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Uniform distribution. Generates a random observation from the Uniform distribution defined in the interval 0 and 1. The number of samples to generate. An array of double values sampled from the specified Uniform distribution. Generates a random observation from the Uniform distribution defined in the interval 0 and 1. The number of samples to generate. The location where to store the samples. An array of double values sampled from the specified Uniform distribution. Generates a random observation from the Uniform distribution defined in the interval 0 and 1. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Uniform distribution. Generates a random observation from the Uniform distribution defined in the interval 0 and 1. A random double value sampled from the specified Uniform distribution. Generates a random observation from the Uniform distribution defined in the interval 0 and 1. A random double value sampled from the specified Uniform distribution. Generates a random observation from the Uniform distribution with the given parameters. The starting number a. The ending number b. A random double value sampled from the specified Uniform distribution. Generates a random observation from the Uniform distribution with the given parameters. The starting number a. The ending number b. The random number generator to use as a source of randomness. Default is to use . A random double value sampled from the specified Uniform distribution. Returns a that represents this instance. A that represents this instance. Log-Normal (Galton) distribution. The log-normal distribution is a probability distribution of a random variable whose logarithm is normally distributed. References: Wikipedia, The Free Encyclopedia. Log-normal distribution. Available on: http://en.wikipedia.org/wiki/Log-normal_distribution NIST/SEMATECH e-Handbook of Statistical Methods. Lognormal Distribution. Available on: http://www.itl.nist.gov/div898/handbook/eda/section3/eda3669.htm Weisstein, Eric W. "Normal Distribution Function." From MathWorld--A Wolfram Web Resource. http://mathworld.wolfram.com/NormalDistributionFunction.html // Create a new Log-normal distribution with μ = 2.79 and σ = 1.10 var log = new LognormalDistribution(location: 0.42, shape: 1.1); // Common measures double mean = log.Mean; // 2.7870954605658511 double median = log.Median; // 1.5219615583481305 double var = log.Variance; // 18.28163603621158 // Cumulative distribution functions double cdf = log.DistributionFunction(x: 0.27); // 0.057961222885664958 double ccdf = log.ComplementaryDistributionFunction(x: 0.27); // 0.942038777114335 double icdf = log.InverseDistributionFunction(p: cdf); // 0.26999997937815973 // Probability density functions double pdf = log.ProbabilityDensityFunction(x: 0.27); // 0.39035530085982068 double lpdf = log.LogProbabilityDensityFunction(x: 0.27); // -0.94069792674674835 // Hazard (failure rate) functions double hf = log.HazardFunction(x: 0.27); // 0.41437285846720867 double chf = log.CumulativeHazardFunction(x: 0.27); // 0.059708840588116374 // String representation string str = log.ToString("N2", CultureInfo.InvariantCulture); // Lognormal(x; μ = 2.79, σ = 1.10) Constructs a Log-Normal (Galton) distribution with zero location and unit shape. Constructs a Log-Normal (Galton) distribution with given location and unit shape. The distribution's location value μ (mu). Constructs a Log-Normal (Galton) distribution with given mean and standard deviation. The distribution's location value μ (mu). The distribution's shape deviation σ (sigma). Shape parameter σ (sigma) of the log-normal distribution. Squared shape parameter σ² (sigma-squared) of the log-normal distribution. Location parameter μ (mu) of the log-normal distribution. Gets the Mean for this Log-Normal distribution. The Lognormal distribution's mean is defined as exp(μ + σ²/2). Gets the Variance (the square of the standard deviation) for this Log-Normal distribution. The Lognormal distribution's variance is defined as (exp(σ²) - 1) * exp(2*μ + σ²). Gets the mode for this distribution. The distribution's mode value. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the Entropy for this Log-Normal distribution. Gets the cumulative distribution function (cdf) for the this Log-Normal distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. The calculation is computed through the relationship to the error function as erfc(-z/sqrt(2)) / 2. See [Weisstein] for more details. References: Weisstein, Eric W. "Normal Distribution Function." From MathWorld--A Wolfram Web Resource. http://mathworld.wolfram.com/NormalDistributionFunction.html See . Gets the probability density function (pdf) for the Normal distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. See . Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. See . Gets the Standard Log-Normal Distribution, with location set to zero and unit shape. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Estimates a new Log-Normal distribution from a given set of observations. Estimates a new Log-Normal distribution from a given set of observations. Estimates a new Log-Normal distribution from a given set of observations. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Generates a random observation from the Lognormal distribution with the given parameters. The distribution's location value. The distribution's shape deviation. A random double value sampled from the specified Lognormal distribution. Generates a random vector of observations from the Lognormal distribution with the given parameters. The distribution's location value. The distribution's shape deviation. The number of samples to generate. An array of double values sampled from the specified Lognormal distribution. Generates a random vector of observations from the Lognormal distribution with the given parameters. The distribution's location value. The distribution's shape deviation. The number of samples to generate. The location where to store the samples. An array of double values sampled from the specified Lognormal distribution. Generates a random observation from the Lognormal distribution with the given parameters. The distribution's location value. The distribution's shape deviation. The random number generator to use as a source of randomness. Default is to use . A random double value sampled from the specified Lognormal distribution. Generates a random vector of observations from the Lognormal distribution with the given parameters. The distribution's location value. The distribution's shape deviation. The number of samples to generate. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Lognormal distribution. Generates a random vector of observations from the Lognormal distribution with the given parameters. The distribution's location value. The distribution's shape deviation. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Lognormal distribution. Returns a that represents this instance. A that represents this instance. Empirical distribution. Empirical distributions are based solely on the data. This class uses the empirical distribution function and the Gaussian kernel density estimation to provide an univariate continuous distribution implementation which depends only on sampled data. References: Wikipedia, The Free Encyclopedia. Empirical Distribution Function. Available on: http://en.wikipedia.org/wiki/Empirical_distribution_function PlanetMath. Empirical Distribution Function. Available on: http://planetmath.org/encyclopedia/EmpiricalDistributionFunction.html Wikipedia, The Free Encyclopedia. Kernel Density Estimation. Available on: http://en.wikipedia.org/wiki/Kernel_density_estimation Bishop, Christopher M.; Pattern Recognition and Machine Learning. Springer; 1st ed. 2006. The following example shows how to build an empirical distribution directly from a sample: // Consider the following univariate samples double[] samples = { 5, 5, 1, 4, 1, 2, 2, 3, 3, 3, 4, 3, 3, 3, 4, 3, 2, 3 }; // Create a non-parametric, empirical distribution using those samples: EmpiricalDistribution distribution = new EmpiricalDistribution(samples); // Common measures double mean = distribution.Mean; // 3 double median = distribution.Median; // 2.9999993064186787 double var = distribution.Variance; // 1.2941176470588236 // Cumulative distribution function double cdf = distribution.DistributionFunction(x: 4.2); // 0.88888888888888884 double ccdf = distribution.ComplementaryDistributionFunction(x: 4.2); //0.11111111111111116 double icdf = distribution.InverseDistributionFunction(p: cdf); // 4.1999999999999993 // Probability density functions double pdf = distribution.ProbabilityDensityFunction(x: 4.2); // 0.15552784414141974 double lpdf = distribution.LogProbabilityDensityFunction(x: 4.2); // -1.8609305013898356 // Hazard (failure rate) functions double hf = distribution.HazardFunction(x: 4.2); // 1.3997505972727771 double chf = distribution.CumulativeHazardFunction(x: 4.2); // 2.1972245773362191 // Automatically estimated smooth parameter (gamma) double smoothing = distribution.Smoothing; // 1.9144923416414432 // String representation string str = distribution.ToString(CultureInfo.InvariantCulture); // Fn(x; S) Creates a new Empirical Distribution from the data samples. The data samples. Creates a new Empirical Distribution from the data samples. The data samples. The kernel smoothing or bandwidth to be used in density estimation. By default, the normal distribution approximation will be used. Creates a new Empirical Distribution from the data samples. The data samples. The fractional weights to use for the samples. The weights must sum up to one. Creates a new Empirical Distribution from the data samples. The data samples. The number of repetition counts for each sample. Creates a new Empirical Distribution from the data samples. The data samples. The fractional weights to use for the samples. The weights must sum up to one. The kernel smoothing or bandwidth to be used in density estimation. By default, the normal distribution approximation will be used. Creates a new Empirical Distribution from the data samples. The data samples. The kernel smoothing or bandwidth to be used in density estimation. By default, the normal distribution approximation will be used. The number of repetition counts for each sample. Gets the samples giving this empirical distribution. Gets the fractional weights associated with each sample. Note that changing values on this array will not result int any effect in this distribution. The distribution must be computed from scratch with new values in case new weights needs to be used. Gets the repetition counts associated with each sample. Note that changing values on this array will not result int any effect in this distribution. The distribution must be computed from scratch with new values in case new weights needs to be used. Gets the total number of samples in this distribution. Gets the bandwidth smoothing parameter used in the kernel density estimation. Gets the mean for this distribution. See . Gets the mode for this distribution. The distribution's mode value. Gets the variance for this distribution. See . Gets the entropy for this distribution. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. See . Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. See . Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Gets the default estimative of the smoothing parameter. This method is based on the practical estimation of the bandwidth as suggested in Wikipedia: http://en.wikipedia.org/wiki/Kernel_density_estimation The observations for the empirical distribution. An estimative of the smoothing parameter. Gets the default estimative of the smoothing parameter. This method is based on the practical estimation of the bandwidth as suggested in Wikipedia: http://en.wikipedia.org/wiki/Kernel_density_estimation The observations for the empirical distribution. The fractional importance for each sample. Those values must sum up to one. An estimative of the smoothing parameter. Gets the default estimative of the smoothing parameter. This method is based on the practical estimation of the bandwidth as suggested in Wikipedia: http://en.wikipedia.org/wiki/Kernel_density_estimation The observations for the empirical distribution. The number of times each sample should be repeated. An estimative of the smoothing parameter. Gets the default estimative of the smoothing parameter. This method is based on the practical estimation of the bandwidth as suggested in Wikipedia: http://en.wikipedia.org/wiki/Kernel_density_estimation The observations for the empirical distribution. The fractional importance for each sample. Those values must sum up to one. The number of times each sample should be repeated. An estimative of the smoothing parameter. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. The random number generator to use as a source of randomness. Default is to use . A random observations drawn from this distribution. F (Fisher-Snedecor) distribution. In probability theory and statistics, the F-distribution is a continuous probability distribution. It is also known as Snedecor's F distribution or the Fisher-Snedecor distribution (after R.A. Fisher and George W. Snedecor). The F-distribution arises frequently as the null distribution of a test statistic, most notably in the analysis of variance; see . References: Wikipedia, The Free Encyclopedia. F-distribution. Available on: http://en.wikipedia.org/wiki/F-distribution The following example shows how to construct a Fisher-Snedecor's F-distribution with 8 and 5 degrees of freedom, respectively. // Create a Fisher-Snedecor's F distribution with 8 and 5 d.f. FDistribution F = new FDistribution(degrees1: 8, degrees2: 5); // Common measures double mean = F.Mean; // 1.6666666666666667 double median = F.Median; // 1.0545096252132447 double var = F.Variance; // 7.6388888888888893 // Cumulative distribution functions double cdf = F.DistributionFunction(x: 0.27); // 0.049463408057268315 double ccdf = F.ComplementaryDistributionFunction(x: 0.27); // 0.95053659194273166 double icdf = F.InverseDistributionFunction(p: cdf); // 0.27 // Probability density functions double pdf = F.ProbabilityDensityFunction(x: 0.27); // 0.45120469723580559 double lpdf = F.LogProbabilityDensityFunction(x: 0.27); // -0.79583416831212883 // Hazard (failure rate) functions double hf = F.HazardFunction(x: 0.27); // 0.47468419528555084 double chf = F.CumulativeHazardFunction(x: 0.27); // 0.050728620222091653 // String representation string str = F.ToString(CultureInfo.InvariantCulture); // F(x; df1 = 8, df2 = 5) Constructs a F-distribution with the given degrees of freedom. Constructs a F-distribution with the given degrees of freedom. The first degree of freedom. Default is 1. The second degree of freedom. Default is 1. Gets the first degree of freedom. Gets the second degree of freedom. Gets the mean for this distribution. Gets the variance for this distribution. Gets the mode for this distribution. The distribution's mode value. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the entropy for this distribution. Gets the cumulative distribution function (cdf) for the F-distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. The F-distribution CDF is computed in terms of the Incomplete Beta function Ix(a,b) as CDF(x) = Iu(d1/2, d2/2) in which u is given as u = (d1 * x) / (d1 * x + d2). Gets the complementary cumulative distribution function evaluated at point x. The F-distribution complementary CDF is computed in terms of the Incomplete Beta function Ix(a,b) as CDFc(x) = Iu(d2/2, d1/2) in which u is given as u = (d2 * x) / (d2 * x + d1). Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. The Inverse Cumulative Distribution Function (ICDF) specifies, for a given probability, the value which the random variable will be at, or below, with that probability. Gets the probability density function (pdf) for the F-distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Not available. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Generates a random vector of observations from the F-distribution with the given parameters. The first degree of freedom. The second degree of freedom. The number of samples to generate. An array of double values sampled from the specified F-distribution. Generates a random vector of observations from the F-distribution with the given parameters. The first degree of freedom. The second degree of freedom. The number of samples to generate. The location where to store the samples. An array of double values sampled from the specified F-distribution. Generates a random observation from the F-distribution with the given parameters. The first degree of freedom. The second degree of freedom. A random double value sampled from the specified F-distribution. Generates a random vector of observations from the F-distribution with the given parameters. The first degree of freedom. The second degree of freedom. The number of samples to generate. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified F-distribution. Generates a random vector of observations from the F-distribution with the given parameters. The first degree of freedom. The second degree of freedom. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified F-distribution. Generates a random observation from the F-distribution with the given parameters. The first degree of freedom. The second degree of freedom. The random number generator to use as a source of randomness. Default is to use . A random double value sampled from the specified F-distribution. Returns a that represents this instance. A that represents this instance. Outcome status for survival methods. A sample can enter the experiment, exit the experiment while still alive or exit the experiment due to failure. Observation started. The observation was left censored before the current time and has now entered the experiment. This is equivalent to R's censoring code -1. Failure happened. This is equivalent to R's censoring code 1. The sample was right-censored. This is equivalent to R's censoring code 0. Estimators for estimating parameters of Hazard distributions. Breslow-Nelson-Aalen estimator (default). Kaplan-Meier estimator. Methods for handling ties in hazard/survival estimation algorithms. Efron's method for ties (default). Breslow's method for ties. Estimators for Survival distribution functions. Fleming-Harrington estimator (default). Kaplan-Meier estimator. Empirical Hazard Distribution. The Empirical Hazard (or Survival) Distribution can be used as an estimative of the true Survival function for a dataset which does not relies on distribution or model assumptions about the data. The most direct use for this class is in Survival Analysis, such as when using or creating Cox's Proportional Hazards models. // references http://www.statsdirect.com/help/default.htm#survival_analysis/kaplan_meier.htm The following example shows how to construct an empirical hazards function from a set of hazard values at the given time instants. // Consider the following observations, occurring at the given time steps double[] times = { 11, 10, 9, 8, 6, 5, 4, 2 }; double[] values = { 0.22, 0.67, 1.00, 0.18, 1.00, 1.00, 1.00, 0.55 }; // Create a new empirical distribution function given the observations and event times EmpiricalHazardDistribution distribution = new EmpiricalHazardDistribution(times, values); // Common measures double mean = distribution.Mean; // 2.1994135014183138 double median = distribution.Median; // 3.9999999151458066 double var = distribution.Variance; // 4.2044065839577112 // Cumulative distribution functions double cdf = distribution.DistributionFunction(x: 4.2); // 0.7877520261732569 double ccdf = distribution.ComplementaryDistributionFunction(x: 4.2); // 0.21224797382674304 double icdf = distribution.InverseDistributionFunction(p: cdf); // 4.3304819115496436 // Probability density functions double pdf = distribution.ProbabilityDensityFunction(x: 4.2); // 0.21224797382674304 double lpdf = distribution.LogProbabilityDensityFunction(x: 4.2); // -1.55 // Hazard (failure rate) functions double hf = distribution.HazardFunction(x: 4.2); // 1.0 double chf = distribution.CumulativeHazardFunction(x: 4.2); // 1.55 // String representation string str = distribution.ToString(); // H(x; v, t) Gets the time steps of the hazard density values. Gets the hazard rate values at each time step. Gets the survival values at each time step. Gets the survival function estimator being used in this distribution. Initializes a new instance of the class. Initializes a new instance of the class. The time steps. The hazard rates at the time steps. Initializes a new instance of the class. The time steps. The hazard rates at the time steps. The survival function estimator to be used. Default is Initializes a new instance of the class. The survival function estimator to be used. Default is Gets the mean for this distribution. The distribution's mean value. Gets the variance for this distribution. The distribution's variance. This method is not supported. This method is not supported. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. A single point in the distribution range. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. In the Empirical Hazard Distribution, this function is computed using the Fleming-Harrington estimator. Gets the cumulative hazard function for this distribution evaluated at point x. A single point in the distribution range. The cumulative hazard function H(x) evaluated at x in the current distribution. Gets the hazard function, also known as the failure rate or the conditional failure density function for this distribution evaluated at point x. A single point in the distribution range. The conditional failure density function h(x) evaluated at x in the current distribution. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. In the Empirical Hazard Distribution, the PDF is defined as the product of the hazard function h(x) and survival function S(x), as PDF(x) = h(x) * S(x). Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Sorts time-censored events considering their time of occurrence and the type of event. Events are first sorted in decreased order of occurrence, and then with failures coming before censoring. The time of occurrence for the event. The outcome at the time of event (failure or censored). The indices of the new sorting. Sorts time-censored events considering their time of occurrence and the type of event. Events are first sorted in decreased order of occurrence, and then with failures coming before censoring. The time of occurrence for the event. The outcome at the time of event (failure or censored). The input vector associated with the event. The indices of the new sorting. Sorts time-censored events considering their time of occurrence and the type of event. Events are first sorted in decreased order of occurrence, and then with failures coming before censoring. The time of occurrence for the event. The outcome at the time of event (failure or censored). The weights associated with each event. The indices of the new sorting. Returns a that represents this instance. The format. The format provider. A that represents this instance. Estimates an Empirical Hazards distribution considering event times and the outcome of the observed sample at the time of event, plus additional parameters for the hazard estimation. The time of occurrence for the event. The weights associated with each event. The hazard estimator to use. Default is . The survival estimator to use. Default is . The method for handling event ties. Default is . The estimated from the given data. Estimates an Empirical Hazards distribution considering event times and the outcome of the observed sample at the time of event, plus additional parameters for the hazard estimation. The time of occurrence for the event. The outcome at the time of event (failure or censored). The weights associated with each event. The hazard estimator to use. Default is . The survival estimator to use. Default is . The method for handling event ties. Default is . The estimated from the given data. Estimates an Empirical Hazards distribution considering event times and the outcome of the observed sample at the time of event, plus additional parameters for the hazard estimation. The time of occurrence for the event. The outcome at the time of event (failure or censored). The weights associated with each event. The hazard estimator to use. Default is . The survival estimator to use. Default is . The method for handling event ties. Default is . The estimated from the given data. Gompertz distribution. The Gompertz distribution is a continuous probability distribution. The Gompertz distribution is often applied to describe the distribution of adult lifespans by demographers and actuaries. Related fields of science such as biology and gerontology also considered the Gompertz distribution for the analysis of survival. More recently, computer scientists have also started to model the failure rates of computer codes by the Gompertz distribution. In marketing science, it has been used as an individual-level model of customer lifetime. References: Wikipedia, The Free Encyclopedia. Gompertz distribution. Available on: http://en.wikipedia.org/wiki/Gompertz_distribution The following example shows how to construct a Gompertz distribution with η = 4.2 and b = 1.1. // Create a new Gompertz distribution with η = 4.2 and b = 1.1 GompertzDistribution dist = new GompertzDistribution(eta: 4.2, b: 1.1); // Common measures double median = dist.Median; // 0.13886469671401389 // Cumulative distribution functions double cdf = dist.DistributionFunction(x: 0.27); // 0.76599768199799145 double ccdf = dist.ComplementaryDistributionFunction(x: 0.27); // 0.23400231800200855 double icdf = dist.InverseDistributionFunction(p: cdf); // 0.26999999999766749 // Probability density functions double pdf = dist.ProbabilityDensityFunction(x: 0.27); // 1.4549484164912097 double lpdf = dist.LogProbabilityDensityFunction(x: 0.27); // 0.37497044741163688 // Hazard (failure rate) functions double hf = dist.HazardFunction(x: 0.27); // 6.2176666834502088 double chf = dist.CumulativeHazardFunction(x: 0.27); // 1.4524242576820101 // String representation string str = dist.ToString(CultureInfo.InvariantCulture); // "Gompertz(x; η = 4.2, b = 1.1)" Initializes a new instance of the class. The shape parameter η. The scale parameter b. Not supported. Not supported. Gets the mode for this distribution. The distribution's mode value. Gets the median for this distribution. The distribution's median value. Not supported. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Generalized Pareto distribution (three parameters). In statistics, the generalized Pareto distribution (GPD) is a family of continuous probability distributions. It is often used to model the tails of another distribution. It is specified by three parameters: location μ, scale σ, and shape ξ. Sometimes it is specified by only scale and shape and sometimes only by its shape parameter. Some references give the shape parameter as κ = − ξ. References: Wikipedia, The Free Encyclopedia. Generalized Pareto distribution. Available from: https://en.wikipedia.org/wiki/Generalized_Pareto_distribution Initializes a new instance of the class. Initializes a new instance of the class. The location parameter μ (mu). Default is 0. The scale parameter σ (sigma). Must be > 0. Default is 1. The shape parameter ξ (Xi). Default is 2. Gets the scale parameter σ (sigma). Gets shape parameter ξ (Xi). Gets the location parameter μ (mu). Gets the variance for this distribution. The distribution's variance. Gets the entropy for this distribution. The distribution's entropy. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the mean for this distribution. The distribution's mean value. Gets the median for this distribution. The distribution's median value. Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . The Inverse Cumulative Distribution Function (ICDF) specifies, for a given probability, the value which the random variable will be at, or below, with that probability. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. System.Double. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. The format. The format provider. A that represents this instance. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. The random number generator to use as a source of randomness. Default is to use . A random observations drawn from this distribution. Mixture of univariate probability distributions. A mixture density is a probability density function which is expressed as a convex combination (i.e. a weighted sum, with non-negative weights that sum to 1) of other probability density functions. The individual density functions that are combined to make the mixture density are called the mixture components, and the weights associated with each component are called the mixture weights. References: Wikipedia, The Free Encyclopedia. Mixture density. Available on: http://en.wikipedia.org/wiki/Mixture_density The type of the univariate component distributions. // Create a new mixture containing two Normal distributions Mixture<NormalDistribution> mix = new Mixture<NormalDistribution>( new NormalDistribution(2, 1), new NormalDistribution(5, 1)); // Common measures double mean = mix.Mean; // 3.5 double median = mix.Median; // 3.4999998506015895 double var = mix.Variance; // 3.25 // Cumulative distribution functions double cdf = mix.DistributionFunction(x: 4.2); // 0.59897597553494908 double ccdf = mix.ComplementaryDistributionFunction(x: 4.2); // 0.40102402446505092 // Probability mass functions double pmf1 = mix.ProbabilityDensityFunction(x: 1.2); // 0.14499174984363708 double pmf2 = mix.ProbabilityDensityFunction(x: 2.3); // 0.19590437513747333 double pmf3 = mix.ProbabilityDensityFunction(x: 3.7); // 0.13270883471234715 double lpmf = mix.LogProbabilityDensityFunction(x: 4.2); // -1.8165661905848629 // Quantile function double icdf1 = mix.InverseDistributionFunction(p: 0.17); // 1.5866611690305095 double icdf2 = mix.InverseDistributionFunction(p: 0.46); // 3.1968506765456883 double icdf3 = mix.InverseDistributionFunction(p: 0.87); // 5.6437596300843076 // Hazard (failure rate) functions double hf = mix.HazardFunction(x: 4.2); // 0.40541978256972522 double chf = mix.CumulativeHazardFunction(x: 4.2); // 0.91373394208601633 // String representation: // Mixture(x; 0.5 * N(x; μ = 5, σ² = 1) + 0.5 * N(x; μ = 5, σ² = 1)) string str = mix.ToString(CultureInfo.InvariantCulture); The following example shows how to estimate (fit) a Mixture of Normal distributions from weighted data: // Randomly initialize some mixture components NormalDistribution[] components = new NormalDistribution[2]; components[0] = new NormalDistribution(2, 1); components[1] = new NormalDistribution(5, 1); // Create an initial mixture var mixture = new Mixture<NormalDistribution>(components); // Now, suppose we have a weighted data // set. Those will be the input points: double[] points = { 0, 3, 1, 7, 3, 5, 1, 2, -1, 2, 7, 6, 8, 6 }; // (14 points) // And those are their respective unnormalized weights: double[] weights = { 1, 1, 1, 2, 2, 1, 1, 1, 2, 1, 2, 3, 1, 1 }; // (14 weights) // Let's normalize the weights so they sum up to one: weights = weights.Divide(weights.Sum()); // Now we can fit our model to the data: mixture.Fit(points, weights); // done! // Our model will be: double mean1 = mixture.Components[0].Mean; // 1.41126 double mean2 = mixture.Components[1].Mean; // 6.53301 // With mixture weights double pi1 = mixture.Coefficients[0]; // 0.51408 double pi2 = mixture.Coefficients[0]; // 0.48591 // If we need the GaussianMixtureModel functionality, we can // use the estimated mixture to initialize a new model: GaussianMixtureModel gmm = new GaussianMixtureModel(mixture); mean1 = gmm.Gaussians[0].Mean[0]; // 1.41126 (same) mean2 = gmm.Gaussians[1].Mean[0]; // 6.53301 (same) p1 = gmm.Gaussians[0].Proportion; // 0.51408 (same) p2 = gmm.Gaussians[1].Proportion; // 0.48591 (same) Initializes a new instance of the class. The mixture distribution components. Initializes a new instance of the class. The mixture weight coefficients. The mixture distribution components. Gets the mixture components. Gets the weight coefficients. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the probability density function (pdf) for one of the component distributions evaluated at point x. The index of the desired component distribution. A single point in the distribution range. The probability of x occurring in the component distribution, computed as the PDF of the component distribution times its mixture coefficient. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the log-probability density function (pdf) for one of the component distributions evaluated at point x. The index of the desired component distribution. A single point in the distribution range. The logarithm of the probability of x occurring in the component distribution, computed as the PDF of the component distribution times its mixture coefficient. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the cumulative distribution function (cdf) for one component of this distribution evaluated at point x. The component distribution's index. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Computes the log-likelihood of the distribution for a given set of observations. Computes the log-likelihood of the distribution for a given set of observations. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Gets the mean for this distribution. Gets the variance for this distribution. References: Lidija Trailovic and Lucy Y. Pao, Variance Estimation and Ranking of Gaussian Mixture Distributions in Target Tracking Applications, Department of Electrical and Computer Engineering This method is not supported. This method is not supported. Gets the support interval for this distribution. A containing the support interval for this distribution. Estimates a new mixture model from a given set of observations. A set of observations. The initial components of the mixture model. Returns a new Mixture fitted to the given observations. Estimates a new mixture model from a given set of observations. A set of observations. The initial mixture coefficients. The initial components of the mixture model. Returns a new Mixture fitted to the given observations. Estimates a new mixture model from a given set of observations. A set of observations. The initial mixture coefficients. The convergence threshold for the Expectation-Maximization estimation. The initial components of the mixture model. Returns a new Mixture fitted to the given observations. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Returns a that represents this instance. A that represents this instance. Univariate general discrete distribution, also referred as the Categorical distribution. An univariate categorical distribution is a statistical distribution whose variables can take on only discrete values. Each discrete value defined within the interval of the distribution has an associated probability value indicating its frequency of occurrence. The discrete uniform distribution is a special case of a generic discrete distribution whose probability values are constant. // Create a Categorical distribution for 3 symbols, in which // the first and second symbol have 25% chance of appearing, // and the third symbol has 50% chance of appearing. // 1st 2nd 3rd double[] probabilities = { 0.25, 0.25, 0.50 }; // Create the categorical with the given probabilities var dist = new GeneralDiscreteDistribution(probabilities); // Common measures double mean = dist.Mean; // 1.25 double median = dist.Median; // 1.00 double var = dist.Variance; // 0.6875 // Cumulative distribution functions double cdf = dist.DistributionFunction(k: 2); // 1.0 double ccdf = dist.ComplementaryDistributionFunction(k: 2); // 0.0 // Probability mass functions double pdf1 = dist.ProbabilityMassFunction(k: 0); // 0.25 double pdf2 = dist.ProbabilityMassFunction(k: 1); // 0.25 double pdf3 = dist.ProbabilityMassFunction(k: 2); // 0.50 double lpdf = dist.LogProbabilityMassFunction(k: 2); // -0.69314718055994529 // Quantile function int icdf1 = dist.InverseDistributionFunction(p: 0.17); // 0 int icdf2 = dist.InverseDistributionFunction(p: 0.39); // 1 int icdf3 = dist.InverseDistributionFunction(p: 0.56); // 2 // Hazard (failure rate) functions double hf = dist.HazardFunction(x: 0); // 0.33333333333333331 double chf = dist.CumulativeHazardFunction(x: 0); // 0.2876820724517809 // String representation string str = dist.ToString(CultureInfo.InvariantCulture); // "Categorical(x; p = { 0.25, 0.25, 0.5 })" Constructs a new generic discrete distribution. The frequency of occurrence for each integer value in the distribution. The distribution is assumed to begin in the interval defined by start up to size of this vector. True if the distribution should be represented using logarithms; false otherwise. Constructs a new generic discrete distribution. The integer value where the distribution starts, also known as the offset value. Default value is 0. The frequency of occurrence for each integer value in the distribution. The distribution is assumed to begin in the interval defined by start up to size of this vector. Constructs a new uniform discrete distribution. The integer value where the distribution starts, also known as the offset value. Default value is 0. The number of discrete values within the distribution. The distribution is assumed to belong to the interval [start, start + symbols]. True if the distribution should be represented using logarithms; false otherwise. Constructs a new generic discrete distribution. The frequency of occurrence for each integer value in the distribution. The distribution is assumed to begin in the interval defined by start up to size of this vector. Constructs a new uniform discrete distribution. The number of discrete values within the distribution. The distribution is assumed to belong to the interval [start, start + symbols]. True if the distribution should be represented using logarithms; false otherwise. Constructs a new uniform discrete distribution. The integer value where the distribution starts, also known as a. Default value is 0. The integer value where the distribution ends, also known as b. Gets the probability value associated with the symbol . The symbol's index. The probability of the given symbol. Gets the integer value where the discrete distribution starts. Gets the integer value where the discrete distribution ends. Gets the number of symbols in the distribution. Gets the probabilities associated with each discrete variable value. Note: if the frequencies in this property are manually changed, the rest of the class properties (Mode, Mean, ...) will not be automatically updated to reflect the actual inserted values. Gets the mean for this distribution. Gets the variance for this distribution. Gets the mode for this distribution. Gets the entropy for this distribution. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The Probability Density Function (PDF) describes the probability that a given value k will occur. The probability of k occurring in the current distribution. Gets the probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The Probability Mass Function (PMF) describes the probability that a given value x will occur. The probability of x occurring in the current distribution. Gets the log-probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of k occurring in the current distribution. The Probability Mass Function (PMF) describes the probability that a given value x will occur. Generates a random observation from the current distribution. The random number generator to use as a source of randomness. Default is to use . A random observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a random sample within the given symbol probabilities. The probabilities for the discrete symbols. The number of samples to generate. A random sample within the given probabilities. Returns a random sample within the given symbol probabilities. The probabilities for the discrete symbols. The number of samples to generate. The random number generator to use as a source of randomness. Default is to use . A random sample within the given probabilities. Returns a random sample within the given symbol probabilities. The probabilities for the discrete symbols. The number of samples to generate. The location where to store the samples. Pass true if the vector contains log-probabilities instead of standard probabilities. A random sample within the given probabilities. Returns a random sample within the given symbol probabilities. The probabilities for the discrete symbols. The number of samples to generate. The location where to store the samples. Pass true if the vector contains log-probabilities instead of standard probabilities. The random number generator to use as a source of randomness. Default is to use . A random sample within the given probabilities. Returns a random symbol within the given symbol probabilities. The probabilities for the discrete symbols. Pass true if the vector contains log-probabilities instead of standard probabilities. A random symbol within the given probabilities. Returns a random symbol within the given symbol probabilities. The probabilities for the discrete symbols. Pass true if the vector contains log-probabilities instead of standard probabilities. The random number generator to use as a source of randomness. Default is to use . A random symbol within the given probabilities. Returns a that represents this instance. A that represents this instance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Creates general discrete distributions given a matrix of symbol probabilities. Creates general discrete distributions given a matrix of symbol probabilities. Abstract class for univariate continuous probability Distributions. A probability distribution identifies either the probability of each value of an unidentified random variable (when the variable is discrete), or the probability of the value falling within a particular interval (when the variable is continuous). The probability distribution describes the range of possible values that a random variable can attain and the probability that the value of the random variable is within any (measurable) subset of that range. The function describing the probability that a given value will occur is called the probability function (or probability density function, abbreviated PDF), and the function describing the cumulative probability that a given value or any value smaller than it will occur is called the distribution function (or cumulative distribution function, abbreviated CDF). References: Wikipedia, The Free Encyclopedia. Probability distribution. Available on: http://en.wikipedia.org/wiki/Probability_distribution Weisstein, Eric W. "Statistical Distribution." From MathWorld--A Wolfram Web Resource. http://mathworld.wolfram.com/StatisticalDistribution.html Constructs a new UnivariateDistribution class. Gets the mean for this distribution. The distribution's mean value. Gets the variance for this distribution. The distribution's variance. Gets the entropy for this distribution. The distribution's entropy. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the mode for this distribution. The distribution's mode value. Gets the Quartiles for this distribution. A object containing the first quartile (Q1) as its minimum value, and the third quartile (Q2) as the maximum. Gets the distribution range within a given percentile. If 0.25 is passed as the argument, this function returns the same as the function. The percentile at which the distribution ranges will be returned. A object containing the minimum value for the distribution value, and the third quartile (Q2) as the maximum. Gets the median for this distribution. The distribution's median value. Gets the Standard Deviation (the square root of the variance) for the current distribution. The distribution's standard deviation. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The Probability Density Function (PDF) describes the probability that a given value x will occur. The probability of x occurring in the current distribution. Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The Probability Density Function (PDF) describes the probability that a given value x will occur. The probability of x occurring in the current distribution. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The Probability Density Function (PDF) describes the probability that a given value x will occur. The logarithm of the probability of x occurring in the current distribution. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The Probability Density Function (PDF) describes the probability that a given value x will occur. The logarithm of the probability of x occurring in the current distribution. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the cumulative distribution function (cdf) for this distribution in the semi-closed interval (a; b] given as P(a < X ≤ b). The start of the semi-closed interval (a; b]. The end of the semi-closed interval (a; b]. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. A single point in the distribution range. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. A single point in the distribution range. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. The Inverse Cumulative Distribution Function (ICDF) specifies, for a given probability, the value which the random variable will be at, or below, with that probability. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. The Inverse Cumulative Distribution Function (ICDF) specifies, for a given probability, the value which the random variable will be at, or below, with that probability. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . Gets the first derivative of the inverse distribution function (icdf) for this distribution evaluated at probability p. A probability value between 0 and 1. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The Probability Density Function (PDF) describes the probability that a given value x will occur. The probability of x occurring in the current distribution. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The Probability Density Function (PDF) describes the probability that a given value x will occur. The probability of x occurring in the current distribution. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The Probability Density Function (PDF) describes the probability that a given value x will occur. The logarithm of the probability of x occurring in the current distribution. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The Probability Density Function (PDF) describes the probability that a given value x will occur. The logarithm of the probability of x occurring in the current distribution. Gets the hazard function, also known as the failure rate or the conditional failure density function for this distribution evaluated at point x. The hazard function is the ratio of the probability density function f(x) to the survival function, S(x). A single point in the distribution range. The conditional failure density function h(x) evaluated at x in the current distribution. Gets the cumulative hazard function for this distribution evaluated at point x. A single point in the distribution range. The cumulative hazard function H(x) evaluated at x in the current distribution. Gets the log of the cumulative hazard function for this distribution evaluated at point x. A single point in the distribution range. The cumulative hazard function H(x) evaluated at x in the current distribution. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. The following example shows how to fit a using the method. However, any other kind of distribution could be fit in the exactly same way. Please consider the code below as an example only: If you would like futher examples, please take a look at the documentation page for the distribution you would like to fit for more details and configuration options you would like or need to control. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. The following example shows how to fit a using the method. However, any other kind of distribution could be fit in the exactly same way. Please consider the code below as an example only: If you would like futher examples, please take a look at the documentation page for the distribution you would like to fit for more details and configuration options you would like or need to control. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. The following example shows how to fit a using the method. However, any other kind of distribution could be fit in the exactly same way. Please consider the code below as an example only: If you would like futher examples, please take a look at the documentation page for the distribution you would like to fit for more details and configuration options you would like or need to control. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. The following example shows how to fit a using the method. However, any other kind of distribution could be fit in the exactly same way. Please consider the code below as an example only: If you would like futher examples, please take a look at the documentation page for the distribution you would like to fit for more details and configuration options you would like or need to control. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. The following example shows how to fit a using the method. However, any other kind of distribution could be fit in the exactly same way. Please consider the code below as an example only: If you would like futher examples, please take a look at the documentation page for the distribution you would like to fit for more details and configuration options you would like or need to control. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. The following example shows how to fit a using the method. However, any other kind of distribution could be fit in the exactly same way. Please consider the code below as an example only: If you would like futher examples, please take a look at the documentation page for the distribution you would like to fit for more details and configuration options you would like or need to control. Generates a random vector of observations from the current distribution. The number of samples to generate. A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Normal (Gaussian) distribution. In probability theory, the normal (or Gaussian) distribution is a very commonly occurring continuous probability distribution—a function that tells the probability that any real observation will fall between any two real limits or real numbers, as the curve approaches zero on either side. Normal distributions are extremely important in statistics and are often used in the natural and social sciences for real-valued random variables whose distributions are not known. The normal distribution is immensely useful because of the central limit theorem, which states that, under mild conditions, the mean of many random variables independently drawn from the same distribution is distributed approximately normally, irrespective of the form of the original distribution: physical quantities that are expected to be the sum of many independent processes (such as measurement errors) often have a distribution very close to the normal. Moreover, many results and methods (such as propagation of uncertainty and least squares parameter fitting) can be derived analytically in explicit form when the relevant variables are normally distributed. The Gaussian distribution is sometimes informally called the bell curve. However, many other distributions are bell-shaped (such as Cauchy's, Student's, and logistic). The terms Gaussian function and Gaussian bell curve are also ambiguous because they sometimes refer to multiples of the normal distribution that cannot be directly interpreted in terms of probabilities. The Gaussian is the most widely used distribution for continuous variables. In the case of a single variable, it is governed by two parameters, the mean and the variance. References: Wikipedia, The Free Encyclopedia. Normal distribution. Available on: https://en.wikipedia.org/wiki/Normal_distribution This examples shows how to create a Normal distribution, compute some of its properties and generate a number of random samples from it. // Create a normal distribution with mean 2 and sigma 3 var normal = new NormalDistribution(mean: 2, stdDev: 3); // In a normal distribution, the median and // the mode coincide with the mean, so double mean = normal.Mean; // 2 double mode = normal.Mode; // 2 double median = normal.Median; // 2 // The variance is the square of the standard deviation double variance = normal.Variance; // 3² = 9 // Let's check what is the cumulative probability of // a value less than 3 occurring in this distribution: double cdf = normal.DistributionFunction(3); // 0.63055 // Finally, let's generate 1000 samples from this distribution // and check if they have the specified mean and standard devs double[] samples = normal.Generate(1000); double sampleMean = samples.Mean(); // 1.92 double sampleDev = samples.StandardDeviation(); // 3.00 This example further demonstrates how to compute derived measures from a Normal distribution: var normal = new NormalDistribution(mean: 4, stdDev: 4.2); double mean = normal.Mean; // 4.0 double median = normal.Median; // 4.0 double mode = normal.Mode; // 4.0 double var = normal.Variance; // 17.64 double cdf = normal.DistributionFunction(x: 1.4); // 0.26794249453351904 double pdf = normal.ProbabilityDensityFunction(x: 1.4); // 0.078423391448155175 double lpdf = normal.LogProbabilityDensityFunction(x: 1.4); // -2.5456330358182586 double ccdf = normal.ComplementaryDistributionFunction(x: 1.4); // 0.732057505466481 double icdf = normal.InverseDistributionFunction(p: cdf); // 1.4 double hf = normal.HazardFunction(x: 1.4); // 0.10712736480747137 double chf = normal.CumulativeHazardFunction(x: 1.4); // 0.31189620872601354 string str = normal.ToString(CultureInfo.InvariantCulture); // N(x; μ = 4, σ² = 17.64) Constructs a Normal (Gaussian) distribution with zero mean and unit standard deviation. Constructs a Normal (Gaussian) distribution with given mean and unit standard deviation. The distribution's mean value μ (mu). Constructs a Normal (Gaussian) distribution with given mean and standard deviation. The distribution's mean value μ (mu). The distribution's standard deviation σ (sigma). Gets the Mean value μ (mu) for this Normal distribution. Gets the median for this distribution. The normal distribution's median value equals its value μ. The distribution's median value. Gets the Variance σ² (sigma-squared), which is the square of the standard deviation σ for this Normal distribution. Gets the Standard Deviation σ (sigma), which is the square root of the variance for this Normal distribution. Gets the mode for this distribution. The normal distribution's mode value equals its value μ. The distribution's mode value. Gets the skewness for this distribution. In the Normal distribution, this is always 0. Gets the excess kurtosis for this distribution. In the Normal distribution, this is always 0. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the Entropy for this Normal distribution. Gets the cumulative distribution function (cdf) for the this Normal distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. The calculation is computed through the relationship to the error function as erfc(-z/sqrt(2)) / 2. References: Weisstein, Eric W. "Normal Distribution." From MathWorld--A Wolfram Web Resource. Available on: http://mathworld.wolfram.com/NormalDistribution.html Wikipedia, The Free Encyclopedia. Normal distribution. Available on: http://en.wikipedia.org/wiki/Normal_distribution#Cumulative_distribution_function See . Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. A single point in the distribution range. Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. The Inverse Cumulative Distribution Function (ICDF) specifies, for a given probability, the value which the random variable will be at, or below, with that probability. The Normal distribution's ICDF is defined in terms of the standard normal inverse cumulative distribution function I as ICDF(p) = μ + σ * I(p). See . Gets the probability density function (pdf) for the Normal distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. The Normal distribution's PDF is defined as PDF(x) = c * exp((x - μ / σ)²/2). See . Gets the probability density function (pdf) for the Normal distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. See . Gets the Z-Score for a given value. Gets the Standard Gaussian Distribution, with zero mean and unit variance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Estimates a new Normal distribution from a given set of observations. Estimates a new Normal distribution from a given set of observations. Estimates a new Normal distribution from a given set of observations. Converts this univariate distribution into a 1-dimensional multivariate distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. The random number generator to use as a source of randomness. Default is to use . A observation drawn from this distribution. Generates a single random observation from the Normal distribution with the given parameters. The mean value μ (mu). The standard deviation σ (sigma). An double value sampled from the specified Normal distribution. Generates a single random observation from the Normal distribution with the given parameters. The mean value μ (mu). The standard deviation σ (sigma). The random number generator to use as a source of randomness. Default is to use . An double value sampled from the specified Normal distribution. Generates a random vector of observations from the Normal distribution with the given parameters. The mean value μ (mu). The standard deviation σ (sigma). The number of samples to generate. An array of double values sampled from the specified Normal distribution. Generates a random vector of observations from the Normal distribution with the given parameters. The mean value μ (mu). The standard deviation σ (sigma). The number of samples to generate. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Normal distribution. Generates a random vector of observations from the Normal distribution with the given parameters. The mean value μ (mu). The standard deviation σ (sigma). The number of samples to generate. The location where to store the samples. An array of double values sampled from the specified Normal distribution. Generates a random vector of observations from the Normal distribution with the given parameters. The mean value μ (mu). The standard deviation σ (sigma). The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Normal distribution. Generates a random vector of observations from the standard Normal distribution (zero mean and unit standard deviation). The number of samples to generate. The location where to store the samples. An array of double values sampled from the specified Normal distribution. Generates a random vector of observations from the standard Normal distribution (zero mean and unit standard deviation). The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Normal distribution. Generates a random value from a standard Normal distribution (zero mean and unit standard deviation). Generates a random value from a standard Normal distribution (zero mean and unit standard deviation). Poisson probability distribution. The Poisson distribution is a discrete probability distribution that expresses the probability of a number of events occurring in a fixed period of time if these events occur with a known average rate and independently of the time since the last event. References: Wikipedia, The Free Encyclopedia. Poisson distribution. Available on: http://en.wikipedia.org/wiki/Poisson_distribution The following example shows how to instantiate a new Poisson distribution with a given rate λ and how to compute its measures and associated functions. // Create a new Poisson distribution with var dist = new PoissonDistribution(lambda: 4.2); // Common measures double mean = dist.Mean; // 4.2 double median = dist.Median; // 4.0 double var = dist.Variance; // 4.2 // Cumulative distribution functions double cdf1 = dist.DistributionFunction(k: 2); // 0.21023798702309743 double cdf2 = dist.DistributionFunction(k: 4); // 0.58982702131057763 double cdf3 = dist.DistributionFunction(k: 7); // 0.93605666027257894 double ccdf = dist.ComplementaryDistributionFunction(k: 2); // 0.78976201297690252 // Probability mass functions double pmf1 = dist.ProbabilityMassFunction(k: 4); // 0.19442365170822165 double pmf2 = dist.ProbabilityMassFunction(k: 5); // 0.1633158674349062 double pmf3 = dist.ProbabilityMassFunction(k: 6); // 0.11432110720443435 double lpmf = dist.LogProbabilityMassFunction(k: 2); // -2.0229781299813 // Quantile function int icdf1 = dist.InverseDistributionFunction(p: cdf1); // 2 int icdf2 = dist.InverseDistributionFunction(p: cdf2); // 4 int icdf3 = dist.InverseDistributionFunction(p: cdf3); // 7 // Hazard (failure rate) functions double hf = dist.HazardFunction(x: 4); // 0.47400404660843515 double chf = dist.CumulativeHazardFunction(x: 4); // 0.89117630901575073 // String representation string str = dist.ToString(CultureInfo.InvariantCulture); // "Poisson(x; λ = 4.2)" This example shows hows to call the distribution function to compute different types of probabilities. // Create a new Poisson distribution var dist = new PoissonDistribution(lambda: 4.2); // P(X = 1) = 0.0629814226460064 double equal = dist.ProbabilityMassFunction(k: 1); // P(X < 1) = 0.0149955768204777 double less = dist.DistributionFunction(k: 1, inclusive: false); // P(X ≤ 1) = 0.0779769994664841 double lessThanOrEqual = dist.DistributionFunction(k: 1, inclusive: true); // P(X > 1) = 0.922023000533516 double greater = dist.ComplementaryDistributionFunction(k: 1); // P(X ≥ 1) = 0.985004423179522 double greaterThanOrEqual = dist.ComplementaryDistributionFunction(k: 1, inclusive: true); Creates a new Poisson distribution with λ = 1. Creates a new Poisson distribution with the given λ (lambda). The Poisson's λ (lambda) parameter. Default is 1. Gets the Poisson's parameter λ (lambda). Gets the mean for this distribution. Gets the variance for this distribution. Gets the entropy for this distribution. A closed form expression for the entropy of a Poisson distribution is unknown. This property returns an approximation for large lambda. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point k. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the probability mass function (pmf) for this distribution evaluated at point x. A single point in the distribution range. The Probability Mass Function (PMF) describes the probability that a given value k will occur. The probability of x occurring in the current distribution. Gets the log-probability mass function (pmf) for this distribution evaluated at point k. A single point in the distribution range. The logarithm of the probability of k occurring in the current distribution. The Probability Mass Function (PMF) describes the probability that a given value k will occur. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. A random observations drawn from this distribution. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Gets the standard Poisson distribution, with lambda (rate) equal to 1. von-Mises (Circular Normal) distribution. The von Mises distribution (also known as the circular normal distribution or Tikhonov distribution) is a continuous probability distribution on the circle. It may be thought of as a close approximation to the wrapped normal distribution, which is the circular analogue of the normal distribution. The wrapped normal distribution describes the distribution of an angle that is the result of the addition of many small independent angular deviations, such as target sensing, or grain orientation in a granular material. The von Mises distribution is more mathematically tractable than the wrapped normal distribution and is the preferred distribution for many applications. References: Wikipedia, The Free Encyclopedia. Von-Mises distribution. Available on: http://en.wikipedia.org/wiki/Von_Mises_distribution Suvrit Sra, "A short note on parameter approximation for von Mises-Fisher distributions: and a fast implementation of $I_s(x)$". (revision of Apr. 2009). Computational Statistics (2011). Available on: http://www.kyb.mpg.de/publications/attachments/vmfnote_7045%5B0%5D.pdf Zheng Sun. M.Sc. Comparing measures of fit for circular distributions. Master thesis, 2006. Available on: https://dspace.library.uvic.ca:8443/bitstream/handle/1828/2698/zhengsun_master_thesis.pdf // Create a new von-Mises distribution with μ = 0.42 and κ = 1.2 var vonMises = new VonMisesDistribution(mean: 0.42, concentration: 1.2); // Common measures double mean = vonMises.Mean; // 0.42 double median = vonMises.Median; // 0.42 double var = vonMises.Variance; // 0.48721760532782921 // Cumulative distribution functions double cdf = vonMises.DistributionFunction(x: 1.4); // 0.81326928491589345 double ccdf = vonMises.ComplementaryDistributionFunction(x: 1.4); // 0.18673071508410655 double icdf = vonMises.InverseDistributionFunction(p: cdf); // 1.3999999637927665 // Probability density functions double pdf = vonMises.ProbabilityDensityFunction(x: 1.4); // 0.2228112141141676 double lpdf = vonMises.LogProbabilityDensityFunction(x: 1.4); // -1.5014304395467863 // Hazard (failure rate) functions double hf = vonMises.HazardFunction(x: 1.4); // 1.1932220899695576 double chf = vonMises.CumulativeHazardFunction(x: 1.4); // 1.6780877262500649 // String representation string str = vonMises.ToString(CultureInfo.InvariantCulture); // VonMises(x; μ = 0.42, κ = 1.2) Constructs a von-Mises distribution with zero mean and unit concentration. Constructs a von-Mises distribution with zero mean. The concentration value κ (kappa). Default is 1. Constructs a von-Mises distribution. The mean value μ (mu). Default is 0. The concentration value κ (kappa). Default is 1. Gets the mean value μ (mu) for this distribution. Gets the median value μ (mu) for this distribution. Gets the mode value μ (mu) for this distribution. Gets the concentration κ (kappa) for this distribution. Gets the variance for this distribution. The von-Mises Variance is defined in terms of the Bessel function of the first kind In(x) as var = 1 - I(1, κ) / I(0, κ) Gets the entropy for this distribution. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Creates a new circular uniform distribution by creating a new with zero kappa. The mean value μ (mu). A with zero kappa, which is equivalent to creating an uniform circular distribution. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Returns a that represents this instance. A that represents this instance. Estimates a new von-Mises distribution from a given set of angles. Estimates a new von-Mises distribution from a given set of angles. Estimates a new von-Mises distribution from a given set of angles. von-Mises cumulative distribution function. This method implements the Von-Mises CDF calculation code as given by Geoffrey Hill on his original FORTRAN code and shared under the GNU LGPL license. References: Geoffrey Hill, ACM TOMS Algorithm 518, Incomplete Bessel Function I0: The von Mises Distribution, ACM Transactions on Mathematical Software, Volume 3, Number 3, September 1977, pages 279-284. The point where to calculate the CDF. The location parameter μ (mu). The concentration parameter κ (kappa). The value of the von-Mises CDF at point . Weibull distribution. In probability theory and statistics, the Weibull distribution is a continuous probability distribution. It is named after Waloddi Weibull, who described it in detail in 1951, although it was first identified by Fréchet (1927) and first applied by Rosin and Rammler (1933) to describe a particle size distribution. The Weibull distribution is related to a number of other probability distributions; in particular, it interpolates between the exponential distribution (for k = 1) and the Rayleigh distribution (when k = 2). If the quantity x is a "time-to-failure", the Weibull distribution gives a distribution for which the failure rate is proportional to a power of time. The shape parameter, k, is that power plus one, and so this parameter can be interpreted directly as follows: A value of k < 1 indicates that the failure rate decreases over time. This happens if there is significant "infant mortality", or defective items failing early and the failure rate decreasing over time as the defective items are weeded out of the population. A value of k = 1 indicates that the failure rate is constant over time. This might suggest random external events are causing mortality, or failure. A value of k > 1 indicates that the failure rate increases with time. This happens if there is an "aging" process, or parts that are more likely to fail as time goes on. In the field of materials science, the shape parameter k of a distribution of strengths is known as the Weibull modulus. References: Wikipedia, The Free Encyclopedia. Weibull distribution. Available on: http://en.wikipedia.org/wiki/Weibull_distribution // Create a new Weibull distribution with λ = 0.42 and k = 1.2 var weilbull = new WeibullDistribution(scale: 0.42, shape: 1.2); // Common measures double mean = weilbull.Mean; // 0.39507546046784414 double median = weilbull.Median; // 0.30945951550913292 double var = weilbull.Variance; // 0.10932249666369542 double mode = weilbull.Mode; // 0.094360430821809421 // Cumulative distribution functions double cdf = weilbull.DistributionFunction(x: 1.4); // 0.98560487188700052 double pdf = weilbull.ProbabilityDensityFunction(x: 1.4); // 0.052326687031379278 double lpdf = weilbull.LogProbabilityDensityFunction(x: 1.4); // -2.9502487697674415 // Probability density functions double ccdf = weilbull.ComplementaryDistributionFunction(x: 1.4); // 0.22369885565908001 double icdf = weilbull.InverseDistributionFunction(p: cdf); // 1.400000001051205 // Hazard (failure rate) functions double hf = weilbull.HazardFunction(x: 1.4); // 1.1093328057258516 double chf = weilbull.CumulativeHazardFunction(x: 1.4); // 1.4974545260150962 // String representation string str = weilbull.ToString(CultureInfo.InvariantCulture); // Weibull(x; λ = 0.42, k = 1.2) Initializes a new instance of the class. The scale parameter λ (lambda). The shape parameter k. Gets the shape parameter k. The value for this distribution's shape parameter k. Gets the scale parameter λ (lambda). The value for this distribution's scale parameter λ (lambda). Gets the mean for this distribution. The distribution's mean value. Gets the variance for this distribution. The distribution's variance. Gets the median for this distribution. The distribution's median value. Gets the mode for this distribution. The distribution's mode value. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the entropy for this distribution. The distribution's entropy. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. A single point in the distribution range. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. The logarithm of the probability of x occurring in the current distribution. The Probability Density Function (PDF) describes the probability that a given value x will occur. Gets the hazard function, also known as the failure rate or the conditional failure density function for this distribution evaluated at point x. A single point in the distribution range. The conditional failure density function h(x) evaluated at x in the current distribution. Gets the cumulative hazard function for this distribution evaluated at point x. A single point in the distribution range. The cumulative hazard function H(x) evaluated at x in the current distribution. Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. A single point in the distribution range. Gets the inverse of the . The inverse complementary distribution function is also known as the inverse survival Function. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Generates a random vector of observations from the current distribution. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . A random vector of observations drawn from this distribution. Generates a random observation from the current distribution. The random number generator to use as a source of randomness. Default is to use . A random observations drawn from this distribution. Generates a random vector of observations from the Weibull distribution with the given parameters. The scale parameter lambda. The shape parameter k. The number of samples to generate. An array of double values sampled from the specified Weibull distribution. Generates a random vector of observations from the Weibull distribution with the given parameters. The scale parameter lambda. The shape parameter k. The number of samples to generate. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Weibull distribution. Generates a random vector of observations from the Weibull distribution with the given parameters. The scale parameter lambda. The shape parameter k. The number of samples to generate. The location where to store the samples. An array of double values sampled from the specified Weibull distribution. Generates a random vector of observations from the Weibull distribution with the given parameters. The scale parameter lambda. The shape parameter k. The number of samples to generate. The location where to store the samples. The random number generator to use as a source of randomness. Default is to use . An array of double values sampled from the specified Weibull distribution. Generates a random observation from the Weibull distribution with the given parameters. The scale parameter lambda. The shape parameter k. A random double value sampled from the specified Weibull distribution. Generates a random observation from the Weibull distribution with the given parameters. The scale parameter lambda. The shape parameter k. The random number generator to use as a source of randomness. Default is to use . A random double value sampled from the specified Weibull distribution. Returns a that represents this instance. A that represents this instance. Common interface for univariate probability distributions. This interface is implemented by both univariate Discrete Distributions and Continuous Distributions. However, unlike , this interface has a generic parameter that allows to define the type of the distribution values (i.e. ). For Multivariate distributions, see . Gets the mean value for the distribution. The distribution's mean. Gets the variance value for the distribution. The distribution's variance. Gets the median value for the distribution. The distribution's median. Gets the mode value for the distribution. The distribution's mode. Gets entropy of the distribution. The distribution's entropy. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. The Inverse Cumulative Distribution Function (ICDF) specifies, for a given probability, the value which the random variable will be at, or below, with that probability. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . Gets the hazard function, also known as the failure rate or the conditional failure density function for this distribution evaluated at point x. The hazard function is the ratio of the probability density function f(x) to the survival function, S(x). A single point in the distribution range. The conditional failure density function h(x) evaluated at x in the current distribution. Gets the cumulative hazard function for this distribution evaluated at point x. A single point in the distribution range. The cumulative hazard function H(x) evaluated at x in the current distribution. Common interface for univariate probability distributions. This interface is implemented by both univariate Discrete Distributions and Continuous Distributions. For Multivariate distributions, see . Gets the mean value for the distribution. The distribution's mean. Gets the variance value for the distribution. The distribution's variance. Gets the median value for the distribution. The distribution's median. Gets the mode value for the distribution. The distribution's mode. Gets entropy of the distribution. The distribution's entropy. Gets the support interval for this distribution. A containing the support interval for this distribution. Gets the Quartiles for this distribution. A object containing the first quartile (Q1) as its minimum value, and the third quartile (Q2) as the maximum. Gets the distribution range within a given percentile. If 0.25 is passed as the argument, this function returns the same as the function. The percentile at which the distribution ranges will be returned. A object containing the minimum value for the distribution value, and the third quartile (Q2) as the maximum. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the cumulative distribution function (cdf) for this distribution in the semi-closed interval (a; b] given as P(a < X ≤ b). The start of the semi-closed interval (a; b]. The end of the semi-closed interval (a; b]. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The Probability Density Function (PDF) describes the probability that a given value x will occur. The probability of x occurring in the current distribution. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The logarithm of the probability of x occurring in the current distribution. Gets the inverse of the cumulative distribution function (icdf) for this distribution evaluated at probability p. This function is also known as the Quantile function. The Inverse Cumulative Distribution Function (ICDF) specifies, for a given probability, the value which the random variable will be at, or below, with that probability. A probability value between 0 and 1. A sample which could original the given probability value when applied in the . Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. A single point in the distribution range. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Gets the hazard function, also known as the failure rate or the conditional failure density function for this distribution evaluated at point x. The hazard function is the ratio of the probability density function f(x) to the survival function, S(x). A single point in the distribution range. The conditional failure density function h(x) evaluated at x in the current distribution. Gets the cumulative hazard function for this distribution evaluated at point x. A single point in the distribution range. The cumulative hazard function H(x) evaluated at x in the current distribution. Gets the cumulative hazard function for this distribution evaluated at point x. A single point in the distribution range. The cumulative hazard function H(x) evaluated at x in the current distribution. Gets the first derivative of the inverse distribution function (icdf) for this distribution evaluated at probability p. A probability value between 0 and 1. Common interface for mixture distributions. The type of the mixture distribution, if either univariate or multivariate. Gets the mixture coefficients (component weights). Gets the mixture components. Base class for statistical distribution implementations. Returns a that represents this instance. A that represents this instance. Returns a that represents this instance. A that represents this instance. Returns a that represents this instance. A that represents this instance. Returns a that represents this instance. The format. The format provider. A that represents this instance. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Common interface for probability distributions. This interface is implemented by all probability distributions in the framework, including s and s. This includes , , , and Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The Probability Density Function (PDF) describes the probability that a given value x will occur. The probability of x occurring in the current distribution. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The logarithm of the probability of x occurring in the current distribution. Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Fits the underlying distribution to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Although both double[] and double[][] arrays are supported, providing a double[] for a multivariate distribution or a double[][] for a univariate distribution may have a negative impact in performance. Common interface for probability distributions. This interface is implemented by all generic probability distributions in the framework, including s and s. Gets the cumulative distribution function (cdf) for this distribution evaluated at point x. The Cumulative Distribution Function (CDF) describes the cumulative probability that a given value or any value smaller than it will occur. Gets the probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The Probability Density Function (PDF) describes the probability that a given value x will occur. The probability of x occurring in the current distribution. Gets the log-probability density function (pdf) for this distribution evaluated at point x. A single point in the distribution range. For a univariate distribution, this should be a single double value. For a multivariate distribution, this should be a double array. The logarithm of the probability of x occurring in the current distribution. Gets the complementary cumulative distribution function (ccdf) for this distribution evaluated at point x. This function is also known as the Survival function. The Complementary Cumulative Distribution Function (CCDF) is the complement of the Cumulative Distribution Function, or 1 minus the CDF. Base abstract class for the Data Table preprocessing filters. The column options type. The filter type to whom these options should belong to. Gets or sets whether this filter is active. An inactive filter will repass the input table as output unchanged. Gets the collection of filter options. Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. The token. Creates a new DataTable Filter Base. Applies the Filter to a . The source . The name of the columns that should be processed. The processed . Applies the Filter to a . The source . The processed . Processes the current filter. Gets options associated with a given variable (data column). The name of the variable. Gets options associated with a given variable (data column). The column's index for the variable. Gets the number of inputs accepted by the model. The number of inputs. Returns an enumerator that iterates through the collection. An enumerator that can be used to iterate through the collection. Returns an enumerator that iterates through a collection. An object that can be used to iterate through the collection. Add a new column options definition to the collection. Called when a new column options definition is being added. Can be used to validate or modify these options beforehand. The column options being added. Column option collection. The type of the filter that this collection belongs to. The type of the column options that will be used by the to determine how to process a particular column. Occurs when a new is being added to the collection. Handlers of this event can prevent a column options from being added by throwing an exception. Compatibility event args for the event. This is only required and used for the .NET 3.5 version of the framework. Extracts the key from the specified column options. Adds a new column options definition to the collection. The column options to be added. The added column options. Gets the associated options for the given column name. The name of the column whose options should be retrieved. The retrieved options. True if the options was contained in the collection; false otherwise. Column options for filter which have per-column settings. Gets or sets the filter to which these options belong to. The owner filter. Gets or sets the name of the column that the options will apply to. Gets or sets a user-determined object associated with this column. Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. The token. Constructs the base class for Column Options. Column's name. Returns a that represents this instance. A that represents this instance. Data processing interface for in-place filters. Applies the filter to a , modifying the table in place. Source table to apply filter to. The method modifies the source table in place. Indicates that a column filter supports automatic initialization. Auto detects the column options by analyzing a given . The column to analyze. Indicates that a filter supports automatic initialization. Auto detects the filter options by analyzing a given . Branching filter. The branching filter allows for different filter sequences to be applied to different subsets of a data table. For instance, consider a data table whose first column, "IsStudent", is an indicator variable: a value of 1 indicates the row contains information about a student, and a value of 0 indicates the row contains information about someone who is not currently a student. Using the branching filter, it becomes possible to apply a different set of filters for the rows that represent students and different filters for rows that represent non-students. Suppose we have the following data table. In this table, each row represents a person, an indicator variable tell us whether this person is a smoker, and the last column indicates the age of each person. Let's say we would like to convert the age of smokers to a scale from -1 to 0, and the age of non-smokers to a scale from 0 to 1. object[,] data = { { "Id", "IsSmoker", "Age" }, { 0, 1, 10 }, { 1, 1, 15 }, { 2, 0, 40 }, { 3, 1, 20 }, { 4, 0, 70 }, { 5, 0, 55 }, }; // Create a DataTable from data DataTable input = data.ToTable(); // We will create two filters, one to operate on the smoking // branch of the data, and other in the non-smoking subjects. // var smoker = new LinearScaling(); var common = new LinearScaling(); // for the smokers, we will convert the age to [-1; 0] smoker.Columns.Add(new LinearScaling.Options("Age") { SourceRange = new DoubleRange(10, 20), OutputRange = new DoubleRange(-1, 0) }); // for non-smokers, we will convert the age to [0; +1] common.Columns.Add(new LinearScaling.Options("Age") { SourceRange = new DoubleRange(40, 70), OutputRange = new DoubleRange(0, 1) }); // We now configure and create the branch filter var settings = new Branching.Options("IsSmoker"); settings.Filters.Add(1, smoker); settings.Filters.Add(0, common); Branching branching = new Branching(settings); // Finally, we can process the input data: DataTable actual = branching.Apply(input); // As result, the generated table will // then contain the following entries: // { "Id", "IsSmoker", "Age" }, // { 0, 1, -1.0 }, // { 1, 1, -0.5 }, // { 2, 0, 0.0 }, // { 3, 1, 0.0 }, // { 4, 0, 1.0 }, // { 5, 0, 0.5 }, Initializes a new instance of the class. Initializes a new instance of the class. The columns to use as filters. Initializes a new instance of the class. The columns to use as filters. Processes the current filter. Column options for the branching filter. Gets the collection of filters associated with a given label value. Initializes a new instance of the class. The column name. Initializes a new instance of the class. Auto detects the column options by analyzing a given . The column to analyze. Codification Filter class. The object type that needs to be codified. Default is string. The codification filter performs an integer codification of classes in given in a string form. An unique integer identifier will be assigned for each of the string classes. Every Learn() method in the framework expects the class labels to be contiguous and zero-indexed, meaning that if there is a classification problem with n classes, all class labels must be numbers ranging from 0 to n-1. However, not every dataset might be in this format and sometimes we will have to pre-process the data to be in this format. The example below shows how to use the Codification class to perform such pre-processing. Most classifiers in the framework also expect the input data to be of the same nature, i.e. continuous. The codification filter can also be used to convert discrete, categorical, ordinal and baseline categorical variables into continuous vectors that can be fed to other machine learning algorithms, such as K-Means:. For more examples, please see the documentation page for the non-generic filter. Gets the number of outputs generated by the model. The number of outputs. Creates a new Codification Filter. Creates a new Codification Filter. Creates a new Codification Filter. Creates a new Codification Filter. Creates a new Codification Filter. Creates a new Codification Filter. Gets or sets the default value to be used as a replacement for missing values. Default is to use System.DBNull.Value. Translates a value of a given variable into its integer (codeword) representation. The name of the variable's data column. The value to be translated. An integer which uniquely identifies the given value for the given variable. Translates an array of values into their integer representation, assuming values are given in original order of columns. The values to be translated. An array of integers in which each value uniquely identifies the given value for each of the variables. Translates an array of values into their integer representation, assuming values are given in original order of columns. A containing the values to be translated. The columns of the containing the values to be translated. An array of integers in which each value uniquely identifies the given value for each of the variables. Translates an array of values into their integer representation, assuming values are given in original order of columns. A containing the values to be translated. The columns of the containing the values to be translated. The updated column names after the data has been translated. An array of integers in which each value uniquely identifies the given value for each of the variables. Translates an array of values into their integer representation, assuming values are given in original order of columns. A containing the values to be translated. The columns of the containing the values to be translated. An array of integers in which each value uniquely identifies the given value for each of the variables. Translates an array of values into their integer representation, assuming values are given in original order of columns. A containing the values to be translated. The columns of the containing the values to be translated. The updated column names after the data has been translated. An array of integers in which each value uniquely identifies the given value for each of the variables. Translates a value of the given variables into their integer (codeword) representation. The names of the variable's data column. The values to be translated. An array of integers in which each integer uniquely identifies the given value for the given variables. Translates a value of the given variables into their integer (codeword) representation. The variable name. The values to be translated. An array of integers in which each integer uniquely identifies the given value for the given variables. Translates a value of the given variables into their integer (codeword) representation. The variable name. The values to be translated. An array of integers in which each integer uniquely identifies the given value for the given variables. Translates a value of the given variables into their integer (codeword) representation. The values to be translated. An array of integers in which each integer uniquely identifies the given value for the given variables. Translates a value of the given variables into their integer (codeword) representation. The values to be translated. The location to where to store the result of this transformation. An array of integers in which each integer uniquely identifies the given value for the given variables. Applies the transformation to a set of input vectors, producing an associated set of output vectors. The input data to which the transformation should be applied. The location to where to store the result of this transformation. The output generated by applying this transformation to the given input. Translates an integer (codeword) representation of the value of a given variable into its original value. The variable name. The codeword to be translated. The original meaning of the given codeword. Translates an integer (codeword) representation of the value of a given variable into its original value. The codewords to be translated. The original meaning of the given codeword. Translates an integer (codeword) representation of the value of a given variable into its original value. The name of the variable's data column. The codewords to be translated. The original meaning of the given codeword. Translates the integer (codeword) representations of the values of the given variables into their original values. The name of the variables' columns. The codewords to be translated. The original meaning of the given codewords. Processes the current filter. Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . Converts this instance into a transform that can generate double[]. Adds a new column options to this filter's collection, specifying how a particular column should be processed by the filter.. The type of the variable to be added. Adds a new column options to this filter's collection, specifying how a particular column should be processed by the filter.. The name of the variable to added. The type of the variable to be added. Adds a new column options to this filter's collection, specifying how a particular column should be processed by the filter.. The name of the variable to added. The type of the variable to be added. The baseline value to be used in conjunction with . The baseline value will be treated as absolute zero in a otherwise one-hot representation. Adds a new column options to this filter's collection, specifying how a particular column should be processed by the filter.. The name of the variable to added. The type of the variable to be added. The order of the variables in the mapping. The first variable will be assigned to position (symbol) 1, the second to position 2, and so on. Called when a new column options definition is being added. Can be used to validate or modify these options beforehand. The column options being added. Options for processing a column. Gets or sets the label mapping for translating integer labels to the original string labels. Gets or sets whether the column can contain missing values. Gets or sets how missing values are represented in this column. Gets or sets a value to be used to replace missing values. Default is to replace missing values using System.DBNull.Value. Gets the number of symbols used to code this variable. Gets the number of symbols used to code this variable. See remarks for details. This method returns the following table of values: Number of elements in (number of distinct elements). Number of elements in (number of distinct elements). Number of elements in (number of distinct elements). 1 (there are no symbols to be encoded). 1 (there are no symbols to be encoded). Gets the codification type that should be used for this variable. Gets the number of inputs accepted by the model (value will be 1). The number of inputs. Gets the number of outputs generated by the model. See remarks for details. The number of outputs. This method returns the following table of values: Number of elements in (number of distinct elements). Number of elements in minus 1 (number of distinct elements, except the baseline, which is encoded as the absence of other values). 1 (there is just one single ordinal variable). 1 (there is just one single continuous variable). 1 (there is just one single discrete variable). Gets the values associated with each symbol, in the order of the symbols. Gets or sets the number of classes expected and recognized by the classifier. The number of classes. Determines whether the given object denotes a missing value. Forces the given key to have a specific symbol value. The key. The value that should be associated with this key. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Applies the transformation to a set of input vectors, producing an associated set of output vectors. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Applies the transformation to a set of input vectors, producing an associated set of output vectors. The input data to which the transformation should be applied. The location to where to store the result of this transformation. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Applies the transformation to a set of input vectors, producing an associated set of output vectors. The input data to which the transformation should be applied. The location to where to store the result of this transformation. The output generated by applying this transformation to the given input. Reverts the transformation to a set of output vectors, producing an associated set of input vectors. The input data to which the transformation should be reverted. Reverts the transformation to a set of output vectors, producing an associated set of input vectors. The input data to which the transformation should be reverted. Reverts the transformation to a set of output vectors, producing an associated set of input vectors. The input data to which the transformation should be reverted. The location to where to store the result of this transformation. The input generated by reverting this transformation to the given output. Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . Weights are not supported and should be null. Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . Weights are not supported and should be null. Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . Weights are not supported and should be null. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. A class-label that best described according to this classifier. Computes class-label decisions for each vector in the given . The input vectors that should be classified into one of the possible classes. The class-labels that best describe each vectors according to this classifier. Computes class-label decisions for each vector in the given . The input vectors that should be classified into one of the possible classes. The location where to store the class-labels. The class-labels that best describe each vectors according to this classifier. Applies the transformation to a set of input vectors, producing an associated set of output vectors. The input data to which the transformation should be applied. The location to where to store the result of this transformation. The output generated by applying this transformation to the given input. Constructs a new Options object. Constructs a new Options object for the given column. The name of the column to create this options for. Constructs a new Options object for the given column. The name of the column to create this options for. The type of the variable in the column. Constructs a new Options object for the given column. The name of the column to create this options for. The type of the variable in the column. The order of the variables in the mapping. The first variable will be assigned to position (symbol) 1, the second to position 2, and so on. Constructs a new Options object for the given column. The name of the column to create this options for. The type of the variable in the column. The baseline value to be used in conjunction with . The baseline value will be treated as absolute zero in a otherwise one-hot representation. Constructs a new Options object for the given column. The name of the column to create this options for. The initial mapping for this column. Value discretization preprocessing filter. This filter converts ranges of values into a different representation according to a set of rules. Please see the examples below to see how this filter can be used in practice. The discretization filter can be used to convert any range of values into another representation. For example, let's say we have a dataset where a column represents percentages using floating point numbers, but we would like to discretize those numbers into more descriptive labels: The discretization filter can also be used to process DataTable like the filter. It can also be used in combination with to process datasets for classification, as shown in the example below: Options for the discretization filter. Gets the map between matching rules and the output that should be produced/inserted when they match. Gets the number of symbols used to code this variable. Gets the number of inputs accepted by the model (value will be 1). The number of inputs. Gets the number of outputs generated by the model (value will be 1). The number of outputs. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Applies the transformation to a set of input vectors, producing an associated set of output vectors. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Applies the transformation to a set of input vectors, producing an associated set of output vectors. The input data to which the transformation should be applied. The location to where to store the result of this transformation. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Applies the transformation to a set of input vectors, producing an associated set of output vectors. The input data to which the transformation should be applied. The location to where to store the result of this transformation. The output generated by applying this transformation to the given input. Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . Weights are not supported and should be null. Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . Weights are not supported and should be null. Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . Weights are not supported and should be null. Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . Weights are not supported and should be null. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. A class-label that best described according to this classifier. Computes class-label decisions for each vector in the given . The input vectors that should be classified into one of the possible classes. The class-labels that best describe each vectors according to this classifier. Computes class-label decisions for each vector in the given . The input vectors that should be classified into one of the possible classes. The location where to store the class-labels. The class-labels that best describe each vectors according to this classifier. Constructs a new Options object. Constructs a new Options object for the given column. The name of the column to create this options for. Constructs a new Options object for the given column. The name of the column to create this options for. The initial mapping for this column. Gets the number of outputs generated by the model. The number of outputs. Creates a new Discretization Filter. Creates a new Discretization Filter. Creates a new Discretization Filter. Creates a new Discretization Filter. Creates a new Discretization Filter. Creates a new Discretization Filter. Translates a value of a given variable into its codeword representation. The name of the variable's data column. The value to be translated. Translates an array of values into their codeword representation, assuming values are given in original order of columns. The values to be translated. Translates an array of values into their codeword representation, assuming values are given in original order of columns. The values to be translated. Translates an array of values into their codeword representation, assuming values are given in original order of columns. The values to be translated. Translates an array of values into their codeword representation, assuming values are given in original order of columns. A containing the values to be translated. The columns of the containing the values to be translated. Translates a value of the given variables into their codeword representation. The names of the variable's data column. The values to be translated. Translates a value of the given variables into their codeword representation. The variable name. The values to be translated. Translates a value of the given variables into their codeword representation. The variable name. The values to be translated. Translates a value of the given variables into their codeword representation. The values to be translated. Applies the transformation to a set of input vectors, producing an associated set of output vectors. The input data to which the transformation should be applied. The location to where to store the result of this transformation. The output generated by applying this transformation to the given input. Processes the current filter. Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . Auto detects the filter options by analyzing a given . Adds the specified matching rule to a column. Name of the column. The rule. The output that should be generated whenever a data sample matches with the rule. Adds the specified matching rule to a column. Name of the column. The rule. The output that should be generated whenever a data sample matches with the rule. Identification filter. The identification filter adds a new column to the data containing an unique id for each of the samples (rows) in the data table (or matrix). Gets or sets the name of the column used to store row indices. Creates a new identification filter. Creates a new identification filter. Applies the filter to the DataTable. Randomization filter. Gets or sets the fixed random seed to be used in randomization, if any. The random seed, for fixed permutations; or null, for true random permutations. Initializes a new instance of the class. A fixed random seed value to generate fixed permutations. If not specified, generates true random permutations. Initializes a new instance of the class. Applies the filter to the current data. Strategies for missing value imputations. Uses a fixed-value to replace missing fields. Uses the mean value to replace missing fields. Uses the mode value to replace missing fields. Uses the median value to replace missing fields. Imputation filter for filling missing values. Creates a new Imputation filter. Creates a new Imputation filter. Creates a new Imputation filter. Creates a new Imputation filter. Imputation filter for filling missing values. Gets the number of outputs generated by the model. The number of outputs. Creates a new Imputation filter. Creates a new Imputation filter. Creates a new Imputation filter. Creates a new Imputation filter. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Applies the transformation to a set of input vectors, producing an associated set of output vectors. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Applies the transformation to a set of input vectors, producing an associated set of output vectors. The input data to which the transformation should be applied. The location to where to store the result of this transformation. The output generated by applying this transformation to the given input. Processes the current filter. Auto detects the filter options by analyzing a given . Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . weights There are more predefined columns than columns in the data. Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . weights There are more predefined columns than columns in the data. Options for the imputation filter. Gets or sets the imputation strategy to use with this column. Missing value indicator. Value to replace missing values with. Constructs a new column option for the Imputation filter. Constructs a new column option for the Imputation filter. Auto detects the column options by analyzing a given . The column to analyze. Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . weights There are more predefined columns than columns in the data. Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . weights There are more predefined columns than columns in the data. Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . weights There are more predefined columns than columns in the data. Determines whether the given object denotes a missing value. Grouping filter. Gets or sets a value indicating whether the group labels are locked and should not be randomly re-selected. true to lock groups; otherwise, false. Gets or sets the group index labels. The group indices. Gets or sets the two-group proportions. Gets or sets the name of the indicator column which will be used to distinguish samples from either group. Creates a new Grouping filter with equal group proportions and default Group indicator column. Creates a new Grouping filter. Processes the current filter. Options for the grouping filter. Gets or sets the labels used for each class contained in the column. Constructs a new Options object for the given column. The name of the column to create this options for. Constructs a new Options object. Elimination filter. Creates a elimination filter to remove rows containing missing values. Creates a elimination filter to remove rows containing missing values in the specified columns. Processes the current filter. Auto detects the filter options by analyzing a given . Options for the discretization filter. Gets the value indicator of a missing field. Default is . Constructs a new column option for the Elimination filter. Constructs a new column option for the Elimination filter. Time-series windowing filter. This filter splits a time-series into overlapping time windows, with optional associated output values. This filter can be used to create time-window databases for time-series regression and latent-state identification. Gets or sets the length of the time-windows that should be extracted from the sequences. Gets or sets the step size that should be used when extracting windows. If set to the same number as the , windows will not overlap. Default is 1. Creates a new time segmentation filter. Creates a new time segmentation filter. The size of the time windows to be extracted. Creates a new time segmentation filter. The size of the time windows to be extracted. The number of elements between two taken windows. If set to the same number of , the windows will not overlap. Default is 1. Processes the current filter. Applies the filter to a time series. The source time series. The time-windows extracted from the time-series. Applies the filter to a time series. The source time series. The output associated with each time-window. The time-windows extracted from the time-series. Options for segmenting a time-series contained inside a column. Constructs a new Options object. Class equalization filter. Currently this class does only work for a single column and only for the binary case (two classes). Creates a new class equalization filter. Creates a new classes equalization filter. Creates a new classes equalization filter. Creates a new classes equalization filter. Processes the current filter. Options for the stratification filter. Gets or sets the labels used for each class contained in the column. Constructs a new Options object for the given column. The name of the column to create this options for. Constructs a new Options object. Auto detects the column options by analyzing a given . The column to analyze. Codification type. Returns . The variable should be codified as an ordinal variable, meaning they will be translated to symbols 0, 1, 2, ... n, where n is the total number of distinct symbols this variable can assume. This is the default encoding in the filter. This variable should be codified as a 1-of-n vector by creating one column for each symbol this variable can assume, and marking the column corresponding to the current symbol as 1 and the rest as zero. This variable should be codified as a 1-of-(n-1) vector by creating one column for each symbol this variable can assume, except the first. This is the same as as , but the first symbol is handled as a baseline (and should be indicated by a zero in every column). This variable is continuous and should be not be codified. This variable is discrete and should be not be codified. Codification Filter class. The codification filter performs an integer codification of classes in given in a string form. An unique integer identifier will be assigned for each of the string classes. When handling data tables, often there will be cases in which a single table contains both numerical variables and categorical data in the form of text labels. Since most machine learning and statistics algorithms expect their data to be numeric, the codification filter can be used to create mappings between text labels and discrete symbols. // Show the start data DataGridBox.Show(table); // Create a new data projection (column) filter var filter = new Codification(table, "Category"); // Apply the filter and get the result DataTable result = filter.Apply(table); // Show it DataGridBox.Show(result); The following more elaborated examples show how to use the filter without necessarily handling System.Data.DataTables. After we have created the codebook, we can use it to feed data with categorical variables to method which would otherwise not know how to handle text labels data. Continuing with our example, the next code section shows how to convert an entire data table into a numerical matrix. Finally, by expressing our data in terms of a simple numerical matrix we will be able to feed it to any machine learning algorithm. The following code section shows how to create a linear multi-class Support Vector Machine to classify ages into any of the previously considered text labels ("child", "adult" or "elder"). Every Learn() method in the framework expects the class labels to be contiguous and zero-indexed, meaning that if there is a classification problem with n classes, all class labels must be numbers ranging from 0 to n-1. However, not every dataset might be in this format and sometimes we will have to pre-process the data to be in this format. The example below shows how to use the Codification class to perform such pre-processing. The codification filter can also work with missing values. The example below shows how a codification codebook can be created from a dataset that includes missing values and how to use this codebook to replace missing values by some other representation (in the case below, replacing null by NaN double numbers. The codification can also support more advanced scenarios where it is necessary to use different categorical representations for different variables, such as one-hot-vectors and categorical-with-baselines, as shown in the example below: Another examples of an advanced scenario where the source dataset contains both symbolic and discrete/continuous variables are shown below: Creates a new Codification Filter. Creates a new Codification Filter. Creates a new Codification Filter. Creates a new Codification Filter. Creates a new Codification Filter. Creates a new Codification Filter. Transforms a matrix of key-value pairs (where the first column denotes a key, and the second column a value) into their integer vector representation. A 2D matrix with two columns, where the first column contains the keys (i.e. "Date") and the second column the values (i.e. "14/05/1988"). A vector of integers where each element contains the translation of each respective row in the given matrix. Translates a value of a given variable into its integer (codeword) representation. The name of the variable's data column. The value to be translated. An integer which uniquely identifies the given value for the given variable. Translates an array of values into their integer representation, assuming values are given in original order of columns. The values to be translated. An array of integers in which each value uniquely identifies the given value for each of the variables. Translates an array of values into their integer representation, assuming values are given in original order of columns. A containing the values to be translated. The columns of the containing the values to be translated. An array of integers in which each value uniquely identifies the given value for each of the variables. Translates a value of the given variables into their integer (codeword) representation. The names of the variable's data column. The values to be translated. An array of integers in which each integer uniquely identifies the given value for the given variables. Translates a value of the given variables into their integer (codeword) representation. The variable name. The values to be translated. An array of integers in which each integer uniquely identifies the given value for the given variables. Translates a value of the given variables into their integer (codeword) representation. The variable name. The values to be translated. An array of integers in which each integer uniquely identifies the given value for the given variables. Translates an integer (codeword) representation of the value of a given variable into its original value. The variable name. The codeword to be translated. The original meaning of the given codeword. Translates an integer (codeword) representation of the value of a given variable into its original value. The name of the variable's data column. The codewords to be translated. The original meaning of the given codeword. Translates the integer (codeword) representations of the values of the given variables into their original values. The name of the variables' columns. The codewords to be translated. The original meaning of the given codewords. Auto detects the filter options by analyzing a given . Auto detects the filter options by analyzing a given . Auto detects the filter options by analyzing a set of string labels. The variable name. A set of values that this variable can assume. Auto detects the filter options by analyzing a set of string labels. The variable name. A set of values that this variable can assume. Auto detects the filter options by analyzing a set of string labels. The variable names. A set of values that those variable can assume. The first element of the array is assumed to be related to the first column name parameter. Value discretization preprocessing filter. This filter converts double or decimal values with an fractional part to the nearest possible integer according to a given threshold and a rounding rule. // Show the start data DataGridBox.Show(table); // Create a new data projection (column) filter var filter = new Discretization("Cost (M)"); // Apply the filter and get the result DataTable result = filter.Apply(table); // Show it DataGridBox.Show(result); Creates a new Discretization filter. Creates a new Discretization filter. Processes the current filter. Auto detects the filter options by analyzing a given . Options for the discretization filter. Gets or sets the threshold for the discretization filter. Gets or sets whether the discretization threshold is symmetric. If a symmetric threshold of 0.4 is used, for example, a real value of 0.5 will be rounded to 1.0 and a real value of -0.5 will be rounded to -1.0. If a non-symmetric threshold of 0.4 is used, a real value of 0.5 will be rounded towards 1.0, but a real value of -0.5 will be rounded to 0.0 (because |-0.5| is higher than the threshold of 0.4). Constructs a new Options class for the discretization filter. Constructs a new Options object. Sequence of table processing filters. Initializes a new instance of the class. Initializes a new instance of the class. Sequence of filters to apply. Applies the sequence of filters to a given table. Sample processing filter interface. The interface defines the set of methods which should be provided by all table processing filters. Methods of this interface should keep the source table unchanged and return the result of data processing filter as new data table. Applies the filter to a . Source table to apply filter to. Returns filter's result obtained by applying the filter to the source table. The method keeps the source table unchanged and returns the the result of the table processing filter as new data table. Data normalization preprocessing filter. The normalization filter is able to transform numerical data into Z-Scores, subtracting the mean for each variable and dividing by their standard deviation. The filter is able to distinguish numerical columns automatically, leaving other columns unaffected. It is also possible to control which columns should be processed by the filter. Suppose we have a data table relating the age of a person and its categorical classification, as in "child", "adult" or "elder". The normalization filter can be used to transform the "Age" column into Z-scores, as shown below: // Create the aforementioned sample table DataTable table = new DataTable("Sample data"); table.Columns.Add("Age", typeof(double)); table.Columns.Add("Label", typeof(string)); // age label table.Rows.Add(10, "child"); table.Rows.Add(07, "child"); table.Rows.Add(04, "child"); table.Rows.Add(21, "adult"); table.Rows.Add(27, "adult"); table.Rows.Add(12, "child"); table.Rows.Add(79, "elder"); table.Rows.Add(40, "adult"); table.Rows.Add(30, "adult"); // The filter will ignore non-real (continuous) data Normalization normalization = new Normalization(table); double mean = normalization["Age"].Mean; // 25.55 double sdev = normalization["Age"].StandardDeviation; // 23.29 // Now we can process another table at once: DataTable result = normalization.Apply(table); // The result will be a table with the same columns, but // in which any column named "Age" will have been normalized // using the previously detected mean and standard deviation: DataGridBox.Show(result); The resulting data is shown below: Creates a new data normalization filter. Creates a new data normalization filter. Creates a new data normalization filter. Processes the current filter. Applies the Filter to a . The source . The processed . Applies the Filter to a . The source . The processed . Auto detects the filter options by analyzing a given . Auto detects the filter options by analyzing a given matrix. Options for normalizing a column. Gets or sets the mean of the data contained in the column. Gets or sets the standard deviation of the data contained in the column. Gets or sets if the column's data should be standardized to Z-Scores. Constructs a new Options object. Constructs a new Options object for the given column. The name of the column to create this options for. Constructs a new Options object for the given column. The name of the column to create this options for. The mean value for normalization. The standard deviation value for standardization. Principal component projection filter. Gets or sets the analysis associated with the filter. Creates a new Principal Component Projection filter. Creates a new data normalization filter. Processes the filter. The data. Auto detects the filter options by analyzing a given . Options for normalizing a column. Initializes a new instance of the class. Initializes a new instance of the class. Name of the column. Relational-algebra projection filter. This filter is able to selectively remove columns from tables, and keep only the columns of interest. // Show the start data DataGridBox.Show(table); // Create a new data projection (column) filter var filter = new Projection("Floors", "Finished"); // Apply the filter and get the result DataTable result = filter.Apply(table); // Show it DataGridBox.Show(result); List of columns to keep in the projection. Creates a new projection filter. Creates a new projection filter. Creates a new projection filter. Applies the filter to the DataTable. Linear Scaling Filter Creates a new Linear Scaling Filter. Creates a new Linear Scaling Filter. Creates a new Linear Scaling Filter. Creates a new Linear Scaling Filter. Applies the filter to the DataTable. Auto detects the filter options by analyzing a given . Auto detects the filter options by analyzing a given . Options for the Linear Scaling filter. Range of the input values Target range of the output values after scaling. Creates a new column options. Constructs a new Options object. Relational-algebra selection filter. Gets or sets the eSQL filter expression for the filter. Gets or sets the ordering to apply for the filter. Constructs a new Selection Filter. The filtering criteria. The desired sort order. Constructs a new Selection Filter. The filtering criteria. Constructs a new Selection Filter. Applies the filter to the current data. Additive combination of kernels. Gets the combination of kernels to use. Gets the weight array to use in the weighted kernel sum. Constructs a new additive kernel. Kernels to combine. Constructs a new additive kernel. Kernels to combine. Weight values for each of the kernels. Default is to assign equal weights. Additive Kernel Combination function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. ANOVA (ANalysis Of VAriance) Kernel. The ANOVA kernel is a graph kernel, which can be computed using dynamic programming tables. References: - http://www.cse.ohio-state.edu/mlss09/mlss09_talks/1.june-MON/jst_tutorial.pdf Constructs a new ANOVA Kernel. Length of the input vector. Length of the subsequences for the ANOVA decomposition. ANOVA Kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Kernel function interface. In Machine Learning and statistics, a Kernel is a function that returns the value of the dot product between the images of the two arguments. k(x,y) = ‹S(x),S(y)› References: http://www.support-vector.net/icml-tutorial.pdf Elementwise addition of a and b, storing in result. The first vector to add. The second vector to add. An array to store the result. The same vector passed as result. Elementwise multiplication of scalar a and vector b, accumulating in result. The scalar to be multiplied. The vector to be multiplied. An array to store the result. Elementwise multiplication of vector a and vector b, accumulating in result. The vector to be multiplied. The vector to be multiplied. An array to store the result. Compress a set of support vectors and weights into a single parameter vector. The weights associated with each support vector. The support vectors. The constant (bias) value. A single parameter vector. The kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Gets the number of parameters in the input vectors. Creates an input vector from the given double values. Converts the input vectors to a double-precision representation. Kernel function interface. In Machine Learning and statistics, a Kernel is a function that returns the value of the dot product between the images of the two arguments. k(x,y) = ‹S(x),S(y)› References: http://www.support-vector.net/icml-tutorial.pdf Kernel function interface. In Machine Learning and statistics, a Kernel is a function that returns the value of the dot product between the images of the two arguments. k(x,y) = ‹S(x),S(y)› References: http://www.support-vector.net/icml-tutorial.pdf The kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Interface for Radial Basis Function kernels. A radial basis function (RBF) is a real-valued function whose value depends only on the distance from the origin, so that ϕ(x) = ϕ(||x||); or alternatively on the distance from some other point c, called a center, so that ϕ(x,c) = ϕ(||x−c||). Any function ϕ that satisfies the property ϕ(x) = ϕ(||x||) is a radial function. The norm is usually Euclidean distance, although other distance functions are also possible. Examples of radial basis kernels include: References: Wikipedia, The Free Encyclopedia. Radial basis functions. Available on: https://en.wikipedia.org/wiki/Radial_basis_function The kernel function. Distance z between two vectors in input space. Dot product in feature (kernel) space. Interface for kernel functions with support for automatic parameter estimation. Interface for kernel functions with support for automatic parameter estimation. Estimates kernel parameters from the data. The input data. Common interface for kernel functions that can explicitly project input points into the kernel feature space. Common interface for kernel functions that can explicitly project input points into the kernel feature space. Projects an input point into feature space. The input point to be projected into feature space. The feature space representation of the given point. Projects an input point into feature space. The input point to be projected into feature space. The feature space representation of the given point. Base class for kernel functions. This class provides automatic distance calculations for classes that do not provide optimized implementations. Base class for kernel functions. This class provides automatic distance calculations for classes that do not provide optimized implementations. Computes the squared distance in feature space between two points given in input space. Vector x in input space. Vector y in input space. Squared distance between x and y in feature (kernel) space. The kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Extension methods for kernel functions. Creates the Gram matrix from the given vectors. The kernel function. The vectors. An optional matrix where the result should be stored in. A symmetric matrix containing the dot-products in feature (kernel) space between all vectors in . Creates the Gram matrix containing all dot products in feature (kernel) space between each vector in x and the ones in y. The first vectors. The second vectors. The kernel function. An optional matrix where the result should be stored in. A symmetric matrix containing the dot-products in feature (kernel) space between each vector in and the ones in . Creates the Gram matrix from the given vectors. The kernel function. The vectors. An optional matrix where the result should be stored in. A symmetric matrix containing the dot-products in feature (kernel) space between all vectors in . Creates the Gram matrix containing all dot products in feature (kernel) space between each vector in x and the ones in y. The kernel function. The first vectors. The second vectors. An optional matrix where the result should be stored in. A symmetric matrix containing the dot-products in feature (kernel) space between each vector in and the ones in . Estimates the complexity parameter C, present in many SVM algorithms, for a given kernel and a given data set by summing every element on the diagonal of the kernel matrix and using an heuristic based on it. The kernel function. The input samples. A suitable value for C. Estimates the complexity parameter C, present in many SVM algorithms, for a given kernel and an unbalanced data set by summing every element on the diagonal of the kernel matrix and using an heuristic based on it. The kernel function. The input samples. The output samples. A suitable value for positive C and negative C, respectively. Estimates the complexity parameter C, present in many SVM algorithms, for a given kernel and a given data set by summing every element on the diagonal of the kernel matrix and using an heuristic based on it. The input samples. A suitable value for C. Estimates the complexity parameter C, present in many SVM algorithms, for a given kernel and an unbalanced data set by summing every element on the diagonal of the kernel matrix and using an heuristic based on it. The input samples. The output samples. A suitable value for positive C and negative C, respectively. Computes the set of all distances between all points in a random subset of the data. The inner kernel. The inputs points. The number of samples. Computes the set of all distances between all points in a random subset of the data. The inner kernel. The inputs points. The number of samples. Centers the given kernel matrix K. The kernel matrix to be centered. The array where to store results. Centers the given kernel matrix K. The kernel matrix to be centered. The row-wise mean vector. The total mean (across all values in the matrix). The array where to store results. Centers the given kernel matrix K. The kernel matrix to be centered. The row-wise mean vector. The total mean (across all values in the matrix). The array where to store results. Centers the given kernel matrix K. The kernel matrix to be centered. The row-wise mean vector. The total mean (across all values in the matrix). The array where to store results. Bessel Kernel. The Bessel kernel is well known in the theory of function spaces of fractional smoothness. Gets or sets the order of the Bessel function. Gets or sets the sigma constant for this kernel. Constructs a new Bessel Kernel. The order for the Bessel function. The value for sigma. Bessel Kernel Function Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Bessel Kernel Function Distance z between two vectors in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. B-Spline Kernel. The B-Spline kernel is defined only in the interval [−1, 1]. It is also a member of the Radial Basis Functions family of kernels. References: Bart Hamers, Kernel Models for Large Scale Applications. Doctoral thesis. Available on: ftp://ftp.esat.kuleuven.ac.be/pub/SISTA/hamers/PhD_bhamers.pdf Gets or sets the B-Spline order. Constructs a new B-Spline Kernel. B-Spline Kernel Function Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Cauchy Kernel. The Cauchy kernel comes from the Cauchy distribution (Basak, 2008). It is a long-tailed kernel and can be used to give long-range influence and sensitivity over the high dimension space. Gets or sets the kernel's sigma value. Constructs a new Cauchy Kernel. The value for sigma. Cauchy Kernel Function Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Cauchy Kernel Function Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Computes the squared distance in feature space between two points given in input space. Vector x in input space. Vector y in input space. Squared distance between x and y in feature (kernel) space. Cauchy Kernel Function Distance z in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Chi-Square Kernel. The Chi-Square kernel comes from the Chi-Square distribution. Constructs a new Chi-Square kernel. Chi-Square Kernel Function Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Circular Kernel. The circular kernel comes from a statistics perspective. It is an example of an isotropic stationary kernel and is positive definite in R^2. Gets or sets the kernel's sigma value. Constructs a new Circular Kernel. Value for sigma. Circular Kernel Function Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Circular Kernel Function Distance z in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Radial Basis Function Dynamic Time Warping Sequence Kernel. The Dynamic Time Warping Sequence Kernel is a sequence kernel, accepting vector sequences of variable size as input. Despite the sequences being variable in size, the vectors contained in such sequences should have its size fixed and should be informed at the construction of this kernel. The conversion of the DTW global distance to a dot product uses a combination of a technique known as spherical normalization and the polynomial kernel. The degree of the polynomial kernel and the alpha for the spherical normalization should be given at the construction of the kernel. For more information, please see the referenced papers shown below. The use of a cache is highly advisable when using this kernel. References: V. Wan, J. Carmichael; Polynomial Dynamic Time Warping Kernel Support Vector Machines for Dysarthric Speech Recognition with Sparse Training Data. Interspeech'2005 - Eurospeech - 9th European Conference on Speech Communication and Technology. Lisboa, 2005. The following example demonstrates how to create and learn a Support Vector Machine (SVM) to recognize sequences of univariate observations using the Dynamic Time Warping kernel. Now, instead of having univariate observations, the following example demonstrates how to create and learn a sequences of multivariate (or n-dimensional) observations. Gets or sets the sigma value for the kernel. When setting sigma, gamma gets updated accordingly (gamma = 0.5/sigma^2). Gets or sets the sigma² value for the kernel. When setting sigma², gamma gets updated accordingly (gamma = 0.5/sigma²). Gets or sets the gamma value for the kernel. When setting gamma, sigma gets updated accordingly (gamma = 0.5/sigma^2). Constructs a new Dynamic Time Warping kernel. The inner kernel function of the composite kernel. Constructs a new Dynamic Time Warping kernel. The inner kernel function of the composite kernel. The kernel's sigma parameter. Constructs a new Dynamic Time Warping kernel. The kernel's sigma parameter. Dynamic Time Warping kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Computes the squared distance in feature space between two points given in input space. Vector x in input space. Vector y in input space. Squared distance between x and y in feature (kernel) space. Global distance D(X,Y) between two sequences of vectors. The current thread local storage. A sequence of vectors. A sequence of vectors. The global distance between X and Y. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Composite Gaussian Kernel. Constructs a new Composite Gaussian Kernel The inner kernel function of the composite kernel. Constructs a new Composite Gaussian Kernel The inner kernel function of the composite kernel. The kernel's sigma parameter. Gets or sets the sigma value for the kernel. When setting sigma, gamma gets updated accordingly (gamma = 0.5/sigma^2). Gets or sets the sigma² value for the kernel. When setting sigma², gamma gets updated accordingly (gamma = 0.5/sigma²). Gets or sets the gamma value for the kernel. When setting gamma, sigma gets updated accordingly (gamma = 0.5/sigma^2). Gaussian Kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Computes the squared distance in feature space between two points given in input space. Vector x in input space. Vector y in input space. Squared distance between x and y in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Composite Gaussian Kernel. Constructs a new Composite Gaussian Kernel The inner kernel function of the composite kernel. Constructs a new Composite Gaussian Kernel The inner kernel function of the composite kernel. The kernel's sigma parameter. Gets or sets the sigma value for the kernel. When setting sigma, gamma gets updated accordingly (gamma = 0.5/sigma^2). Gets or sets the sigma² value for the kernel. When setting sigma², gamma gets updated accordingly (gamma = 0.5/sigma²). Gets or sets the gamma value for the kernel. When setting gamma, sigma gets updated accordingly (gamma = 0.5/sigma^2). Gaussian Kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Hellinger Kernel. Hellinger kernel is an euclidean norm of linear kernel. Reference: http://www.di.ens.fr/willow/events/cvml2011/materials/practical-classification/ Constructs a new Hellinger Kernel. Hellinger Function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Pearson VII universal kernel (PUK). Constructs a new Pearson VII universal kernel. The Pearson's omega parameter w. Default is 1. The Pearson's sigma parameter s. Default is 1. Constructs a new Pearson VII universal kernel. Gets or sets the kernel's parameter omega. Default is 1. Gets or sets the kernel's parameter sigma. Default is 1. Pearson Universal kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Pearson Universal function. Distance z in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Normalized Kernel. This kernel definition can be used to provide normalized versions of other kernel classes, such as the . A normalized kernel will always produce distances between -1 and 1. Gets or sets the inner kernel function whose results should be normalized. Constructs a new Cauchy Kernel. The kernel function to be normalized. Normalized Kernel Function Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Inverse Multiquadric Kernel. The inverse multiquadric kernel is only conditionally positive definite. Gets or sets the kernel's constant value. Constructs a new Inverse Multiquadric Kernel. The constant term theta. Constructs a new Inverse Multiquadric Kernel. Inverse Multiquadric Kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Inverse Multiquadric Kernel function. Distance z in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Normalized Polynomial Kernel. This class is equivalent to the Normalized>Polynomial> kernel but has more efficient implementation. Constructs a new Normalized Polynomial kernel of a given degree. The polynomial degree for this kernel. The polynomial constant for this kernel. Default is 1. Constructs a new Normalized Polynomial kernel of a given degree. The polynomial degree for this kernel. Gets or sets the kernel's polynomial degree. Gets or sets the kernel's polynomial constant term. Normalized polynomial kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Value cache for kernel function evaluations. The total memory size occupied by the cache can fluctuate between and . This class works as a least-recently-used cache for elements computed from a the kernel (Gram) matrix. Elements which have not been needed for some time are discarded from the cache; while elements which are constantly requested remains cached. The use of cache may speedup learning by a large factor; however the actual speedup may vary according to the choice of cache size. The total memory size occupied by the cache can fluctuate between and . Constructs a new . The kernel function. The inputs values. Constructs a new . The kernel function. The inputs values. The size for the cache, measured in number of rows from the set. Default is to use all rows. In order to know how many rows can fit under a amount of memory, use . Gets the maximum number of rows that a cache can keep inside the given amount of bytes. This value can be used to initialize SequentialMinimalOptimization's CacheSize property, or be passed to constructor. Gets the maximum number of rows that a cache can keep inside the given amount of bytes. This value can be used to initialize SequentialMinimalOptimization's CacheSize property, or be passed to constructor. Gets the maximum number of rows that a cache can keep inside the given amount of bytes. This value can be used to initialize SequentialMinimalOptimization's CacheSize property, or be passed to constructor. Value cache for kernel function evaluations. The total memory size occupied by the cache can fluctuate between and . This class works as a least-recently-used cache for elements computed from a the kernel (Gram) matrix. Elements which have not been needed for some time are discarded from the cache; while elements which are constantly requested remains cached. The use of cache may speedup learning by a large factor; however the actual speedup may vary according to the choice of cache size. The total memory size occupied by the cache can fluctuate between and . Gets the size of the cache, measured in number of rows. The size of this cache. Gets the current number of rows stored in this cache. Gets the maximum size of the cache, measured in bytes. Gets the minimum size of the cache, measured in bytes. Gets the total number of cache hits. Gets the total number of cache misses. Gets the percentage of the cache currently in use. Gets a value indicating whether the cache is enabled. If the value is false, it means the kernel function is being evaluated on-the-fly. Constructs a new . The kernel function. The inputs values. Constructs a new . The kernel function. The inputs values. The size for the cache, measured in number of rows from the set. Default is to use all rows. In order to know how many rows can fit under an amount of memory, use . Attempts to retrieve the value of the kernel function from the diagonal of the kernel matrix. If the value is not available, it is immediately computed and inserted in the cache. Index of the point to compute. The result of the kernel function k(p[i], p[i]). Attempts to retrieve the kernel function evaluated between point at index i and j. If it is not cached, it will be computed and the cache will be updated. The index of the first point p to compute. The index of the second point p to compute. The result of the kernel function k(p[i], p[j]). Attempts to retrieve the value of the kernel function from the diagonal of the kernel matrix. If the value is not available, it is immediately computed and inserted in the cache. Index of the point to compute. The result of the kernel function k(p[i], p[i]). Attempts to retrieve the kernel function evaluated between point at index i and j. If it is not cached, it will be computed and the cache will be updated. The index of the first point p to compute. The index of the second point p to compute. The result of the kernel function k(p[i], p[j]). Clears the cache. Resets cache statistics. Gets the key from the given indices. The index i. The index j. The key associated with the given indices. Gets a copy of the data cache. A copy of the data cache. Gets a copy of the Least Recently Used (LRU) List of Kernel Matrix elements. Elements on the start of the list have been used most; elements at the end are about to be discarded from the cache. The Least Recently Used list of kernel matrix elements. Releases unmanaged and - optionally - managed resources. true to release both managed and unmanaged resources; false to release only unmanaged resources. Disposes this instance. Quadratic Kernel. Constructs a new Quadratic kernel. The polynomial constant for this kernel. Default is 1. Constructs a new Quadratic kernel. Gets or sets the kernel's polynomial constant term. Quadratic kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Quadratic kernel function. Distance z in input space. Dot product in feature (kernel) space. Computes the squared distance in input space between two points given in feature space. Vector x in feature (kernel) space. Vector y in feature (kernel) space. Distance between x and y in input space. Computes the distance in input space between two points given in feature space. Vector x in feature (kernel) space. Vector y in feature (kernel) space. Distance between x and y in input space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Projects an input point into feature space. The input point to be projected into feature space. The feature space representation of the given point. Projects a set of input points into feature space. The input points to be projected into feature space. The feature space representation of the given points. Projects an input point into feature space. The input point to be projected into feature space. The parameter of the kernel. The feature space representation of the given point. Symmetric Triangle Kernel. References: Chaudhuri et al, A Comparative Study of Kernels for the Multi-class Support Vector Machine, 2008. Available on: http://www.computer.org/portal/web/csdl/doi/10.1109/ICNC.2008.803 Constructs a new Symmetric Triangle Kernel Gets or sets the gamma value for the kernel. Symmetric Triangle Kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Symmetric Triangle Kernel function. Distance z in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Squared Sinc Kernel. References: Chaudhuri et al, A Comparative Study of Kernels for the Multi-class Support Vector Machine, 2008. Available on: http://www.computer.org/portal/web/csdl/doi/10.1109/ICNC.2008.803 Constructs a new Squared Sinc Kernel Gets or sets the gamma value for the kernel. Squared Sine Kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Squared Sine Kernel function. Distance z in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Custom Kernel. Constructs a new Custom kernel. Custom kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Dirichlet Kernel. References: A Tutorial on Support Vector Machines (1998). Available on: http://www.umiacs.umd.edu/~joseph/support-vector-machines4.pdf Constructs a new Dirichlet Kernel Gets or sets the dimension for the kernel. Dirichlet Kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Dynamic Time Warping Sequence Kernel. The Dynamic Time Warping Sequence Kernel is a sequence kernel, accepting vector sequences of variable size as input. Despite the sequences being variable in size, the vectors contained in such sequences should have its size fixed and should be informed at the construction of this kernel. The conversion of the DTW global distance to a dot product uses a combination of a technique known as spherical normalization and the polynomial kernel. The degree of the polynomial kernel and the alpha for the spherical normalization should be given at the construction of the kernel. For more information, please see the referenced papers shown below. The use of a cache is highly advisable when using this kernel. References: V. Wan, J. Carmichael; Polynomial Dynamic Time Warping Kernel Support Vector Machines for Dysarthric Speech Recognition with Sparse Training Data. Interspeech'2005 - Eurospeech - 9th European Conference on Speech Communication and Technology. Lisboa, 2005. The following example demonstrates how to create and learn a Support Vector Machine (SVM) to recognize sequences of univariate observations using the Dynamic Time Warping kernel. Now, instead of having univariate observations, the following example demonstrates how to create and learn a sequences of multivariate (or n-dimensional) observations. Gets or sets the length for the feature vectors contained in each sequence used by the kernel. Gets or sets the hypersphere ratio. Default is 1. Gets or sets the polynomial degree for this kernel. Default is 1. Constructs a new Dynamic Time Warping kernel. The length of the feature vectors contained in each sequence. Constructs a new Dynamic Time Warping kernel. The length of the feature vectors contained in each sequence. The hypersphere ratio. Default value is 1. Constructs a new Dynamic Time Warping kernel. The length of the feature vectors contained in each sequence. The hypersphere ratio. Default value is 1. The degree of the kernel. Default value is 1 (linear kernel). Constructs a new Dynamic Time Warping kernel. The hypersphere ratio. Default value is 1. The degree of the kernel. Default value is 1 (linear kernel). Dynamic Time Warping kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Dynamic Time Warping kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Computes the squared distance in feature space between two points given in input space. Vector x in input space. Vector y in input space. Squared distance between x and y in feature (kernel) space. Computes the squared distance in feature space between two points given in input space. Vector x in input space. Vector y in input space. Squared distance between x and y in feature (kernel) space. Global distance D(X,Y) between two sequences of vectors. The current thread local storage. A sequence of vectors. A sequence of vectors. The global distance between X and Y. Global distance D(X,Y) between two sequences of vectors. The current thread local storage. A sequence of vectors. A sequence of vectors. The global distance between X and Y. Projects vectors from a sequence of vectors into a hypersphere, augmenting their size in one unit and normalizing them to be unit vectors. A sequence of vectors. A sequence of vector projections. Projects vectors from a sequence of vectors into a hypersphere, augmenting their size in one unit and normalizing them to be unit vectors. A sequence of vectors. A sequence of vector projections. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Gaussian Kernel. The Gaussian kernel requires tuning for the proper value of σ. Different approaches to this problem includes the use of brute force (i.e. using a grid-search algorithm) or a gradient ascent optimization. References: P. F. Evangelista, M. J. Embrechts, and B. K. Szymanski. Some Properties of the Gaussian Kernel for One Class Learning. Available on: http://www.cs.rpi.edu/~szymansk/papers/icann07.pdf Constructs a new Gaussian Kernel with a given sigma value. To construct from a gamma value, use the named constructor instead. The kernel's sigma parameter. Gets or sets the sigma value for the kernel. When setting sigma, gamma gets updated accordingly (gamma = 0.5/sigma^2). Gets or sets the sigma² value for the kernel. When setting sigma², gamma gets updated accordingly (gamma = 0.5/sigma²). Gets or sets the gamma value for the kernel. When setting gamma, sigma gets updated accordingly (gamma = 0.5/sigma^2). Gaussian Kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Gaussian Kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Gaussian Kernel function. Distance z in input space. Dot product in feature (kernel) space. Computes the squared distance in feature space between two points given in input space. Vector x in input space. Vector y in input space. Squared distance between x and y in feature (kernel) space. Computes the squared distance in feature space between two points given in input space. Vector x in input space. Vector y in input space. Squared distance between x and y in feature (kernel) space. Computes the squared distance in input space between two points given in feature space. Vector x in feature (kernel) space. Vector y in feature (kernel) space. Squared distance between x and y in input space. Computes the distance in input space given a distance computed in feature space. Distance in feature space. Distance in input space. Constructs a new Gaussian Kernel with a given gamma value. To construct from a sigma value, use the constructor instead. The kernel's gamma parameter. Estimate appropriate values for sigma given a data set. This method uses a simple heuristic to obtain appropriate values for sigma in a radial basis function kernel. The heuristic is shown by Caputo, Sim, Furesjo and Smola, "Appearance-based object recognition using SVMs: which kernel should I use?", 2002. The data set. A Gaussian kernel initialized with an appropriate sigma value. Estimate appropriate values for sigma given a data set. This method uses a simple heuristic to obtain appropriate values for sigma in a radial basis function kernel. The heuristic is shown by Caputo, Sim, Furesjo and Smola, "Appearance-based object recognition using SVMs: which kernel should I use?", 2002. The data set. The range of suitable values for sigma. A Gaussian kernel initialized with an appropriate sigma value. Estimates appropriate values for sigma given a data set. This method uses a simple heuristic to obtain appropriate values for sigma in a radial basis function kernel. The heuristic is shown by Caputo, Sim, Furesjo and Smola, "Appearance-based object recognition using SVMs: which kernel should I use?", 2002. The data set. The number of random samples to analyze. A Gaussian kernel initialized with an appropriate sigma value. Estimates appropriate values for sigma given a data set. This method uses a simple heuristic to obtain appropriate values for sigma in a radial basis function kernel. The heuristic is shown by Caputo, Sim, Furesjo and Smola, "Appearance-based object recognition using SVMs: which kernel should I use?", 2002. The data set. The number of random samples to analyze. The range of suitable values for sigma. A Gaussian kernel initialized with an appropriate sigma value. Estimate appropriate values for sigma given a data set. This method uses a simple heuristic to obtain appropriate values for sigma in a radial basis function kernel. The heuristic is shown by Caputo, Sim, Furesjo and Smola, "Appearance-based object recognition using SVMs: which kernel should I use?", 2002. The data set. A Gaussian kernel initialized with an appropriate sigma value. Estimate appropriate values for sigma given a data set. This method uses a simple heuristic to obtain appropriate values for sigma in a radial basis function kernel. The heuristic is shown by Caputo, Sim, Furesjo and Smola, "Appearance-based object recognition using SVMs: which kernel should I use?", 2002. The data set. The range of suitable values for sigma. A Gaussian kernel initialized with an appropriate sigma value. Estimates appropriate values for sigma given a data set. This method uses a simple heuristic to obtain appropriate values for sigma in a radial basis function kernel. The heuristic is shown by Caputo, Sim, Furesjo and Smola, "Appearance-based object recognition using SVMs: which kernel should I use?", 2002. The data set. The number of random samples to analyze. A Gaussian kernel initialized with an appropriate sigma value. Estimates appropriate values for sigma given a data set. This method uses a simple heuristic to obtain appropriate values for sigma in a radial basis function kernel. The heuristic is shown by Caputo, Sim, Furesjo and Smola, "Appearance-based object recognition using SVMs: which kernel should I use?", 2002. The data set. The number of random samples to analyze. The range of suitable values for sigma. A Gaussian kernel initialized with an appropriate sigma value. Estimate appropriate values for sigma given a data set. This method uses a simple heuristic to obtain appropriate values for sigma in a radial basis function kernel. The heuristic is shown by Caputo, Sim, Furesjo and Smola, "Appearance-based object recognition using SVMs: which kernel should I use?", 2002. The data set. The distance function to be used in the Gaussian kernel. Default is . A Gaussian kernel initialized with an appropriate sigma value. Estimate appropriate values for sigma given a data set. This method uses a simple heuristic to obtain appropriate values for sigma in a radial basis function kernel. The heuristic is shown by Caputo, Sim, Furesjo and Smola, "Appearance-based object recognition using SVMs: which kernel should I use?", 2002. The data set. The range of suitable values for sigma. The distance function to be used in the Gaussian kernel. Default is . A Gaussian kernel initialized with an appropriate sigma value. Estimates appropriate values for sigma given a data set. This method uses a simple heuristic to obtain appropriate values for sigma in a radial basis function kernel. The heuristic is shown by Caputo, Sim, Furesjo and Smola, "Appearance-based object recognition using SVMs: which kernel should I use?", 2002. The data set. The number of random samples to analyze. The distance function to be used in the Gaussian kernel. Default is . A Gaussian kernel initialized with an appropriate sigma value. Estimates appropriate values for sigma given a data set. This method uses a simple heuristic to obtain appropriate values for sigma in a radial basis function kernel. The heuristic is shown by Caputo, Sim, Furesjo and Smola, "Appearance-based object recognition using SVMs: which kernel should I use?", 2002. The data set. The number of random samples to analyze. The range of suitable values for sigma. The distance function to be used in the Gaussian kernel. Default is . A Gaussian kernel initialized with an appropriate sigma value. Computes the set of all distances between all points in a random subset of the data. The inputs points. The number of samples. Computes the set of all distances between all points in a random subset of the data. The inputs points. The number of samples. Computes the set of all distances between all points in a random subset of the data. The inputs points. The number of samples. The distance function to be used in the Gaussian kernel. Default is . Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Estimate appropriate values for sigma given a data set. This method uses a simple heuristic to obtain appropriate values for sigma in a radial basis function kernel. The heuristic is shown by Caputo, Sim, Furesjo and Smola, "Appearance-based object recognition using SVMs: which kernel should I use?", 2002. The inner kernel. The data set. A Gaussian kernel initialized with an appropriate sigma value. Estimate appropriate values for sigma given a data set. This method uses a simple heuristic to obtain appropriate values for sigma in a radial basis function kernel. The heuristic is shown by Caputo, Sim, Furesjo and Smola, "Appearance-based object recognition using SVMs: which kernel should I use?", 2002. The inner kernel. The data set. The range of suitable values for sigma. A Gaussian kernel initialized with an appropriate sigma value. Estimates appropriate values for sigma given a data set. This method uses a simple heuristic to obtain appropriate values for sigma in a radial basis function kernel. The heuristic is shown by Caputo, Sim, Furesjo and Smola, "Appearance-based object recognition using SVMs: which kernel should I use?", 2002. The inner kernel. The data set. The number of random samples to analyze. A Gaussian kernel initialized with an appropriate sigma value. Estimates appropriate values for sigma given a data set. This method uses a simple heuristic to obtain appropriate values for sigma in a radial basis function kernel. The heuristic is shown by Caputo, Sim, Furesjo and Smola, "Appearance-based object recognition using SVMs: which kernel should I use?", 2002. The inner kernel. The data set. The number of random samples to analyze. The range of suitable values for sigma. A Gaussian kernel initialized with an appropriate sigma value. Generalized Histogram Intersection Kernel. The Generalized Histogram Intersection kernel is built based on the Histogram Intersection Kernel for image classification but applies in a much larger variety of contexts (Boughorbel, 2005). Constructs a new Generalized Histogram Intersection Kernel. Generalized Histogram Intersection Kernel Function Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Hyperbolic Secant Kernel. References: Chaudhuri et al, A Comparative Study of Kernels for the Multi-class Support Vector Machine, 2008. Available on: http://www.computer.org/portal/web/csdl/doi/10.1109/ICNC.2008.803 Constructs a new Hyperbolic Secant Kernel Gets or sets the gamma value for the kernel. Hyperbolic Secant Kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Hyperbolic Secant Kernel function. Distance z in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Input space distance interface for kernel functions. Kernels which implement this interface can be used to solve the pre-image problem in Kernel Principal Component Analysis and other methods based in Multi- Dimensional Scaling. Computes the squared distance in input space between two points given in feature space. Vector x in feature (kernel) space. Vector y in feature (kernel) space. Squared distance between x and y in input space. Kernel function interface. In Machine Learning and statistics, a Kernel is a function that returns the value of the dot product between the images of the two arguments. k(x,y) = ‹S(x),S(y)› References: http://www.support-vector.net/icml-tutorial.pdf Laplacian Kernel. Constructs a new Laplacian Kernel Constructs a new Laplacian Kernel The sigma slope value. Gets or sets the sigma value for the kernel. When setting sigma, gamma gets updated accordingly (gamma = 0.5*/sigma^2). Gets or sets the gamma value for the kernel. When setting gamma, sigma gets updated accordingly (gamma = 0.5*/sigma^2). Laplacian Kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Laplacian Kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Laplacian Kernel function. Distance z in input space. Dot product in feature (kernel) space. Computes the squared distance in input space between two points given in feature space. Vector x in feature (kernel) space. Vector y in feature (kernel) space. Squared distance between x and y in input space. Computes the squared distance in input space between two points given in feature space. Vector x in feature (kernel) space. Vector y in feature (kernel) space. Squared distance between x and y in input space. Computes the squared distance in input space between two points given in feature space. Vector x in feature (kernel) space. Vector y in feature (kernel) space. Squared distance between x and y in input space. Computes the distance in input space given a distance computed in feature space. Distance in feature space. Distance in input space. Estimate appropriate values for sigma given a data set. This method uses a simple heuristic to obtain appropriate values for sigma in a radial basis function kernel. The heuristic is shown by Caputo, Sim, Furesjo and Smola, "Appearance-based object recognition using SVMs: which kernel should I use?", 2002. The data set. A Laplacian kernel initialized with an appropriate sigma value. Estimate appropriate values for sigma given a data set. This method uses a simple heuristic to obtain appropriate values for sigma in a radial basis function kernel. The heuristic is shown by Caputo, Sim, Furesjo and Smola, "Appearance-based object recognition using SVMs: which kernel should I use?", 2002. The data set. The number of random samples to analyze. The range of suitable values for sigma. A Laplacian kernel initialized with an appropriate sigma value. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Linear Kernel. Constructs a new Linear kernel. A constant intercept term. Default is 0. Gets or sets the kernel's intercept term. Default is 0. Linear kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Linear kernel function. Distance z in input space. Dot product in feature (kernel) space. Computes the squared distance in input space between two points given in feature space. Vector x in feature (kernel) space. Vector y in feature (kernel) space. Squared distance between x and y in input space. Computes the squared distance in input space between two points given in feature space. Vector x in feature (kernel) space. Vector y in feature (kernel) space. Squared distance between x and y in input space. Computes the squared distance in input space between two points given in feature space. Vector x in feature (kernel) space. Vector y in feature (kernel) space. Squared distance between x and y in input space. Elementwise addition of a and b, storing in result. The first vector to add. The second vector to add. An array to store the result. The same vector passed as result. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Elementwise multiplication of scalar a and vector b, storing in result. The scalar to be multiplied. The vector to be multiplied. An array to store the result. Compress a set of support vectors and weights into a single parameter vector. The weights associated with each support vector. The support vectors. The constant (bias) value. A single parameter vector. Projects an input point into feature space. The input point to be projected into feature space. The feature space representation of the given point. Projects a set of input points into feature space. The input points to be projected into feature space. The feature space representation of the given points. Projects an input point into feature space. The input point to be projected into feature space. The parameter of the kernel. The feature space representation of the given point. The kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. The kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Elementwise addition of a and b, storing in result. The first vector to add. The second vector to add. An array to store the result. Elementwise multiplication of scalar a and vector b, storing in result. The scalar to be multiplied. The vector to be multiplied. An array to store the result. Compress a set of support vectors and weights into a single parameter vector. The weights associated with each support vector. The support vectors. The constant (bias) value. A single parameter vector. Gets the number of parameters in the input vectors. Gets the number of parameters in the input vectors. Creates an input vector from the given double values. Creates an input vector with the given dimensions. Elementwise multiplication of vector a and vector b, accumulating in result. The vector to be multiplied. The vector to be multiplied. An array to store the result. Elementwise multiplication of vector a and vector b, accumulating in result. The vector to be multiplied. The vector to be multiplied. An array to store the result. Converts the input vectors to a double-precision representation. Converts the input vectors to a double-precision representation. Logarithm Kernel. The Log kernel seems to be particularly interesting for images, but is only conditionally positive definite. Constructs a new Log Kernel The kernel's degree. Constructs a new Log Kernel The kernel's degree. Gets or sets the kernel's degree. Log Kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Log Kernel function. Distance z in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Multiquadric Kernel. The multiquadric kernel is only conditionally positive-definite. Gets or sets the kernel's constant value. Constructs a new Multiquadric Kernel. The constant term theta. Constructs a new Multiquadric Kernel. Multiquadric Kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Multiquadric Kernel function. Distance z in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Polynomial Kernel. Constructs a new Polynomial kernel of a given degree. The polynomial degree for this kernel. The polynomial constant for this kernel. Default is 1. Constructs a new Polynomial kernel of a given degree. The polynomial degree for this kernel. Gets or sets the kernel's polynomial degree. Gets or sets the kernel's polynomial constant term. Polynomial kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Polynomial kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Polynomial kernel function. Distance z in input space. Dot product in feature (kernel) space. Computes the squared distance in feature space between two points given in input space. Vector x in input space. Vector y in input space. Squared distance between x and y in feature (kernel) space. Computes the squared distance in feature space between two points given in input space. Vector x in input space. Vector y in input space. Squared distance between x and y in feature (kernel) space. Computes the distance in input space between two points given in feature space. Vector x in feature (kernel) space. Vector y in feature (kernel) space. Distance between x and y in input space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Projects an input point into feature space. The input point to be projected into feature space. The feature space representation of the given point. Projects a set of input points into feature space. The input points to be projected into feature space. The feature space representation of the given points. Projects an input point into feature space. The input point to be projected into feature space. The parameter of the kernel. The parameter of the kernel. The feature space representation of the given point. Power Kernel, also known as the (Unrectified) Triangular Kernel. The Power kernel is also known as the (unrectified) triangular kernel. It is an example of scale-invariant kernel (Sahbi and Fleuret, 2004) and is also only conditionally positive definite. Gets or sets the kernel's degree. Constructs a new Power Kernel. The kernel's degree. Constructs a new Power Kernel. The kernel's degree. Power Kernel Function Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Power Kernel Function Distance z in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Precomputed Gram Matrix Kernel. The following example shows how to learn a multi-class SVM using a precomputed kernel matrix, obtained from a Polynomial kernel. The following example shows how to learn a simple binary SVM using a precomputed kernel matrix obtained from a Gaussian kernel. Constructs a new Precomputed Matrix Kernel. Constructs a new Precomputed Matrix Kernel. Gets or sets the precomputed Gram matrix for this kernel. Gets or sets the precomputed Gram matrix for this kernel. Gets a vector of indices that can be fed as the inputs of a learning algorithm. The learning algorithm will then use the indices to refer to each element in the precomputed kernel matrix. Gets the dimension of the basis spawned by the initial training vectors. Gets the current number of training samples. Kernel function. An array containing a first element with the index for input vector x. An array containing a first element with the index for input vector y. Dot product in feature (kernel) space. The kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Rational Quadratic Kernel. The Rational Quadratic kernel is less computationally intensive than the Gaussian kernel and can be used as an alternative when using the Gaussian becomes too expensive. Gets or sets the kernel's constant term. Constructs a new Rational Quadratic Kernel. The constant term theta. Rational Quadratic Kernel Function Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Rational Quadratic Kernel Function Distance z in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Sigmoid Kernel. Sigmoid kernel of the form k(x,z) = tanh(a * x'z + c). Sigmoid kernels are only conditionally positive definite for some values of a and c, and therefore may not induce a reproducing kernel Hilbert space. However, they have been successfully used in practice (Schölkopf and Smola, 2002). Estimates suitable values for the sigmoid kernel by exploring the response area of the tanh function. An input data set. A Sigmoid kernel initialized with appropriate values. Estimates suitable values for the sigmoid kernel by exploring the response area of the tanh function. An input data set. The size of the subset to use in the estimation. The interquartile range for the data. A Sigmoid kernel initialized with appropriate values. Computes the set of all distances between all points in a random subset of the data. The inputs points. The number of samples. Constructs a Sigmoid kernel. Constructs a Sigmoid kernel. Alpha parameter. Typically should be set to a small positive value. Default is 0.01. Constant parameter. Typically should be set to a negative value. Default is -e (Euler's constant). Gets or sets the kernel's alpha parameter. In a sigmoid kernel, alpha is a inner product coefficient for the hyperbolic tangent function. Gets or sets the kernel's constant term. Sigmoid kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Sigmoid kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Computes the squared distance in feature space between two points given in input space. Vector x in input space. Vector y in input space. Squared distance between x and y in feature (kernel) space. Sigmoid kernel function. Distance z in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Sparse Cauchy Kernel. The Cauchy kernel comes from the Cauchy distribution (Basak, 2008). It is a long-tailed kernel and can be used to give long-range influence and sensitivity over the high dimension space. Gets or sets the kernel's sigma value. Constructs a new Sparse Cauchy Kernel. The value for sigma. Cauchy Kernel Function Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Sparse Gaussian Kernel. The Gaussian kernel requires tuning for the proper value of σ. Different approaches to this problem includes the use of brute force (i.e. using a grid-search algorithm) or a gradient ascent optimization. For an example on how to create a sparse kernel, please see the page. References: P. F. Evangelista, M. J. Embrechts, and B. K. Szymanski. Some Properties of the Gaussian Kernel for One Class Learning. Available on: http://www.cs.rpi.edu/~szymansk/papers/icann07.pdf Constructs a new Sparse Gaussian Kernel Constructs a new Sparse Gaussian Kernel The standard deviation for the Gaussian distribution. Default is 1. Gets or sets the sigma value for the kernel. When setting sigma, gamma gets updated accordingly (gamma = 0.5*/sigma^2). Gets or sets the gamma value for the kernel. When setting gamma, sigma gets updated accordingly (gamma = 0.5*/sigma^2). Gaussian Kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Computes the distance in input space between two points given in feature space. Vector x in feature (kernel) space. Vector y in feature (kernel) space. Distance between x and y in input space. Computes the squared distance in input space between two points given in feature space. Vector x in feature (kernel) space. Vector y in feature (kernel) space. Squared distance between x and y in input space. Estimate appropriate values for sigma given a data set. This method uses a simple heuristic to obtain appropriate values for sigma in a radial basis function kernel. The heuristic is shown by Caputo, Sim, Furesjo and Smola, "Appearance-based object recognition using SVMs: which kernel should I use?", 2002. The data set. The number of random samples to analyze. The range of suitable values for sigma. A Gaussian kernel initialized with an appropriate sigma value. Sparse Laplacian Kernel. Constructs a new Laplacian Kernel Constructs a new Laplacian Kernel The sigma slope value. Gets or sets the sigma value for the kernel. When setting sigma, gamma gets updated accordingly (gamma = 0.5*/sigma^2). Gets or sets the gamma value for the kernel. When setting gamma, sigma gets updated accordingly (gamma = 0.5*/sigma^2). Laplacian Kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Computes the distance in input space between two points given in feature space. Vector x in feature (kernel) space. Vector y in feature (kernel) space. Distance between x and y in input space. Estimate appropriate values for sigma given a data set. This method uses a simple heuristic to obtain appropriate values for sigma in a radial basis function kernel. The heuristic is shown by Caputo, Sim, Furesjo and Smola, "Appearance-based object recognition using SVMs: which kernel should I use?", 2002. The data set. The number of random samples to analyze. The range of suitable values for sigma. A Laplacian kernel initialized with an appropriate sigma value. Sparse Linear Kernel. The Sparse Linear kernel accepts inputs in the libsvm sparse format. The following example shows how to teach a kernel support vector machine using the linear sparse kernel to perform the AND classification task using sparse vectors. // Example AND problem double[][] inputs = { new double[] { }, // 0 and 0: 0 (label -1) new double[] { 2,1 }, // 0 and 1: 0 (label -1) new double[] { 1,1 }, // 1 and 0: 0 (label -1) new double[] { 1,1, 2,1 } // 1 and 1: 1 (label +1) }; // Dichotomy SVM outputs should be given as [-1;+1] int[] labels = { // 0, 0, 0, 1 -1, -1, -1, 1 }; // Create a Support Vector Machine for the given inputs // (sparse machines should use 0 as the number of inputs) var machine = new KernelSupportVectorMachine(new SparseLinear(), inputs: 0); // Instantiate a new learning algorithm for SVMs var smo = new SequentialMinimalOptimization(machine, inputs, labels); // Set up the learning algorithm smo.Complexity = 100000.0; // Run double error = smo.Run(); // should be zero double[] predicted = inputs.Apply(machine.Compute).Sign(); // Outputs should be -1, -1, -1, +1 Constructs a new Linear kernel. A constant intercept term. Default is 0. Constructs a new Linear Kernel. Gets or sets the kernel's intercept term. Sparse Linear kernel function. Sparse vector x in input space. Sparse vector y in input space. Dot product in feature (kernel) space. Computes the squared distance in feature space between two points given in input space. Vector x in feature (kernel) space. Vector y in feature (kernel) space. Distance between x and y in input space. Computes the product of two vectors given in sparse representation. The first vector x. The second vector y. The inner product x * y between the given vectors. Computes the squared Euclidean distance of two vectors given in sparse representation. The first vector x. The second vector y. The squared Euclidean distance d² = |x - y|² between the given vectors. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Sparse Polynomial Kernel. Constructs a new Sparse Polynomial kernel of a given degree. The polynomial degree for this kernel. The polynomial constant for this kernel. Default is 1. Constructs a new Polynomial kernel of a given degree. The polynomial degree for this kernel. Gets or sets the kernel's polynomial degree. Gets or sets the kernel's polynomial constant term. Polynomial kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Sparse Sigmoid Kernel. Sigmoid kernels are not positive definite and therefore do not induce a reproducing kernel Hilbert space. However, they have been successfully used in practice (Schölkopf and Smola, 2002). Constructs a Sparse Sigmoid kernel. Alpha parameter. Constant parameter. Constructs a Sparse Sigmoid kernel. Gets or sets the kernel's gamma parameter. In a sigmoid kernel, gamma is a inner product coefficient for the hyperbolic tangent function. Gets or sets the kernel's constant term. Sigmoid kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Spherical Kernel. The spherical kernel comes from a statistics perspective. It is an example of an isotropic stationary kernel and is positive definite in R^3. Gets or sets the kernel's sigma value. Constructs a new Spherical Kernel. Value for sigma. Spherical Kernel Function Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Spherical Kernel Function Distance z in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Infinite Spline Kernel function. The Spline kernel is given as a piece-wise cubic polynomial, as derived in the works by Gunn (1998). Constructs a new Spline Kernel. Spline Kernel Function Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Taylor approximation for the explicit Gaussian kernel. References: Lin, Keng-Pei, and Ming-Syan Chen. "Efficient kernel approximation for large-scale support vector machine classification." Proceedings of the Eleventh SIAM International Conference on Data Mining. 2011. Available on: http://epubs.siam.org/doi/pdf/10.1137/1.9781611972818.19 Gets or sets the approximation degree for this kernel. Default is 1024. Gets or sets the Gaussian kernel being approximated by this Taylor expansion. Constructs a new kernel with the given sigma. The kernel's sigma parameter. The Gaussian approximation degree. Default is 1024. Constructs a new kernel with the given sigma. The original Gaussian kernel to be approximated. The Gaussian approximation degree. Default is 1024. Gaussian Kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Projects an input point into feature space. The input point to be projected into feature space. The feature space representation of the given point. Projects a set of input points into feature space. The input points to be projected into feature space. The feature space representation of the given points. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Elementwise multiplication of scalar a and vector b, storing in result. The scalar to be multiplied. The vector to be multiplied. An array to store the result. Compress a set of support vectors and weights into a single parameter vector. The weights associated with each support vector. The support vectors. The constant (bias) value. A single parameter vector. Computes the squared distance in input space between two points given in feature space. Vector x in feature (kernel) space. Vector y in feature (kernel) space. Squared distance between x and y in input space. Computes the distance d(x,y) between points and . The first point x. The second point y. A double-precision value representing the distance d(x,y) between and according to the distance function implemented by this class. Elementwise addition of a and b, storing in result. The first vector to add. The second vector to add. An array to store the result. The same vector passed as result. Gets the number of parameters in the input vectors. Creates an input vector from the given double values. Elementwise multiplication of vector a and vector b, accumulating in result. The vector to be multiplied. The vector to be multiplied. An array to store the result. Converts the input vectors to a double-precision representation. Tensor Product combination of Kernels. Gets or sets the inner kernels used in this tensor kernel. Constructs a new additive kernel. Kernels to combine. Tensor Product Kernel Combination function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Thin Spline Plate Kernel. Thin plate splines (TPS) are a spline-based technique for data interpolation and smoothing. Gets or sets the sigma constant for this kernel. Constructs a new ThinSplinePlate Kernel. The value for sigma. Thin Spline Plate Kernel Function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Generalized T-Student Kernel. The Generalized T-Student Kernel is a Mercer Kernel and thus forms a positive semi-definite Kernel matrix (Boughorbel, 2004). It has a similar form to the Inverse Multiquadric Kernel. Gets or sets the degree of this kernel. Constructs a new Generalized T-Student Kernel. The kernel's degree. Generalized T-Student Kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Generalized T-Student Kernel function. Distance z in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Wave Kernel. The Wave kernel is symmetric positive semi-definite (Huang, 2008). Constructs a new Wave Kernel. Value for sigma. Gets or sets the kernel's sigma value. Wave Kernel Function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Wave Kernel Function. Distance z in input space. Dot product in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Wavelet Kernel. In Wavelet analysis theory, one of the common goals is to express or approximate a signal or function using a family of functions generated by dilations and translations of a function called the mother wavelet. The Wavelet kernel uses a mother wavelet function together with dilation and translation constants to produce such representations and build a inner product which can be used by kernel methods. The default wavelet used by this class is the mother function h(x) = cos(1.75x)*exp(-x²/2). References: Li Zhang, Weida Zhou, and Licheng Jiao; Wavelet Support Vector Machine. IEEE Transactions on Systems, Man, and Cybernetics—Part B: Cybernetics, Vol. 34, No. 1, February 2004. Gets or sets the Mother wavelet for this kernel. Gets or sets the wavelet dilation for this kernel. Gets or sets the wavelet translation for this kernel. Gets or sets whether this is an invariant Wavelet kernel. Constructs a new Wavelet kernel. Constructs a new Wavelet kernel. Constructs a new Wavelet kernel. Constructs a new Wavelet kernel. Constructs a new Wavelet kernel. Wavelet kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Wavelet kernel function. Vector x in input space. Vector y in input space. Dot product in feature (kernel) space. Computes the squared distance in feature space between two points given in input space. Vector x in input space. Vector y in input space. Squared distance between x and y in feature (kernel) space. Computes the squared distance in feature space between two points given in input space. Vector x in input space. Vector y in input space. Squared distance between x and y in feature (kernel) space. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Absolute link function. Link functions can be used in many models, such as in and Support Vector Machines. Linear scaling coefficient b (slope). Creates a new Absolute link function. The beta value. Creates a new Absolute link function. The Absolute link function. An input value. The transformed input value. The absolute link function is given by f(x) = abs(x) / b. The mean function. A transformed value. The reverse transformed value. The inverse absolute link function is given by g(x) = B * abs(x). The logarithm of the inverse of the link function. A transformed value. The log of the reverse transformed value. First derivative of the function. The input value. The first derivative of the input value. The first derivative of the absolute link function is given by f'(x) = B. First derivative of the function expressed in terms of it's output. The reverse transformed value. The first derivative of the input value. The first derivative of the absolute link function in terms of y = f(x) is given by f'(y) = B. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Cauchy link function. The Cauchy link function is associated with the Cauchy distribution. Link functions can be used in many models, such as in and Support Vector Machines. Linear scaling coefficient a (intercept). Linear scaling coefficient b (slope). Creates a new Cauchit link function. The beta value. Default is 1/pi. The constant value. Default is 0.5. Creates a new Cauchit link function. The Cauchit link function. An input value. The transformed input value. The Cauchit link function is given by f(x) = tan((x - A) / B). The Cauchit mean (activation) function. A transformed value. The reverse transformed value. The inverse Cauchit link function is given by g(x) = tan(x) * B + A. The logarithm of the inverse of the link function. A transformed value. The log of the reverse transformed value. First derivative of the function. The input value. The first derivative of the input value. The first derivative of the Cauchit link function in terms of y = f(x) is given by f'(y) = B / (x * x + 1) First derivative of the mean function expressed in terms of it's output. The reverse transformed value. The first derivative of the input value. The first derivative of the Cauchit link function in terms of y = f(x) is given by f'(y) = B / (tan((y - A) / B)² + 1) Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Threshold link function. Threshold coefficient b. Creates a new Absolute link function. The threshold value. Creates a new Absolute link function. The Absolute link function. An input value. The transformed input value. The mean function. A transformed value. The reverse transformed value. The logarithm of the inverse of the link function. A transformed value. The log of the reverse transformed value. First derivative of the function. The input value. The first derivative of the input value. First derivative of the function expressed in terms of it's output. The reverse transformed value. The first derivative of the input value. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Sin link function. Linear scaling coefficient a (intercept). Linear scaling coefficient b (slope). Creates a new Sin link function. The beta value. The constant value. Creates a new Sin link function. The Sin link function. An input value. The transformed input value. The Sin mean (activation) function. A transformed value. The reverse transformed value. The logarithm of the inverse of the link function. A transformed value. The log of the reverse transformed value. First derivative of the function. The input value. The first derivative of the input value. First derivative of the function expressed in terms of it's output. The reverse transformed value. The first derivative of the input value. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Natural logarithm of natural logarithm link function. Linear scaling coefficient a (intercept). Linear scaling coefficient b (slope). Creates a new Log-Log link function. The beta value. The constant value. Creates a new Log-Log link function. Creates a Complementary Log-Log link function. The Log-log link function. An input value. The transformed input value. The Log-log mean (activation) function. A transformed value. The reverse transformed value. The logarithm of the inverse of the link function. A transformed value. The log of the reverse transformed value. First derivative of the function. The input value. The first derivative of the input value. First derivative of the function expressed in terms of it's output. The reverse transformed value. The first derivative of the input value. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Natural logarithm link function. The natural logarithm link function is associated with the Poisson distribution. Linear scaling coefficient a (intercept). Linear scaling coefficient b (slope). Creates a new Log link function. The beta value. Default is 1. The constant value. Default is 0. Creates a new Log link function. The link function. An input value. The transformed input value. The mean (activation) function. A transformed value. The reverse transformed value. The mean (activation) function. A transformed value. The reverse transformed value. First derivative of the function. The input value. The first derivative of the input value. First derivative of the function expressed in terms of it's output. The reverse transformed value. The first derivative of the input value. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Inverse squared link function. The inverse squared link function is associated with the Inverse Gaussian distribution. Linear scaling coefficient a (intercept). Linear scaling coefficient b (slope). Creates a new Inverse squared Link function. The beta value. The constant value. Creates a new Inverse squared Link function. The Inverse Squared link function. An input value. The transformed input value. The Inverse Squared mean (activation) function. A transformed value. The reverse transformed value. The logarithm of the inverse of the link function. A transformed value. The log of the reverse transformed value. First derivative of the function. The input value. The first derivative of the input value. First derivative of the function expressed in terms of it's output. The reverse transformed value. The first derivative of the input value. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Probit link function. Creates a new Probit link function. The Probit link function. An input value. The transformed input value. The Probit link function is given by f(x) = Phi^-1(x), in which Phi^-1 is the inverse Normal (Gaussian) cumulative distribution function. The Probit mean (activation) function. A transformed value. The reverse transformed value. The Probit link function is given by g(x) = Phi(x), in which Phi is the Normal (Gaussian) cumulative distribution function. The logarithm of the inverse of the link function. A transformed value. The log of the reverse transformed value. First derivative of the function. The input value. The first derivative of the input value. The first derivative of the identity link function is given by f'(x) = exp(c - (Phi^-1(x))² * 0.5) in which c = -log(sqrt(2*π) and Phi^-1 is the inverse Normal (Gaussian) cumulative distribution function. First derivative of the function expressed in terms of it's output. The reverse transformed value. The first derivative of the input value. The first derivative of the identity link function in terms of y = f(x) is given by f'(y) = exp(c - x * x * 0.5) in which c = -log(sqrt(2*π) and x = Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Link function interface. The link function provides the relationship between the linear predictor and the mean of the distribution function. There are many commonly used link functions, and their choice can be somewhat arbitrary. It can be convenient to match the domain of the link function to the range of the distribution function's mean. References: Wikipedia contributors. "Generalized linear model." Wikipedia, The Free Encyclopedia. The link function. An input value. The transformed input value. The inverse of the link function. A transformed value. The reverse transformed value. The logarithm of the inverse of the link function. A transformed value. The log of the reverse transformed value. First derivative of the function. The input value. The first derivative of the input value. First derivative of the function expressed in terms of it's output. The reverse transformed value. The first derivative of the input value. Identity link function. The identity link function is associated with the Normal distribution. Link functions can be used in many models, such as in and Support Vector Machines. Linear scaling coefficient a (intercept). Linear scaling coefficient b (slope). Creates a new Identity link function. The variance value. The mean value. Creates a new Identity link function. The Identity link function. An input value. The transformed input value. The Identity link function is given by f(x) = (x - A) / B. The mean function. A transformed value. The reverse transformed value. The inverse Identity link function is given by g(x) = B * x + A. The logarithm of the inverse of the link function. A transformed value. The log of the reverse transformed value. First derivative of the function. The input value. The first derivative of the input value. The first derivative of the identity link function is given by f'(x) = B. First derivative of the function expressed in terms of it's output. The reverse transformed value. The first derivative of the input value. The first derivative of the identity link function in terms of y = f(x) is given by f'(y) = B. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Logit link function. The Logit link function is associated with the Binomial and Multinomial distributions. Linear scaling coefficient a (intercept). Linear scaling coefficient b (slope). Creates a new Logit link function. The beta value. Default is 1. The constant value. Default is 0. Initializes a new instance of the class. The Logit link function. An input value. The transformed input value. The inverse Logit link function is given by f(x) = (Math.Log(x / (1.0 - x)) - A) / B. The Logit mean (activation) function. A transformed value. The reverse transformed value. The inverse Logit link function is given by g(x) = 1.0 / (1.0 + Math.Exp(-z) in which z = B * x + A. The logarithm of the inverse of the link function. A transformed value. The log of the reverse transformed value. First derivative of the function. The input value. The first derivative of the input value. The first derivative of the identity link function is given by f'(x) = y * (1.0 - y) where y = f(x) is the Logit function. First derivative of the mean function expressed in terms of it's output. The reverse transformed value. The first derivative of the input value. The first derivative of the Logit link function in terms of y = f(x) is given by y * (1.0 - y). Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Inverse link function. The inverse link function is associated with the Exponential and Gamma distributions. Linear scaling coefficient a (intercept). Linear scaling coefficient b (slope). Creates a new Inverse link function. The alpha value. The constant value. Creates a new Inverse link function. The Inverse link function. An input value. The transformed input value. The Inverse mean (activation) function. A transformed value. The reverse transformed value. The logarithm of the inverse of the link function. A transformed value. The log of the reverse transformed value. First derivative of the function. The input value. The first derivative of the input value. First derivative of the function expressed in terms of it's output. The reverse transformed value. The first derivative of the input value. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Set of statistics functions. This class represents collection of common functions used in statistics. Every Matrix function assumes data is organized in a table-like model, where Columns represents variables and Rows represents a observation of each variable. Creates Tukey's box plot inner fence. Creates Tukey's box plot outer fence. Gets the rank of a sample, often used with order statistics. Gets the rank of a sample, often used with order statistics. Gets the number of ties and distinct elements in a rank vector. Gets the number of ties and distinct elements in a rank vector. Generates a random matrix. The size of the square matrix. The minimum value for a diagonal element. The maximum size for a diagonal element. A square, positive-definite matrix which can be interpreted as a covariance matrix. Computes the kernel distance for a kernel function even if it doesn't implement the interface. Can be used to check the proper implementation of the distance function. The kernel function whose distance needs to be evaluated. An input point x given in input space. An input point y given in input space. The distance between and in kernel (feature) space. Generates the Standard Scores, also known as Z-Scores, from the given data. A number multi-dimensional array containing the matrix values. The Z-Scores for the matrix. Generates the Standard Scores, also known as Z-Scores, from the given data. A number multi-dimensional array containing the matrix values. The mean value of the given values, if already known. The values' standard deviation vector, if already known. The Z-Scores for the matrix. Generates the Standard Scores, also known as Z-Scores, from the given data. A number multi-dimensional array containing the matrix values. The Z-Scores for the matrix. Generates the Standard Scores, also known as Z-Scores, from the given data. A number multi-dimensional array containing the matrix values. The mean value of the given values, if already known. The values' standard deviation vector, if already known. The Z-Scores for the matrix. Centers an observation, subtracting the empirical mean from each element in the observation vector. An array of double precision floating-point numbers. The destination array where the result of this operation should be stored. Centers an observation, subtracting the empirical mean from each element in the observation vector. An array of double precision floating-point numbers. The mean of the , if already known. The destination array where the result of this operation should be stored. Centers column data, subtracting the empirical mean from each variable. A matrix where each column represent a variable and each row represent a observation. True to perform the operation in place, altering the original input matrix. Centers column data, subtracting the empirical mean from each variable. A matrix where each column represent a variable and each row represent a observation. The mean value of the given values, if already known. True to perform the operation in place, altering the original input matrix. Centers column data, subtracting the empirical mean from each variable. A matrix where each column represent a variable and each row represent a observation. True to perform the operation in place, altering the original input matrix. Centers column data, subtracting the empirical mean from each variable. A matrix where each column represent a variable and each row represent a observation. The mean value of the given values, if already known. True to perform the operation in place, altering the original input matrix. Standardizes column data, removing the empirical standard deviation from each variable. This method does not remove the empirical mean prior to execution. An array of double precision floating-point numbers. True to perform the operation in place, altering the original input matrix. Standardizes column data, removing the empirical standard deviation from each variable. This method does not remove the empirical mean prior to execution. An array of double precision floating-point numbers. The standard deviation of the given , if already known. True to perform the operation in place, altering the original input matrix. Standardizes column data, removing the empirical standard deviation from each variable. This method does not remove the empirical mean prior to execution. A matrix where each column represent a variable and each row represent a observation. True to perform the operation in place, altering the original input matrix. Standardizes column data, removing the empirical standard deviation from each variable. This method does not remove the empirical mean prior to execution. A matrix where each column represent a variable and each row represent a observation. The values' standard deviation vector, if already known. The minimum value that will be taken as the standard deviation in case the deviation is too close to zero. True to perform the operation in place, altering the original input matrix. Standardizes column data, removing the empirical standard deviation from each variable. This method does not remove the empirical mean prior to execution. A matrix where each column represent a variable and each row represent a observation. True to perform the operation in place, altering the original input matrix. Standardizes column data, removing the empirical standard deviation from each variable. This method does not remove the empirical mean prior to execution. A matrix where each column represent a variable and each row represent a observation. The minimum value that will be taken as the standard deviation in case the deviation is too close to zero. The values' standard deviation vector, if already known. True to perform the operation in place, altering the original input matrix. Obsolete. Please use instead. Obsolete. Please use instead. Obsolete. Please use instead. Obsolete. Please use instead. Obsolete. Please use instead. Obsolete. Please use instead. Obsolete. Please use instead. Obsolete. Please use instead. Obsolete. Please use instead. Gets the coefficient of determination, as known as the R-Squared (R²) The coefficient of determination is used in the context of statistical models whose main purpose is the prediction of future outcomes on the basis of other related information. It is the proportion of variability in a data set that is accounted for by the statistical model. It provides a measure of how well future outcomes are likely to be predicted by the model. The R^2 coefficient of determination is a statistical measure of how well the regression approximates the real data points. An R^2 of 1.0 indicates that the regression perfectly fits the data. Obsolete. Please use instead. Obsolete. Please use instead. Obsolete. Please use instead. Obsolete. Please use instead. Obsolete. Please use instead. Obsolete. Please use instead. Obsolete. Please use instead. Computes the whitening transform for the given data, making its covariance matrix equals the identity matrix. A matrix where each column represent a variable and each row represent a observation. The base matrix used in the transformation. The transformed source data (which now has unit variance). Computes the whitening transform for the given data, making its covariance matrix equals the identity matrix. A matrix where each column represent a variable and each row represent a observation. The base matrix used in the transformation. The transformed source data (which now has unit variance). Creates a new distribution that has been fit to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Creates a new distribution that has been fit to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Creates a new distribution that has been fit to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Creates a new distribution that has been fit to a given set of observations. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Creates a new distribution that has been fit to a given set of observations. The distribution whose parameters should be fitted to the samples. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Creates a new distribution that has been fit to a given set of observations. The distribution whose parameters should be fitted to the samples. The array of observations to fit the model against. The array elements can be either of type double (for univariate data) or type double[] (for multivariate data). The weight vector containing the weight for each of the samples. Optional arguments which may be used during fitting, such as regularization constants and additional parameters. Contains statistical models with direct applications in machine learning, such as Hidden Markov Models, Conditional Random Fields, Hidden Conditional Random Fields and linear and logistic regressions. The main algorithms and techniques available on this namespaces are certainly the hidden Markov models. The Accord.NET Framework contains one of the most popular and well-tested offerings for creating, training and validating Markov models using either discrete observations or any arbitrary discrete, continuous or mixed probability distributions to model the observations. This namespace also brings Conditional Random Fields, that alongside the Markov models can be used to build sequence classifiers, perform gesture recognition, and can even be combined with neural networks to create hybrid models. Other models include regression and survival models. The namespace class diagram is shown below. Please note that class diagrams for each of the inner namespaces are also available within their own documentation pages. Contains classes related to Conditional Random Fields, Hidden Conditional Random Fields and their learning algorithms. The namespace class diagram is shown below. Please note that class diagrams for each of the inner namespaces are also available within their own documentation pages. Utility methods to assist in the creating of s. Creates a new from the given . The classifier. A that implements exactly the same model as the given . Creates a new from the given . The classifier. A that implements exactly the same model as the given . Creates a new from the given . The classifier. A that implements exactly the same model as the given . Creates a new from the given . The classifier. A that implements exactly the same model as the given . Creates a new from the given . The classifier. A that implements exactly the same model as the given . Creates a new from the given . The classifier. A that implements exactly the same model as the given . Contains learning algorithms for CRFs and HCRFs, such as Conjugate Gradient, L-BFGS and RProp-based learning. The namespace class diagram is shown below. Abstract base class for hidden Conditional Random Fields algorithms. Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. Gets or sets the potential function to be used if this learning algorithm needs to create a new . Gets or sets the model being trained. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Creates an instance of the model to be learned. Inheritors of this abstract class must define this method so new models can be created from the training data. Inheritors should implement the actual learning algorithm in this method. Base class for Hidden Conditional Random Fields learning algorithms based on gradient optimization algorithms. Gets the optimization algorithm being used. Gets or sets the amount of the parameter weights which should be included in the objective function. Default is 0 (do not include regularization). Gets or sets the tolerance value used to determine whether the algorithm has converged. The tolerance. Gets or sets the maximum number of iterations performed by the learning algorithm. The maximum iterations. Gets or sets whether the algorithm has converged. true if this instance has converged; otherwise, false. Gets or sets the parallelization options for this algorithm. The parallel options. Constructs a new L-BFGS learning algorithm. Inheritors of this class should create the optimization algorithm in this method, using the current and settings. Runs the learning algorithm. Online learning is not supported. Runs the learning algorithm with the specified input training observations and corresponding output labels. The training observations. The observation's labels. Online learning is not supported. Performs application-defined tasks associated with freeing, releasing, or resetting unmanaged resources. Releases unmanaged resources and performs other cleanup operations before the is reclaimed by garbage collection. Releases unmanaged and - optionally - managed resources true to release both managed and unmanaged resources; false to release only unmanaged resources. Linear Gradient calculator class for Hidden Conditional Random Fields. The type of the observations being modeled. Gets or sets the inputs to be used in the next call to the Objective or Gradient functions. Gets or sets the outputs to be used in the next call to the Objective or Gradient functions. Gets or sets the current parameter vector for the model being learned. Gets the error computed in the last call to the gradient or objective functions. Gets or sets the amount of the parameter weights which should be included in the objective function. Default is 0 (do not include regularization). Gets the model being trained. Initializes a new instance of the class. Initializes a new instance of the class. The model to be trained. Computes the gradient (vector of derivatives) vector for the cost function, which may be used to guide optimization. The parameter vector lambda to use in the model. The inputs to compute the cost function. The respective outputs to compute the cost function. The value of the gradient vector for the given parameters. Computes the gradient (vector of derivatives) vector for the cost function, which may be used to guide optimization. The parameter vector lambda to use in the model. The inputs to compute the cost function. The respective outputs to compute the cost function. The value of the gradient vector for the given parameters. Computes the gradient using the input/outputs stored in this object. This method is not thread safe. The parameter vector lambda to use in the model. The value of the gradient vector for the given parameters. Computes the gradient using the input/outputs stored in this object. This method is thread-safe. The value of the gradient vector for the given parameters. Computes the objective (cost) function for the Hidden Conditional Random Field (negative log-likelihood). The parameter vector lambda to use in the model. The inputs to compute the cost function. The respective outputs to compute the cost function. The value of the objective function for the given parameters. Computes the objective (cost) function for the Hidden Conditional Random Field (negative log-likelihood) using the input/outputs stored in this object. The parameter vector lambda to use in the model. Computes the objective (cost) function for the Hidden Conditional Random Field (negative log-likelihood) using the input/outputs stored in this object. Performs application-defined tasks associated with freeing, releasing, or resetting unmanaged resources. Releases unmanaged resources and performs other cleanup operations before the is reclaimed by garbage collection. Releases unmanaged and - optionally - managed resources true to release both managed and unmanaged resources; false to release only unmanaged resources. Common interface for Hidden Conditional Random Fields learning algorithms. For an example on how to learn Hidden Conditional Random Fields, please see the Hidden Resilient Gradient Learning page. All learning algorithms can be utilized in a similar manner. Runs one iteration of the learning algorithm with the specified input training observation and corresponding output label. The training observations. The observation labels. The error in the last iteration. Runs one iteration of learning algorithm with the specified input training observations and corresponding output labels. The training observations. The observations' labels. The error in the last iteration. Runs the learning algorithm with the specified input training observation and corresponding output label until convergence. The training observations. The observations' labels. The error in the last iteration. Common interface for Conditional Random Fields learning algorithms. Runs the learning algorithm with the specified input training observations and corresponding output labels. The training observations. The observation's labels. Conjugate Gradient learning algorithm for Hidden Conditional Hidden Fields. For an example on how to learn Hidden Conditional Random Fields, please see the Hidden Resilient Gradient Learning page. All learning algorithms can be utilized in a similar manner. Please use HasConverged instead. Gets the current iteration number. The current iteration. Please use MaxIterations instead. Occurs when the current learning progress has changed. Constructs a new Conjugate Gradient learning algorithm. Inheritors of this class should create the optimization algorithm in this method, using the current and settings. ConjugateGradient. Quasi-Newton (L-BFGS) learning algorithm for Hidden Conditional Hidden Fields. The type of the observations. The next example shows how to use the learning algorithms in a real-world dataset, including training and testing in separate sets and evaluating its performance: Gets the current iteration number. The current iteration. Constructs a new L-BFGS learning algorithm. Constructs a new L-BFGS learning algorithm. Inheritors of this class should create the optimization algorithm in this method, using the current and settings. BoundedBroydenFletcherGoldfarbShanno. Quasi-Newton (L-BFGS) learning algorithm for Conditional Hidden Fields. Gets or sets the model being trained. Gets or sets the potential function to use if this learning algorithm needs to create a new . Gets or sets the tolerance value used to determine whether the algorithm has converged. The tolerance. Gets or sets the maximum number of iterations performed by the learning algorithm. The maximum iterations. Gets the current iteration number. The current iteration. Gets or sets whether the algorithm has converged. true if this instance has converged; otherwise, false. Constructs a new L-BFGS learning algorithm. Constructs a new L-BFGS learning algorithm. Runs the learning algorithm with the specified input training observations and corresponding output labels. The training observations. The observation's labels. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Stochastic Gradient Descent learning algorithm for Hidden Conditional Hidden Fields. Gets or sets the learning rate to use as the gradient descent step size. Default value is 1e-1. Gets or sets the maximum change in the average log-likelihood after an iteration of the algorithm used to detect convergence. Gets or sets the maximum number of iterations performed by the learning algorithm. Please use MaxIterations instead. Gets or sets the number of performed iterations. Gets or sets whether the algorithm has converged. true if this instance has converged; otherwise, false. Gets or sets a value indicating whether this should use stochastic gradient updates. true for stochastic updates; otherwise, false. Gets or sets the amount of the parameter weights which should be included in the objective function. Default is 0 (do not include regularization). Gets or sets the parallelization options for this algorithm. The parallel options. Occurs when the current learning progress has changed. Initializes a new instance of the class. Initializes a new instance of the class. The model to be trained. Resets the step size. Runs the learning algorithm with the specified input training observations and corresponding output labels. The training observations. The observation's labels. The error in the last iteration. Runs the learning algorithm with the specified input training observations and corresponding output labels. The training observations. The observation's labels. The error in the last iteration. Runs the learning algorithm. Runs one iteration of the learning algorithm with the specified input training observation and corresponding output label. The training observations. The observation's labels. The error in the last iteration. Raises the event. The ProgressChangedEventArgs instance containing the event data. Performs application-defined tasks associated with freeing, releasing, or resetting unmanaged resources. Releases unmanaged resources and performs other cleanup operations before the is reclaimed by garbage collection. Releases unmanaged and - optionally - managed resources true to release both managed and unmanaged resources; false to release only unmanaged resources. Resilient Gradient Learning. The type of the observations being modeled. The next example shows how to use the learning algorithms in a real-world dataset, including training and testing in separate sets and evaluating its performance: Gets or sets a value indicating whether this should use stochastic gradient updates. Default is true. true for stochastic updates; otherwise, false. Gets or sets the amount of the parameter weights which should be included in the objective function. Default is 0 (do not include regularization). Occurs when the current learning progress has changed. Gets or sets the maximum possible update step, also referred as delta min. Default is 50. Gets or sets the minimum possible update step, also referred as delta max. Default is 1e-6. Gets the decrease parameter, also referred as eta minus. Default is 0.5. Gets the increase parameter, also referred as eta plus. Default is 1.2. Gets or sets the maximum change in the average log-likelihood after an iteration of the algorithm used to detect convergence. Please use MaxIterations instead. Gets or sets the maximum number of iterations performed by the learning algorithm. Gets or sets the number of performed iterations. Gets or sets whether the algorithm has converged. true if this instance has converged; otherwise, false. Gets or sets the parallelization options for this algorithm. The parallel options. Initializes a new instance of the class. Initializes a new instance of the class. Model to teach. Runs one iteration of the learning algorithm with the specified input training observation and corresponding output label. The training observations. The observation's labels. The error in the last iteration. Runs the learning algorithm. Runs the learning algorithm with the specified input training observations and corresponding output labels. The training observations. The observation's labels. The error in the last iteration. Runs one iteration of the learning algorithm with the specified input training observation and corresponding output label. The training observations. The observation's labels. The error in the last iteration. Raises the event. The ProgressChangedEventArgs instance containing the event data. Resets the current update steps using the given learning rate. Performs application-defined tasks associated with freeing, releasing, or resetting unmanaged resources. Releases unmanaged resources and performs other cleanup operations before the is reclaimed by garbage collection. Releases unmanaged and - optionally - managed resources true to release both managed and unmanaged resources; false to release only unmanaged resources. Factor Potential function for a Markov model whose states are independent distributions composed of discrete and Normal distributed components. Creates a new factor (clique) potential function. The owner . The number of states in this clique potential. The index of this factor potential in the . The lookup table of states where the independent distributions begin. The index of the first class label feature in the 's parameter vector. The number of class label features in this factor. The index of the first edge feature in the 's parameter vector. The number of edge features in this factor. The index of the first state feature in the 's parameter vector. The number of state features in this factor. Computes the factor potential function for the given parameters. The previous state in a given sequence of states. The current state in a given sequence of states. The observation vector. The index of the observation in the current state of the sequence. The output class label for the sequence. The value of the factor potential function evaluated for the given parameters. Normal-density Markov Factor Potential (Clique Potential) function. Creates a new factor (clique) potential function. The owner . The number of states in this clique potential. The index of this factor potential in the . The index of the first class label feature in the 's parameter vector. The number of class label features in this factor. The index of the first edge feature in the 's parameter vector. The number of edge features in this factor. The index of the first state feature in the 's parameter vector. The number of state features in this factor. Computes the factor potential function for the given parameters. The previous state in a given sequence of states. The current state in a given sequence of states. The observation vector. The index of the observation in the current state of the sequence. The output class label for the sequence. The value of the factor potential function evaluated for the given parameters. Multivariate Normal Markov Model Factor Potential (Clique Potential) function. Creates a new factor (clique) potential function. The owner . The number of states in this clique potential. The index of this factor potential in the . The number of dimensions for the multivariate observations. The index of the first class label feature in the 's parameter vector. The number of class label features in this factor. The index of the first edge feature in the 's parameter vector. The number of edge features in this factor. The index of the first state feature in the 's parameter vector. The number of state features in this factor. Computes the factor potential function for the given parameters. The previous state in a given sequence of states. The current state in a given sequence of states. The observation vector. The index of the observation in the current state of the sequence. The output class label for the sequence. The value of the factor potential function evaluated for the given parameters. Discrete-density Markov Factor Potential (Clique Potential) function. Gets the number of symbols in the discrete alphabet used by this Markov model factor. Creates a new factor (clique) potential function. The owner . The number of states in this clique potential. The index of this factor potential in the . The number of symbols in the discrete alphabet. The index of the first class label feature in the 's parameter vector. The number of class label features in this factor. The index of the first edge feature in the 's parameter vector. The number of edge features in this factor. The index of the first state feature in the 's parameter vector. The number of state features in this factor. Computes the factor potential function for the given parameters. The previous state in a given sequence of states. The current state in a given sequence of states. The observation vector. The index of the observation in the current state of the sequence. The output class label for the sequence. The value of the factor potential function evaluated for the given parameters. Potential function modeling Hidden Markov Classifiers. Constructs a new potential function modeling Hidden Markov Models. A hidden Markov sequence classifier. Constructs a new potential function modeling Hidden Markov Models. A hidden Markov sequence classifier. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Potential function modeling Hidden Markov Models. Gets the total number of dimensions for this multivariate potential function. Constructs a new potential function modeling Hidden Markov Models. A hidden Markov sequence classifier. True to include class features (priors), false otherwise. Constructs a new potential function modeling Hidden Markov Models. A hidden Markov sequence classifier. True to include class features (priors), false otherwise. Constructs a new potential function modeling Hidden Markov Models. A hidden Markov sequence classifier. True to include class features (priors), false otherwise. Constructs a new potential function modeling Hidden Markov Models. A hidden Markov sequence classifier. True to include class features (priors), false otherwise. Constructs a new potential function modeling Hidden Markov Models. A hidden Markov sequence classifier. True to include class features (priors), false otherwise. Constructs a new potential function modeling Hidden Markov Models. A normal density hidden Markov. Constructs a new potential function modeling Hidden Markov Models. A normal density hidden Markov. Constructs a new potential function modeling Hidden Markov Models. A hidden Markov sequence classifier. True to include class features (priors), false otherwise. Constructs a new potential function modeling Hidden Markov Models. A hidden Markov sequence classifier. True to include class features (priors), false otherwise. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Potential function modeling Hidden Markov Models. Gets the number of symbols assumed by this function. Constructs a new potential function modeling Hidden Markov Models. The number of states. The number of symbols. The number of output classes. Constructs a new potential function modeling Hidden Markov Models. The classifier model. True to include class features (priors), false otherwise. Constructs a new potential function modeling Hidden Markov Models. The classifier model. True to include class features (priors), false otherwise. Constructs a new potential function modeling Hidden Markov Models. The number of states. The number of symbols. The random number generator to use when initializing weights. Constructs a new potential function modeling Hidden Markov Models. The hidden Markov model. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Factor Potential (Clique Potential) function. The type of the observations being modeled. Gets the to which this factor potential belongs. Gets the number of model states assumed by this function. Gets the index of this factor in the potential function. Gets the segment of the parameter vector which contains parameters respective to all features from this factor. Gets the segment of the parameter vector which contains parameters respective to the edge features. Gets the segment of the parameter vector which contains parameters respective to the state features. Gets the segment of the parameter vector which contains parameters respective to the output features. Creates a new factor (clique) potential function. The owner . The number of states in this clique potential. The index of this factor potential in the . The index of the first edge feature in the 's parameter vector. The number of edge features in this factor. The index of the first state feature in the 's parameter vector. The number of state features in this factor. The index of the first class feature in the 's parameter vector. The number of class features in this factor. Creates a new factor (clique) potential function. The owner . The number of states in this clique potential. The index of this factor potential in the . Computes the factor potential function for the given parameters. A state sequence. A sequence of observations. The output class label for the sequence. The value of the factor potential function evaluated for the given parameters. Computes the factor potential function for the given parameters. A state sequence. A sequence of observations. The output class label for the sequence. The value of the factor potential function evaluated for the given parameters. Computes the factor potential function for the given parameters. The previous state in a given sequence of states. The current state in a given sequence of states. The observation vector. The index of the observation in the current state of the sequence. The output class label for the sequence. The value of the factor potential function evaluated for the given parameters. Returns an enumerator that iterates through all features in this factor potential function. An object that can be used to iterate through the collection. Returns an enumerator that iterates through all features in this factor potential function. An object that can be used to iterate through the collection. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Common interface for CRF's Potential functions. Gets the factor potentials (also known as clique potentials) functions composing this potential function. Gets the number of output classes assumed by this function. Gets or sets the set of weights for each feature function. The weights for each of the feature functions. Gets the feature functions composing this potential function. Gets the feature vector for a given input and sequence of states. Base implementation for potential functions. The type of the observations modeled. Gets the factor potentials (also known as clique potentials) functions composing this potential function. Gets the number of output classes assumed by this function. Gets or sets the set of weights for each feature function. The weights for each of the feature functions. Gets the feature functions composing this potential function. Computes the factor potential function for the given parameters. A state sequence. A sequence of observations. The output class label for the sequence. The value of the factor potential function evaluated for the given parameters. Linear-Chain Conditional Random Field (CRF). A conditional random field (CRF) is a type of discriminative undirected probabilistic graphical model. It is most often used for labeling or parsing of sequential data, such as natural language text or biological sequences and computer vision. This implementation is currently experimental. Gets the number of states in this linear-chain Conditional Random Field. Gets the potential function encompassing all feature functions for this model. Initializes a new instance of the class. The number of states for the model. The potential function to be used by the model. Computes the partition function, as known as Z(x), for the specified observations. Computes the Log of the partition function. Computes the log-likelihood of the model for the given observations. This method is equivalent to the HiddenMarkovModel.LogLikelihood(TObservation[], int[]) method. Computes the most likely state labels for the given observations, returning the overall sequence probability for this model. Computes the most likely state labels for the given observations, returning the overall sequence log-likelihood for this model. Saves the random field to a stream. The stream to which the random field is to be serialized. Saves the random field to a stream. The stream to which the random field is to be serialized. Loads a random field from a stream. The stream from which the random field is to be deserialized. The deserialized random field. Loads a random field from a file. The path to the file from which the random field is to be deserialized. The deserialized random field. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Computes class-label decisions for the given . The input vectors that should be classified as any of the possible classes. The location where to store the class-labels. A set of class-labels that best describe the vectors according to this classifier. Common interface for Conditional Random Fields feature functions The type of the observations being modeled. Gets the potential function containing this feature. Computes the feature for the given parameters. The previous state. The current state. The observations. The index of the current observation. The output class label for the sequence. Computes the feature for the given parameters. The sequence of states. The sequence of observations. The output class label for the sequence. The result of the feature. Computes the probability of occurrence of this feature given a sequence of observations. The matrix of forward state probabilities. The matrix of backward state probabilities. The observation sequence. The output class label for the sequence. The probability of occurrence of this feature. Computes the log-probability of occurrence of this feature given a sequence of observations. The matrix of forward state log-probabilities. The matrix of backward state log-probabilities. The observation sequence. The output class label for the sequence. The probability of occurrence of this feature. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Base implementation for Conditional Random Fields feature functions. The type of the observations being modeled. Gets the potential function containing this feature. Gets the potential factor to which this feature belongs. Creates a new feature. The potential function to which this feature belongs. The index of the potential factor to which this feature belongs. Computes the feature for the given parameters. The sequence of states. The sequence of observations. The output class label for the sequence. The result of the feature. Computes the feature for the given parameters. The previous state. The current state. The observations. The index of the current observation. The output class label for the sequence. Computes the probability of occurrence of this feature given a sequence of observations. The matrix of forward state probabilities. The matrix of backward state probabilities. The observation sequence. The output class label for the sequence. The probability of occurrence of this feature. Computes the log-probability of occurrence of this feature given a sequence of observations. The matrix of forward state log-probabilities. The matrix of backward state log-probabilities. The observation sequence. The output class label for the sequence. The probability of occurrence of this feature. State feature for Hidden Markov Model symbol emission probabilities. Constructs a new symbol emission feature. The potential function to which this feature belongs. The index of the potential factor to which this feature belongs. The state for the emission. The emission symbol. The observation dimension this emission feature applies to. Computes the feature for the given parameters. The previous state. The current state. The observations. The index of the current observation. The output class label for the sequence. Computes the probability of occurrence of this feature given a sequence of observations. The matrix of forward state probabilities. The matrix of backward state probabilities. The observation sequence. The output class label for the sequence. The probability of occurrence of this feature. Computes the log-probability of occurrence of this feature given a sequence of observations. The matrix of forward state log-probabilities. The matrix of backward state log-probabilities. The observation sequence. The output class label for the sequence. The probability of occurrence of this feature. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. State occupancy function for modeling continuous- density Hidden Markov Model state emission features. Constructs a state occupancy feature. The potential function to which this feature belongs. The index of the potential factor to which this feature belongs. The current state. Computes the feature for the given parameters. The previous state. The current state. The observations. The index of the current observation. The output class label for the sequence. Computes the probability of occurrence of this feature given a sequence of observations. The matrix of forward state probabilities. The matrix of backward state probabilities. The observation sequence. The output class label for the sequence. The probability of occurrence of this feature. Computes the log-probability of occurrence of this feature given a sequence of observations. The matrix of forward state log-probabilities. The matrix of backward state log-probabilities. The observation sequence. The output class label for the sequence. The probability of occurrence of this feature. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. State feature for second moment Gaussian emission probabilities. Constructs a new symbol emission feature. The potential function to which this feature belongs. The index of the potential factor to which this feature belongs. The state for the emission. The dimension of the multidimensional observation this feature should respond to. Computes the feature for the given parameters. The previous state. The current state. The observations. The index of the current observation. The output class label for the sequence. Computes the probability of occurrence of this feature given a sequence of observations. The matrix of forward state probabilities. The matrix of backward state probabilities. The observation sequence. The output class label for the sequence. The probability of occurrence of this feature. Computes the log-probability of occurrence of this feature given a sequence of observations. The matrix of forward state log-probabilities. The matrix of backward state log-probabilities. The observation sequence. The output class label for the sequence. The probability of occurrence of this feature. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. State feature for first moment multivariate Gaussian emission probabilities. Constructs a new first moment emission feature. The potential function to which this feature belongs. The index of the potential factor to which this feature belongs. The state for the emission. The multivariate dimension to consider in the computation. Computes the feature for the given parameters. The previous state. The current state. The observations. The index of the current observation. The output class label for the sequence. Computes the probability of occurrence of this feature given a sequence of observations. The matrix of forward state probabilities. The matrix of backward state probabilities. The observation sequence. The output class label for the sequence. The probability of occurrence of this feature. Computes the log-probability of occurrence of this feature given a sequence of observations. The matrix of forward state log-probabilities. The matrix of backward state log-probabilities. The observation sequence. The output class label for the sequence. The probability of occurrence of this feature. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. State feature for second moment Gaussian emission probabilities. Constructs a new second moment emission feature. The potential function to which this feature belongs. The index of the potential factor to which this feature belongs. The state for the emission. Computes the feature for the given parameters. The previous state. The current state. The observations. The index of the current observation. The output class label for the sequence. Computes the probability of occurrence of this feature given a sequence of observations. The matrix of forward state probabilities. The matrix of backward state probabilities. The observation sequence. The output class label for the sequence. The probability of occurrence of this feature. Computes the log-probability of occurrence of this feature given a sequence of observations. The matrix of forward state log-probabilities. The matrix of backward state log-probabilities. The observation sequence. The output class label for the sequence. The probability of occurrence of this feature. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. State feature for first moment Gaussian emission probabilities. Constructs a new first moment emission feature. The potential function to which this feature belongs. The index of the potential factor to which this feature belongs. The state for the emission. Computes the feature for the given parameters. The previous state. The current state. The observations. The index of the current observation. The output class label for the sequence. Computes the probability of occurrence of this feature given a sequence of observations. The matrix of forward state probabilities. The matrix of backward state probabilities. The observation sequence. The output class label for the sequence. The probability of occurrence of this feature. Computes the log-probability of occurrence of this feature given a sequence of observations. The matrix of forward state log-probabilities. The matrix of backward state log-probabilities. The observation sequence. The output class label for the sequence. The probability of occurrence of this feature. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Edge feature for Hidden Markov Model state transition probabilities. Constructs a initial state transition feature. The potential function to which this feature belongs. The index of the potential factor to which this feature belongs. The destination state. Computes the feature for the given parameters. The previous state. The current state. The observations. The index of the current observation. The output class label for the sequence. Computes the probability of occurrence of this feature given a sequence of observations. The matrix of forward state probabilities. The matrix of backward state probabilities. The observation sequence. The output class label for the sequence. The probability of occurrence of this feature. Computes the log-probability of occurrence of this feature given a sequence of observations. The matrix of forward state log-probabilities. The matrix of backward state log-probabilities. The observation sequence. The output class label for the sequence. The probability of occurrence of this feature. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. State feature for Hidden Markov Model output class symbol probabilities. Constructs a new output class symbol feature. The potential function to which this feature belongs. The index of the potential factor to which this feature belongs. The emission symbol. Computes the feature for the given parameters. The previous state. The current state. The observations. The index of the current observation. The output class label for the sequence. Computes the probability of occurrence of this feature given a sequence of observations. The matrix of forward state probabilities. The matrix of backward state probabilities. The observation sequence. The output class label for the sequence. The probability of occurrence of this feature. Computes the log-probability of occurrence of this feature given a sequence of observations. The matrix of forward state log-probabilities. The matrix of backward state log-probabilities. The observation sequence. The output class label for the sequence. The probability of occurrence of this feature. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. State feature for Hidden Markov Model symbol emission probabilities. Constructs a new symbol emission feature. The potential function to which this feature belongs. The index of the potential factor to which this feature belongs. The state for the emission. The emission symbol. Computes the feature for the given parameters. The previous state. The current state. The observations. The index of the current observation. The output class label for the sequence. Computes the probability of occurrence of this feature given a sequence of observations. The matrix of forward state probabilities. The matrix of backward state probabilities. The observation sequence. The output class label for the sequence. The probability of occurrence of this feature. Computes the log-probability of occurrence of this feature given a sequence of observations. The matrix of forward state log-probabilities. The matrix of backward state log-probabilities. The observation sequence. The output class label for the sequence. The probability of occurrence of this feature. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Edge feature for Hidden Markov Model state transition probabilities. Constructs a state transition feature. The potential function to which this feature belongs. The index of the potential factor to which this feature belongs. The originating state. The destination state. Computes the feature for the given parameters. The previous state. The current state. The observations. The index of the current observation. The output class label for the sequence. Computes the probability of occurrence of this feature given a sequence of observations. The matrix of forward state probabilities. The matrix of backward state probabilities. The observation sequence. The output class label for the sequence. The probability of occurrence of this feature. Computes the log-probability of occurrence of this feature given a sequence of observations. The matrix of forward state log-probabilities. The matrix of backward state log-probabilities. The observation sequence. The output class label for the sequence. The probability of occurrence of this feature. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Forward-Backward algorithms for Conditional Random Fields. Computes Forward probabilities for a given hidden Markov model and a set of observations. Computes Forward probabilities for a given potential function and a set of observations. Computes Forward probabilities for a given potential function and a set of observations. Computes Forward probabilities for a given potential function and a set of observations. Computes Forward probabilities for a given potential function and a set of observations. Computes Forward probabilities for a given potential function and a set of observations. Computes Backward probabilities for a given potential function and a set of observations. Computes Backward probabilities for a given potential function and a set of observations. Computes Backward probabilities for a given potential function and a set of observations. Computes Backward probabilities for a given potential function and a set of observations. Computes Backward probabilities for a given potential function and a set of observations(no scaling). Computes Forward probabilities for a given hidden Markov model and a set of observations. Computes Forward probabilities for a given potential function and a set of observations. Computes Forward probabilities for a given potential function and a set of observations. Computes Backward probabilities for a given potential function and a set of observations. Computes Backward probabilities for a given potential function and a set of observations. Computes Backward probabilities for a given potential function and a set of observations(no scaling). Common interface for gradient evaluators for Hidden Conditional Random Fields . Computes the gradient using the input/outputs stored in this object. The value of the gradient vector for the given parameters. Computes the objective (cost) function for the Hidden Conditional Random Field (negative log-likelihood) using the input/outputs stored in this object. Hidden Conditional Random Field (HCRF). Conditional random fields (CRFs) are a class of statistical modeling method often applied in pattern recognition and machine learning, where they are used for structured prediction. Whereas an ordinary classifier predicts a label for a single sample without regard to "neighboring" samples, a CRF can take context into account; e.g., the linear chain CRF popular in natural language processing predicts sequences of labels for sequences of input samples. While Conditional Random Fields can be seen as a generalization of Markov models, Hidden Conditional Random Fields can be seen as a generalization of Hidden Markov Model Classifiers. The (linear-chain) Conditional Random Field is the discriminative counterpart of the Markov model. An observable Markov Model assumes the sequences of states y to be visible, rather than hidden. Thus they can be used in a different set of problems than the hidden Markov models. Those models are often used for sequence component labeling, also known as part-of-sequence tagging. After a model has been trained, they are mostly used to tag parts of a sequence using the Viterbi algorithm. This is very handy to perform, for example, classification of parts of a speech utterance, such as classifying phonemes inside an audio signal. References: C. Souza, Sequence Classifiers in C# - Part II: Hidden Conditional Random Fields. CodeProject. Available at: http://www.codeproject.com/Articles/559535/Sequence-Classifiers-in-Csharp-Part-II-Hidden-Cond Chan, Tony F.; Golub, Gene H.; LeVeque, Randall J. (1983). Algorithms for Computing the Sample Variance: Analysis and Recommendations. The American Statistician 37, 242-247. The next example shows how to use the learning algorithms in a real-world dataset, including training and testing in separate sets and evaluating its performance: The type of the observations modeled by the field. Gets the number of outputs assumed by the model. Gets the potential function encompassing all feature functions for this model. Initializes a new instance of the class. Initializes a new instance of the class. The potential function to be used by the model. Computes the most likely output for the given observations. Computes the most likely output for the given observations. Computes the most likely output for the given observations. Computes the most likely state labels for the given observations, returning the overall sequence probability for this model. Computes the most likely state labels for the given observations, returning the overall sequence probability for this model. Computes the log-likelihood that the given observations belong to the desired output. Computes the log-likelihood that the given observations belong to the desired output. Computes the log-likelihood that the given observations belong to the desired outputs. Computes the log-likelihood that the given observations belong to the desired outputs. Computes the partition function Z(x,y). Computes the log-partition function ln Z(x,y). Computes the partition function Z(x). Computes the log-partition function ln Z(x). Computes the log-likelihood of the observations given this model. Computes the log-likelihood of the observations given this model. Saves the random field to a stream. The stream to which the random field is to be serialized. Saves the random field to a stream. The stream to which the random field is to be serialized. Loads a random field from a stream. The stream from which the random field is to be deserialized. The deserialized random field. Loads a random field from a file. The path to the file from which the random field is to be deserialized. The deserialized random field. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. A class-label that best described according to this classifier. Base class for (HMM) Sequence Classifiers. This class cannot be instantiated. Initializes a new instance of the class. The number of classes in the classification problem. Initializes a new instance of the class. The models specializing in each of the classes of the classification problem. Gets or sets the threshold model. For gesture spotting, Lee and Kim introduced a threshold model which is composed of parts of the models in a hidden Markov sequence classifier. The threshold model acts as a baseline for decision rejection. If none of the classifiers is able to produce a higher likelihood than the threshold model, the decision is rejected. In the original Lee and Kim publication, the threshold model is constructed by creating a fully connected ergodic model by removing all outgoing transitions of states in all gesture models and fully connecting those states. References: H. Lee, J. Kim, An HMM-based threshold model approach for gesture recognition, IEEE Trans. Pattern Anal. Mach. Intell. 21 (10) (1999) 961–973. Gets or sets a value governing the rejection given by a threshold model (if present). Increasing this value will result in higher rejection rates. Default is 1. Gets the collection of models specialized in each class of the sequence classification problem. Gets the Hidden Markov Model implementation responsible for recognizing each of the classes given the desired class label. The class label of the model to get. Gets the number of classes which can be recognized by this classifier. Gets the prior distribution assumed for the classes. Computes the log-likelihood that the given input vector belongs to its decided class. Computes the likelihood that the given input vector belongs to its decided class. Computes the log-likelihood that the given input vector belongs to its decided class. Computes the probability that the given input vector belongs to its decided class. Computes the log-likelihood that the given input vector belongs to the specified . The input vector. The index of the class whose score will be computed. Computes the probabilities that the given input vector belongs to each of the possible classes. The input vector. The decided class for the input. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The decided class for the input. An array where the probabilities will be stored, avoiding unnecessary memory allocations. Computes a class-label decision for a given . The input vector that should be classified into one of the possible classes. A class-label that best described according to this classifier. Returns an enumerator that iterates through the models in the classifier. A that can be used to iterate through the collection. Returns an enumerator that iterates through the models in the classifier. A that can be used to iterate through the collection. Forward-Backward algorithms for Hidden Markov Models. Forward-Backward algorithms for Hidden Markov Models. Computes Forward probabilities for a given hidden Markov model and a set of observations. Computes Forward probabilities for a given hidden Markov model and a set of observations. Computes Forward probabilities for a given hidden Markov model and a set of observations. Computes Forward probabilities for a given hidden Markov model and a set of observations. Computes Backward probabilities for a given hidden Markov model and a set of observations. Computes Backward probabilities for a given hidden Markov model and a set of observations. Computes Forward probabilities for a given hidden Markov model and a set of observations. Computes Forward probabilities for a given hidden Markov model and a set of observations. Computes Forward probabilities for a given hidden Markov model and a set of observations. Computes Backward probabilities for a given hidden Markov model and a set of observations. Computes Backward probabilities for a given hidden Markov model and a set of observations. Computes Backward probabilities for a given hidden Markov model and a set of observations (no scaling). Computes Forward probabilities for a given hidden Markov model and a set of observations. Computes Forward probabilities for a given hidden Markov model and a set of observations (no scaling). Computes Forward probabilities for a given hidden Markov model and a set of observations. Computes Forward probabilities for a given hidden Markov model and a set of observations. Computes Forward probabilities for a given hidden Markov model and a set of observations. Computes Forward probabilities for a given hidden Markov model and a set of observations. Computes Forward probabilities for a given hidden Markov model and a set of observations. Computes Forward probabilities for a given hidden Markov model and a set of observations. Computes Forward probabilities for a given hidden Markov model and a set of observations. Computes Backward probabilities for a given hidden Markov model and a set of observations. Computes Backward probabilities for a given hidden Markov model and a set of observations (no scaling). Computes Backward probabilities for a given hidden Markov model and a set of observations. Computes Backward probabilities for a given hidden Markov model and a set of observations (no scaling). Computes Backward probabilities for a given hidden Markov model and a set of observations. Computes Backward probabilities for a given hidden Markov model and a set of observations. Computes Forward probabilities for a given hidden Markov model and a set of observations. Computes Forward probabilities for a given hidden Markov model and a set of observations. Computes Forward probabilities for a given hidden Markov model and a set of observations. Computes Forward probabilities for a given hidden Markov model and a set of observations. Computes Forward probabilities for a given hidden Markov model and a set of observations. Computes Forward probabilities for a given hidden Markov model and a set of observations. Computes Backward probabilities for a given hidden Markov model and a set of observations. Computes Backward probabilities for a given hidden Markov model and a set of observations. Computes Backward probabilities for a given hidden Markov model and a set of observations (no scaling). Computes Backward probabilities for a given hidden Markov model and a set of observations. Computes Backward probabilities for a given hidden Markov model and a set of observations. Computes Backward probabilities for a given hidden Markov model and a set of observations (no scaling). Arbitrary-density Hidden Markov Model Set for Sequence Classification. This class uses a set of density hidden Markov models to classify sequences of real (double-precision floating point) numbers or arrays of those numbers. Each model will try to learn and recognize each of the different output classes. For examples and details on how to learn such models, please take a look on the documentation for . For the discrete version of this classifier, please see its non-generic counterpart . Examples are available at the respective learning algorithm pages. For example, see . Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. An array specifying the number of hidden states for each of the classifiers. By default, and Ergodic topology will be used. The initial probability distributions for the hidden states. For multivariate continuous density distributions, such as Normal mixtures, the choice of initial values is crucial for a good performance. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. The topology of the hidden states. A forward-only topology is indicated to sequence classification problems, such as speech recognition. The initial probability distributions for the hidden states. For multivariate continuous density distributions, such as Normal mixtures, the choice of initial values is crucial for a good performance. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. An array specifying the number of hidden states for each of the classifiers. By default, and Ergodic topology will be used. The initial probability distributions for the hidden states. For multivariate continuous density distributions, such as Normal mixtures, the choice of initial values is crucial for a good performance. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. The topology of the hidden states. A forward-only topology is indicated to sequence classification problems, such as speech recognition. The initial probability distributions for the hidden states. For multivariate continuous density distributions, such as Normal mixtures, the choice of initial values is crucial for a good performance. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. The topology of the hidden states. A forward-only topology is indicated to sequence classification problems, such as speech recognition. The initial probability distributions for the hidden states. For multivariate continuous density distributions, such as Normal mixtures, the choice of initial values is crucial for a good performance. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. The topology of the hidden states. A forward-only topology is indicated to sequence classification problems, such as speech recognition. The initial probability distributions for the hidden states. For multivariate continuous density distributions, such as Normal mixtures, the choice of initial values is crucial for a good performance. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. The topology of the hidden states. A forward-only topology is indicated to sequence classification problems, such as speech recognition. The initial probability distributions for the hidden states. For multivariate continuous density distributions, such as Normal mixtures, the choice of initial values is crucial for a good performance. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. The topology of the hidden states. A forward-only topology is indicated to sequence classification problems, such as speech recognition. The initial probability distributions for the hidden states. For multivariate continuous density distributions, such as Normal mixtures, the choice of initial values is crucial for a good performance. The class labels for each of the models. Creates a new Sequence Classifier with the given number of classes. The models specializing in each of the classes of the classification problem. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. The topology of the hidden states. A forward-only topology is indicated to sequence classification problems, such as speech recognition. The initial probability distributions for the hidden states. For multivariate continuous density distributions, such as Normal mixtures, the choice of initial values is crucial for a good performance. The class labels for each of the models. Algorithms for solving -related problems, such as sequence decoding and likelihood evaluation. Uses the Viterbi algorithm (max-sum) to find the hidden states of a sequence of observations and to evaluate its likelihood. The likelihood will be computed along the Viterbi path. Uses the forward algorithm (sum-prod) to compute the likelihood of a sequence. The likelihood will be computed considering every possible path in the model (default). When set, calling LogLikelihoods will give the model's posterior distribution. Hidden Markov Model for any kind of observations (not only discrete). Hidden Markov Models (HMM) are stochastic methods to model temporal and sequence data. They are especially known for their application in temporal pattern recognition such as speech, handwriting, gesture recognition, part-of-speech tagging, musical score following, partial discharges and bioinformatics. This page refers to the arbitrary-density (continuous emission distributions) version of the model. For discrete distributions, please see . Dynamical systems of discrete nature assumed to be governed by a Markov chain emits a sequence of observable outputs. Under the Markov assumption, it is also assumed that the latest output depends only on the current state of the system. Such states are often not known from the observer when only the output values are observable. Hidden Markov Models attempt to model such systems and allow, among other things, To infer the most likely sequence of states that produced a given output sequence, Infer which will be the most likely next state (and thus predicting the next output), Calculate the probability that a given sequence of outputs originated from the system (allowing the use of hidden Markov models for sequence classification). The “hidden” in Hidden Markov Models comes from the fact that the observer does not know in which state the system may be in, but has only a probabilistic insight on where it should be. The arbitrary-density Hidden Markov Model uses any probability density function (such as Gaussian Mixture Model) for computing the state probability. In other words, in a continuous HMM the matrix of emission probabilities B is replaced by an array of either discrete or continuous probability density functions. If a general discrete distribution is used as the underlying probability density function, the model becomes equivalent to the discrete Hidden Markov Model. For a more thorough explanation on some fundamentals on how Hidden Markov Models work, please see the documentation page. To learn a Markov model, you can find a list of both supervised and unsupervised learning algorithms in the namespace. References: Wikipedia contributors. "Linear regression." Wikipedia, the Free Encyclopedia. Available at: http://en.wikipedia.org/wiki/Hidden_Markov_model Bishop, Christopher M.; Pattern Recognition and Machine Learning. Springer; 1st ed. 2006. The example below reproduces the same example given in the Wikipedia entry for the Viterbi algorithm (http://en.wikipedia.org/wiki/Viterbi_algorithm). As an arbitrary density model, one can use it with any available probability distributions, including with a discrete probability. In the following example, the generic model is used with a to reproduce the same example given in . Below, the model's parameters are initialized manually. However, it is possible to learn those automatically using . Examples on how to learn hidden Markov models can be found on the documentation pages of the respective learning algorithms: , , . The simplest of such examples can be seen below: Markov models can also be trained without having, in fact, "hidden" parts. The following example shows how hidden Markov models trained using Maximum Likelihood Learning can be used in the context of fraud analysis, in which we actually know in advance the class labels for each state in the sequences we are trying to learn: Where the transform function is defined as: Hidden Markov Models can also be used to predict the next observation in a sequence. This can be done by inspecting the forward matrix of probabilities for the sequence and checking which would be the most likely state after the current one. Then, it returns the most likely value (the mode) for the distribution associated with that state. This limits the applicability of this model to only very short-term predictions (i.e. most likely, only the most immediate next observation). Baum-Welch, one of the most famous learning algorithms for Hidden Markov Models. Discrete-density Hidden Markov Model Constructs a new Hidden Markov Model. Constructs a new Hidden Markov Model. Gets or sets the algorithm that should be used to compute solutions to this model's LogLikelihood(T[] input) evaluation, Decide(T[] input) decoding and LogLikelihoods(T[] input) posterior problems. Gets the number of states of this model. Gets the number of states of this model. Gets the log-initial probabilities log(pi) for this model. Gets the log-transition matrix log(A) for this model. Gets or sets a user-defined tag associated with this model. Constructs a new Hidden Markov Model with arbitrary-density state probabilities. A object specifying the initial values of the matrix of transition probabilities A and initial state probabilities pi to be used by this model. The initial emission probability distribution to be used by each of the states. This initial probability distribution will be cloned across all states. Constructs a new Hidden Markov Model with arbitrary-density state probabilities. A object specifying the initial values of the matrix of transition probabilities A and initial state probabilities pi to be used by this model. The initial emission probability distribution to be used by each of the states. This initial probability distribution will be cloned across all states. Constructs a new Hidden Markov Model with arbitrary-density state probabilities. A object specifying the initial values of the matrix of transition probabilities A and initial state probabilities pi to be used by this model. The initial emission probability distributions for each state. Constructs a new Hidden Markov Model with arbitrary-density state probabilities. The transitions matrix A for this model. The emissions matrix B for this model. The initial state probabilities for this model. Set to true if the matrices are given with logarithms of the intended probabilities; set to false otherwise. Default is false. Constructs a new Hidden Markov Model with arbitrary-density state probabilities. The transitions matrix A for this model. The emissions matrix B for this model. The initial state probabilities for this model. Set to true if the matrices are given with logarithms of the intended probabilities; set to false otherwise. Default is false. Constructs a new Hidden Markov Model with arbitrary-density state probabilities. The number of states for the model. A initial distribution to be copied to all states in the model. Constructs a new Hidden Markov Model with arbitrary-density state probabilities. The number of states for the model. A initial distribution to be copied to all states in the model. Gets the Emission matrix (B) for this model. Calculates the most likely sequence of hidden states that produced the given observation sequence. Decoding problem. Given the HMM M = (A, B, pi) and the observation sequence O = {o1,o2, ..., oK}, calculate the most likely sequence of hidden states Si that produced this observation sequence O. This can be computed efficiently using the Viterbi algorithm. A sequence of observations. The sequence of states that most likely produced the sequence. Calculates the most likely sequence of hidden states that produced the given observation sequence. Decoding problem. Given the HMM M = (A, B, pi) and the observation sequence O = {o1,o2, ..., oK}, calculate the most likely sequence of hidden states Si that produced this observation sequence O. This can be computed efficiently using the Viterbi algorithm. A sequence of observations. The log-likelihood along the most likely sequence. The sequence of states that most likely produced the sequence. Calculates the probability of each hidden state for each observation in the observation vector. If there are 3 states in the model, and the array contains 5 elements, the resulting vector will contain 5 vectors of size 3 each. Each vector of size 3 will contain probability values that sum up to one. By following those probabilities in order, we may decode those probabilities into a sequence of most likely states. However, the sequence of obtained states may not be valid in the model. A sequence of observations. A vector of the same size as the observation vectors, containing the probabilities for each state in the model for the current observation. If there are 3 states in the model, and the array contains 5 elements, the resulting vector will contain 5 vectors of size 3 each. Each vector of size 3 will contain probability values that sum up to one. Calculates the probability of each hidden state for each observation in the observation vector, and uses those probabilities to decode the most likely sequence of states for each observation in the sequence using the posterior decoding method. See remarks for details. If there are 3 states in the model, and the array contains 5 elements, the resulting vector will contain 5 vectors of size 3 each. Each vector of size 3 will contain probability values that sum up to one. By following those probabilities in order, we may decode those probabilities into a sequence of most likely states. However, the sequence of obtained states may not be valid in the model. A sequence of observations. The sequence of states most likely associated with each observation, estimated using the posterior decoding method. A vector of the same size as the observation vectors, containing the probabilities for each state in the model for the current observation. If there are 3 states in the model, and the array contains 5 elements, the resulting vector will contain 5 vectors of size 3 each. Each vector of size 3 will contain probability values that sum up to one. Calculates the likelihood that this model has generated the given sequence. Evaluation problem. Given the HMM M = (A, B, pi) and the observation sequence O = {o1, o2, ..., oK}, calculate the probability that model M has generated sequence O. This can be computed efficiently using the either the Viterbi or the Forward algorithms. A sequence of observations. The log-likelihood that the given sequence has been generated by this model. Calculates the log-likelihood that this model has generated the given observation sequence along the given state path. A sequence of observations. A sequence of states. The log-likelihood that the given sequence of observations has been generated by this model along the given sequence of states. Predicts the next observation occurring after a given observation sequence. A sequence of observations. Predictions will be made regarding the next observations that should be coming after the last observation in this sequence. This method works by inspecting the forward matrix of probabilities for the sequence and checking which would be the most likely state after the current one. Then, it returns the most likely value (the mode) for the distribution associated with that state. This limits the applicability of this model to only very short-term predictions (i.e. most likely, only the most immediate next observation). Predicts the next observation occurring after a given observation sequence. A sequence of observations. Predictions will be made regarding the next observations that should be coming after the last observation in this sequence. The log-likelihood of the given sequence, plus the predicted next observation. Exponentiate this value (use the System.Math.Exp function) to obtain a likelihood value. This method works by inspecting the forward matrix of probabilities for the sequence and checking which would be the most likely state after the current one. Then, it returns the most likely value (the mode) for the distribution associated with that state. This limits the applicability of this model to only very short-term predictions (i.e. most likely, only the most immediate next observation). Predicts the next observation occurring after a given observation sequence. A sequence of observations. Predictions will be made regarding the next observations that should be coming after the last observation in this sequence. The log-likelihood of the given sequence, plus the predicted next observation. Exponentiate this value (use the System.Math.Exp function) to obtain a likelihood value. The continuous probability distribution describing the next observations that are likely to be generated. Taking the mode of this distribution might give the most likely next value in the observed sequence. This method works by inspecting the forward matrix of probabilities for the sequence and checking which would be the most likely state after the current one. Then, it returns the most likely value (the mode) for the distribution associated with that state. This limits the applicability of this model to only very short-term predictions (i.e. most likely, only the most immediate next observation). Predicts the next observation occurring after a given observation sequence. A sequence of observations. Predictions will be made regarding the next observations that should be coming after the last observation in this sequence. The continuous probability distribution describing the next observations that are likely to be generated. Taking the mode of this distribution might give the most likely next value in the observed sequence. This method works by inspecting the forward matrix of probabilities for the sequence and checking which would be the most likely state after the current one. Then, it returns the most likely value (the mode) for the distribution associated with that state. This limits the applicability of this model to only very short-term predictions (i.e. most likely, only the most immediate next observation). Predicts the next observation occurring after a given observation sequence. A sequence of observations. Predictions will be made regarding the next observations that should be coming after the last observation in this sequence. The continuous probability distribution describing the next observations that are likely to be generated. Taking the mode of this distribution might give the most likely next value in the observed sequence. This method works by inspecting the forward matrix of probabilities for the sequence and checking which would be the most likely state after the current one. Then, it returns the most likely value (the mode) for the distribution associated with that state. This limits the applicability of this model to only very short-term predictions (i.e. most likely, only the most immediate next observation). Predicts the next observation occurring after a given observation sequence. A sequence of observations. Predictions will be made regarding the next observations that should be coming after the last observation in this sequence. The log-likelihood of the given sequence, plus the predicted next observation. Exponentiate this value (use the System.Math.Exp function) to obtain a likelihood value. The continuous probability distribution describing the next observations that are likely to be generated. Taking the mode of this distribution might give the most likely next value in the observed sequence. This method works by inspecting the forward matrix of probabilities for the sequence and checking which would be the most likely state after the current one. Then, it returns the most likely value (the mode) for the distribution associated with that state. This limits the applicability of this model to only very short-term predictions (i.e. most likely, only the most immediate next observation). Predicts the next observations occurring after a given observation sequence. A sequence of observations. Predictions will be made regarding the next observations that should be coming after the last observation in this sequence. The number of observations to be predicted. Default is 1. This method works by inspecting the forward matrix of probabilities for the sequence and checking which would be the most likely state after the current one. Then, it returns the most likely value (the mode) for the distribution associated with that state. This limits the applicability of this model to only very short-term predictions (i.e. most likely, only the most immediate next observation). Predicts the next observations occurring after a given observation sequence. A sequence of observations. Predictions will be made regarding the next observations that should be coming after the last observation in this sequence. The number of observations to be predicted. Default is 1. The log-likelihood of the given sequence, plus the predicted next observation. Exponentiate this value (use the System.Math.Exp function) to obtain a likelihood value. This method works by inspecting the forward matrix of probabilities for the sequence and checking which would be the most likely state after the current one. Then, it returns the most likely value (the mode) for the distribution associated with that state. This limits the applicability of this model to only very short-term predictions (i.e. most likely, only the most immediate next observation). Generates a random vector of observations from the model. The number of samples to generate. A random vector of observations drawn from the model. Generates a random vector of observations from the model. The number of samples to generate. The log-likelihood of the generated observation sequence. The Viterbi path of the generated observation sequence. A random vector of observations drawn from the model. Predicts the next observation occurring after a given observation sequence. Predicts the next observation occurring after a given observation sequence. Predicts the next observation occurring after a given observation sequence. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Predicts a the probability that the sequence vector has been generated by this log-likelihood tagger along the given path of hidden states. Predicts a the probability that the sequence vector has been generated by this log-likelihood tagger along the given path of hidden states. Predicts a the probability that the sequence vector has been generated by this log-likelihood tagger along the given path of hidden states. Predicts a the probability that the sequence vector has been generated by this log-likelihood tagger. Predicts a the probability that the sequence vector has been generated by this log-likelihood tagger. Predicts a the log-likelihood for each of the observations in the sequence vector assuming each of the possible states in the tagger model. Predicts a the log-likelihood for each of the observations in the sequence vector assuming each of the possible states in the tagger model. Computes class-label decisions for the given . The input vectors that should be classified as any of the possible classes. The location where to store the class-labels. A set of class-labels that best describe the vectors according to this classifier. Computes class-label decisions for the given . The input vectors that should be classified as any of the possible classes. The location where to store the class-labels. A set of class-labels that best describe the vectors according to this classifier. Base class for implementations of the Baum-Welch learning algorithm. This class cannot be instantiated. This class uses a template method pattern so specialized classes can be written for each kind of hidden Markov model emission density (either discrete or continuous). The methods , and should be overridden by inheriting classes to specify how those probabilities should be computed for the density being modeled. For the actual Baum-Welch classes, please refer to or . For other kinds of algorithms, please see and and their generic counter-parts. Initializes a new instance of the class. Gets or sets the maximum change in the average log-likelihood after an iteration of the algorithm used to detect convergence. This is the likelihood convergence limit L between two iterations of the algorithm. The algorithm will stop when the change in the likelihood for two consecutive iterations has not changed by more than L percent of the likelihood. If left as zero, the algorithm will ignore this parameter and iterate over a number of fixed iterations specified by the previous parameter. Please use MaxIterations instead. Gets or sets the maximum number of iterations performed by the learning algorithm. This is the maximum number of iterations to be performed by the learning algorithm. If specified as zero, the algorithm will learn until convergence of the model average likelihood respecting the desired limit. Gets or sets the number of performed iterations. Gets or sets whether the algorithm has converged. true if this instance has converged; otherwise, false. Gets the Ksi matrix of log probabilities created during the last iteration of the Baum-Welch learning algorithm. Gets the Gamma matrix of log probabilities created during the last iteration of the Baum-Welch learning algorithm. Gets the sample weights in the last iteration of the Baum-Welch learning algorithm. Runs the Baum-Welch learning algorithm for hidden Markov models. Learning problem. Given some training observation sequences O = {o1, o2, ..., oK} and general structure of HMM (numbers of hidden and visible states), determine HMM parameters M = (A, B, pi) that best fit training data. The sequences of univariate or multivariate observations used to train the model. Can be either of type double[] (for the univariate case) or double[][] for the multivariate case. The average log-likelihood for the observations after the model has been trained. Runs the Baum-Welch learning algorithm for hidden Markov models. Learning problem. Given some training observation sequences O = {o1, o2, ..., oK} and general structure of HMM (numbers of hidden and visible states), determine HMM parameters M = (A, B, pi) that best fit training data. The sequences of univariate or multivariate observations used to train the model. Can be either of type double[] (for the univariate case) or double[][] for the multivariate case. The weight associated with each sequence. The average log-likelihood for the observations after the model has been trained. Computes the forward and backward probabilities matrices for a given observation referenced by its index in the input training data. The index of the observation in the input training data. Returns the computed forward probabilities matrix. Returns the computed backward probabilities matrix. Computes the ksi matrix of probabilities for a given observation referenced by its index in the input training data. The index of the observation in the input training data. The matrix of forward probabilities for the observation. The matrix of backward probabilities for the observation. Updates the emission probability matrix. Implementations of this method should use the observations in the training data and the Gamma probability matrix to update the probability distributions of symbol emissions. Abstract base class for hidden Markov model learning algorithms. Gets or sets the parallelization options for this algorithm. Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. Gets the classifier being trained by this instance. The classifier being trained by this instance. Obsolete. Gets or sets the configuration function specifying which training algorithm should be used for each of the models in the hidden Markov model set. Gets or sets a value indicating whether a threshold model should be created or updated after training to support rejection. true to update the threshold model after training; otherwise, false. Gets or sets a value indicating whether the class priors should be estimated from the data, as in an empirical Bayes method. Gets the log-likelihood at the end of the training. Occurs when the learning of a class model has started. Occurs when the learning of a class model has finished. Creates a new instance of the learning algorithm for a given Markov sequence classifier using the specified configuration function. Creates a new instance of the learning algorithm for a given Markov sequence classifier using the specified configuration function. Creates a new instance of the learning algorithm for a given Markov sequence classifier using the specified configuration function. Creates a new instance of the learning algorithm for a given Markov sequence classifier using the specified configuration function. Trains each model to recognize each of the output labels. The sum log-likelihood for all models after training. Creates a new threshold model for the current set of Markov models in this sequence classifier. A threshold Markov model. Creates the state transition topology for the threshold model. This method can be used to help in the implementation of the abstract method which has to be defined for implementers of this class. Raises the event. The instance containing the event data. Raises the event. The instance containing the event data. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Creates an instance of the model to be learned. Inheritors of this abstract class must define this method so new models can be created from the training data. Base class for implementations of the Baum-Welch learning algorithm. This class cannot be instantiated. Gets or sets the number of states to be used when this learning algorithm needs to create new models. The number of states. Gets or sets the state transition topology to be used when this learning algorithm needs to create new models. Default is . The topology to be used when this learning algorithm needs to create a new model. Gets or sets the model being trained. Initializes a new instance of the class. The model to be learned. Initializes a new instance of the class. Creates an instance of the model to be learned. Inheritors of this abstract class must define this method so new models can be created from the training data. Base class for implementations of the Viterbi learning algorithm. This class cannot be instantiated. This class uses a template method pattern so specialized classes can be written for each kind of hidden Markov model emission density (either discrete or continuous). For the actual Viterbi classes, please refer to or . For other kinds of algorithms, please see and and their generic counter-parts. Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. Gets or sets the maximum change in the average log-likelihood after an iteration of the algorithm used to detect convergence. This is the likelihood convergence limit L between two iterations of the algorithm. The algorithm will stop when the change in the likelihood for two consecutive iterations has not changed by more than L percent of the likelihood. If left as zero, the algorithm will ignore this parameter and iterate over a number of fixed iterations specified by the previous parameter. Please use MaxIterations instead. Gets or sets the maximum number of iterations performed by the learning algorithm. This is the maximum number of iterations to be performed by the learning algorithm. If specified as zero, the algorithm will learn until convergence of the model average likelihood respecting the desired limit. Gets the current iteration. The current iteration. Gets a value indicating whether this instance has converged. true if this instance has converged; otherwise, false. Gets or sets on how many batches the learning data should be divided during learning. Batches are used to estimate adequately the first models so they can better compute the Viterbi paths for subsequent passes of the algorithm. Default is 1. Creates a new instance of the Viterbi learning algorithm. Runs the learning algorithm. Learning problem. Given some training observation sequences O = {o1, o2, ..., oK} and general structure of HMM (numbers of hidden and visible states), determine HMM parameters M = (A, B, pi) that best fit training data. Computes the log-likelihood for the current model for the given observations. The observation vectors. The log-likelihood of the observations belonging to the model. Runs one single epoch (iteration) of the learning algorithm. The observation sequences. A vector to be populated with the decoded Viterbi sequences. Base class for implementations of the Baum-Welch learning algorithm. This class cannot be instantiated. Creates a new instance of the Baum-Welch learning algorithm. Creates a new instance of the Baum-Welch learning algorithm. Fits one emission distribution. This method can be override in a base class in order to implement special fitting options. Base class for implementations of the Baum-Welch learning algorithm. This class cannot be instantiated. Gets all observations as a single vector. Gets or sets convergence parameters. The convergence parameters. Gets or sets the distribution fitting options to use when estimating distribution densities during learning. The distribution fitting options. Gets the log-likelihood of the model at the last iteration. Gets or sets the function that initializes the emission distributions in the hidden Markov Models. Creates a new instance of the Baum-Welch learning algorithm. Creates a new instance of the Baum-Welch learning algorithm. Gets or sets the maximum change in the average log-likelihood after an iteration of the algorithm used to detect convergence. This is the likelihood convergence limit L between two iterations of the algorithm. The algorithm will stop when the change in the likelihood for two consecutive iterations has not changed by more than L percent of the likelihood. If left as zero, the algorithm will ignore this parameter and iterate over a number of fixed iterations specified by the previous parameter. Please use MaxIterations instead. Gets or sets the maximum number of iterations performed by the learning algorithm. This is the maximum number of iterations to be performed by the learning algorithm. If specified as zero, the algorithm will learn until convergence of the model average likelihood respecting the desired limit. Gets or sets the number of performed iterations. Gets or sets whether the algorithm has converged. true if this instance has converged; otherwise, false. Gets the Ksi matrix of log probabilities created during the last iteration of the Baum-Welch learning algorithm. Gets the Gamma matrix of log probabilities created during the last iteration of the Baum-Welch learning algorithm. Gets the sample weights in the last iteration of the Baum-Welch learning algorithm. Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . Computes the ksi matrix of probabilities for a given observation referenced by its index in the input training data. The index of the observation in the input training data. The matrix of forward probabilities for the observation. The matrix of backward probabilities for the observation. Updates the emission probability matrix. Implementations of this method should use the observations in the training data and the Gamma probability matrix to update the probability distributions of symbol emissions. Fits one emission distribution. This method can be override in a base class in order to implement special fitting options. Computes the forward and backward probabilities matrices for a given observation referenced by its index in the input training data. The index of the observation in the input training data. Returns the computed forward probabilities matrix. Returns the computed backward probabilities matrix. Baum-Welch learning algorithms for learning Hidden Markov Models. The type of the emission distributions in the model. The type of the observations (i.e. int for a discrete model). The type of fitting options accepted by this distribution. Please see the documentation page for the actual documentation of this class, including examples. Initializes a new instance of the class. Initializes a new instance of the class. The model to be learned. Creates an instance of the model to be learned. Inheritors of this abstract class must define this method so new models can be created from the training data. Baum-Welch learning algorithm for arbitrary-density (generic) Hidden Markov Models. The Baum-Welch algorithm is an unsupervised algorithm used to learn a single hidden Markov model object from a set of observation sequences. It works by using a variant of the Expectation-Maximization algorithm to search a set of model parameters (i.e. the matrix of transition probabilities A , the vector of state probability distributions B, and the initial probability vector π) that would result in a model having a high likelihood of being able to generate a set of training sequences given to this algorithm. For increased accuracy, this class performs all computations using log-probabilities. For a more thorough explanation on hidden Markov models with practical examples on gesture recognition, please see Sequence Classifiers in C#, Part I: Hidden Markov Models [1]. [1]: http://www.codeproject.com/Articles/541428/Sequence-Classifiers-in-Csharp-Part-I-Hidden-Marko In the following example, we will create a Continuous Hidden Markov Model using a univariate Normal distribution to model properly model continuous sequences. In the following example, we will create a Discrete Hidden Markov Model using a Generic Discrete Probability Distribution to reproduce the same code example given in documentation. The next example shows how to create a multivariate model using a multivariate normal distribution. In this example, sequences contain vector-valued observations, such as in the case of (x,y) pairs. The following example shows how to create a hidden Markov model that considers each feature to be independent of each other. This is the same as following Bayes' assumption of independence for each feature in the feature vector. Finally, the last example shows how to fit a mixture-density hidden Markov models. When using Normal distributions, it is often the case we might find problems which are difficult to solve. Some problems may include constant variables or other numerical difficulties preventing a the proper estimation of a Normal distribution from the data. A sign of those difficulties arises when the learning algorithm throws the exception "Variance is zero. Try specifying a regularization constant in the fitting options" for univariate distributions (e.g. or a informing that the "Covariance matrix is not positive definite. Try specifying a regularization constant in the fitting options" for multivariate distributions like the . In both cases, this is an indication that the variables being learned can not be suitably modeled by Normal distributions. To avoid numerical difficulties when estimating those probabilities, a small regularization constant can be added to the variances or to the covariance matrices until they become greater than zero or positive definite. To specify a regularization constant as given in the above message, we can indicate a fitting options object for the model distribution using: Typically, any small value would suffice as a regularization constant, though smaller values may lead to longer fitting times. Too high values, on the other hand, would lead to decreased accuracy. The type of the emission distributions in the model. The type of the observations (i.e. int for a discrete model). Initializes a new instance of the class. The model to be learned. Initializes a new instance of the class. Creates an instance of the model to be learned. Inheritors of this abstract class must define this method so new models can be created from the training data. Learning algorithm for arbitrary-density generative hidden Markov sequence classifiers. This class acts as a teacher for classifiers based on arbitrary-density hidden Markov models. The learning algorithm uses a generative approach. It works by training each model in the generative classifier separately. This can teach models that use any probability distribution. Such arbitrary-density models can be used for any kind of observation values or vectors. When be used whenever the sequence of observations is discrete or can be represented by discrete symbols, such as class labels, integers, and so on. If you need to classify sequences of other entities, such as real numbers, vectors (i.e. multivariate observations), then you can use generic-density hidden Markov models. Those models can be modeled after any kind of probability distribution implementing the interface. For a more thorough explanation on hidden Markov models with practical examples on gesture recognition, please see Sequence Classifiers in C#, Part I: Hidden Markov Models [1]. [1]: http://www.codeproject.com/Articles/541428/Sequence-Classifiers-in-Csharp-Part-I-Hidden-Marko The following example creates a continuous-density hidden Markov model sequence classifier to recognize two classes of univariate observation sequences. The following example creates a continuous-density hidden Markov model sequence classifier to recognize two classes of multivariate sequence of observations. This example uses multivariate Normal distributions as emission densities. When there is insufficient training data, or one of the variables is constant, the Normal distribution estimation may fail with a "Covariance matrix is not positive-definite". In this case, it is possible to sidestep this issue by specifying a small regularization constant to be added to the diagonal elements of the covariance matrix. The next example shows how to use the learning algorithms in a real-world dataset, including training and testing in separate sets and evaluating its performance: Creates a new instance of the learning algorithm for a given Markov sequence classifier using the specified configuration function. Creates a new instance of the learning algorithm for a given Markov sequence classifier. Creates a new instance of the learning algorithm for a given Markov sequence classifier. Creates an instance of the model to be learned. Inheritors of this abstract class must define this method so new models can be created from the training data. Creates a new threshold model for the current set of Markov models in this sequence classifier. A threshold Markov model. Base class for observable Markov model learning algorithms. Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. Gets the model being trained. Gets or sets whether the emission fitting algorithm should present weighted samples or simply the clustered samples to the density estimation methods. Gets or sets whether to use Laplace's rule of succession to avoid zero probabilities. Gets or sets the function that initializes the emission distributions in the hidden Markov Models. Gets or sets the distribution fitting options to use when estimating distribution densities during learning. The distribution fitting options. Creates a new instance of the Maximum Likelihood learning algorithm. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Creates an instance of the model to be learned. Inheritors of this abstract class must define this method so new models can be created from the training data. Maximum Likelihood learning algorithm for discrete-density Hidden Markov Models. The maximum likelihood estimate is a supervised learning algorithm. It considers both the sequence of observations as well as the sequence of states in the Markov model are visible and thus during training. Often, the Maximum Likelihood Estimate can be used to give a starting point to a unsupervised algorithm, making possible to use semi-supervised techniques with HMMs. It is possible, for example, to use MLE to guess initial values for an HMM given a small set of manually labeled labels, and then further estimate this model using the Viterbi learning algorithm. The following example comes from Prof. Yechiam Yemini slides on Hidden Markov Models, available at http://www.cs.columbia.edu/4761/notes07/chapter4.3-HMM.pdf. In this example, we will be specifying both the sequence of observations and the sequence of states assigned to each observation in each sequence to learn our Markov model. The following example shows how hidden Markov models trained using Maximum Likelihood Learning can be used in the context of fraud analysis. Where the transform function is defined as: Creates a new instance of the Maximum Likelihood learning algorithm. Creates a new instance of the Maximum Likelihood learning algorithm. Creates an instance of the model to be learned. Inheritors of this abstract class must define this method so new models can be created from the training data. Viterbi learning algorithm. The Viterbi learning algorithm is an alternate learning algorithms for hidden Markov models. It works by obtaining the Viterbi path for the set of training observation sequences and then computing the maximum likelihood estimates for the model parameters. Those operations are repeated iteratively until model convergence. The Viterbi learning algorithm is also known as the Segmental K-Means algorithm. Gets the model being trained. Gets or sets the distribution fitting options to use when estimating distribution densities during learning. The distribution fitting options. Gets or sets whether to use Laplace's rule of succession to avoid zero probabilities. When this property is set, it will only affect the estimation of the transition and initial state probabilities. To control the estimation of the emission probabilities, please use the corresponding property. Gets or sets a cancellation token that can be used to cancel the algorithm while it is running. Creates a new instance of the Viterbi learning algorithm. Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . Runs one single epoch (iteration) of the learning algorithm. The observation sequences. A vector to be populated with the decoded Viterbi sequences. Computes the log-likelihood for the current model for the given observations. The observation vectors. The log-likelihood of the observations belonging to the model. Obsolete. Please use instead. Gets the model being trained. Gets or sets whether the emission fitting algorithm should present weighted samples or simply the clustered samples to the density estimation methods. Gets or sets whether to use Laplace's rule of succession to avoid zero probabilities. Gets or sets the distribution fitting options to use when estimating distribution densities during learning. The distribution fitting options. Creates a new instance of the Maximum Likelihood learning algorithm. Runs the Maximum Likelihood learning algorithm for hidden Markov models. An array of observation sequences to be used to train the model. An array of state labels associated to each observation sequence. The average log-likelihood for the observations after the model has been trained. Supervised learning problem. Given some training observation sequences O = {o1, o2, ..., oK}, known training state paths H = {h1, h2, ..., hK} and general structure of HMM (numbers of hidden and visible states), determine HMM parameters M = (A, B, pi) that best fit training data. Runs the Maximum Likelihood learning algorithm for hidden Markov models. An array of observation sequences to be used to train the model. An array of state labels associated to each observation sequence. The average log-likelihood for the observations after the model has been trained. Supervised learning problem. Given some training observation sequences O = {o1, o2, ..., oK}, known training state paths H = {h1, h2, ..., hK} and general structure of HMM (numbers of hidden and visible states), determine HMM parameters M = (A, B, pi) that best fit training data. Converts a univariate or multivariate array of observations into a two-dimensional jagged array. Maximum Likelihood learning algorithm for discrete-density Hidden Markov Models. The maximum likelihood estimate is a supervised learning algorithm. It considers both the sequence of observations as well as the sequence of states in the Markov model are visible and thus during training. Often, the Maximum Likelihood Estimate can be used to give a starting point to a unsupervised algorithm, making possible to use semi-supervised techniques with HMMs. It is possible, for example, to use MLE to guess initial values for an HMM given a small set of manually labeled labels, and then further estimate this model using the Viterbi learning algorithm. The following example comes from Prof. Yechiam Yemini slides on Hidden Markov Models, available at http://www.cs.columbia.edu/4761/notes07/chapter4.3-HMM.pdf. In this example, we will be specifying both the sequence of observations and the sequence of states assigned to each observation in each sequence to learn our Markov model. Creates a new instance of the Maximum Likelihood learning algorithm. Creates a new instance of the Maximum Likelihood learning algorithm. Runs the Maximum Likelihood learning algorithm for hidden Markov models. An array of observation sequences to be used to train the model. An array of state labels associated to each observation sequence. The average log-likelihood for the observations after the model has been trained. Supervised learning problem. Given some training observation sequences O = {o1, o2, ..., oK}, known training state paths H = {h1, h2, ..., hK} and general structure of HMM (numbers of hidden and visible states), determine HMM parameters M = (A, B, pi) that best fit training data. Runs the Maximum Likelihood learning algorithm for hidden Markov models. An array of observation sequences to be used to train the model. An array of state labels associated to each observation sequence. The average log-likelihood for the observations after the model has been trained. Supervised learning problem. Given some training observation sequences O = {o1, o2, ..., oK}, known training state paths H = {h1, h2, ..., hK} and general structure of HMM (numbers of hidden and visible states), determine HMM parameters M = (A, B, pi) that best fit training data. Creates an instance of the model to be learned. Inheritors of this abstract class must define this method so new models can be created from the training data. Obsolete. Please use ViterbiLearning<TDistribution, TObservation> instead. Gets the model being trained. Gets or sets the distribution fitting options to use when estimating distribution densities during learning. The distribution fitting options. Gets or sets whether to use Laplace's rule of succession to avoid zero probabilities. When this property is set, it will only affect the estimation of the transition and initial state probabilities. To control the estimation of the emission probabilities, please use the corresponding property. Creates a new instance of the Viterbi learning algorithm. Runs the learning algorithm. Learning problem. Given some training observation sequences O = {o1, o2, ..., oK} and general structure of HMM (numbers of hidden and visible states), determine HMM parameters M = (A, B, pi) that best fit training data. Runs one single epoch (iteration) of the learning algorithm. The observation sequences. A vector to be populated with the decoded Viterbi sequences. Computes the log-likelihood for the current model for the given observations. The observation vectors. The log-likelihood of the observations belonging to the model. Converts a univariate or multivariate array of observations into a two-dimensional jagged array. Viterbi learning algorithm. The Viterbi learning algorithm is an alternate learning algorithms for hidden Markov models. It works by obtaining the Viterbi path for the set of training observation sequences and then computing the maximum likelihood estimates for the model parameters. Those operations are repeated iteratively until model convergence. The Viterbi learning algorithm is also known as the Segmental K-Means algorithm. Gets the model being trained. Gets or sets whether to use Laplace's rule of succession to avoid zero probabilities. Creates a new instance of the Viterbi learning algorithm. Learns a model that can map the given inputs to the desired outputs. The model inputs. The weight of importance for each input sample. A model that has learned how to produce suitable outputs given the input data . Runs one single epoch (iteration) of the learning algorithm. The observation sequences. A vector to be populated with the decoded Viterbi sequences. Computes the log-likelihood for the current model for the given observations. The observation vectors. The log-likelihood of the observations belonging to the model. Runs the learning algorithm. Learning problem. Given some training observation sequences O = {o1, o2, ..., oK} and general structure of HMM (numbers of hidden and visible states), determine HMM parameters M = (A, B, pi) that best fit training data. Learning algorithm for discrete-density generative hidden Markov sequence classifiers. This class acts as a teacher for classifiers based on discrete hidden Markov models. The learning algorithm uses a generative approach. It works by training each model in the generative classifier separately. This class implements discrete classifiers only. Discrete classifiers can be used whenever the sequence of observations is discrete or can be represented by discrete symbols, such as class labels, integers, and so on. If you need to classify sequences of other entities, such as real numbers, vectors (i.e. multivariate observations), then you can use generic-density hidden Markov models. Those models can be modeled after any kind of probability distribution implementing the interface. For a more thorough explanation on hidden Markov models with practical examples on gesture recognition, please see Sequence Classifiers in C#, Part I: Hidden Markov Models [1]. [1]: http://www.codeproject.com/Articles/541428/Sequence-Classifiers-in-Csharp-Part-I-Hidden-Marko The following example shows how to create a hidden Markov model sequence classifier to classify discrete sequences into two disjoint labels: labels for class 0 and labels for class 1. The training data is separated in inputs and outputs. The inputs are the sequences we are trying to learn, and the outputs are the labels associated with each input sequence. In this example we will be using the Baum-Welch algorithm to learn each model in our generative classifier; however, any other unsupervised learning algorithm could be used. It is also possible to learn a hidden Markov classifier with support for rejection. When a classifier is configured to use rejection, it will be able to detect when a sample does not belong to any of the classes that it has previously seen. Gets or sets the smoothing kernel's sigma for the threshold model. The smoothing kernel's sigma. Creates a new instance of the learning algorithm for a given Markov sequence classifier using the specified configuration function. Creates a new instance of the learning algorithm for a given Markov sequence classifier using the specified configuration function. Creates an instance of the model to be learned. Inheritors of this abstract class must define this method so new models can be created from the training data. Trains each model to recognize each of the output labels. The sum log-likelihood for all models after training. Compute model error for a given data set. The input points. The output points. The percent of misclassification errors for the data. Creates a new threshold model for the current set of Markov models in this sequence classifier. A threshold Markov model. Configuration function delegate for Sequence Classifier Learning algorithms. Submodel learning event arguments. Gets the generative class model to which this event refers to. Gets the total number of models to be learned. Initializes a new instance of the class. The class label. The total number of classes. Abstract base class for Sequence Classifier learning algorithms. Gets the classifier being trained by this instance. The classifier being trained by this instance. Gets or sets the configuration function specifying which training algorithm should be used for each of the models in the hidden Markov model set. Gets or sets a value indicating whether a threshold model should be created or updated after training to support rejection. true to update the threshold model after training; otherwise, false. Gets or sets a value indicating whether the class priors should be estimated from the data, as in an empirical Bayes method. Occurs when the learning of a class model has started. Occurs when the learning of a class model has finished. Creates a new instance of the learning algorithm for a given Markov sequence classifier using the specified configuration function. Creates a new instance of the learning algorithm for a given Markov sequence classifier. Trains each model to recognize each of the output labels. The sum log-likelihood for all models after training. Creates a new threshold model for the current set of Markov models in this sequence classifier. A threshold Markov model. Creates the state transition topology for the threshold model. This method can be used to help in the implementation of the abstract method which has to be defined for implementers of this class. Raises the event. The instance containing the event data. Raises the event. The instance containing the event data. Common interface for supervised learning algorithms for hidden Markov models such as the Maximum Likelihood (MLE) learning algorithm. In the context of hidden Markov models, supervised algorithms are algorithms which consider that both the sequence of observations and the sequence of states are visible (or known) during training. This is in contrast with unsupervised learning algorithms such as the Baum-Welch, which consider that the sequence of states is hidden. Runs the learning algorithm. Supervised learning problem. Given some training observation sequences O = {o1, o2, ..., oK} and sequence of hidden states H = {h1, h2, ..., hK} and general structure of HMM (numbers of hidden and visible states), determine HMM parameters M = (A, B, pi) that best fit training data. Common interface for unsupervised learning algorithms for hidden Markov models such as the Baum-Welch learning and the Viterbi learning algorithms. In the context of hidden Markov models, unsupervised algorithms are algorithms which consider that the sequence of states in a system is hidden, and just the system's outputs can be seen (or are known) during training. This is in contrast with supervised learning algorithms such as the Maximum Likelihood (MLE), which consider that both the sequence of observations and the sequence of states are observable during training. Runs the learning algorithm. Learning problem. Given some training observation sequences O = {o1, o2, ..., oK} and general structure of HMM (numbers of hidden and visible states), determine HMM parameters M = (A, B, pi) that best fit training data. The observations. Common interface for unsupervised learning algorithms for hidden Markov models such as the Baum-Welch learning and the Viterbi learning algorithms. In the context of hidden Markov models, unsupervised algorithms are algorithms which consider that the sequence of states in a system is hidden, and just the system's outputs can be seen (or are known) during training. This is in contrast with supervised learning algorithms such as the Maximum Likelihood (MLE), which consider that both the sequence of observations and the sequence of states are observable during training. Runs the learning algorithm. Learning problem. Given some training observation sequences O = {o1, o2, ..., oK} and general structure of HMM (numbers of hidden and visible states), determine HMM parameters M = (A, B, pi) that best fit training data. The observations. Common interface for unsupervised learning algorithms for hidden Markov models which support for weighted training samples. Runs the learning algorithm. Learning problem. Given some training observation sequences O = {o1, o2, ..., oK} and general structure of HMM (numbers of hidden and visible states), determine HMM parameters M = (A, B, pi) that best fit training data. Baum-Welch learning algorithm for discrete-density Hidden Markov Models. The Baum-Welch algorithm is an unsupervised algorithm used to learn a single hidden Markov model object from a set of observation sequences. It works by using a variant of the Expectation-Maximization algorithm to search a set of model parameters (i.e. the matrix of transition probabilities A, the matrix of emission probabilities B, and the initial probability vector π) that would result in a model having a high likelihood of being able to generate a set of training sequences given to this algorithm. For increased accuracy, this class performs all computations using log-probabilities. For a more thorough explanation on hidden Markov models with practical examples on gesture recognition, please see Sequence Classifiers in C#, Part I: Hidden Markov Models [1]. [1]: http://www.codeproject.com/Articles/541428/Sequence-Classifiers-in-Csharp-Part-I-Hidden-Marko Gets or sets the number of symbols that should be used whenever this learning algorithm needs to create a new model. This property must be set before learning. The number of symbols. Creates a new instance of the Baum-Welch learning algorithm. Creates a new instance of the Baum-Welch learning algorithm. Obsolete. Obsolete. Creates a Baum-Welch with default configurations for hidden Markov models with normal mixture densities. Creates a Baum-Welch with default configurations for hidden Markov models with normal mixture densities. Creates an instance of the model to be learned. Inheritors of this abstract class must define this method so new models can be created from the training data. Obsolete. Please use instead. Gets the model being trained. Gets or sets the distribution fitting options to use when estimating distribution densities during learning. The distribution fitting options. Creates a new instance of the Baum-Welch learning algorithm. Runs the Baum-Welch learning algorithm for hidden Markov models. Learning problem. Given some training observation sequences O = {o1, o2, ..., oK} and general structure of HMM (numbers of hidden and visible states), determine HMM parameters M = (A, B, pi) that best fit training data. The sequences of univariate or multivariate observations used to train the model. Can be either of type double[] (for the univariate case) or double[][] for the multivariate case. The average log-likelihood for the observations after the model has been trained. Computes the ksi matrix of probabilities for a given observation referenced by its index in the input training data. The index of the observation in the input training data. The matrix of forward probabilities for the observation. The matrix of backward probabilities for the observation. Updates the emission probability matrix. Implementations of this method should use the observations in the training data and the Gamma probability matrix to update the probability distributions of symbol emissions. Computes the forward and backward probabilities matrices for a given observation referenced by its index in the input training data. The index of the observation in the input training data. Returns the computed forward probabilities matrix. Returns the computed backward probabilities matrix. Obsolete. Please use instead. Creates a new instance of the learning algorithm for a given Markov sequence classifier using the specified configuration function. Trains each model to recognize each of the output labels. The sum log-likelihood for all models after training. Compute model error for a given data set. The input points. The output points. The percent of misclassification errors for the data. Creates a new threshold model for the current set of Markov models in this sequence classifier. A threshold Markov model. Common interface for Hybrid Hidden Markov Models. Calculates the most likely sequence of hidden states that produced the given observation sequence. Decoding problem. Given the HMM M = (A, B, pi) and the observation sequence O = {o1,o2, ..., oK}, calculate the most likely sequence of hidden states Si that produced this observation sequence O. This can be computed efficiently using the Viterbi algorithm. A sequence of observations. The state optimized probability. The sequence of states that most likely produced the sequence. Calculates the probability that this model has generated the given sequence. Evaluation problem. Given the HMM M = (A, B, pi) and the observation sequence O = {o1, o2, ..., oK}, calculate the probability that model M has generated sequence O. This can be computed efficiently using the Forward algorithm. A sequence of observations. The probability that the given sequence has been generated by this model. Gets the expected number of dimensions in each observation. Gets the number of states of this model. Gets or sets a user-defined tag. Hybrid Markov classifier for arbitrary state-observation functions. Gets the Markov models for each sequence class. Gets the number of dimensions of the observations handled by this classifier. Creates a new Sequence Classifier with the given number of classes. The models specializing in each of the classes of the classification problem. Computes the most likely class for a given sequence. The sequence of observations. Return the label of the given sequence, or -1 if it has been rejected by the threshold model. Computes the most likely class for a given sequence. The sequence of observations. The probability of the assigned class. Return the label of the given sequence, or -1 if it has been rejected by the threshold model. Computes the most likely class for a given sequence. The sequence of observations. The class responsibilities (or the probability of the sequence to belong to each class). When using threshold models, the sum of the probabilities will not equal one, and the amount left was the threshold probability. If a threshold model is not being used, the array should sum to one. Return the label of the given sequence, or -1 if it has been rejected by the threshold model. General Markov function for arbitrary state-emission density definitions. The previous state index. The observation at the current state. An array containing the values for the observations in each next possible state. Hybrid Markov model for arbitrary state-observation functions. This class can be used to implement HMM hybrids such as ANN-HMM or SVM-HMMs through the specification of a custom . Gets the Markov function, which takes the previous state, the next state and a observation and produces a probability value. Gets the number of states in the model. Gets the number of dimensions of the observations handled by this model. Gets or sets an user-defined object associated with this model. Initializes a new instance of the class. A function specifying a probability for a transition-emission pair. The number of states in the model. The number of dimensions in the model. Calculates the most likely sequence of hidden states that produced the given observation sequence. Decoding problem. Given the HMM M = (A, B, pi) and the observation sequence O = {o1,o2, ..., oK}, calculate the most likely sequence of hidden states Si that produced this observation sequence O. This can be computed efficiently using the Viterbi algorithm. A sequence of observations. The state optimized probability. The sequence of states that most likely produced the sequence. Calculates the probability that this model has generated the given sequence. Evaluation problem. Given the HMM M = (A, B, pi) and the observation sequence O = {o1, o2, ..., oK}, calculate the probability that model M has generated sequence O. This can be computed efficiently using the Forward algorithm. A sequence of observations. The probability that the given sequence has been generated by this model. Internal methods for validation and other shared functions. Converts a univariate or multivariate array of observations into a two-dimensional jagged array. Converts a univariate or multivariate array of observations into a two-dimensional jagged array. Base class for Hidden Markov Models. This class cannot be instantiated. Constructs a new Hidden Markov Model. Gets the number of states of this model. Gets the log-initial probabilities log(pi) for this model. Gets the log-initial probabilities log(pi) for this model. Gets the log-transition matrix log(A) for this model. Gets the log-transition matrix log(A) for this model. Gets or sets a user-defined tag associated with this model. Base class for (HMM) Sequence Classifiers. This class cannot be instantiated. Initializes a new instance of the class. The number of classes in the classification problem. Initializes a new instance of the class. The models specializing in each of the classes of the classification problem. Gets or sets the threshold model. For gesture spotting, Lee and Kim introduced a threshold model which is composed of parts of the models in a hidden Markov sequence classifier. The threshold model acts as a baseline for decision rejection. If none of the classifiers is able to produce a higher likelihood than the threshold model, the decision is rejected. In the original Lee and Kim publication, the threshold model is constructed by creating a fully connected ergodic model by removing all outgoing transitions of states in all gesture models and fully connecting those states. References: H. Lee, J. Kim, An HMM-based threshold model approach for gesture recognition, IEEE Trans. Pattern Anal. Mach. Intell. 21 (10) (1999) 961–973. Gets or sets a value governing the rejection given by a threshold model (if present). Increasing this value will result in higher rejection rates. Default is 1. Gets the collection of models specialized in each class of the sequence classification problem. Gets the Hidden Markov Model implementation responsible for recognizing each of the classes given the desired class label. The class label of the model to get. Gets the number of classes which can be recognized by this classifier. Gets the prior distribution assumed for the classes. Computes the most likely class for a given sequence. The sequence of observations. Return the label of the given sequence, or -1 if it has been rejected by the threshold model. Computes the most likely class for a given sequence. The sequence of observations. The probability of the assigned class. Return the label of the given sequence, or -1 if it has been rejected by the threshold model. Computes the most likely class for a given sequence. The sequence of observations. The probabilities for each class. Return the label of the given sequence, or -1 if it has been rejected by the threshold model. Computes the log-likelihood that a sequence belongs to a given class according to this classifier. The sequence of observations. The output class label. The log-likelihood of the sequence belonging to the given class. Computes the log-likelihood that a sequence belongs any of the classes in the classifier. The sequence of observations. The log-likelihood of the sequence belonging to the classifier. Computes the log-likelihood of a set of sequences belonging to their given respective classes according to this classifier. A set of sequences of observations. The output class label for each sequence. The log-likelihood of the sequences belonging to the given classes. Returns an enumerator that iterates through the models in the classifier. A that can be used to iterate through the collection. Returns an enumerator that iterates through the models in the classifier. A that can be used to iterate through the collection. Common interface for sequence classifiers using hidden Markov models. Computes the most likely class for a given sequence. The sequence of observations. The class responsibilities (or the probability of the sequence to belong to each class). When using threshold models, the sum of the probabilities will not equal one, and the amount left was the threshold probability. If a threshold model is not being used, the array should sum to one. Return the label of the given sequence, or -1 if it has been rejected. Gets the number of classes which can be recognized by this classifier. Obsolete. Please use instead. Constructs a new Hidden Markov Model with arbitrary-density state probabilities. A object specifying the initial values of the matrix of transition probabilities A and initial state probabilities pi to be used by this model. The initial emission probability distribution to be used by each of the states. This initial probability distribution will be cloned across all states. Constructs a new Hidden Markov Model with arbitrary-density state probabilities. A object specifying the initial values of the matrix of transition probabilities A and initial state probabilities pi to be used by this model. The initial emission probability distributions for each state. Constructs a new Hidden Markov Model with arbitrary-density state probabilities. The transitions matrix A for this model. The emissions matrix B for this model. The initial state probabilities for this model. Set to true if the matrices are given with logarithms of the intended probabilities; set to false otherwise. Default is false. Constructs a new Hidden Markov Model with arbitrary-density state probabilities. The number of states for the model. A initial distribution to be copied to all states in the model. Gets the number of dimensions in the probability distributions for the states. Gets the Emission matrix (B) for this model. Calculates the most likely sequence of hidden states that produced the given observation sequence. Decoding problem. Given the HMM M = (A, B, pi) and the observation sequence O = {o1,o2, ..., oK}, calculate the most likely sequence of hidden states Si that produced this observation sequence O. This can be computed efficiently using the Viterbi algorithm. A sequence of observations. The sequence of states that most likely produced the sequence. Calculates the most likely sequence of hidden states that produced the given observation sequence. Decoding problem. Given the HMM M = (A, B, pi) and the observation sequence O = {o1,o2, ..., oK}, calculate the most likely sequence of hidden states Si that produced this observation sequence O. This can be computed efficiently using the Viterbi algorithm. A sequence of observations. The log-likelihood along the most likely sequence. The sequence of states that most likely produced the sequence. Calculates the probability of each hidden state for each observation in the observation vector. If there are 3 states in the model, and the array contains 5 elements, the resulting vector will contain 5 vectors of size 3 each. Each vector of size 3 will contain probability values that sum up to one. By following those probabilities in order, we may decode those probabilities into a sequence of most likely states. However, the sequence of obtained states may not be valid in the model. A sequence of observations. A vector of the same size as the observation vectors, containing the probabilities for each state in the model for the current observation. If there are 3 states in the model, and the array contains 5 elements, the resulting vector will contain 5 vectors of size 3 each. Each vector of size 3 will contain probability values that sum up to one. Calculates the probability of each hidden state for each observation in the observation vector, and uses those probabilities to decode the most likely sequence of states for each observation in the sequence using the posterior decoding method. See remarks for details. If there are 3 states in the model, and the array contains 5 elements, the resulting vector will contain 5 vectors of size 3 each. Each vector of size 3 will contain probability values that sum up to one. By following those probabilities in order, we may decode those probabilities into a sequence of most likely states. However, the sequence of obtained states may not be valid in the model. A sequence of observations. The sequence of states most likely associated with each observation, estimated using the posterior decoding method. A vector of the same size as the observation vectors, containing the probabilities for each state in the model for the current observation. If there are 3 states in the model, and the array contains 5 elements, the resulting vector will contain 5 vectors of size 3 each. Each vector of size 3 will contain probability values that sum up to one. Calculates the likelihood that this model has generated the given sequence. Evaluation problem. Given the HMM M = (A, B, pi) and the observation sequence O = {o1, o2, ..., oK}, calculate the probability that model M has generated sequence O. This can be computed efficiently using the either the Viterbi or the Forward algorithms. A sequence of observations. The log-likelihood that the given sequence has been generated by this model. Calculates the log-likelihood that this model has generated the given observation sequence along the given state path. A sequence of observations. A sequence of states. The log-likelihood that the given sequence of observations has been generated by this model along the given sequence of states. Predicts the next observation occurring after a given observation sequence. A sequence of observations. Predictions will be made regarding the next observations that should be coming after the last observation in this sequence. Predicts the next observation occurring after a given observation sequence. A sequence of observations. Predictions will be made regarding the next observation that should be coming after the last observation in this sequence. Predicts the next observation occurring after a given observation sequence. A sequence of observations. Predictions will be made regarding the next observations that should be coming after the last observation in this sequence. The log-likelihood of the given sequence, plus the predicted next observation. Exponentiate this value (use the System.Math.Exp function) to obtain a likelihood value. Predicts the next observation occurring after a given observation sequence. A sequence of observations. Predictions will be made regarding the next observations that should be coming after the last observation in this sequence. The log-likelihood of the given sequence, plus the predicted next observation. Exponentiate this value (use the System.Math.Exp function) to obtain a likelihood value. Predicts the next observation occurring after a given observation sequence. A sequence of observations. Predictions will be made regarding the next observations that should be coming after the last observation in this sequence. The log-likelihood of the given sequence, plus the predicted next observation. Exponentiate this value (use the System.Math.Exp function) to obtain a likelihood value. The continuous probability distribution describing the next observations that are likely to be generated. Taking the mode of this distribution might give the most likely next value in the observed sequence. Predicts the next observation occurring after a given observation sequence. A sequence of observations. Predictions will be made regarding the next observations that should be coming after the last observation in this sequence. The continuous probability distribution describing the next observations that are likely to be generated. Taking the mode of this distribution might give the most likely next value in the observed sequence. Predicts the next observation occurring after a given observation sequence. A sequence of observations. Predictions will be made regarding the next observations that should be coming after the last observation in this sequence. The continuous probability distribution describing the next observations that are likely to be generated. Taking the mode of this distribution might give the most likely next value in the observed sequence. Predicts the next observation occurring after a given observation sequence. A sequence of observations. Predictions will be made regarding the next observations that should be coming after the last observation in this sequence. The log-likelihood of the given sequence, plus the predicted next observation. Exponentiate this value (use the System.Math.Exp function) to obtain a likelihood value. The continuous probability distribution describing the next observations that are likely to be generated. Taking the mode of this distribution might give the most likely next value in the observed sequence. Predicts the next observations occurring after a given observation sequence. A sequence of observations. Predictions will be made regarding the next observations that should be coming after the last observation in this sequence. The number of observations to be predicted. Default is 1. The log-likelihood of the given sequence, plus the predicted next observation. Exponentiate this value (use the System.Math.Exp function) to obtain a likelihood value. Predicts the next observations occurring after a given observation sequence. A sequence of observations. Predictions will be made regarding the next observations that should be coming after the last observation in this sequence. The number of observations to be predicted. Default is 1. The log-likelihood of the given sequence, plus the predicted next observations. Exponentiate this value (use the System.Math.Exp function) to obtain a likelihood value. Generates a random vector of observations from the model. The number of samples to generate. A random vector of observations drawn from the model. Generates a random vector of observations from the model. The number of samples to generate. The log-likelihood of the generated observation sequence. The Viterbi path of the generated observation sequence. A random vector of observations drawn from the model. Predicts the next observation occurring after a given observation sequence. Predicts the next observation occurring after a given observation sequence. Predicts the next observation occurring after a given observation sequence. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Saves the hidden Markov model to a stream. The stream to which the model is to be serialized. Saves the hidden Markov model to a stream. The stream to which the model is to be serialized. Loads a hidden Markov model from a stream. The stream from which the model is to be deserialized. The deserialized model. Loads a hidden Markov model from a file. The path to the file from which the model is to be deserialized. The deserialized model. Obsolete. Please use instead. Gets the number of dimensions of the observations handled by this classifier. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. An array specifying the number of hidden states for each of the classifiers. By default, and Ergodic topology will be used. The initial probability distributions for the hidden states. For multivariate continuous density distributions, such as Normal mixtures, the choice of initial values is crucial for a good performance. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. The topology of the hidden states. A forward-only topology is indicated to sequence classification problems, such as speech recognition. The initial probability distributions for the hidden states. For multivariate continuous density distributions, such as Normal mixtures, the choice of initial values is crucial for a good performance. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. The topology of the hidden states. A forward-only topology is indicated to sequence classification problems, such as speech recognition. The initial probability distributions for the hidden states. For multivariate continuous density distributions, such as Normal mixtures, the choice of initial values is crucial for a good performance. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. The topology of the hidden states. A forward-only topology is indicated to sequence classification problems, such as speech recognition. The initial probability distributions for the hidden states. For multivariate continuous density distributions, such as Normal mixtures, the choice of initial values is crucial for a good performance. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. The topology of the hidden states. A forward-only topology is indicated to sequence classification problems, such as speech recognition. The initial probability distributions for the hidden states. For multivariate continuous density distributions, such as Normal mixtures, the choice of initial values is crucial for a good performance. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. The topology of the hidden states. A forward-only topology is indicated to sequence classification problems, such as speech recognition. The initial probability distributions for the hidden states. For multivariate continuous density distributions, such as Normal mixtures, the choice of initial values is crucial for a good performance. The class labels for each of the models. Creates a new Sequence Classifier with the given number of classes. The models specializing in each of the classes of the classification problem. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. The topology of the hidden states. A forward-only topology is indicated to sequence classification problems, such as speech recognition. The initial probability distributions for the hidden states. For multivariate continuous density distributions, such as Normal mixtures, the choice of initial values is crucial for a good performance. The class labels for each of the models. Computes the most likely class for a given sequence. The sequence of observations. Return the label of the given sequence, or -1 if it has been rejected by the threshold model. Computes the most likely class for a given sequence. The sequence of observations. The probability of the assigned class. Return the label of the given sequence, or -1 if it has been rejected by the threshold model. Computes the most likely class for a given sequence. The sequence of observations. The class responsibilities (or the probability of the sequence to belong to each class). When using threshold models, the sum of the probabilities will not equal one, and the amount left was the threshold probability. If a threshold model is not being used, the array should sum to one. Return the label of the given sequence, or -1 if it has been rejected by the threshold model. Computes the log-likelihood of a sequence belong to a given class according to this classifier. The sequence of observations. The output class label. The log-likelihood of the sequence belonging to the given class. Computes the log-likelihood that a sequence belongs any of the classes in the classifier. The sequence of observations. The log-likelihood of the sequence belonging to the classifier. Computes the log-likelihood of a set of sequences belonging to their given respective classes according to this classifier. A set of sequences of observations. The output class label for each sequence. The log-likelihood of the sequences belonging to the given classes. Saves the classifier to a stream. The stream to which the classifier is to be serialized. Saves the classifier to a stream. The stream to which the classifier is to be serialized. Loads a classifier from a stream. The stream from which the classifier is to be deserialized. The deserialized classifier. Loads a classifier from a file. The path to the file from which the classifier is to be deserialized. The deserialized classifier. Discrete-density Hidden Markov Model. Hidden Markov Models (HMM) are stochastic methods to model temporal and sequence data. They are especially known for their application in temporal pattern recognition such as speech, handwriting, gesture recognition, part-of-speech tagging, musical score following, partial discharges and bioinformatics. This page refers to the discrete-density version of the model. For arbitrary density (probability distribution) definitions, please see . Dynamical systems of discrete nature assumed to be governed by a Markov chain emits a sequence of observable outputs. Under the Markov assumption, it is also assumed that the latest output depends only on the current state of the system. Such states are often not known from the observer when only the output values are observable. Assuming the Markov probability, the probability of any sequence of observations occurring when following a given sequence of states can be stated as

in which the probabilities p(yt|yt-1) can be read as the probability of being currently in state yt given we just were in the state yt-1 at the previous instant t-1, and the probability p(xt|yt) can be understood as the probability of observing xt at instant t given we are currently in the state yt. To compute those probabilities, we simple use two matrices A and B. The matrix A is the matrix of state probabilities: it gives the probabilities p(yt|yt-1) of jumping from one state to the other, and the matrix B is the matrix of observation probabilities, which gives the distribution density p(xt|yt) associated a given state yt. In the discrete case, B is really a matrix. In the continuous case, B is a vector of probability distributions. The overall model definition can then be stated by the tuple

in which n is an integer representing the total number of states in the system, A is a matrix of transition probabilities, B is either a matrix of observation probabilities (in the discrete case) or a vector of probability distributions (in the general case) and p is a vector of initial state probabilities determining the probability of starting in each of the possible states in the model. Hidden Markov Models attempt to model such systems and allow, among other things, To infer the most likely sequence of states that produced a given output sequence, Infer which will be the most likely next state (and thus predicting the next output), Calculate the probability that a given sequence of outputs originated from the system (allowing the use of hidden Markov models for sequence classification). The “hidden” in Hidden Markov Models comes from the fact that the observer does not know in which state the system may be in, but has only a probabilistic insight on where it should be. To learn a Markov model, you can find a list of both supervised and unsupervised learning algorithms in the namespace. References: Wikipedia contributors. "Linear regression." Wikipedia, the Free Encyclopedia. Available at: http://en.wikipedia.org/wiki/Hidden_Markov_model Nikolai Shokhirev, Hidden Markov Models. Personal website. Available at: http://www.shokhirev.com/nikolai/abc/alg/hmm/hmm.html X. Huang, A. Acero, H. Hon. "Spoken Language Processing." pp 396-397. Prentice Hall, 2001. Dawei Shen. Some mathematics for HMMs, 2008. Available at: http://courses.media.mit.edu/2010fall/mas622j/ProblemSets/ps4/tutorial.pdf
The example below reproduces the same example given in the Wikipedia entry for the Viterbi algorithm (http://en.wikipedia.org/wiki/Viterbi_algorithm). In this example, the model's parameters are initialized manually. However, it is possible to learn those automatically using . If you would like to learn the a hidden Markov model straight from a dataset, you can use: Hidden Markov Models are generative models, and as such, can be used to generate new samples following the structure that they have learned from the data Hidden Markov Models can also be used to predict the next observation in a sequence. This can be done by inspecting the forward matrix of probabilities for the sequence and inspecting all states and possible symbols to find which state-observation combination would be the most likely after the current ones. This limits the applicability of this model to only very short-term predictions (i.e. most likely, only the most immediate next observation). For more examples on how to learn discrete models, please see the documentation page. For continuous models (models that can model more than just integer labels), please see . Baum-Welch, one of the most famous learning algorithms for Hidden Markov Models. Arbitrary-density Hidden Markov Model.
Please use instead. Please use instead. Please use instead. Gets the number of symbols in this model's alphabet. Please use instead. Gets the log-emission matrix log(B) for this model. Constructs a new Hidden Markov Model. A object specifying the initial values of the matrix of transition probabilities A and initial state probabilities pi to be used by this model. The emissions matrix B for this model. Set to true if the matrices are given with logarithms of the intended probabilities; set to false otherwise. Default is false. Constructs a new Hidden Markov Model. A object specifying the initial values of the matrix of transition probabilities A and initial state probabilities pi to be used by this model. The emissions matrix B for this model. Set to true if the matrices are given with logarithms of the intended probabilities; set to false otherwise. Default is false. Constructs a new Hidden Markov Model. A object specifying the initial values of the matrix of transition probabilities A and initial state probabilities pi to be used by this model. The number of output symbols used for this model. Constructs a new Hidden Markov Model. A object specifying the initial values of the matrix of transition probabilities A and initial state probabilities pi to be used by this model. The number of output symbols used for this model. Whether to initialize emissions with random probabilities or uniformly with 1 / number of symbols. Default is false (default is to use 1/symbols). Constructs a new Hidden Markov Model. The transitions matrix A for this model. The emissions matrix B for this model. The initial state probabilities for this model. Set to true if the matrices are given with logarithms of the intended probabilities; set to false otherwise. Default is false. Constructs a new Hidden Markov Model. The transitions matrix A for this model. The emissions matrix B for this model. The initial state probabilities for this model. Set to true if the matrices are given with logarithms of the intended probabilities; set to false otherwise. Default is false. Constructs a new Hidden Markov Model. The number of states for this model. The number of output symbols used for this model. Constructs a new Hidden Markov Model. The number of states for this model. The number of output symbols used for this model. Whether to initialize the model transitions and emissions with random probabilities or uniformly with 1 / number of states (for transitions) and 1 / number of symbols (for emissions). Default is false. Predicts next observations occurring after a given observation sequence. A sequence of observations. Predictions will be made regarding the next observations that should be coming after the last observation in this sequence. The number of observations to be predicted. Default is 1. Predicts next observations occurring after a given observation sequence. A sequence of observations. Predictions will be made regarding the next observations that should be coming after the last observation in this sequence. The number of observations to be predicted. Default is 1. The log-likelihood of the different symbols for each predicted next observations. In order to convert those values to probabilities, exponentiate the values in the vectors (using the Exp function) and divide each value by their vector's sum. Predicts the next observation occurring after a given observation sequence. A sequence of observations. Predictions will be made regarding the next observation that should be coming after the last observation in this sequence. The log-likelihood of the different symbols for the next observation. In order to convert those values to probabilities, exponentiate the values in the vector (using the Exp function) and divide each value by the vector sum. This will give the probability of each next possible symbol to be the next observation in the sequence. Predicts the next observations occurring after a given observation sequence. A sequence of observations. Predictions will be made regarding the next observations that should be coming after the last observation in this sequence. The number of observations to be predicted. Default is 1. The log-likelihood of the different symbols for each predicted next observations. In order to convert those values to probabilities, exponentiate the values in the vectors (using the Exp function) and divide each value by their vector's sum. The log-likelihood of the given sequence, plus the predicted next observations. Exponentiate this value (use the System.Math.Exp function) to obtain a likelihood value. Converts this Discrete density Hidden Markov Model into a arbitrary density model. Converts this Discrete density Hidden Markov Model into a arbitrary density model. Converts this Discrete density Hidden Markov Model to a Continuous density model. Constructs a new discrete-density Hidden Markov Model. The transitions matrix A for this model. The emissions matrix B for this model. The initial state probabilities for this model. Set to true if the matrices are given with logarithms of the intended probabilities; set to false otherwise. Default is false. Constructs a new Hidden Markov Model with discrete state probabilities. A object specifying the initial values of the matrix of transition probabilities A and initial state probabilities pi to be used by this model. The number of output symbols used for this model. Constructs a new Hidden Markov Model with discrete state probabilities. A object specifying the initial values of the matrix of transition probabilities A and initial state probabilities pi to be used by this model. The number of output symbols used for this model. Whether to initialize emissions with random probabilities or uniformly with 1 / number of symbols. Default is false (default is to use 1/symbols). Constructs a new Hidden Markov Model with discrete state probabilities. The number of states for this model. The number of output symbols used for this model. Constructs a new Hidden Markov Model with discrete state probabilities. The number of states for this model. The number of output symbols used for this model. Whether to initialize emissions with random probabilities or uniformly with 1 / number of symbols. Default is false (default is to use 1/symbols). Creates a discrete hidden Markov model using the generic interface. The transitions matrix A for this model. The emissions matrix B for this model. The initial state probabilities for this model. Set to true if the matrices are given with logarithms of the intended probabilities; set to false otherwise. Default is false. Creates a discrete hidden Markov model using the generic interface. A object specifying the initial values of the matrix of transition probabilities A and initial state probabilities pi to be used by this model. The number of output symbols used for this model. Creates a discrete hidden Markov model using the generic interface. A object specifying the initial values of the matrix of transition probabilities A and initial state probabilities pi to be used by this model. The number of output symbols used for this model. Whether to initialize emissions with random probabilities or uniformly with 1 / number of symbols. Default is false (default is to use 1/symbols). Creates a discrete hidden Markov model using the generic interface. The number of states for this model. The number of output symbols used for this model. Creates a discrete hidden Markov model using the generic interface. The number of states for this model. The number of output symbols used for this model. Whether to initialize emissions with random probabilities or uniformly with 1 / number of symbols. Default is false (default is to use 1/symbols). Calculates the probability of each hidden state for each observation in the observation vector. If there are 3 states in the model, and the array contains 5 elements, the resulting vector will contain 5 vectors of size 3 each. Each vector of size 3 will contain probability values that sum up to one. By following those probabilities in order, we may decode those probabilities into a sequence of most likely states. However, the sequence of obtained states may not be valid in the model. A sequence of observations. A vector of the same size as the observation vectors, containing the probabilities for each state in the model for the current observation. If there are 3 states in the model, and the array contains 5 elements, the resulting vector will contain 5 vectors of size 3 each. Each vector of size 3 will contain probability values that sum up to one. Calculates the probability of each hidden state for each observation in the observation vector, and uses those probabilities to decode the most likely sequence of states for each observation in the sequence using the posterior decoding method. See remarks for details. If there are 3 states in the model, and the array contains 5 elements, the resulting vector will contain 5 vectors of size 3 each. Each vector of size 3 will contain probability values that sum up to one. By following those probabilities in order, we may decode those probabilities into a sequence of most likely states. However, the sequence of obtained states may not be valid in the model. A sequence of observations. The sequence of states most likely associated with each observation, estimated using the posterior decoding method. A vector of the same size as the observation vectors, containing the probabilities for each state in the model for the current observation. If there are 3 states in the model, and the array contains 5 elements, the resulting vector will contain 5 vectors of size 3 each. Each vector of size 3 will contain probability values that sum up to one. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Saves the hidden Markov model to a stream. The stream to which the model is to be serialized. Saves the hidden Markov model to a stream. The stream to which the model is to be serialized. Loads a hidden Markov model from a stream. The stream from which the model is to be deserialized. The deserialized classifier. Loads a hidden Markov model from a file. The path to the file from which the model is to be deserialized. The deserialized model. Loads a hidden Markov model from a stream. The stream from which the model is to be deserialized. The deserialized model. Loads a hidden Markov model from a file. The path to the file from which the model is to be deserialized. The deserialized model. Common interface for Hidden Markov Models. Calculates the most likely sequence of hidden states that produced the given observation sequence. Decoding problem. Given the HMM M = (A, B, pi) and the observation sequence O = {o1,o2, ..., oK}, calculate the most likely sequence of hidden states Si that produced this observation sequence O. This can be computed efficiently using the Viterbi algorithm. A sequence of observations. The state optimized probability. The sequence of states that most likely produced the sequence. Calculates the probability that this model has generated the given sequence. Evaluation problem. Given the HMM M = (A, B, pi) and the observation sequence O = {o1, o2, ..., oK}, calculate the probability that model M has generated sequence O. This can be computed efficiently using the Forward algorithm. A sequence of observations. The probability that the given sequence has been generated by this model. Gets the number of states of this model. Gets the initial probabilities for this model. Gets the log of the initial probabilities (log(pi)) for this model. Gets the log of the transition matrix (log(A)) for this model. Gets the log of the transition matrix (log(A)) for this model. Gets or sets a user-defined tag. Calculates the probability of each hidden state for each observation in the observation vector. If there are 3 states in the model, and the array contains 5 elements, the resulting vector will contain 5 vectors of size 3 each. Each vector of size 3 will contain probability values that sum up to one. By following those probabilities in order, we may decode those probabilities into a sequence of most likely states. However, the sequence of obtained states may not be valid in the model. A sequence of observations. A vector of the same size as the observation vectors, containing the probabilities for each state in the model for the current observation. If there are 3 states in the model, and the array contains 5 elements, the resulting vector will contain 5 vectors of size 3 each. Each vector of size 3 will contain probability values that sum up to one. Calculates the probability of each hidden state for each observation in the observation vector, and uses those probabilities to decode the most likely sequence of states for each observation in the sequence using the posterior decoding method. See remarks for details. If there are 3 states in the model, and the array contains 5 elements, the resulting vector will contain 5 vectors of size 3 each. Each vector of size 3 will contain probability values that sum up to one. By following those probabilities in order, we may decode those probabilities into a sequence of most likely states. However, the sequence of obtained states may not be valid in the model. A sequence of observations. The sequence of states most likely associated with each observation, estimated using the posterior decoding method. A vector of the same size as the observation vectors, containing the probabilities for each state in the model for the current observation. If there are 3 states in the model, and the array contains 5 elements, the resulting vector will contain 5 vectors of size 3 each. Each vector of size 3 will contain probability values that sum up to one. Discrete-density Hidden Markov Model Set for Sequence Classification. This class uses a set of discrete hidden Markov models to classify sequences of integer symbols. Each model will try to learn and recognize each of the different output classes. For examples and details on how to learn such models, please take a look on the documentation for . For other type of sequences, such as discrete sequences (not necessarily symbols) or even continuous and multivariate variables, please see use the generic classifier counterpart Examples are available at the respective learning algorithm pages. For example, see . Obsolete. Please use instead. Gets the number of symbols recognizable by the models. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. The topology of the hidden states. A forward-only topology is indicated to sequence classification problems, such as speech recognition. The number of symbols in the models' discrete alphabet. The optional class names for each of the classifiers. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. The topology of the hidden states. A forward-only topology is indicated to sequence classification problems, such as speech recognition. The number of symbols in the models' discrete alphabet. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. The topology of the hidden states. A forward-only topology is indicated to sequence classification problems, such as speech recognition. The number of symbols in the models' discrete alphabet. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. The topology of the hidden states. A forward-only topology is indicated to sequence classification problems, such as speech recognition. The number of symbols in the models' discrete alphabet. The optional class names for each of the classifiers. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. An array specifying the number of hidden states for each of the classifiers. By default, and Ergodic topology will be used. The number of symbols in the models' discrete alphabet. The optional class names for each of the classifiers. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. An array specifying the number of hidden states for each of the classifiers. By default, and Ergodic topology will be used. The number of symbols in the models' discrete alphabet. Computes the most likely class for a given sequence. The sequence of observations. Return the label of the given sequence, or -1 if it has been rejected by the threshold model. Computes the most likely class for a given sequence. The sequence of observations. The class responsibilities (or the probability of the sequence to belong to each class). When using threshold models, the sum of the probabilities will not equal one, and the amount left was the threshold probability. If a threshold model is not being used, the array should sum to one. Return the label of the given sequence, or -1 if it has been rejected by the threshold model. Computes the most likely class for a given sequence. The sequence of observations. The class responsibilities (or the probability of the sequence to belong to each class). When using threshold models, the sum of the probabilities will not equal one, and the amount left was the threshold probability. If a threshold model is not being used, the array should sum to one. Return the label of the given sequence, or -1 if it has been rejected by the threshold model. Computes the log-likelihood of a set of sequences belonging to their given respective classes according to this classifier. A set of sequences of observations. The output class label for each sequence. The log-likelihood of the sequences belonging to the given classes. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. An array specifying the number of hidden states for each of the classifiers. By default, and Ergodic topology will be used. The number of symbols in the models' discrete alphabet. Creates a new Sequence Classifier with the given number of classes. The number of classes in the classifier. An array specifying the number of hidden states for each of the classifiers. By default, and Ergodic topology will be used. The number of symbols in the models' discrete alphabet. Computes the most likely class for a given sequence. Saves the classifier to a stream. The stream to which the classifier is to be serialized. Saves the classifier to a stream. The stream to which the classifier is to be serialized. Loads a classifier from a stream. The stream from which the classifier is to be deserialized. The deserialized classifier. Loads a classifier from a file. The path to the file from which the classifier is to be deserialized. The deserialized classifier. Loads a classifier from a stream. The stream from which the classifier is to be deserialized. The deserialized classifier. Loads a classifier from a file. The path to the file from which the classifier is to be deserialized. The deserialized classifier. Custom Topology for Hidden Markov Model. An Hidden Markov Model Topology specifies how many states and which initial probabilities a Markov model should have. Two common topologies can be discussed in terms of transition state probabilities and are available to construction through the and classes implementing the interface. Topology specification is important with regard to both learning and performance: A model with too many states (and thus too many settable parameters) will require too much training data while an model with an insufficient number of states will prohibit the HMM from capturing subtle statistical patterns. This custom implementation allows for arbitrarily specification of the state transition matrix and initial state probabilities for hidden Markov models. Creates a new custom topology with user-defined transition matrix and initial state probabilities. The initial probabilities for the model. The transition probabilities for the model. Creates a new custom topology with user-defined transition matrix and initial state probabilities. The initial probabilities for the model. The transition probabilities for the model. Set to true if the passed transitions are given in log-probabilities. Default is false (given values are probabilities). Creates a new custom topology with user-defined transition matrix and initial state probabilities. The initial probabilities for the model. The transition probabilities for the model. Creates a new custom topology with user-defined transition matrix and initial state probabilities. The initial probabilities for the model. The transition probabilities for the model. Set to true if the passed transitions are given in log-probabilities. Default is false (given values are probabilities). Gets the number of states in this topology. Gets the initial state probabilities. Gets the state-transitions matrix. Creates the state transitions matrix and the initial state probabilities for this topology. Ergodic (fully-connected) Topology for Hidden Markov Models. Ergodic models are commonly used to represent models in which a single (large) sequence of observations is used for training (such as when a training sequence does not have well defined starting and ending points and can potentially be infinitely long). Models starting with an ergodic transition-state topology typically have only a small number of states. References: Alexander Schliep, "Learning Hidden Markov Model Topology". Richard Hughey and Anders Krogh, "Hidden Markov models for sequence analysis: extension and analysis of the basic method", CABIOS 12(2):95-107, 1996. Available in: http://compbio.soe.ucsc.edu/html_format_papers/hughkrogh96/cabios.html In a second example, we will create an Ergodic (fully connected) discrete-density hidden Markov model with uniform probabilities. // Create a new Ergodic hidden Markov model with three // fully-connected states and four sequence symbols. var model = new HiddenMarkovModel(new Ergodic(3), 4); // After creation, the state transition matrix for the model // should be given by: // // { 0.33, 0.33, 0.33 } // { 0.33, 0.33, 0.33 } // { 0.33, 0.33, 0.33 } // // in which all state transitions are allowed. Gets the number of states in this topology. Gets or sets whether the transition matrix should be initialized with random probabilities or not. Default is false. Creates a new Ergodic topology for a given number of states. The number of states to be used in the model. Creates a new Ergodic topology for a given number of states. The number of states to be used in the model. Whether to initialize the model with random probabilities or uniformly with 1 / number of states. Default is false (default is to use 1/states). Creates the state transitions matrix and the initial state probabilities for this topology. Forward Topology for Hidden Markov Models. Forward topologies are commonly used to initialize models in which training sequences can be organized in samples, such as in the recognition of spoken words. In spoken word recognition, several examples of a single word can (and should) be used to train a single model, to achieve the most general model able to generalize over a great number of word samples. Forward models can typically have a large number of states. References: Alexander Schliep, "Learning Hidden Markov Model Topology". Richard Hughey and Anders Krogh, "Hidden Markov models for sequence analysis: extension and analysis of the basic method", CABIOS 12(2):95-107, 1996. Available in: http://compbio.soe.ucsc.edu/html_format_papers/hughkrogh96/cabios.html In the following example, we will create a Forward-only discrete-density hidden Markov model. // Create a new Forward-only hidden Markov model with // three forward-only states and four sequence symbols. var model = new HiddenMarkovModel(new Forward(3), 4); // After creation, the state transition matrix for the model // should be given by: // // { 0.33, 0.33, 0.33 } // { 0.00, 0.50, 0.50 } // { 0.00, 0.00, 1.00 } // // in which no backward transitions are allowed (have zero probability). Gets the number of states in this topology. Gets or sets the maximum deepness level allowed for the forward state transition chains. Gets or sets whether the transition matrix should be initialized with random probabilities or not. Default is false. Gets the initial state probabilities. Creates a new Forward topology for a given number of states. The number of states to be used in the model. Creates a new Forward topology for a given number of states. The number of states to be used in the model. The maximum number of forward transitions allowed for a state. Default is to use the same as the number of states (all forward connections are allowed). Creates a new Forward topology for a given number of states. The number of states to be used in the model. Whether to initialize the model with random probabilities or uniformly with 1 / number of states. Default is false (default is to use 1/states). Creates a new Forward topology for a given number of states. The number of states to be used in the model. The maximum number of forward transitions allowed for a state. Default is to use the same as the number of states (all forward connections are allowed). Whether to initialize the model with random probabilities or uniformly with 1 / number of states. Default is false (default is to use 1/states). Creates the state transitions matrix and the initial state probabilities for this topology. Hidden Markov model topology (architecture) specification. An Hidden Markov Model Topology specifies how many states and which initial probabilities a Markov model should have. Two common topologies can be discussed in terms of transition state probabilities and are available to construction through the and classes implementing this interface. Topology specification is important with regard to both learning and performance: A model with too many states (and thus too many settable parameters) will require too much training data while an model with an insufficient number of states will prohibit the HMM from capturing subtle statistical patterns. References: Alexander Schliep, "Learning Hidden Markov Model Topology". Richard Hughey and Anders Krogh, "Hidden Markov models for sequence analysis: extension and analysis of the basic method", CABIOS 12(2):95-107, 1996. Available in: http://compbio.soe.ucsc.edu/html_format_papers/hughkrogh96/cabios.html Gets the number of states in this topology. Creates the state transitions matrix and the initial state probabilities for this topology. Polynomial Least-Squares. In linear regression, the model specification is that the dependent variable, y is a linear combination of the parameters (but need not be linear in the independent variables). As the linear regression has a closed form solution, the regression coefficients can be efficiently computed using the Regress method of this class. Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. Gets or sets whether to always use a robust Least-Squares estimate using the . Default is false. Gets or sets the polynomial degree to use in the polynomial regression. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Least Squares learning algorithm for linear regression models. Let's say we have some univariate, continuous sets of input data, and a corresponding univariate, continuous set of output data, such as a set of points in R². A simple linear regression is able to fit a line relating the input variables to the output variables in which the minimum-squared-error of the line and the actual output points is minimum. The following example shows how to fit a multiple linear regression model to model a plane as an equation in the form ax + by + c = z. The following example shows how to fit a multivariate linear regression model, producing multidimensional outputs for each input. Gets or sets whether to include an intercept term in the learned models. Default is true. Gets or sets whether to always use a robust Least-Squares estimate using the . Default is false. Initializes a new instance of the class. Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Gets the information matrix used to update the regression weights in the last call to Common interface for Linear Regression Models. This interface specifies a common interface for querying a linear regression model. Since a closed-form solution exists for fitting most linear models, each of the models may also implement a Regress method for computing actual regression. Computes the model output for a given input. Multiple Linear Regression. In multiple linear regression, the model specification is that the dependent variable, denoted y_i, is a linear combination of the parameters (but need not be linear in the independent x_i variables). As the linear regression has a closed form solution, the regression coefficients can be computed by calling the method only once. The following example shows how to fit a multiple linear regression model to model a plane as an equation in the form ax + by + c = z. The next example shows how to fit a multiple linear regression model in conjunction with a discrete codebook to learn from discrete variables using one-hot encodings when applicable: The next example shows how to fit a multiple linear regression model with the additional constraint that none of its coefficients should be negative. For this we can use the learning algorithm instead of the used above. Creates a new Multiple Linear Regression. The number of inputs for the regression. Creates a new Multiple Linear Regression. The number of inputs for the regression. True to use an intercept term, false otherwise. Default is false. Creates a new Multiple Linear Regression. The number of inputs for the regression. True to use an intercept term, false otherwise. Default is false. Initializes a new instance of the class. Gets the coefficients used by the regression model. If the model contains an intercept term, it will be in the end of the vector. Gets the number of inputs accepted by the model. Gets or sets the linear weights of the regression model. The intercept term is not stored in this vector, but is instead available through the property. Gets the number of parameters in this model (equals the NumberOfInputs + 1). Gets the number of inputs for the regression model. Gets whether this model has an additional intercept term. Gets or sets the intercept value for the regression. Performs the regression using the input vectors and output data, returning the sum of squared errors of the fit. The input vectors to be used in the regression. The output values for each input vector. Set to true to force the use of the . This will avoid any rank exceptions, but might be more computing intensive. The Sum-Of-Squares error of the regression. Performs the regression using the input vectors and output data, returning the sum of squared errors of the fit. The input vectors to be used in the regression. The output values for each input vector. The Sum-Of-Squares error of the regression. Performs the regression using the input vectors and output data, returning the sum of squared errors of the fit. The input vectors to be used in the regression. The output values for each input vector. Gets the Fisher's information matrix. Set to true to force the use of the . This will avoid any rank exceptions, but might be more computing intensive. The Sum-Of-Squares error of the regression. Gets the coefficient of determination, as known as R² (r-squared). The coefficient of determination is used in the context of statistical models whose main purpose is the prediction of future outcomes on the basis of other related information. It is the proportion of variability in a data set that is accounted for by the statistical model. It provides a measure of how well future outcomes are likely to be predicted by the model. The R² coefficient of determination is a statistical measure of how well the regression line approximates the real data points. An R² of 1.0 indicates that the regression line perfectly fits the data. This method uses the class to compute the R² coefficient. Please see the documentation for for more details, including usage examples. The R² (r-squared) coefficient for the given data. Gets the overall regression standard error. The inputs used to train the model. The outputs used to train the model. Gets the degrees of freedom when fitting the regression. Gets the standard error for each coefficient. The overall regression standard error (can be computed from . The information matrix obtained when training the model (see ). Gets the standard error of the fit for a particular input vector. The input vector where the standard error of the fit should be computed. The overall regression standard error (can be computed from . The information matrix obtained when training the model (see ). The standard error of the fit at the given input point. Gets the standard error of the prediction for a particular input vector. The input vector where the standard error of the prediction should be computed. The overall regression standard error (can be computed from . The information matrix obtained when training the model (see ). The standard error of the prediction given for the input point. Gets the confidence interval for an input point. The input vector. The overall regression standard error (can be computed from . The number of training samples used to fit the model. The information matrix obtained when training the model (see ). The prediction interval confidence (default is 95%). Gets the prediction interval for an input point. The input vector. The overall regression standard error (can be computed from . The number of training samples used to fit the model. The information matrix obtained when training the model (see ). The prediction interval confidence (default is 95%). Computes the Multiple Linear Regression for an input vector. The input vector. The calculated output. Computes the Multiple Linear Regression for input vectors. The input vector data. The calculated outputs. Returns a System.String representing the regression. Creates a new linear regression directly from data points. The input vectors x. The output vectors y. A linear regression f(x) that most approximates y. Creates a new linear regression from the regression coefficients. The linear coefficients. Whether to include an intercept (bias) term. A linear regression with the given coefficients. Returns a that represents this instance. The format to use.-or- A null reference (Nothing in Visual Basic) to use the default format defined for the type of the System.IFormattable implementation. The provider to use to format the value.-or- A null reference (Nothing in Visual Basic) to obtain the numeric format information from the current locale setting of the operating system. A that represents this instance. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Multivariate Linear Regression. Multivariate Linear Regression is a generalization of Multiple Linear Regression to allow for multiple outputs. Creates a new Multivariate Linear Regression. The number of inputs for the regression. The number of outputs for the regression. Creates a new Multivariate Linear Regression. The number of inputs for the regression. The number of outputs for the regression. True to use an intercept term, false otherwise. Default is false. Creates a new Multivariate Linear Regression. Creates a new Multivariate Linear Regression. Initializes a new instance of the class. Gets the coefficient matrix used by the regression model. Each column corresponds to the coefficient vector for each of the outputs. Gets the linear weights matrix. Gets the intercept vector (bias). Gets the number of parameters in the model (returns NumberOfInputs * NumberOfOutputs + NumberOfInputs). Gets the number of inputs in the model. Gets the number of outputs in the model. Performs the regression using the input vectors and output vectors, returning the sum of squared errors of the fit. The input vectors to be used in the regression. The output values for each input vector. The Sum-Of-Squares error of the regression. Gets the coefficient of determination, as known as R² (r-squared). The coefficient of determination is used in the context of statistical models whose main purpose is the prediction of future outcomes on the basis of other related information. It is the proportion of variability in a data set that is accounted for by the statistical model. It provides a measure of how well future outcomes are likely to be predicted by the model. The R² coefficient of determination is a statistical measure of how well the regression line approximates the real data points. An R² of 1.0 indicates that the regression line perfectly fits the data. This method uses the class to compute the R² coefficient. Please see the documentation for for more details, including usage examples. The R² (r-squared) coefficient for the given data. Gets the coefficient of determination, as known as R² (r-squared). The coefficient of determination is used in the context of statistical models whose main purpose is the prediction of future outcomes on the basis of other related information. It is the proportion of variability in a data set that is accounted for by the statistical model. It provides a measure of how well future outcomes are likely to be predicted by the model. The R² coefficient of determination is a statistical measure of how well the regression line approximates the real data points. An R² of 1.0 indicates that the regression line perfectly fits the data. This method uses the class to compute the R² coefficient. Please see the documentation for for more details, including usage examples. The R² (r-squared) coefficient for the given data. Computes the Multiple Linear Regression output for a given input. A input vector. The computed output. Computes the Multiple Linear Regression output for a given input. An array of input vectors. The computed outputs. Creates a new linear regression directly from data points. The input vectors x. The output vectors y. A linear regression f(x) that most approximates y. Creates a new linear regression from the regression coefficients. The linear coefficients. The intercept (bias) values. A linear regression with the given coefficients. Creates the inverse regression, a regression that can recover the input data given the outputs of this current regression. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Gets the overall regression standard error. The inputs used to train the model. The outputs used to train the model. Gets the degrees of freedom when fitting the regression. Gets the standard error for each coefficient. The overall regression standard error (can be computed from . The information matrix obtained when training the model (see ). Gets the standard error of the fit for a particular input vector. The input vector where the standard error of the fit should be computed. The overall regression standard error (can be computed from . The information matrix obtained when training the model (see ). The standard error of the fit at the given input point. Gets the standard error of the prediction for a particular input vector. The input vector where the standard error of the prediction should be computed. The overall regression standard error (can be computed from . The information matrix obtained when training the model (see ). The standard error of the prediction given for the input point. Gets the confidence interval for an input point. The input vector. The overall regression standard error (can be computed from . The number of training samples used to fit the model. The information matrix obtained when training the model (see ). The prediction interval confidence (default is 95%). Gets the prediction interval for an input point. The input vector. The overall regression standard error (can be computed from . The number of training samples used to fit the model. The information matrix obtained when training the model (see ). The prediction interval confidence (default is 95%). Simple Linear Regression of the form y = Ax + B. In linear regression, the model specification is that the dependent variable, y is a linear combination of the parameters (but need not be linear in the independent variables). As the linear regression has a closed form solution, the regression coefficients can be efficiently computed using the Regress method of this class. Let's say we have some univariate, continuous sets of input data, and a corresponding univariate, continuous set of output data, such as a set of points in R². A simple linear regression is able to fit a line relating the input variables to the output variables in which the minimum-squared-error of the line and the actual output points is minimum. Now, let's say we would like to perform a regression using an intermediary transformation, such as for example logarithmic regression. In this case, all we have to do is to first transform the input variables into the desired domain, then apply the regression as normal: Creates a new Simple Linear Regression of the form y = Ax + B. Angular coefficient (Slope). Linear coefficient (Intercept). Gets the number of parameters in the model (returns 2). Performs the regression using the input and output data, returning the sum of squared errors of the fit. The input data. The output data. The regression Sum-of-Squares error. Computes the regression output for a given input. An array of input values. The array of calculated output values. Computes the regression for a single input. The input value. The calculated output. Gets the coefficient of determination, as known as R² (r-squared). The coefficient of determination is used in the context of statistical models whose main purpose is the prediction of future outcomes on the basis of other related information. It is the proportion of variability in a data set that is accounted for by the statistical model. It provides a measure of how well future outcomes are likely to be predicted by the model. The R² coefficient of determination is a statistical measure of how well the regression line approximates the real data points. An R² of 1.0 indicates that the regression line perfectly fits the data. This method uses the class to compute the R² coefficient. Please see the documentation for for more details, including usage examples. The R² (r-squared) coefficient for the given data. Gets the coefficient of determination, or R² (r-squared). The coefficient of determination is used in the context of statistical models whose main purpose is the prediction of future outcomes on the basis of other related information. It is the proportion of variability in a data set that is accounted for by the statistical model. It provides a measure of how well future outcomes are likely to be predicted by the model. The R² coefficient of determination is a statistical measure of how well the regression line approximates the real data points. An R² of 1.0 indicates that the regression line perfectly fits the data. This method uses the class to compute the R² coefficient. Please see the documentation for for more details, including usage examples. The R² (r-squared) coefficient for the given data. Returns a System.String representing the regression. Returns a System.String representing the regression. Returns a System.String representing the regression. Returns a System.String representing the regression. Creates a new linear regression directly from data points. The input vectors x. The output vectors y. A linear regression f(x) that most approximates y. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Gets the degrees of freedom when fitting the regression. Gets the overall regression standard error. The inputs used to train the model. The outputs used to train the model. Gets the standard error of the fit for a particular input point. The input vector where the standard error of the fit should be computed. The inputs used to train the model. The outputs used to train the model. The standard error of the fit at the given input point. Gets the standard error of the prediction for a particular input point. The input vector where the standard error of the prediction should be computed. The inputs used to train the model. The outputs used to train the model. The standard error of the prediction given for the input point. Gets the confidence interval for an input point. The input point. The inputs used to train the model. The outputs used to train the model. The prediction interval confidence (default is 95%). Gets the prediction interval for an input point. The input point. The inputs used to train the model. The outputs used to train the model. The prediction interval confidence (default is 95%). Polynomial Linear Regression. In linear regression, the model specification is that the dependent variable, y is a linear combination of the parameters (but need not be linear in the independent variables). As the linear regression has a closed form solution, the regression coefficients can be efficiently computed using the Regress method of this class. Creates a new Polynomial Linear Regression. The degree of the polynomial used by the model. Creates a new Polynomial Linear Regression. Creates a new Polynomial Linear Regression. Gets the degree of the polynomial used by the regression. Gets the coefficients of the polynomial regression, with the first being the higher-order term and the last the intercept term. Gets or sets the linear weights of the regression model. The intercept term is not stored in this vector, but is instead available through the property. Gets or sets the intercept value for the regression. Performs the regression using the input and output data, returning the sum of squared errors of the fit. The input data. The output data. The regression Sum-of-Squares error. Computes the regressed model output for the given inputs. The input data. The computed outputs. Computes the regressed model output for the given input. The input value. The computed output. Gets the coefficient of determination, as known as R² (r-squared). The coefficient of determination is used in the context of statistical models whose main purpose is the prediction of future outcomes on the basis of other related information. It is the proportion of variability in a data set that is accounted for by the statistical model. It provides a measure of how well future outcomes are likely to be predicted by the model. The R² coefficient of determination is a statistical measure of how well the regression line approximates the real data points. An R² of 1.0 indicates that the regression line perfectly fits the data. This method uses the class to compute the R² coefficient. Please see the documentation for for more details, including usage examples. The R² (r-squared) coefficient for the given data. Gets the coefficient of determination, as known as R² (r-squared). The coefficient of determination is used in the context of statistical models whose main purpose is the prediction of future outcomes on the basis of other related information. It is the proportion of variability in a data set that is accounted for by the statistical model. It provides a measure of how well future outcomes are likely to be predicted by the model. The R² coefficient of determination is a statistical measure of how well the regression line approximates the real data points. An R² of 1.0 indicates that the regression line perfectly fits the data. This method uses the class to compute the R² coefficient. Please see the documentation for for more details, including usage examples. The R² (r-squared) coefficient for the given data. Returns a System.String representing the regression. Returns a System.String representing the regression. Returns a System.String representing the regression. Returns a System.String representing the regression. Creates a new polynomial regression directly from data points. The polynomial degree to use. The input vectors x. The output vectors y. A polynomial regression f(x) that most approximates y. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The location to where to store the result of this transformation. The output generated by applying this transformation to the given input. Gradient optimization for Multinomial logistic regression fitting. The gradient optimization class allows multinomial logistic regression models to be learnt using any mathematical optimization algorithm that implements the interface. Using Conjugate Gradient: Using Gradient Descent: Using BFGS: Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. Gets or sets the optimization method used to optimize the parameters (learn) the . Gets or sets the number of samples to be used as the mini-batch. If set to 0 (or a negative number) the total number of training samples will be used as the mini-batch. The size of the mini batch. Creates a new . Creates a new . The regression to estimate. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Non-negative Least Squares for optimization. References: Donghui Chen and Robert J.Plemmons, Nonnegativity Constraints in Numerical Analysis. Available on: http://users.wfu.edu/plemmons/papers/nonneg.pdf The following example shows how to fit a multiple linear regression model with the additional constraint that none of its coefficients should be negative. For this we can use the learning algorithm instead of the used above. Gets the coefficient vector being fitted. Gets or sets the maximum number of iterations to be performed. Gets or sets the tolerance for detecting convergence. Default is 0.001. Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. Initializes a new instance of the class. Initializes a new instance of the class. The regression to be fitted. Runs the fitting algorithm. The input training data. The output associated with each of the outputs. The sum of squared errors after the learning. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Non-linear Least Squares for optimization. The first example shows how to fit a non-linear least squares problem with . The second example shows how to fit a non-linear least squares problem with . Gets or sets a value indicating whether standard errors should be computed in the next iteration. true to compute standard errors; otherwise, false. Gets the Least-Squares optimization algorithm used to perform the actual learning. Gets the number of variables (free parameters) in the non-linear model specified in . The number of parameters of . Gets or sets the model function, mapping inputs to outputs given a suitable parameter vector. Gets or sets a function that computes the gradient of the in respect to the current parameters. Gets or sets the vector of initial values to be used at the beginning of the optimization. Setting a suitable set of initial values can be important to achieve good convergence or avoid poor local minimas. Initializes a new instance of the class. Initializes a new instance of the class. The regression model. Initializes a new instance of the class. The regression model. The least squares algorithm to be used to estimate the regression parameters. Default is to use a Levenberg-Marquardt algorithm. Runs the fitting algorithm. The input training data. The output associated with each of the outputs. The sum of squared errors after the learning. Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Stochastic Gradient Descent learning for Logistic Regression fitting. Gets the previous values for the coefficients which were in place before the last learning iteration was performed. Gets the current values for the coefficients. Gets the Gradient vector computed in the last Newton-Raphson iteration. Gets the total number of parameters in the model. Gets or sets whether this algorithm should use stochastic updates or not. Default is false. Gets or sets the algorithm learning rate. Default is 0.1. Please use MaxIterations instead. Gets or sets the maximum number of iterations performed by the learning algorithm. Gets or sets the tolerance value used to determine whether the algorithm has converged. Gets the current iteration number. The current iteration. Gets or sets whether the algorithm has converged. true if this instance has converged; otherwise, false. Constructs a new Gradient Descent algorithm. Constructs a new Gradient Descent algorithm. The regression to estimate. Runs one iteration of the Reweighted Least Squares algorithm. The input data. The outputs associated with each input vector. The maximum relative change in the parameters after the iteration. Runs a single pass of the gradient descent algorithm. Runs one iteration of the Reweighted Least Squares algorithm. The input data. The outputs associated with each input vector. The maximum relative change in the parameters after the iteration. Computes the sum-of-squared error between the model outputs and the expected outputs. The input data set. The output values. The sum-of-squared errors. Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Common interface for multiple regression fitting methods. Runs the fitting algorithm. The input training data. The output associated with each of the outputs. The error. Common interface for regression fitting methods. Runs the fitting algorithm. The input training data. The output associated with each of the outputs. The sum of squared errors after the learning. Common interface for regression fitting methods. Runs the fitting algorithm. The input training data. The time until the output happened. The indication variables used to signal if the event occurred or if it was censored. The error. Runs the fitting algorithm. The input training data. The time until the output happened. The indication variables used to signal if the event occurred or if it was censored. The error. Iterative Reweighted Least Squares for Logistic Regression fitting. The Iterative Reweighted Least Squares is an iterative technique based on the Newton-Raphson iterative optimization scheme. The IRLS method uses a local quadratic approximation to the log-likelihood function. By applying the Newton-Raphson optimization scheme to the cross-entropy error function (defined as the negative logarithm of the likelihood), one arises at a weighted formulation for the Hessian matrix. The Iterative Reweighted Least Squares algorithm can also be used to learn arbitrary generalized linear models. However, the use of this class to learn such models is currently experimental. References: Bishop, Christopher M.; Pattern Recognition and Machine Learning. Springer; 1st ed. 2006. Amos Storkey. (2005). Learning from Data: Learning Logistic Regressors. School of Informatics. Available on: http://www.inf.ed.ac.uk/teaching/courses/lfd/lectures/logisticlearn-print.pdf Cosma Shalizi. (2009). Logistic Regression and Newton's Method. Available on: http://www.stat.cmu.edu/~cshalizi/350/lectures/26/lecture-26.pdf Edward F. Conor. Logistic Regression. Website. Available on: http://userwww.sfsu.edu/~efc/classes/biol710/logistic/logisticreg.htm Constructs a new Iterative Reweighted Least Squares. Constructs a new Iterative Reweighted Least Squares. The regression to estimate. Constructs a new Iterative Reweighted Least Squares. The regression to estimate. Runs one iteration of the Reweighted Least Squares algorithm. The input data. The outputs associated with each input vector. The maximum relative change in the parameters after the iteration. Runs one iteration of the Reweighted Least Squares algorithm. The input data. The outputs associated with each input vector. The weights associated with each sample. The maximum relative change in the parameters after the iteration. Runs one iteration of the Reweighted Least Squares algorithm. The input data. The outputs associated with each input vector. The maximum relative change in the parameters after the iteration. Runs one iteration of the Reweighted Least Squares algorithm. The input data. The outputs associated with each input vector. The weight associated with each sample. The maximum relative change in the parameters after the iteration. Runs one iteration of the Reweighted Least Squares algorithm. The input data. The outputs associated with each input vector. The maximum relative change in the parameters after the iteration. Runs one iteration of the Reweighted Least Squares algorithm. The input data. The outputs associated with each input vector. The maximum relative change in the parameters after the iteration. Runs one iteration of the Reweighted Least Squares algorithm. The input data. The outputs associated with each input vector. An weight associated with each sample. The maximum relative change in the parameters after the iteration. Computes the sum-of-squared error between the model outputs and the expected outputs. The input data set. The output values. The sum-of-squared errors. Iterative Reweighted Least Squares for fitting Generalized Linear Models. Initializes this instance. Gets or sets the regression model being learned. Gets the previous values for the coefficients which were in place before the last learning iteration was performed. Gets the last parameter updates in the last iteration. Gets the current values for the coefficients. Gets the Hessian matrix computed in the last Newton-Raphson iteration. Gets the Gradient vector computed in the last Newton-Raphson iteration. Gets the total number of parameters in the model. Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. Please use MaxIterations instead. Gets or sets the tolerance value used to determine whether the algorithm has converged. Gets or sets the maximum number of iterations performed by the learning algorithm. The maximum iterations. Gets the current iteration number. The current iteration. Gets or sets whether the algorithm has converged. true if this instance has converged; otherwise, false. Gets or sets a value indicating whether standard errors should be computed in the next iteration. true to compute standard errors; otherwise, false. Gets or sets the regularization value to be added in the objective function. Default is 1e-10. Initializes a new instance of the class. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . outputs;The number of input vectors and their associated output values must have the same size. Gets the information matrix used to update the regression weights in the last call to Lower-Bound Newton-Raphson for Multinomial logistic regression fitting. The Lower Bound principle consists of replacing the second derivative matrix by a global lower bound in the Leowner ordering [Böhning, 92]. In the case of multinomial logistic regression estimation, the Hessian of the negative log-likelihood function can be replaced by one of those lower bounds, leading to a monotonically converging sequence of iterates. Furthermore, [Krishnapuram, Carin, Figueiredo and Hartemink, 2005] also have shown that a lower bound can be achieved which does not depend on the coefficients for the current iteration. References: B. Krishnapuram, L. Carin, M.A.T. Figueiredo, A. Hartemink. Sparse Multinomial Logistic Regression: Fast Algorithms and Generalization Bounds. 2005. Available on: http://www.lx.it.pt/~mtf/Krishnapuram_Carin_Figueiredo_Hartemink_2005.pdf D. Böhning. Multinomial logistic regression algorithm. Annals of the Institute of Statistical Mathematics, 44(9):197 ˝U200, 1992. 2. M. Corney. Bishop, Christopher M.; Pattern Recognition and Machine Learning. Springer; 1st ed. 2006. Gets the previous values for the coefficients which were in place before the last learning iteration was performed. Gets the current values for the coefficients. Gets or sets a value indicating whether the lower bound should be updated using new data. Default is true. true if the lower bound should be updated; otherwise, false. Gets the Lower-Bound matrix being used in place of the Hessian matrix in the Newton-Raphson iterations. Gets the Gradient vector computed in the last Newton-Raphson iteration. Gets the total number of parameters in the model. Gets or sets a value indicating whether standard errors should be computed in the next iteration. true to compute standard errors; otherwise, false. Please use MaxIterations instead. Gets or sets the maximum number of iterations performed by the learning algorithm. Gets or sets the tolerance value used to determine whether the algorithm has converged. Gets or sets the number of performed iterations. Gets or sets whether the algorithm has converged. true if this instance has converged; otherwise, false. Gets the vector of parameter updates in the last iteration. How much each parameter changed after the last update. Gets the maximum parameter change in the last iteration. If this value is less than , the algorithm has converged. Creates a new . Creates a new . The regression to estimate. Runs one iteration of the Lower-Bound Newton-Raphson iteration. The input data. The outputs associated with each input vector. The maximum relative change in the parameters after the iteration. Runs one iteration of the Lower-Bound Newton-Raphson iteration. The input data. The outputs associated with each input vector. The maximum relative change in the parameters after the iteration. Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Newton-Raphson learning updates for Cox's Proportional Hazards models. Gets or sets the maximum absolute parameter change detectable after an iteration of the algorithm used to detect convergence. Default is 1e-5. Please use MaxIterations instead. Gets or sets the maximum number of iterations performed by the learning algorithm. Gets or sets the number of performed iterations. Gets or sets whether the algorithm has converged. true if this instance has converged; otherwise, false. Gets or sets the hazard estimator that should be used by the proportional hazards learning algorithm. Default is to use . Gets or sets the ties handling method to be used by the proportional hazards learning algorithm. Default is to use 's method. Gets the previous values for the coefficients which were in place before the last learning iteration was performed. Gets the current values for the coefficients. Gets the Hessian matrix computed in the last Newton-Raphson iteration. Gets the Gradient vector computed in the last Newton-Raphson iteration. Gets the total number of parameters in the model. Gets or sets the regression model being learned. Gets or sets a value indicating whether standard errors should be computed at the end of the next iterations. true to compute standard errors; otherwise, false. Gets or sets a value indicating whether an estimate of the baseline hazard function should be computed at the end of the next iterations. true to compute the baseline function; otherwise, false. Gets or sets a value indicating whether the Cox model should be computed using the mean-centered version of the covariates. Default is true. Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. Gets or sets the smoothing factor used to avoid numerical problems in the beginning of the training. Default is 0.1. Constructs a new Newton-Raphson learning algorithm for Cox's Proportional Hazards models. Constructs a new Newton-Raphson learning algorithm for Cox's Proportional Hazards models. The model to estimate. Runs the Newton-Raphson update for Cox's hazards learning until convergence. The input data. The time-to-event for the training samples. The maximum relative change in the parameters after the iteration. Runs the Newton-Raphson update for Cox's hazards learning until convergence. The input data. The output (event) associated with each input vector. The time-to-event for the non-censored training samples. The maximum relative change in the parameters after the iteration. Runs the Newton-Raphson update for Cox's hazards learning until convergence. The input data. The output (event) associated with each input vector. The time-to-event for the non-censored training samples. The maximum relative change in the parameters after the iteration. Runs the Newton-Raphson update for Cox's hazards learning until convergence. The output (event) associated with each input vector. The time-to-event for the non-censored training samples. The maximum relative change in the parameters after the iteration. Runs the Newton-Raphson update for Cox's hazards learning until convergence. The output (event) associated with each input vector. The time-to-event for the non-censored training samples. The maximum relative change in the parameters after the iteration. Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The desired outputs associated with each inputs. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given . Learns a model that can map the given inputs to the given outputs. The model inputs. The output (event) associated with each input vector. The time-to-event for the non-censored training samples. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given and . Learns a model that can map the given inputs to the given outputs. The model inputs. The output (event) associated with each input vector. The time-to-event for the non-censored training samples. The weight of importance for each input-output pair (if supported by the learning algorithm). A model that has learned how to produce given and . Multivariate non-linear regression using Kernels. The kernel function. Gets or sets the kernel function. Gets or sets the original input data that is needed to compute the kernel (Gram) matrices for the regression. Gets or sets the linear weights of the regression model. The intercept term is not stored in this vector, but is instead available through the property. Gets or sets the intercept value for the regression. Gets or sets the mean values (to be subtracted from samples). Gets or sets the standard deviations (to be divided from samples). Gets or sets the means of the data in feature space (to center samples). Gets or sets the grand mean of the data in feature space (to center samples). Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The location where the output should be stored. The output generated by applying this transformation to the given input. Multivariate non-linear regression using Kernels. Generalized Linear Model Regression. References: Bishop, Christopher M.; Pattern Recognition and Machine Learning. Springer; 1st ed. 2006. Amos Storkey. (2005). Learning from Data: Learning Logistic Regressors. School of Informatics. Available on: http://www.inf.ed.ac.uk/teaching/courses/lfd/lectures/logisticlearn-print.pdf Cosma Shalizi. (2009). Logistic Regression and Newton's Method. Available on: http://www.stat.cmu.edu/~cshalizi/350/lectures/26/lecture-26.pdf Edward F. Conor. Logistic Regression. Website. Available on: http://userwww.sfsu.edu/~efc/classes/biol710/logistic/logisticreg.htm Creates a new Generalized Linear Regression Model. The link function to use. Creates a new Generalized Linear Regression Model. The link function to use. The number of input variables for the model. Creates a new Generalized Linear Regression Model. The link function to use. The number of input variables for the model. The starting intercept value. Default is 0. Creates a new Generalized Linear Regression Model. Creates a new Generalized Linear Regression Model. The link function to use. The coefficient vector. The standard error vector. Gets the number of inputs accepted by the model. Obsolete. For quick compatibility fixes in the short term, use and . Gets the number of parameters in this model (equals the NumberOfInputs + 1). Gets or sets the linear weights of the regression model. The intercept term is not stored in this vector, but is instead available through the property. Gets the standard errors associated with each coefficient during the model estimation phase. Gets the number of inputs handled by this model. Gets the link function used by this generalized linear model. Gets the underlying linear regression. Gets or sets the intercept term. This is always the first value of the array. Gets a coefficient value, where 0 is the intercept term and the other coefficients are indexed starting at 1. Sets a coefficient value, where 0 is the intercept term and the other coefficients are indexed starting at 1. Computes the model output for the given input vector. The input vector. The output value. Computes the model output for each of the given input vectors. The array of input vectors. The array of output values. Gets the Wald Test for a given coefficient. The Wald statistical test is a test for a model parameter in which the estimated parameter θ is compared with another proposed parameter under the assumption that the difference between them will be approximately normal. There are several problems with the use of the Wald test. Please take a look on substitute tests based on the log-likelihood if possible. The coefficient's index. The first value (at zero index) is the intercept value. Gets the Log-Likelihood for the model. A set of input data. A set of output data. The Log-Likelihood (a measure of performance) of the model calculated over the given data sets. Gets the Log-Likelihood for the model. A set of input data. A set of output data. The weights associated with each input vector. The Log-Likelihood (a measure of performance) of the model calculated over the given data sets. Gets the Deviance for the model. The deviance is defined as -2*Log-Likelihood. A set of input data. A set of output data. The deviance (a measure of performance) of the model calculated over the given data sets. Gets the Deviance for the model. The deviance is defined as -2*Log-Likelihood. A set of input data. A set of output data. The weights associated with each input vector. The deviance (a measure of performance) of the model calculated over the given data sets. Gets the Log-Likelihood Ratio between two models. The Log-Likelihood ratio is defined as 2*(LL - LL0). A set of input data. A set of output data. Another Logistic Regression model. The Log-Likelihood ratio (a measure of performance between two models) calculated over the given data sets. Gets the Log-Likelihood Ratio between two models. The Log-Likelihood ratio is defined as 2*(LL - LL0). A set of input data. A set of output data. The weights associated with each input vector. Another Logistic Regression model. The Log-Likelihood ratio (a measure of performance between two models) calculated over the given data sets. The likelihood ratio test of the overall model, also called the model chi-square test. A set of input data. A set of output data. The Chi-square test, also called the likelihood ratio test or the log-likelihood test is based on the deviance of the model (-2*log-likelihood). The log-likelihood ratio test indicates whether there is evidence of the need to move from a simpler model to a more complicated one (where the simpler model is nested within the complicated one). The difference between the log-likelihood ratios for the researcher's model and a simpler model is often called the "model chi-square". The likelihood ratio test of the overall model, also called the model chi-square test. A set of input data. A set of output data. The weights associated with each input vector. The Chi-square test, also called the likelihood ratio test or the log-likelihood test is based on the deviance of the model (-2*log-likelihood). The log-likelihood ratio test indicates whether there is evidence of the need to move from a simpler model to a more complicated one (where the simpler model is nested within the complicated one). The difference between the log-likelihood ratios for the researcher's model and a simpler model is often called the "model chi-square". Gets the degrees of freedom when fitting the regression. Gets the standard error for each coefficient. The information matrix obtained when training the model (see ). Gets the standard error of the fit for a particular input vector. The input vector where the standard error of the fit should be computed. The information matrix obtained when training the model (see ). The standard error of the fit at the given input point. Gets the standard error of the prediction for a particular input vector. The input vector where the standard error of the prediction should be computed. The information matrix obtained when training the model (see ). The standard error of the prediction given for the input point. Gets the confidence interval for an input point. The input vector. The number of training samples used to fit the model. The information matrix obtained when training the model (see ). The prediction interval confidence (default is 95%). Gets the prediction interval for an input point. The input vector. The number of training samples used to fit the model. The information matrix obtained when training the model (see ). The prediction interval confidence (default is 95%). Creates a new GeneralizedLinearRegression that is a copy of the current instance. Creates a GeneralizedLinearRegression from a object. A object. True to make a copy of the logistic regression values, false to use the actual values. If the actual values are used, changes done on one model will be reflected on the other model. A new which is a copy of the given . Computes a numerical score measuring the association between the given vector and each class. The input vector. An array where the result will be stored, avoiding unnecessary memory allocations. System.Double[]. Predicts a class label vector for the given input vectors, returning the log-likelihood that the input vector belongs to its predicted class. The input vector. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. System.Double[]. Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class. The input vector. An array where the probabilities will be stored, avoiding unnecessary memory allocations. System.Double[]. Regression function delegate. This delegate represents a parameterized function that, given a set of model coefficients and an input vector, produces an associated output value. The model coefficients, also known as parameters or coefficients. An input vector. The output value produced given the and vector. Gradient function delegate. This delegate represents the gradient of regression function. A regression function is a parameterized function that, given a set of model coefficients and an input vector, produces an associated output value. This function should compute the gradient vector in respect to the function . The model coefficients, also known as parameters or coefficients. An input vector. The resulting gradient vector (w.r.t to the coefficients). Nonlinear Regression. The first example shows how to fit a non-linear regression with . The second example shows how to fit a non-linear regression with . Gets the regression coefficients. Gets the standard errors for the regression coefficients. Gets the model function, mapping inputs to outputs given a suitable parameter vector. Gets or sets a function that computes the gradient of the in respect to the . Initializes a new instance of the class. The number of variables (free parameters) in the model. The regression function implementing the regression model. Initializes a new instance of the class. The number of variables (free parameters) in the model. The regression function implementing the regression model. The function that computes the gradient for (optional). Computes the model output for the given input vector. The input vector. The output value. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Applies the transformation to an input, producing an associated output. The input data to which the transformation should be applied. The output generated by applying this transformation to the given input. Cox's Proportional Hazards Model. Gets the mean vector used to center observations before computations. Gets or sets the intercept (bias) for the regression model. Gets the coefficient vector, in which the first value is always the intercept value. Gets the standard errors associated with each coefficient during the model estimation phase. Gets the baseline hazard function, if specified. Gets the number of inputs handled by this model. Creates a new Cox Proportional-Hazards Model. The number of input variables for the model. Creates a new Cox Proportional-Hazards Model. The number of input variables for the model. The initial baseline hazard distribution. Default is the . Creates a new Cox Proportional-Hazards Model. Obsolete. Please use the Probability(input) method instead. Obsolete. Please use the Probability(input) method instead. Obsolete. Please use the Probability(input, time) method instead. Obsolete. Please use the Probability(input) method instead. Computes the model's baseline survival function. This method simply calls the of the function. The event time. The baseline survival function at the given time. Computes the model output for the given input vector. The input vector. The event times. The probabilities of the event occurring at the given times for the given observations. Gets the Log-Hazard Ratio between two observations. The first observation. The second observation. Gets the Deviance for the model. The deviance is defined as -2*Log-Likelihood. A set of input data. The time-to-event before the output occurs. The corresponding output data. The deviance (a measure of performance) of the model calculated over the given data sets. Gets the Partial Log-Likelihood for the model. A set of input data. The time-to-event before the output occurs. The corresponding output data. The Partial Log-Likelihood (a measure of performance) of the model calculated over the given data set. Gets the Partial Log-Likelihood for the model. A set of input data. The time-to-event before the output occurs. The corresponding output data. The Partial Log-Likelihood (a measure of performance) of the model calculated over the given data set. Gets the Partial Log-Likelihood for the model. The time-to-event before the output occurs. The corresponding output data. The Partial Log-Likelihood (a measure of performance) of the model calculated over the given data set. Gets the Partial Log-Likelihood for the model. The time-to-event before the output occurs. The corresponding output data. The Partial Log-Likelihood (a measure of performance) of the model calculated over the given data set. Gets the 95% confidence interval for the Hazard Ratio for a given coefficient. The coefficient's index. Gets the Wald Test for a given coefficient. The Wald statistical test is a test for a model parameter in which the estimated parameter θ is compared with another proposed parameter under the assumption that the difference between them will be approximately normal. There are several problems with the use of the Wald test. Please take a look on substitute tests based on the log-likelihood if possible. The coefficient's index. The first value (at zero index) is the intercept value. Gets the Log-Likelihood Ratio between two models. The Log-Likelihood ratio is defined as 2*(LL - LL0). A set of input data. The time-to-event before the output occurs. The corresponding output data. Another Cox Proportional Hazards model. The Log-Likelihood ratio (a measure of performance between two models) calculated over the given data sets. The likelihood ratio test of the overall model, also called the model chi-square test. A set of input data. The time-to-event before the output occurs. The corresponding output data. The Chi-square test, also called the likelihood ratio test or the log-likelihood test is based on the deviance of the model (-2*log-likelihood). The log-likelihood ratio test indicates whether there is evidence of the need to move from a simpler model to a more complicated one (where the simpler model is nested within the complicated one). The difference between the log-likelihood ratios for the researcher's model and a simpler model is often called the "model chi-square". The likelihood ratio test of the overall model, also called the model chi-square test. A set of input data. The time-to-event before the output occurs. The corresponding output data. The Chi-square test, also called the likelihood ratio test or the log-likelihood test is based on the deviance of the model (-2*log-likelihood). The log-likelihood ratio test indicates whether there is evidence of the need to move from a simpler model to a more complicated one (where the simpler model is nested within the complicated one). The difference between the log-likelihood ratios for the researcher's model and a simpler model is often called the "model chi-square". Creates a new Cox's Proportional Hazards that is a copy of the current instance. Gets the Hazard Ratio for a given coefficient. The hazard ratio can be computed raising Euler's number (e ~~ 2.71) to the power of the associated coefficient. The coefficient's index. The Hazard Ratio for the given coefficient. Predicts a class label vector for the given input vectors, returning the log-likelihood that the input vector belongs to its predicted class. The input vector. An array where the log-likelihoods will be stored, avoiding unnecessary memory allocations. Computes class-label decisions for each vector in the given . The input vector. The event times. Computes class-label decisions for each vector in the given . The input vector. The event times. Computes a numerical score measuring the association between the given vector and each class. The input vector. The event times. Computes a numerical score measuring the association between the given vector and each class. The input vector. The event times. Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class. The input vector. The event times. The probabilities of the event occurring at the given times for the given observations. Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class. The input vector. The event times. The probabilities of the event occurring at the given times for the given observations. Predicts a class label vector for the given input vectors, returning the log-likelihood that the input vector belongs to its predicted class. The input vector. The event times. Predicts a class label vector for the given input vectors, returning the log-likelihood that the input vector belongs to its predicted class. The input vector. The event times. Computes class-label decisions for each vector in the given . The input vectors that should be classified into one of the possible classes. Computes class-label decisions for each vector in the given . The input vectors that should be classified into one of the possible classes. Computes a numerical score measuring the association between the given vector and each class. The input vector. Computes a numerical score measuring the association between the given vector and each class. The input vector. Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class. The input vector. Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class. The input vector. Predicts a class label vector for the given input vectors, returning the log-likelihood that the input vector belongs to its predicted class. The input vector. Predicts a class label vector for the given input vectors, returning the log-likelihood that the input vector belongs to its predicted class. The input vector. Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class. The event times. Predicts a class label for the given input vector, returning the log-likelihood that the input vector belongs to its predicted class. The event times. Nominal Multinomial Logistic Regression. The default optimizer for is the class: Additionally, the class allows multinomial logistic regression models to be learnt using any mathematical optimization algorithm that implements the interface. Using Conjugate Gradient: Using Gradient Descent: Using BFGS: Creates a new Multinomial Logistic Regression Model. The number of input variables for the model. The number of categories for the model. Creates a new Multinomial Logistic Regression Model. The number of input variables for the model. The number of categories for the model. The initial values for the intercepts. Gets the coefficient vectors, in which the first column are the intercept values. Gets the total number of parameters in the multinomial logistic regression [(categories - 1) * (inputs + 1)]. Gets the standard errors associated with each coefficient during the model estimation phase. Gets the number of categories of the model. Gets the number of inputs of the model. Computes the model output for the given input vector. The first category is always considered the baseline category. The input vector. The output value. Computes the model outputs for the given input vectors. The first category is always considered the baseline category. The input vector. The output value. Computes the log-likelihood that the given input vector belongs to the specified . The input vector. The index of the class whose score will be computed. System.Double. Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. A set of input vectors. The class labels associated with each input vector, as predicted by the classifier. If passed as null, the classifier will create a new array. An array where the probabilities will be stored, avoiding unnecessary memory allocations. System.Double[]. The likelihood ratio test of the overall model, also called the model chi-square test. The Chi-square test, also called the likelihood ratio test or the log-likelihood test is based on the deviance of the model (-2*log-likelihood). The log-likelihood ratio test indicates whether there is evidence of the need to move from a simpler model to a more complicated one (where the simpler model is nested within the complicated one). The difference between the log-likelihood ratios for the researcher's model and a simpler model is often called the "model chi-square". The likelihood ratio test of the overall model, also called the model chi-square test. The Chi-square test, also called the likelihood ratio test or the log-likelihood test is based on the deviance of the model (-2*log-likelihood). The log-likelihood ratio test indicates whether there is evidence of the need to move from a simpler model to a more complicated one (where the simpler model is nested within the complicated one). The difference between the log-likelihood ratios for the researcher's model and a simpler model is often called the "model chi-square". Gets the 95% confidence interval for the Odds Ratio for a given coefficient. The category's index. The coefficient's index. The first value (at zero index) is the intercept value. Gets the 95% confidence intervals for the Odds Ratios for all coefficients. The category's index. Gets the Odds Ratio for a given coefficient. The odds ratio can be computed raising Euler's number (e ~~ 2.71) to the power of the associated coefficient. The category index. The coefficient's index. The first value (at zero index) is the intercept value. The Odds Ratio for the given coefficient. Gets the Odds Ratio for all coefficients. The odds ratio can be computed raising Euler's number (e ~~ 2.71) to the power of the associated coefficient. The category index. The Odds Ratio for the given coefficient. Gets the Wald Test for a given coefficient. The Wald statistical test is a test for a model parameter in which the estimated parameter θ is compared with another proposed parameter under the assumption that the difference between them will be approximately normal. There are several problems with the use of the Wald test. Please take a look on substitute tests based on the log-likelihood if possible. The category index. The coefficient's index. The first value (at zero index) is the intercept value. Gets the Wald Test for all coefficients. The Wald statistical test is a test for a model parameter in which the estimated parameter θ is compared with another proposed parameter under the assumption that the difference between them will be approximately normal. There are several problems with the use of the Wald test. Please take a look on substitute tests based on the log-likelihood if possible. The category's index. Gets the Deviance for the model. The deviance is defined as -2*Log-Likelihood. A set of input data. A set of output data. The deviance (a measure of performance) of the model calculated over the given data sets. Gets the Deviance for the model. The deviance is defined as -2*Log-Likelihood. A set of input data. A set of output data. The deviance (a measure of performance) of the model calculated over the given data sets. Gets the Deviance for the model. The deviance is defined as -2*Log-Likelihood. A set of input data. A set of output data. The deviance (a measure of performance) of the model calculated over the given data sets. Gets the Log-Likelihood for the model. A set of input data. A set of output data. The Log-Likelihood (a measure of performance) of the model calculated over the given data sets. Gets the Log-Likelihood Ratio between two models. The Log-Likelihood ratio is defined as 2*(LL - LL0). A set of input data. A set of output data. Another Logistic Regression model. The Log-Likelihood ratio (a measure of performance between two models) calculated over the given data sets. Creates a new MultinomialLogisticRegression that is a copy of the current instance. Binary Logistic Regression. In statistics, logistic regression (sometimes called the logistic model or Logit model) is used for prediction of the probability of occurrence of an event by fitting data to a logistic curve. It is a generalized linear model used for binomial regression. Like many forms of regression analysis, it makes use of several predictor variables that may be either numerical or categorical. For example, the probability that a person has a heart attack within a specified time period might be predicted from knowledge of the person's age, sex and body mass index. Logistic regression is used extensively in the medical and social sciences as well as marketing applications such as prediction of a customer's propensity to purchase a product or cease a subscription. References: Bishop, Christopher M.; Pattern Recognition and Machine Learning. Springer; 1st ed. 2006. Amos Storkey. (2005). Learning from Data: Learning Logistic Regressors. School of Informatics. Available on: http://www.inf.ed.ac.uk/teaching/courses/lfd/lectures/logisticlearn-print.pdf Cosma Shalizi. (2009). Logistic Regression and Newton's Method. Available on: http://www.stat.cmu.edu/~cshalizi/350/lectures/26/lecture-26.pdf Edward F. Conor. Logistic Regression. Website. Available on: http://userwww.sfsu.edu/~efc/classes/biol710/logistic/logisticreg.htm The following example shows how to learn a logistic regression using the standard algorithm. Please note that it is also possible to train logistic regression models using large-margin algorithms. With those algorithms, it is possible to train using different regularization options, such as L1 (with ProbabilisticCoordinateDescent) or L2 (with ProbabilisticDualCoordinateDescent). The following example shows how to obtain L1-regularized regression from a probabilistic linear Support Vector Machine: Creates a new Logistic Regression Model. Creates a new Logistic Regression Model. The number of input variables for the model. Creates a new Logistic Regression Model. The number of input variables for the model. The starting intercept value. Default is 0. Gets the 95% confidence interval for the Odds Ratio for a given coefficient. The coefficient's index. The first value (at zero index) is the intercept value. Gets the Odds Ratio for a given coefficient. The odds ratio can be computed raising Euler's number (e ~~ 2.71) to the power of the associated coefficient. The coefficient's index. The first value (at zero index) is the intercept value. The Odds Ratio for the given coefficient. Constructs a new from an array of weights (linear coefficients). The first weight is interpreted as the intercept value. An array of linear coefficients. A whose are the same as in the given array. Constructs a new from an array of weights (linear coefficients). The first weight is interpreted as the intercept value. An array of linear coefficients. The intercept term. A whose are the same as in the given array. Moving-window statistics. Gets the size of the window. The window's size. Gets the number of samples within the window. The number of samples within the window. Gets the minimum value in the window. Gets the maximum value in the window. Initializes a new instance of the class. The size of the moving window. Pushes a value into the window. Returns an enumerator that iterates through the collection. A that can be used to iterate through the collection. Returns an enumerator that iterates through a collection. An object that can be used to iterate through the collection. Removes all elements from the window and resets statistics. Common interface for moving-window statistics. Moving-window statistics such as moving average and moving variance, are a type of finite impulse response filters used to analyze a set of data points by creating a series of averages of different subsets of the full data set. Gets the size of the window. The window's size. Gets the number of samples within the window. The number of samples within the window. Common interface for moving-window statistics. Moving-window statistics such as moving average and moving variance, are a type of finite impulse response filters used to analyze a set of data points by creating a series of averages of different subsets of the full data set. Moving-window circular statistics. Gets the sum of the sines of the angles within the window. Gets the sum of the cosines of the angles within the window. Gets the size of the window. The window's size. Gets the number of samples within the window. The number of samples within the window. Gets the mean of the angles within the window. The mean. Gets the variance of the angles within the window. Gets the standard deviation of the angles within the window. Gets the current length of the sample mean resultant vector of the gathered values. Initializes a new instance of the class. The size of the moving window. Registers the occurrence of a value. The value to be registered. Clears all measures previously computed. Moving-window statistics. Provides statistics derived from successive segments of constant, overlapping size ('windowSize') of a series of values. Values are added one at a time to a MovingNormalStatistics instance through method and are actually kept inside the instance. Gets the sum the values within the window. The sum of values within the window. Gets the sum of squared values within the window. The sum of squared values. Gets the size of the window. The window's size. Gets the number of samples within the window. The number of samples within the window. Gets the mean of the values within the window. The mean of the values. Gets the variance of the values within the window. The variance of the values. Gets the standard deviation of the values within the window. The standard deviation of the values. Initializes a new instance of the class. The size of the moving window. Pushes a value into the window. Returns an enumerator that iterates through the collection. A that can be used to iterate through the collection. Returns an enumerator that iterates through a collection. An object that can be used to iterate through the collection. Removes all elements from the window and resets statistics. Common interface for running statistics. Running statistics are measures computed as data becomes available. When using running statistics, there is no need to know the number of samples a priori, such as in the case of the direct . Registers the occurrence of a value. The value to be registered. Clears all measures previously computed. Kalman filter for 2D coordinate systems. References: Student Dave's tutorial on Object Tracking in Images Using 2D Kalman Filters. Available on: http://studentdavestutorials.weebly.com/object-tracking-2d-kalman-filter.html Gets or sets the current X position of the object. Gets or sets the current Y position of the object. Gets or sets the current object's velocity in the X axis. Gets or sets the current object's velocity in the Y axis. Gets or sets the observational noise of the current object's in the X axis. Gets or sets the observational noise of the current object's in the Y axis. Initializes a new instance of the class. Initializes a new instance of the class. The sampling rate. The acceleration. The acceleration standard deviation. Registers the occurrence of a value. The value to be registered. Registers the occurrence of a value. The value to be registered. Registers the occurrence of a value. The x-coordinate of the value to be registered. The y-coordinate of the value to be registered. Clears all measures previously computed. Common interface for running Markov filters. Gets whether the model has been initialized or not. Gets the current vector of probabilities of being in each state. Gets the current most likely state (in the Viterbi path). Gets the current Viterbi probability (along the most likely path). Gets the current Forward probability (along all possible paths). Clears all measures previously computed and indicate the sequence has ended. Base class for running hidden Markov filters. Initializes a new instance of the class. The Markov model. Gets whether the model has been initialized or not. Gets the current vector of probabilities of being in each state. Gets the current most likely state (in the Viterbi path). Gets the current Viterbi probability (along the most likely path). Gets the current Forward probability (along all possible paths). Clears all measures previously computed and indicate the sequence has ended. Clears all measures previously computed. Hidden Markov Classifier filter. Gets the used in this filter. Gets the class response probabilities measuring the likelihood of the current sequence belonging to each of the classes. Gets the current classification label for the sequence up to the current observation. Gets the current rejection threshold level generated by classifier's threshold model. Creates a new . The hidden Markov classifier model. Registers the occurrence of a value. The value to be registered. Checks the classification after the insertion of a new value without registering this value. The next log-likelihood if the occurrence of is registered. The value to be checked. Clears all measures previously computed. Hidden Markov Classifier filter for general state distributions. Gets the used in this filter. Gets the class response probabilities measuring the likelihood of the current sequence belonging to each of the classes. Gets the current classification label for the sequence up to the current observation. Gets the current rejection threshold level generated by classifier's threshold model. Creates a new . The hidden Markov classifier model. Registers the occurrence of a value. The value to be registered. Registers the occurrence of a value. The value to be registered. Checks the classification after the insertion of a new value without registering this value. The next log-likelihood if the occurrence of is registered. The value to be checked. Checks the classification after the insertion of a new value without registering this value. The next log-likelihood if the occurrence of is registered. The value to be checked. Clears all measures previously computed. Hidden Markov Model filter. Gets the used in this filter. Creates a new . The hidden Markov model to use in this filter. Registers the occurrence of a value. The value to be registered. Checks the classification after the insertion of a new value without registering this value. The value to be checked. Gets whether the model has been initialized or not. Gets the current vector of probabilities of being in each state. Gets the current most likely state (in the Viterbi path). Gets the current Viterbi probability (along the most likely path). Gets the current Forward probability (along all possible paths). Clears this instance. Common interface for running statistics. Running statistics are measures computed as data becomes available. When using running statistics, there is no need to know the number of samples a priori, such as in the case of the direct . Gets the current mean of the gathered values. The mean of the values. Gets the current variance of the gathered values. The variance of the values. Gets the current standard deviation of the gathered values. The standard deviation of the values. Hidden Markov Model filter. Gets the used in this filter. Creates a new . The hidden Markov model to use in this filter. Registers the occurrence of a value. The value to be registered. Checks the classification after the insertion of a new value without registering this value. The value to be checked. Gets whether the model has been initialized or not. Gets the current vector of probabilities of being in each state. Gets the current most likely state (in the Viterbi path). Gets the current Viterbi probability (along the most likely path). Gets the current Forward probability (along all possible paths). Clears this instance. Running circular statistics. This class computes the running variance using Welford’s method. Running statistics need only one pass over the data, and do not require all data to be available prior to computing. References: John D. Cook. Accurately computing running variance. Available on: http://www.johndcook.com/standard_deviation.html Chan, Tony F.; Golub, Gene H.; LeVeque, Randall J. (1983). Algorithms for Computing the Sample Variance: Analysis and Recommendations. The American Statistician 37, 242-247. Ling, Robert F. (1974). Comparison of Several Algorithms for Computing Sample Means and Variances. Journal of the American Statistical Association, Vol. 69, No. 348, 859-866. Gets the sum of the sines of the angles within the window. Gets the sum of the cosines of the angles within the window. Gets the current mean of the gathered values. The mean of the values. Gets the current variance of the gathered values. The variance of the values. Gets the current standard deviation of the gathered values. The standard deviation of the values. Gets the current length of the sample mean resultant vector of the gathered values. Gets the current count of values seen. The number of samples seen. Initializes a new instance of the class. Registers the occurrence of a value. The value to be registered. Clears all measures previously computed. Running (range) statistics. This class computes the running minimum and maximum values in a stream of values. Running statistics need only one pass over the data, and do not require all data to be available prior to computing. Gets the number of samples seen. Gets the minimum value seen. Gets the maximum value seen. Initializes a new instance of the class. Registers the occurrence of a value. The value to be registered. Clears all measures previously computed. Running (normal) statistics. This class computes the running variance using Welford’s method. Running statistics need only one pass over the data, and do not require all data to be available prior to computing. References: John D. Cook. Accurately computing running variance. Available on: http://www.johndcook.com/standard_deviation.html Chan, Tony F.; Golub, Gene H.; LeVeque, Randall J. (1983). Algorithms for Computing the Sample Variance: Analysis and Recommendations. The American Statistician 37, 242-247. Ling, Robert F. (1974). Comparison of Several Algorithms for Computing Sample Means and Variances. Journal of the American Statistical Association, Vol. 69, No. 348, 859-866. Gets the current mean of the gathered values. The mean of the values. Gets the current variance of the gathered values. The variance of the values. Gets the current standard deviation of the gathered values. The standard deviation of the values. Gets the current count of values seen. The number of samples seen. Initializes a new instance of the class. Registers the occurrence of a value. The value to be registered. Clears all measures previously computed. Contains 34+ statistical hypothesis tests, including one way and two-way ANOVA tests, non-parametric tests such as the Kolmogorov-Smirnov test and the Sign Test for the Median, contingency table tests such as the Kappa test, including variations for multiple tables, as well as the Bhapkar and Bowker tests; and the more traditional Chi-Square, Z, F , T and Wald tests. This namespace contains a suite of parametric and non-parametric hypothesis tests. Every test in this library implements the interface, which defines a few key methods and properties to assert whether an statistical hypothesis can be supported or not. Every hypothesis test is associated with an statistic distribution which can in turn be queried, inspected and computed as any other distribution in the namespace. By default, tests are created using a 0.05 significance level , which in the framework is referred as the test's size. P-Values are also ready to be inspected by checking a test's P-Value property. Furthermore, several tests in this namespace also support power analysis. The power analysis of a test can be used to suggest an optimal number of samples which have to be obtained in order to achieve a more interpretable or useful result while doing hypothesis testing. Power analyses implement the interface, and analyses are available for the one sample Z, and T tests, as well as their two sample versions. Some useful parametric tests are the , , , , , and . Useful non-parametric tests include the , , and the . Tests are also available for two or more samples. In this case, we can find two sample variants for the , , , , , , , as well as the for unpaired samples. For multiple samples we can find the and , as well as the and . Finally, the namespace also includes several tests for contingency tables. Those tests include Kappa test for inter-rater agreement and its variants, such as the , and . Other tests include , , , , and the . The namespace class diagram is shown below. Please note that class diagrams for each of the inner namespaces are also available within their own documentation pages. Hypothesis test for a single ROC curve. Gets the ROC curve being tested. Creates a new . The curve to be tested. The hypothesized value for the ROC area. The alternative hypothesis (research hypothesis) to test. Calculates the standard error of an area calculation for a curve with the given number of positive and negatives instances Calculates the standard error of an area calculation for a curve with the given number of positive and negatives instances Kappa test for the average of two groups of contingency tables. The two-matrix Kappa test tries to assert whether the Kappa measure of two groups of contingency tables, each group created by a different rater or classification model and measured repeatedly, differs significantly. This is a two sample t-test kind of test. References: J. L. Fleiss. Statistical methods for rates and proportions. Wiley-Interscience; 3rd edition (September 5, 2003) Gets the variance for the first Kappa value. Gets the variance for the second Kappa value. Creates a new Two-Table Mean Kappa test. The average kappa value for the first group of contingency tables. The average kappa value for the second group of contingency tables. The kappa's variance in the first group of tables. The kappa's variance in the first group of tables. The number of contingency tables averaged in the first group. The number of contingency tables averaged in the second group. True to assume equal variances, false otherwise. Default is true. The alternative hypothesis (research hypothesis) to test. The hypothesized difference between the two Kappa values. Creates a new Two-Table Mean Kappa test. The first group of contingency tables. The second group of contingency tables. True to assume equal variances, false otherwise. Default is true. The hypothesized difference between the two average Kappa values. The alternative hypothesis (research hypothesis) to test. Kappa Test for multiple contingency tables. The multiple-matrix Kappa test tries to assert whether the Kappa measure of many contingency tables, each of which created by a different rater or classification model, differs significantly. The computations are based on the pages 607, 608 of (Fleiss, 2003). This is a Chi-square kind of test. References: J. L. Fleiss. Statistical methods for rates and proportions. Wiley-Interscience; 3rd edition (September 5, 2003) Gets the overall Kappa value for the many contingency tables. Gets the overall Kappa variance for the many contingency tables. Gets the variance for each kappa value. Gets the kappa for each contingency table. Creates a new multiple table Kappa test. The kappa values. The variance for each kappa value. Creates a new multiple table Kappa test. The contingency tables. Computes the multiple matrix Kappa test. Fisher's exact test for contingency tables. This test statistic distribution is the Hypergeometric. Gets the alternative hypothesis under test. If the test is , the null hypothesis can be rejected in favor of this alternative hypothesis. Constructs a new Fisher's exact test. The matrix to be tested. The alternative hypothesis (research hypothesis) to test. Computes the Fisher's exact test. Converts a given test statistic to a p-value. The value of the test statistic. The p-value for the given statistic. Converts a given p-value to a test statistic. The p-value. The test statistic which would generate the given p-value. One-sample Anderson-Darling (AD) test. Gets the theoretical, hypothesized distribution for the samples, which should have been stated before any measurements. Creates a new Anderson-Darling test. The sample we would like to test as belonging to the . A fully specified distribution. Gets the Anderson-Darling statistic for the samples and target distribution. The sorted samples. The target distribution. Not supported. Converts a given test statistic to a p-value. The value of the test statistic. The p-value for the given statistic. One sample Lilliefors' corrected Kolmogorov-Smirnov (KS) test. In statistics, the Lilliefors test, named after Hubert Lilliefors, professor of statistics at George Washington University, is a test based on the Kolmogorov–Smirnov test. It is used to test the null hypothesis that data come from a normally distributed population, when the null hypothesis does not specify which normal distribution; i.e., it does not specify the expected value and variance of the distribution. Contrary to the Kolmogorov-Smirnov test, this test can be used to assess the likelihood that a given sample could have been generated from a distribution that has been fitted from the data. References: Wikipedia, The Free Encyclopedia. Lilliefors Test. Available on: https://en.wikipedia.org/wiki/Lilliefors_test In this first example, suppose we got a new sample, and we would like to test whether this sample has been originated from a uniform continuous distribution. Unlike , we can actually use this test whether the data fits a distribution that has been estimated from the data. We can also check whether a Normal distribution fitted on the data is a good candidate model for the samples: Gets the alternative hypothesis under test. If the test is , the null hypothesis can be rejected in favor of this alternative hypothesis. Gets the hypothesized distribution for the samples. Gets the empirical distribution measured from the sample. Gets the number of observations in the sample being tested. Creates a new One-Sample Lilliefors' Kolmogorov-Smirnov test. The sample we would like to test as belonging to the . A fully specified distribution (which could have been estimated from the data). The alternative hypothesis (research hypothesis) to test. The number of Monte-Carlo iterations to perform. Default is 10,000. Whether the target distribution should be re-estimated from the sampled data at each Monte-Carlo iteration. Pass true in case has been estimated from the data. Converts a given p-value to a test statistic. The p-value. The test statistic which would generate the given p-value. Converts a given test statistic to a p-value. The value of the test statistic. The p-value for the given statistic. Performs a Goodness-of-Fit method by automatically creating and fitting the chosen distribution to the samples and computing a against this fitted distribution. The type of the distribution. The samples used to fit the distribution. A Lilliefor Test assessing whether it is likely that the samples could have been generated by the chosen distribution. Sources of variation in a two-way ANOVA experiment. Please see for examples. Gets information about the first factor (A). Gets information about the second factor (B) source. Gets information about the interaction factor (AxB) source. Gets information about the error (within-variance) source. Gets information about the grouped (cells) variance source. Gets information about the total source of variance. Shapiro-Wilk test for normality. The Shapiro–Wilk test is a test of normality in frequentist statistics. It was published in 1965 by Samuel Sanford Shapiro and Martin Wilk. The The Shapiro–Wilk test tests the null hypothesis that a sample came from a normally distributed population. The null-hypothesis of this test is that the population is normally distributed. Thus, if the p-value is less than the chosen alpha level, then the null hypothesis is rejected and there is evidence that the data tested are not from a normally distributed population; in other words, the data are not normal. On the contrary, if the p-value is greater than the chosen alpha level, then the null hypothesis that the data came from a normally distributed population cannot be rejected (e.g., for an alpha level of 0.05, a data set with a p-value of 0.02 rejects the null hypothesis that the data are from a normally distributed population). However, since the test is biased by sample size, the test may be statistically significant from a normal distribution in any large samples. Thus a Q–Q plot is required for verification in addition to the test. References: Wikipedia, The Free Encyclopedia. Shapiro-Wilk test. Available on: http://en.wikipedia.org/wiki/Shapiro%E2%80%93Wilk_test Creates a new Shapiro-Wilk test. The sample we would like to test. The sample must contain at least 4 observations. Converts a given p-value to a test statistic. The p-value. The test statistic which would generate the given p-value. Converts a given test statistic to a p-value. The value of the test statistic. The p-value for the given statistic. Multinomial test (approximated). In statistics, the multinomial test is the test of the null hypothesis that the parameters of a multinomial distribution equal specified values. The test can be approximated using a chi-square distribution. References: Wikipedia, The Free Encyclopedia. Multinomial Test. Available on: http://en.wikipedia.org/wiki/Multinomial_test The following example is based on the example available on About.com Statistics, An Example of Chi-Square Test for a Multinomial Experiment By Courtney Taylor. In this example, we would like to test if a die is fair. For this, we will be rolling the die 600 times, annotating the result every time the die falls. In the end, we got a one 106 times, a two 90 times, a three 98 times, a four 102 times, a five 100 times and a six 104 times: int[] sample = { 106, 90, 98, 102, 100, 104 }; // If the die was fair, we should note that we would be expecting the // probabilities to be all equal to 1 / 6: double[] hypothesizedProportion = { // 1 2 3 4 5 6 1 / 6.0, 1 / 6.0, 1 / 6.0, 1 / 6.0, 1 / 6.0, 1 / 6.0, }; // Now, we create our test using the samples and the expected proportion MultinomialTest test = new MultinomialTest(sample, hypothesizedProportion); double chiSquare = test.Statistic; // 1.6 bool significant = test.Significant; // false Since the test didn't come up significant, it means that we don't have enough evidence to to reject the null hypothesis that the die is fair. Gets the observed sample proportions. Gets the hypothesized population proportions. Creates a new Multinomial test. The proportions for each category in the sample. The number of observations in the sample. Creates a new Multinomial test. The number of occurrences for each category in the sample. Creates a new Multinomial test. The number of occurrences for each category in the sample. The hypothesized category proportions. Default is to assume uniformly equal proportions. Creates a new Multinomial test. The proportions for each category in the sample. The number of observations in the sample. The hypothesized category proportions. Default is to assume uniformly equal proportions. Creates a new Multinomial test. The categories for each observation in the sample. The number of possible categories. Creates a new Multinomial test. The categories for each observation in the sample. The number of possible categories. The hypothesized category proportions. Default is to assume uniformly equal proportions. Computes the Multinomial test. Bartlett's test for equality of variances. In statistics, Bartlett's test is used to test if k samples are from populations with equal variances. Equal variances across samples is called homoscedasticity or homogeneity of variances. Some statistical tests, for example the analysis of variance, assume that variances are equal across groups or samples. The Bartlett test can be used to verify that assumption. Bartlett's test is sensitive to departures from normality. That is, if the samples come from non-normal distributions, then Bartlett's test may simply be testing for non-normality. Levene's test and the Brown–Forsythe test are alternatives to the Bartlett test that are less sensitive to departures from normality. References: Wikipedia, The Free Encyclopedia. Bartlett's test. Available on: http://en.wikipedia.org/wiki/Bartlett's_test Tests the null hypothesis that all group variances are equal. The grouped samples. Levene test computation methods. The test has been computed using the Mean. The test has been computed using the Median (which is known as the Brown-Forsythe test). The test has been computed using the trimmed mean. Levene's test for equality of variances. In statistics, Levene's test is an inferential statistic used to assess the equality of variances for a variable calculated for two or more groups. Some common statistical procedures assume that variances of the populations from which different samples are drawn are equal. Levene's test assesses this assumption. It tests the null hypothesis that the population variances are equal (called homogeneity of variance or homoscedasticity). If the resulting P-value of Levene's test is less than some significance level (typically 0.05), the obtained differences in sample variances are unlikely to have occurred based on random sampling from a population with equal variances. Thus, the null hypothesis of equal variances is rejected and it is concluded that there is a difference between the variances in the population. Some of the procedures typically assuming homoscedasticity, for which one can use Levene's tests, include analysis of variance and t-tests. Levene's test is often used before a comparison of means. When Levene's test shows significance, one should switch to generalized tests, free from homoscedasticity assumptions. Levene's test may also be used as a main test for answering a stand-alone question of whether two sub-samples in a given population have equal or different variances. References: Wikipedia, The Free Encyclopedia. Levene's test. Available on: http://en.wikipedia.org/wiki/Levene's_test Gets the method used to compute the Levene's test. Tests the null hypothesis that all group variances are equal. The grouped samples. True to use the median in the Levene calculation. False to use the mean. Default is false (use the mean). Tests the null hypothesis that all group variances are equal. The grouped samples. The percentage of observations to discard from the sample when computing the test with the truncated mean. Contains methods for power analysis of several related hypothesis tests, including support for automatic sample size estimation. The namespace class diagram is shown below. Please note that class diagrams for each of the inner namespaces are also available within their own documentation pages. Common interface for power analysis objects. The power of a statistical test is the probability that it correctly rejects the null hypothesis when the null hypothesis is false. That is, power = P(reject null hypothesis | null hypothesis is false) It can be equivalently thought of as the probability of correctly accepting the alternative hypothesis when the alternative hypothesis is true - that is, the ability of a test to detect an effect, if the effect actually exists. The power is in general a function of the possible distributions, often determined by a parameter, under the alternative hypothesis. As the power increases, the chances of a Type II error occurring decrease. The probability of a Type II error occurring is referred to as the false negative rate (β) and the power is equal to 1−β. The power is also known as the sensitivity. Power analysis can be used to calculate the minimum sample size required so that one can be reasonably likely to detect an effect of a given size. Power analysis can also be used to calculate the minimum effect size that is likely to be detected in a study using a given sample size. In addition, the concept of power is used to make comparisons between different statistical testing procedures: for example, between a parametric and a nonparametric test of the same hypothesis. There is also the concept of a power function of a test, which is the probability of rejecting the null when the null is true. References: Wikipedia, The Free Encyclopedia. Statistical power. Available on: http://en.wikipedia.org/wiki/Statistical_power Gets the test type. Gets the power of the test, also known as the (1-Beta error rate) or the test's sensitivity. Gets the significance level for the test. Also known as alpha. Gets the number of samples considered in the test. Gets the effect size of the test. Common interface for two-sample power analysis objects. Gets the number of observations contained in the first sample. Gets the number of observations contained in the second sample. Base class for two sample power analysis methods. This class cannot be instantiated. Gets the test type. Gets or sets the power of the test, also known as the (1-Beta error rate). Gets or sets the significance level for the test. Also known as alpha. Gets or sets the number of observations in the first sample considered in the test. Gets or sets the number of observations in the second sample considered in the test. Gets the total number of observations in both samples considered in the test. Gets the total number of observations in both samples considered in the test. Gets or sets the effect size of the test. Constructs a new power analysis for a two-sample test. Computes the power for a test with givens values of effect size and number of samples under . The power for the test under the given conditions. Computes the minimum detectable effect size for the test considering the power given in , the number of samples in and the significance level . The minimum detectable effect size for the test under the given conditions. Computes the minimum significance level for the test considering the power given in , the number of samples in and the effect size . The minimum detectable effect size for the test under the given conditions. Computes the recommended sample size for the test to attain the power indicated in considering values of and . Recommended sample size for attaining the given for size effect under the given . Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Converts the numeric power of this test to its equivalent string representation. Converts the numeric power of this test to its equivalent string representation. Gets the minimum difference in the experiment units to which it is possible to detect a difference. The common standard deviation for the samples. The minimum difference in means which can be detected by the test. Gets the minimum difference in the experiment units to which it is possible to detect a difference. The variance for the first sample. The variance for the second sample. The minimum difference in means which can be detected by the test. Power analysis for two-sample Z-Tests. Please take a look at the example section. Creates a new . The hypothesis tested. Creates a new . The test to create the analysis for. Computes the power for a test with givens values of effect size and number of samples under . The power for the test under the given conditions. Gets the recommended sample size for the test to attain the power indicating in considering values of and . Recommended sample size for attaining the given for size effect under the given . Computes the minimum detectable effect size for the test considering the power given in , the number of samples in and the significance level . The minimum detectable effect size for the test under the given conditions. Estimates the number of samples necessary to attain the required power level for the given effect size. The minimum detectable difference. The difference standard deviation. The desired power level. Default is 0.8. The desired significance level. Default is 0.05. The alternative hypothesis (research hypothesis) to be tested. The proportion of observations in the second group when compared to the first group. A proportion of 2:1 results in twice more samples in the second group than in the first. Default is 1. The required number of samples. Estimates the number of samples necessary to attain the required power level for the given effect size. The number of observations in the first sample. The number of observations in the second sample. The desired power level. Default is 0.8. The desired significance level. Default is 0.05. The alternative hypothesis (research hypothesis) to be tested. The required number of samples. Power analysis for two-sample T-Tests. There are different ways a power analysis test can be conducted. // Let's say we have two samples, and we would like to know whether those // samples have the same mean. For this, we can perform a two sample T-Test: double[] A = { 5.0, 6.0, 7.9, 6.95, 5.3, 10.0, 7.48, 9.4, 7.6, 8.0, 6.22 }; double[] B = { 5.0, 1.6, 5.75, 5.80, 2.9, 8.88, 4.56, 2.4, 5.0, 10.0 }; // Perform the test, assuming the samples have unequal variances var test = new TwoSampleTTest(A, B, assumeEqualVariances: false); double df = test.DegreesOfFreedom; // d.f. = 14.351 double t = test.Statistic; // t = 2.14 double p = test.PValue; // p = 0.04999 bool significant = test.Significant; // true // The test gave us an indication that the samples may // indeed have come from different distributions (whose // mean value is actually distinct from each other). // Now, we would like to perform an _a posteriori_ analysis of the // test. When doing an a posteriori analysis, we can not change some // characteristics of the test (because it has been already done), // but we can measure some important features that may indicate // whether the test is trustworthy or not. // One of the first things would be to check for the test's power. // A test's power is 1 minus the probability of rejecting the null // hypothesis when the null hypothesis is actually false. It is // the other side of the coin when we consider that the P-value // is the probability of rejecting the null hypothesis when the // null hypothesis is actually true. // Ideally, this should be a high value: double power = test.Analysis.Power; // 0.5376260 // Check how much effect we are trying to detect double effect = test.Analysis.Effect; // 0.94566 // With this power, that is the minimal difference we can spot? double sigma = Math.Sqrt(test.Variance); double thres = test.Analysis.Effect * sigma; // 2.0700909090909 // This means that, using our test, the smallest difference that // we could detect with some confidence would be something around // 2 standard deviations. If we would like to say the samples are // different when they are less than 2 std. dev. apart, we would // need to do repeat our experiment differently. Another way to create the power analysis is to pass the hypothesis test to the t-test power analysis constructor. // Create an a posteriori analysis of the experiment var analysis = new TwoSampleTTestPowerAnalysis(test); // When creating a power analysis, we have three things we can // change. We can always freely configure two of those things // and then ask the analysis to give us the third. // Those are: double e = analysis.Effect; // the test's minimum detectable effect size (0.94566) double n = analysis.TotalSamples; // the number of samples in the test (21 or (11 + 10)) double b = analysis.Power; // the probability of committing a type-2 error (0.53) // Let's say we would like to create a test with 80% power. analysis.Power = 0.8; analysis.ComputeEffect(); // what effect could we detect? double detectableEffect = analysis.Effect; // we would detect a difference of 1.290514 However, to achieve this 80%, we would need to redo our experiment more carefully. Assuming we are going to redo our experiment, we will have more freedom about what we can change and what we can not. For better addressing those points, we will create an a priori analysis of the experiment: // We would like to know how many samples we would need to gather in // order to achieve a 80% power test which can detect an effect size // of one standard deviation: // analysis = TwoSampleTTestPowerAnalysis.GetSampleSize ( variance1: A.Variance(), variance2: B.Variance(), delta: 1.0, // the minimum detectable difference we want power: 0.8 // the test power that we want ); // How many samples would we need in order to see the effect we need? int n1 = (int)Math.Ceiling(analysis.Samples1); // 77 int n2 = (int)Math.Ceiling(analysis.Samples2); // 77 // According to our power analysis, we would need at least 77 // observations in each sample in order to see the effect we // need with the required 80% power. Creates a new . The hypothesis tested. Creates a new . The test to create the analysis for. Computes the power for a test with givens values of effect size and number of samples under . The power for the test under the given conditions. Estimates the number of samples necessary to attain the required power level for the given effect size. The minimum detectable difference. The difference standard deviation. The desired power level. Default is 0.8. The desired significance level. Default is 0.05. The proportion of observations in the second group when compared to the first group. A proportion of 2:1 results in twice more samples in the second group than in the first. Default is 1. The alternative hypothesis (research hypothesis) to be tested. The required number of samples. Estimates the number of samples necessary to attain the required power level for the given effect size. The minimum detectable difference. The first sample variance. The second sample variance. The desired power level. Default is 0.8. The desired significance level. Default is 0.05. The proportion of observations in the second group when compared to the first group. A proportion of 2:1 results in twice more samples in the second group than in the first. Default is 1. The alternative hypothesis (research hypothesis) to be tested. The required number of samples. Base class for one sample power analysis methods. This class cannot be instantiated. Gets the test type. Gets or sets the power of the test, also known as the (1-Beta error rate). Gets or sets the significance level for the test. Also known as alpha. Gets or sets the number of samples considered in the test. Gets or sets the effect size of the test. Constructs a new power analysis for a one-sample test. Computes the power for a test with givens values of effect size and number of samples under . The power for the test under the given conditions. Computes the minimum detectable effect size for the test considering the power given in , the number of samples in and the significance level . The minimum detectable effect size for the test under the given conditions. Computes recommended sample size for the test to attain the power indicated in considering values of and . Recommended sample size for attaining the given for size effect under the given . Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Converts the numeric power of this test to its equivalent string representation. Converts the numeric power of this test to its equivalent string representation. Gets the minimum difference in the experiment units to which it is possible to detect a difference. The standard deviation for the samples. The minimum difference in means which can be detected by the test. Power analysis for one-sample T-Tests. // When creating a power analysis, we have three things we can // change. We can always freely configure two of those things // and then ask the analysis to give us the third. var analysis = new TTestPowerAnalysis(OneSampleHypothesis.ValueIsDifferentFromHypothesis); // Those are: double e = analysis.Effect; // the test's minimum detectable effect size double n = analysis.Samples; // the number of samples in the test double p = analysis.Power; // the probability of committing a type-2 error // Let's set the desired effect size and the // number of samples so we can get the power analysis.Effect = 0.2; // we would like to detect at least 0.2 std. dev. apart analysis.Samples = 60; // we would like to use at most 60 samples analysis.ComputePower(); // what will be the power of this test? double power = analysis.Power; // The power is going to be 0.33 (or 33%) // Let's set the desired power and the number // of samples so we can get the effect size analysis.Power = 0.8; // we would like to create a test with 80% power analysis.Samples = 60; // we would like to use at most 60 samples analysis.ComputeEffect(); // what would be the minimum effect size we can detect? double effect = analysis.Effect; // The effect will be 0.36 standard deviations. // Let's set the desired power and the effect // size so we can get the number of samples analysis.Power = 0.8; // we would like to create a test with 80% power analysis.Effect = 0.2; // we would like to detect at least 0.2 std. dev. apart analysis.ComputeSamples(); double samples = analysis.Samples; // We would need around 199 samples. Creates a new . The hypothesis tested. Creates a new . The test to create the analysis for. Computes the power for a test with givens values of effect size and number of samples under . The power for the test under the given conditions. Estimates the number of samples necessary to attain the required power level for the given effect size. The minimum detectable difference. The difference standard deviation. The desired power level. Default is 0.8. The desired significance level. Default is 0.05. The alternative hypothesis (research hypothesis) to be tested. The required number of samples. Estimates the number of samples necessary to attain the required power level for the given effect size. The number of observations in the sample. The desired power level. Default is 0.8. The desired significance level. Default is 0.05. The alternative hypothesis (research hypothesis) to be tested. The required number of samples. Power analysis for one-sample Z-Tests. // When creating a power analysis, we have three things we can // change. We can always freely configure two of those things // and then ask the analysis to give us the third. var analysis = new ZTestPowerAnalysis(OneSampleHypothesis.ValueIsDifferentFromHypothesis); // Those are: double e = analysis.Effect; // the test's minimum detectable effect size double n = analysis.Samples; // the number of samples in the test double p = analysis.Power; // the probability of committing a type-2 error // Let's set the desired effect size and the // number of samples so we can get the power analysis.Effect = 0.2; // we would like to detect at least 0.2 std. dev. apart analysis.Samples = 60; // we would like to use at most 60 samples analysis.ComputePower(); // what will be the power of this test? double power = analysis.Power; // The power is going to be 0.34 (or 34%) // Let's set the desired power and the number // of samples so we can get the effect size analysis.Power = 0.8; // we would like to create a test with 80% power analysis.Samples = 60; // we would like to use at most 60 samples analysis.ComputeEffect(); // what would be the minimum effect size we can detect? double effect = analysis.Effect; // The effect will be 0.36 standard deviations. // Let's set the desired power and the effect // size so we can get the number of samples analysis.Power = 0.8; // we would like to create a test with 80% power analysis.Effect = 0.2; // we would like to detect at least 0.2 std. dev. apart analysis.ComputeSamples(); double samples = analysis.Samples; // We would need around 197 samples. Creates a new . The hypothesis tested. Creates a new . The test to create the analysis for. Computes the power for a test with givens values of effect size and number of samples under . The power for the test under the given conditions. Gets the recommended sample size for the test to attain the power indicating in considering values of and . Recommended sample size for attaining the given for size effect under the given . Computes the minimum detectable effect size for the test considering the power given in , the number of samples in and the significance level . The minimum detectable effect size for the test under the given conditions. Estimates the number of samples necessary to attain the required power level for the given effect size. The minimum detectable difference. The difference standard deviation. The desired power level. Default is 0.8. The desired significance level. Default is 0.05. The alternative hypothesis (research hypothesis) to be tested. The required number of samples. Estimates the number of samples necessary to attain the required power level for the given effect size. The number of observations in the sample. The desired power level. Default is 0.8. The desired significance level. Default is 0.05. The alternative hypothesis (research hypothesis) to be tested. The required number of samples. T-Test for two paired samples. The Paired T-test can be used when the samples are dependent; that is, when there is only one sample that has been tested twice (repeated measures) or when there are two samples that have been matched or "paired". This is an example of a paired difference test. References: Wikipedia, The Free Encyclopedia. Student's t-test. Available from: http://en.wikipedia.org/wiki/Student%27s_t-test#Dependent_t-test_for_paired_samples Suppose we would like to know the effect of a treatment (such as a new drug) in improving the well-being of 9 patients. The well-being is measured in a discrete scale, going from 0 to 10. // To do so, we need to register the initial state of each patient // and then register their state after a given time under treatment. double[,] patients = { // before after // treatment treatment /* Patient 1.*/ { 0, 1 }, /* Patient 2.*/ { 6, 5 }, /* Patient 3.*/ { 4, 9 }, /* Patient 4.*/ { 8, 6 }, /* Patient 5.*/ { 1, 6 }, /* Patient 6.*/ { 6, 7 }, /* Patient 7.*/ { 3, 4 }, /* Patient 8.*/ { 8, 7 }, /* Patient 9.*/ { 6, 5 }, }; // Extract the before and after columns double[] before = patients.GetColumn(0); double[] after = patients.GetColumn(1); // Create the paired-sample T-test. Our research hypothesis is // that the treatment does improve the patient's well-being. So // we will be testing the hypothesis that the well-being of the // "before" sample, the first sample, is "smaller" in comparison // to the "after" treatment group. PairedTTest test = new PairedTTest(before, after, TwoSampleHypothesis.FirstValueIsSmallerThanSecond); bool significant = test.Significant; // not significant double pvalue = test.PValue; // p-value = 0.1650 double tstat = test.Statistic; // t-stat = -1.0371 Gets the alternative hypothesis under test. If the test is , the null hypothesis can be rejected in favor of this alternative hypothesis. Gets the first sample's mean. Gets the second sample's mean. Gets the observed mean difference between the two samples. Gets the standard error of the difference. Gets the size of a sample. Both samples have equal size. Gets the 95% confidence interval for the statistic. Gets a confidence interval for the statistic within the given confidence level percentage. The confidence level. Default is 0.95. A confidence interval for the estimated value. Creates a new paired t-test. The observations in the first sample. The observations in the second sample. The alternative hypothesis (research hypothesis) to test. Update event. Converts a given p-value to a test statistic. The p-value. The test statistic which would generate the given p-value. Converts a given test statistic to a p-value. The value of the test statistic. The p-value for the given statistic. Z-Test for two sample proportions. Creates a new Z-Test for two sample proportions. The proportion of success observations in the first sample. The total number of observations in the first sample. The proportion of success observations in the second sample. The total number of observations in the second sample. The alternative hypothesis (research hypothesis) to test. Creates a new Z-Test for two sample proportions. The number of successes in the first sample. The total number of trials (observations) in the first sample. The number of successes in the second sample. The total number of trials (observations) in the second sample. The alternative hypothesis (research hypothesis) to test. Computes the Z-test for two sample proportions. Hypothesis test for two Receiver-Operating Characteristic (ROC) curve areas (ROC-AUC). First Receiver-Operating Characteristic curve. First Receiver-Operating Characteristic curve. Gets the summed Kappa variance for the two contingency tables. Gets the variance for the first Kappa value. Gets the variance for the second Kappa value. Creates a new test for two ROC curves. The first ROC curve. The second ROC curve. The hypothesized difference between the two areas. The alternative hypothesis (research hypothesis) to test. Grubb's Test for Outliers (for approximately Normal distributions). Grubbs' test (named after Frank E. Grubbs, who published the test in 1950), also known as the maximum normed residual test or extreme studentized deviate test, is a statistical test used to detect outliers in a univariate data set assumed to come from a normally distributed population. References: Wikipedia, The Free Encyclopedia. Grubb's test for outliers. Available on: https://en.wikipedia.org/wiki/Grubbs%27_test_for_outliers Gets the number of observations in the sample. The number of observations in the sample. Gets the sample's mean. Gets the sample's standard deviation. Gets the difference between the minimum value and the mean. Gets the difference between the maximum value and the mean. Gets the maximum absolute difference between an observation in the sample and the mean. Gets the being tested. Constructs a Grubb's test. Converts a given test statistic to a p-value. The value of the test statistic. The p-value for the given statistic. Converts a given p-value to a test statistic. The p-value. The test statistic which would generate the given p-value. Base class for Wilcoxon's W tests. This is a base class which doesn't need to be used directly. Instead, you may wish to call and . Gets the number of samples being tested. Gets the signs for each of the differences. Gets the differences between the samples. Gets the rank statistics for the differences. Gets wether the samples to be ranked contain zeros. Gets wether the samples to be ranked contain ties. Gets whether we are using a exact test. Creates a new Wilcoxon's W+ test. The signs for the sample differences. The differences between samples. The distribution tail to test. Creates a new Wilcoxon's W+ test. Computes the Wilcoxon Signed-Rank test. Computes the Wilcoxon Signed-Rank test. Converts a given test statistic to a p-value. The value of the test statistic. The p-value for the given statistic. Converts a given p-value to a test statistic. The p-value. The test statistic which would generate the given p-value. Mann-Whitney-Wilcoxon test for unpaired samples. The Mann–Whitney U test (also called the Mann–Whitney–Wilcoxon (MWW), Wilcoxon rank-sum test, or Wilcoxon–Mann–Whitney test) is a non-parametric test of the null hypothesis that two populations are the same against an alternative hypothesis, especially that a particular population tends to have larger values than the other. It has greater efficiency than the t-test on non-normal distributions, such as a mixture of normal distributions, and it is nearly as efficient as the t-test on normal distributions. The following example comes from Richard Lowry's page at http://vassarstats.net/textbook/ch11a.html. As stated by Richard, this example deals with persons seeking treatment by claustrophobia. Those persons are randomly divided into two groups, and each group receive a different treatment for the disorder. The hypothesis would be that treatment A would more effective than B. To check this hypothesis, we can use Mann-Whitney's Test to compare the medians of both groups. // Claustrophobia test scores for people treated with treatment A double[] sample1 = { 4.6, 4.7, 4.9, 5.1, 5.2, 5.5, 5.8, 6.1, 6.5, 6.5, 7.2 }; // Claustrophobia test scores for people treated with treatment B double[] sample2 = { 5.2, 5.3, 5.4, 5.6, 6.2, 6.3, 6.8, 7.7, 8.0, 8.1 }; // Create a new Mann-Whitney-Wilcoxon's test to compare the two samples MannWhitneyWilcoxonTest test = new MannWhitneyWilcoxonTest(sample1, sample2, TwoSampleHypothesis.FirstValueIsSmallerThanSecond); double sum1 = test.RankSum1; // 96.5 double sum2 = test.RankSum2; // 134.5 double statistic1 = test.Statistic1; // 79.5 double statistic2 = test.Statistic2; // 30.5 double pvalue = test.PValue; // 0.043834132843420748 // Check if the test was significant bool significant = test.Significant; // true Gets the alternative hypothesis under test. If the test is , the null hypothesis can be rejected in favor of this alternative hypothesis. Gets the number of samples in the first sample. Gets the number of samples in the second sample. Gets the rank statistics for the first sample. Gets the rank statistics for the second sample. Gets the sum of ranks for the first sample. Often known as Ta. Gets the sum of ranks for the second sample. Often known as Tb. Gets the difference between the expected value for the observed value of and its expected value under the null hypothesis. Often known as U_a. Gets the difference between the expected value for the observed value of and its expected value under the null hypothesis. Often known as U_b. Gets a value indicating whether the provided samples have tied ranks. Gets whether we are using a exact test. Tests whether two samples comes from the same distribution without assuming normality. The first sample. The second sample. The alternative hypothesis (research hypothesis) to test. True to compute the exact distribution. May require a significant amount of processing power for large samples (n > 12). If left at null, whether to compute the exact or approximate distribution will depend on the number of samples. Default is null. Whether to account for ties when computing the rank statistics or not. Default is true. Converts a given test statistic to a p-value. The value of the test statistic. The p-value for the given statistic. Converts a given p-value to a test statistic. The p-value. The test statistic which would generate the given p-value. Base class for Hypothesis Tests. A statistical hypothesis test is a method of making decisions using data, whether from a controlled experiment or an observational study (not controlled). In statistics, a result is called statistically significant if it is unlikely to have occurred by chance alone, according to a pre-determined threshold probability, the significance level. References: Wikipedia, The Free Encyclopedia. Statistical Hypothesis Testing. Initializes a new instance of the class. Gets the distribution associated with the test statistic. Gets the P-value associated with this test. In statistical hypothesis testing, the p-value is the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true. The lower the p-value, the less likely the result can be explained by chance alone, assuming the null hypothesis is true. Gets the test statistic. Gets the test type. Gets the significance level for the test. Default value is 0.05 (5%). Gets whether the null hypothesis should be rejected. A test result is said to be statistically significant when the result would be very unlikely to have occurred by chance alone. Gets the critical value for the current significance level. Converts a given test statistic to a p-value. The value of the test statistic. The p-value for the given statistic. Converts a given p-value to a test statistic. The p-value. The test statistic which would generate the given p-value. Called whenever the test significance level changes. Converts the numeric P-Value of this test to its equivalent string representation. Converts the numeric P-Value of this test to its equivalent string representation. Common interface for Hypothesis tests depending on a statistical distribution. The test statistic distribution. Gets the distribution associated with the test statistic. Common interface for Hypothesis tests depending on a statistical distribution. Gets the test type. Gets whether the null hypothesis should be rejected. A test result is said to be statistically significant when the result would be very unlikely to have occurred by chance alone. Converts a given test statistic to a p-value. The value of the test statistic. The p-value for the given statistic. Converts a given p-value to a test statistic. The p-value. The test statistic which would generate the given p-value. Bhapkar test of homogeneity for contingency tables. The Bhapkar test is a more powerful alternative to the Stuart-Maxwell test. This is a Chi-square kind of test. References: Bhapkar, V.P. (1966). A note on the equivalence of two test criteria for hypotheses in categorical data. Journal of the American Statistical Association, 61, 228-235. Gets the delta vector d used in the test calculations. Gets the covariance matrix S used in the test calculations. Gets the inverse covariance matrix S^-1 used in the calculations. Creates a new Bhapkar test. The contingency table to test. Bowker test of symmetry for contingency tables. This is a Chi-square kind of test. Creates a new Bowker test. The contingency table to test. Two-Sample (Goodness-of-fit) Chi-Square Test (Upper Tail) A chi-square test (also chi-squared or χ² test) is any statistical hypothesis test in which the sampling distribution of the test statistic is a chi-square distribution when the null hypothesis is true, or any in which this is asymptotically true, meaning that the sampling distribution (if the null hypothesis is true) can be made to approximate a chi-square distribution as closely as desired by making the sample size large enough. The chi-square test is used whenever one would like to test whether the actual data differs from a random distribution. References: Wikipedia, The Free Encyclopedia. Chi-Square Test. Available on: http://en.wikipedia.org/wiki/Chi-square_test J. S. McLaughlin. Chi-Square Test. Available on: http://www2.lv.psu.edu/jxm57/irp/chisquar.html The following example has been based on the example section of the Pearson's chi-squared test article on Wikipedia. // Suppose we would like to test the hypothesis that a random sample of // 100 people has been drawn from a population in which men and women are // equal in frequency. // Under this hypothesis, the observed number of men and women would be // compared to the theoretical frequencies of 50 men and 50 women. So, // after drawing our sample, we found out that there were 44 men and 56 // women in the sample: // man woman double[] observed = { 44, 56 }; double[] expected = { 50, 50 }; // If the null hypothesis is true (i.e., men and women are chosen with // equal probability), the test statistic will be drawn from a chi-squared // distribution with one degree of freedom. If the male frequency is known, // then the female frequency is determined. // int degreesOfFreedom = 1; // So now we have: // var chi = new ChiSquareTest(expected, observed, degreesOfFreedom); // The chi-squared distribution for 1 degree of freedom shows that the // probability of observing this difference (or a more extreme difference // than this) if men and women are equally numerous in the population is // approximately 0.23. double pvalue = chi.PValue; // 0.23 // This probability is higher than conventional criteria for statistical // significance (0.001 or 0.05), so normally we would not reject the null // hypothesis that the number of men in the population is the same as the // number of women. bool significant = chi.Significant; // false Gets the degrees of freedom for the Chi-Square distribution. Constructs a Chi-Square Test. The test statistic. The chi-square distribution degrees of freedom. Constructs a Chi-Square Test. The expected variable values. The observed variable values. The chi-square distribution degrees of freedom. Constructs a Chi-Square Test. Constructs a Chi-Square Test. Constructs a Chi-Square Test. Constructs a Chi-Square Test. Computes the Chi-Square Test. Converts a given test statistic to a p-value. The value of the test statistic. The p-value for the given statistic. Converts a given p-value to a test statistic. The p-value. The test statistic which would generate the given p-value. Kappa Test for two contingency tables. The two-matrix Kappa test tries to assert whether the Kappa measure of two contingency tables, each of which created by a different rater or classification model, differs significantly. This is a two sample z-test kind of test. References: J. L. Fleiss. Statistical methods for rates and proportions. Wiley-Interscience; 3rd edition (September 5, 2003) Ientilucci, Emmett (2006). "On Using and Computing the Kappa Statistic". Available on: http://www.cis.rit.edu/~ejipci/Reports/On_Using_and_Computing_the_Kappa_Statistic.pdf Gets the summed Kappa variance for the two contingency tables. Gets the variance for the first Kappa value. Gets the variance for the second Kappa value. Creates a new Two-Table Kappa test. The kappa value for the first contingency table to test. The kappa value for the second contingency table to test. The variance of the kappa value for the first contingency table to test. The variance of the kappa value for the second contingency table to test. The alternative hypothesis (research hypothesis) to test. The hypothesized difference between the two Kappa values. Creates a new Two-Table Kappa test. The first contingency table to test. The second contingency table to test. The hypothesized difference between the two Kappa values. The alternative hypothesis (research hypothesis) to test. Kappa Test for agreement in contingency tables. The Kappa test tries to assert whether the Kappa measure of a a contingency table, is significantly different from another hypothesized value. The computations used by the test are the same found in the 1969 paper by J. L. Fleiss, J. Cohen, B. S. Everitt, in which they presented the finally corrected version of the Kappa's variance formulae. This is contrast to the computations traditionally found in the remote sensing literature. For those variance computations, see the method. This is a z-test kind of test. References: J. L. Fleiss. Statistical methods for rates and proportions. Wiley-Interscience; 3rd edition (September 5, 2003) J. L. Fleiss, J. Cohen, B. S. Everitt. Large sample standard errors of kappa and weighted kappa. Psychological Bulletin, Volume: 72, Issue: 5. Washington, DC: American Psychological Association, Pages: 323-327, 1969. Gets the variance of the Kappa statistic. Creates a new Kappa test. The estimated Kappa statistic. The standard error of the kappa statistic. If the test is being used to assert independency between two raters (i.e. testing the null hypothesis that the underlying Kappa is zero), then the standard error should be computed with the null hypothesis parameter set to true. The alternative hypothesis (research hypothesis) to test. If the hypothesized kappa is left unspecified, a one-tailed test will be used. Otherwise, the default is to use a two-sided test. Creates a new Kappa test. The estimated Kappa statistic. The standard error of the kappa statistic. If the test is being used to assert independency between two raters (i.e. testing the null hypothesis that the underlying Kappa is zero), then the standard error should be computed with the null hypothesis parameter set to true. The hypothesized value for the Kappa statistic. The alternative hypothesis (research hypothesis) to test. If the hypothesized kappa is left unspecified, a one-tailed test will be used. Otherwise, the default is to use a two-sided test. Creates a new Kappa test. The contingency table to test. The alternative hypothesis (research hypothesis) to test. If the hypothesized kappa is left unspecified, a one-tailed test will be used. Otherwise, the default is to use a two-sided test. Creates a new Kappa test. The contingency table to test. The alternative hypothesis (research hypothesis) to test. If the hypothesized kappa is left unspecified, a one-tailed test will be used. Otherwise, the default is to use a two-sided test. The hypothesized value for the Kappa statistic. If the test is being used to assert independency between two raters (i.e. testing the null hypothesis that the underlying Kappa is zero), then the standard error will be computed with the null hypothesis parameter set to true. Creates a new Kappa test. The contingency table to test. The alternative hypothesis (research hypothesis) to test. If the hypothesized kappa is left unspecified, a one-tailed test will be used. Otherwise, the default is to use a two-sided test. The hypothesized value for the Kappa statistic. If the test is being used to assert independency between two raters (i.e. testing the null hypothesis that the underlying Kappa is zero), then the standard error will be computed with the null hypothesis parameter set to true. Compute Cohen's Kappa variance using the large sample approximation given by Congalton, which is common in the remote sensing literature. A representing the ratings. Kappa's variance. Compute Cohen's Kappa variance using the large sample approximation given by Congalton, which is common in the remote sensing literature. A representing the ratings. Kappa's standard deviation. Kappa's variance. Computes the asymptotic variance for Fleiss's Kappa variance using the formulae by (Fleiss et al, 1969) when the underlying Kappa is assumed different from zero. A representing the ratings. Kappa's variance. Computes the asymptotic variance for Fleiss's Kappa variance using the formulae by (Fleiss et al, 1969). If is set to true, the method will return the variance under the null hypothesis. A representing the ratings. Kappa's standard deviation. True to compute Kappa's variance when the null hypothesis is true (i.e. that the underlying kappa is zer). False otherwise. Default is false. Kappa's variance. Computes the asymptotic variance for Fleiss's Kappa variance using the formulae by (Fleiss et al, 1969). If is set to true, the method will return the variance under the null hypothesis. A representing the ratings. Kappa's standard deviation. True to compute Kappa's variance when the null hypothesis is true (i.e. that the underlying kappa is zer). False otherwise. Default is false. Kappa's variance. Snedecor's F-Test. A F-test is any statistical test in which the test statistic has an F-distribution under the null hypothesis. It is most often used when comparing statistical models that have been fit to a data set, in order to identify the model that best fits the population from which the data were sampled. References: Wikipedia, The Free Encyclopedia. F-Test. Available on: http://en.wikipedia.org/wiki/F-test // The following example has been based on the page "F-Test for Equality // of Two Variances", from NIST/SEMATECH e-Handbook of Statistical Methods: // // http://www.itl.nist.gov/div898/handbook/eda/section3/eda359.htm // // Consider a data set containing 480 ceramic strength // measurements for two batches of material. The summary // statistics for each batch are shown below: // Batch 1: int numberOfObservations1 = 240; // double mean1 = 688.9987; double stdDev1 = 65.54909; double var1 = stdDev1 * stdDev1; // Batch 2: int numberOfObservations2 = 240; // double mean2 = 611.1559; double stdDev2 = 61.85425; double var2 = stdDev2 * stdDev2; // Here, we will be testing the null hypothesis that // the variances for the two batches are equal. int degreesOfFreedom1 = numberOfObservations1 - 1; int degreesOfFreedom2 = numberOfObservations2 - 1; // Now we can create a F-Test to test the difference between variances var ftest = new FTest(var1, var2, degreesOfFreedom1, degreesOfFreedom2); double statistic = ftest.Statistic; // 1.123037 double pvalue = ftest.PValue; // 0.185191 bool significant = ftest.Significant; // false // The F test indicates that there is not enough evidence // to reject the null hypothesis that the two batch variances // are equal at the 0.05 significance level. Gets the alternative hypothesis under test. If the test is , the null hypothesis can be rejected in favor of this alternative hypothesis. Gets the degrees of freedom for the numerator in the test distribution. Gets the degrees of freedom for the denominator in the test distribution. Creates a new F-Test for a given statistic with given degrees of freedom. The variance of the first sample. The variance of the second sample. The degrees of freedom for the first sample. The degrees of freedom for the second sample. The alternative hypothesis (research hypothesis) to test. Creates a new F-Test for a given statistic with given degrees of freedom. The test statistic. The degrees of freedom for the numerator. The degrees of freedom for the denominator. The alternative hypothesis (research hypothesis) to test. Computes the F-test. Creates a new F-Test. Converts a given test statistic to a p-value. The value of the test statistic. The p-value for the given statistic. Converts a given p-value to a test statistic. The p-value. The test statistic which would generate the given p-value. McNemar test of homogeneity for 2 x 2 contingency tables. McNemar's test is a non-parametric method used on nominal data. It is applied to 2 × 2 contingency tables with a dichotomous trait, with matched pairs of subjects, to determine whether the row and column marginal frequencies are equal, i.e. if the contingency table presents marginal homogeneity. This is a Chi-square kind of test. References: Wikipedia contributors, "McNemar's test," Wikipedia, The Free Encyclopedia, Available on: http://http://en.wikipedia.org/wiki/McNemar's_test. Creates a new McNemar test. The contingency table to test. True to use Yate's correction of continuity, falser otherwise. Default is false. One-sample Kolmogorov-Smirnov (KS) test. The Kolmogorov-Smirnov test tries to determine if a sample differs significantly from an hypothesized theoretical probability distribution. The Kolmogorov-Smirnov test has an interesting advantage in which it does not requires any assumptions about the data. The distribution of the K-S test statistic does not depend on which distribution is being tested. The K-S test has also the advantage of being an exact test (other tests, such as the chi-square goodness-of-fit test depends on an adequate sample size). One disadvantage is that it requires a fully defined distribution which should not have been estimated from the data. If the parameters of the theoretical distribution have been estimated from the data, the critical region of the K-S test will be no longer valid. This class uses an efficient and high-accuracy algorithm based on work by Richard Simard (2010). Please see for more details. References: Wikipedia, The Free Encyclopedia. Kolmogorov-Smirnov Test. Available on: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test NIST/SEMATECH e-Handbook of Statistical Methods. Kolmogorov-Smirnov Goodness-of-Fit Test. Available on: http://www.itl.nist.gov/div898/handbook/eda/section3/eda35g.htm Richard Simard, Pierre L’Ecuyer. Computing the Two-Sided Kolmogorov-Smirnov Distribution. Journal of Statistical Software. Volume VV, Issue II. Available on: http://www.iro.umontreal.ca/~lecuyer/myftp/papers/ksdist.pdf In this first example, suppose we got a new sample, and we would like to test whether this sample has been originated from a uniform continuous distribution. double[] sample = { 0.621, 0.503, 0.203, 0.477, 0.710, 0.581, 0.329, 0.480, 0.554, 0.382 }; // First, we create the distribution we would like to test against: // var distribution = UniformContinuousDistribution.Standard; // Now we can define our hypothesis. The null hypothesis is that the sample // comes from a standard uniform distribution, while the alternate is that // the sample is not from a standard uniform distribution. // var kstest = new KolmogorovSmirnovTest(sample, distribution); double statistic = kstest.Statistic; // 0.29 double pvalue = kstest.PValue; // 0.3067 bool significant = kstest.Significant; // false Since the null hypothesis could not be rejected, then the sample can perhaps be from a uniform distribution. However, please note that this doesn't means that the sample *is* from the uniform, it only means that we could not rule out the possibility. Before we could not rule out the possibility that the sample came from a uniform distribution, which means the sample was not very far from uniform. This would be an indicative that it would be far from what would be expected from a Normal distribution: // First, we create the distribution we would like to test against: // NormalDistribution distribution = NormalDistribution.Standard; // Now we can define our hypothesis. The null hypothesis is that the sample // comes from a standard Normal distribution, while the alternate is that // the sample is not from a standard Normal distribution. // var kstest = new KolmogorovSmirnovTest(sample, distribution); double statistic = kstest.Statistic; // 0.580432 double pvalue = kstest.PValue; // 0.000999 bool significant = kstest.Significant; // true Since the test says that the null hypothesis should be rejected, then this can be regarded as a strong indicative that the sample does not comes from a Normal distribution, just as we expected. Gets the alternative hypothesis under test. If the test is , the null hypothesis can be rejected in favor of this alternative hypothesis. Gets the theoretical, hypothesized distribution for the samples, which should have been stated before any measurements. Gets the empirical distribution measured from the sample. Creates a new One-Sample Kolmogorov test. The sample we would like to test as belonging to the . A fully specified distribution (which must NOT have been estimated from the data). Creates a new One-Sample Kolmogorov test. The sample we would like to test as belonging to the . A fully specified distribution (which must NOT have been estimated from the data). The alternative hypothesis (research hypothesis) to test. Gets the appropriate Kolmogorov-Sminorv D statistic for the samples and target distribution. The sorted samples. The target distribution. The alternate hypothesis for the KS test. For , this is the two-sided Dn statistic; for this is the one sided Dn+ statistic; and for this is the one sided Dn- statistic. Gets the one-sided "Dn-" Kolmogorov-Sminorv statistic for the samples and target distribution. The sorted samples. The target distribution. Gets the one-sided "Dn+" Kolmogorov-Sminorv statistic for the samples and target distribution. The sorted samples. The target distribution. Gets the two-sided "Dn" Kolmogorov-Sminorv statistic for the samples and target distribution. The sorted samples. The target distribution. Converts a given p-value to a test statistic. The p-value. The test statistic which would generate the given p-value. Converts a given test statistic to a p-value. The value of the test statistic. The p-value for the given statistic. ANOVA's result table. This class represents the results obtained from an ANOVA experiment. Source of variation in an ANOVA experiment. Creates a new object representation of a variation source in an ANOVA experiment. The associated ANOVA analysis. The name of the variation source. Creates a new object representation of a variation source in an ANOVA experiment. The associated ANOVA analysis. The name of the variation source. The degrees of freedom for the source. The sum of squares of the source. Creates a new object representation of a variation source in an ANOVA experiment. The associated ANOVA analysis. The name of the variation source. The degrees of freedom for the source. The mean sum of squares of the source. The sum of squares of the source. Creates a new object representation of a variation source in an ANOVA experiment. The associated ANOVA analysis. The name of the variation source. The degrees of freedom for the source. The sum of squares of the source. The F-Test containing the F-Statistic for the source. Creates a new object representation of a variation source in an ANOVA experiment. The associated ANOVA analysis. The name of the variation source. The degrees of freedom for the source. The sum of squares of the source. The mean sum of squares of the source. The F-Test containing the F-Statistic for the source. Gets the ANOVA associated with this source. Gets the name of the variation source. Gets the sum of squares associated with the variation source. Gets the degrees of freedom associated with the variation source. Get the mean squares, or the variance, associated with the source. Gets the significance of the source. Gets the F-Statistic associated with the source's significance. Common interface for analyses of variance. Gets the ANOVA results in the form of a table. One-way Analysis of Variance (ANOVA). The one-way ANOVA is a way to test for the equality of three or more means at the same time by using variances. In its simplest form ANOVA provides a statistical test of whether or not the means of several groups are all equal, and therefore generalizes t-test to more than two groups. References: Wikipedia, The Free Encyclopedia. Analysis of variance. Wikipedia, The Free Encyclopedia. F-Test. Wikipedia, The Free Encyclopedia. One-way ANOVA. The following is the same example given in Wikipedia's page for the F-Test [1]. Suppose one would like to test the effect of three levels of a fertilizer on plant growth. To achieve this goal, an experimenter has divided a set of 18 plants on three groups, 6 plants each. Each group has received different levels of the fertilizer under question. After some months, the experimenter registers the growth for each plant: double[][] samples = { new double[] { 6, 8, 4, 5, 3, 4 }, // records for the first group new double[] { 8, 12, 9, 11, 6, 8 }, // records for the second group new double[] { 13, 9, 11, 8, 7, 12 }, // records for the third group }; Now, he would like to test whether the different fertilizer levels has indeed caused any effect in plant growth. In other words, he would like to test if the three groups are indeed significantly different. // To do it, he runs an ANOVA test: OneWayAnova anova = new OneWayAnova(samples); After the Anova object has been created, one can display its findings in the form of a standard ANOVA table by binding anova.Table to a DataGridView or any other display object supporting data binding. To illustrate, we could use Accord.NET's DataGridBox to inspect the table's contents. DataGridBox.Show(anova.Table); Result in: The p-level for the analysis is about 0.002, meaning the test is significant at the 5% significance level. The experimenter would thus reject the null hypothesis, concluding there is a strong evidence that the three groups are indeed different. Assuming the experiment was correctly controlled, this would be an indication that the fertilizer does indeed affect plant growth. [1] http://en.wikipedia.org/wiki/F_test Gets the F-Test produced by this one-way ANOVA. Gets the ANOVA results in the form of a table. Creates a new one-way ANOVA test. The sampled values. The independent, nominal variables. Creates a new one-way ANOVA test. The grouped sampled values. Two-way ANOVA model types. References: Wikipedia, The Free Encyclopedia. Analysis of variance. Fixed-effects model (Model 1). The fixed-effects model of analysis of variance, as known as model 1, applies to situations in which the experimenter applies one or more treatments to the subjects of the experiment to see if the response variable values change. This allows the experimenter to estimate the ranges of response variable values that the treatment would generate in the population as a whole. References: Wikipedia, The Free Encyclopedia. Analysis of variance. Random-effects model (Model 2). Random effects models are used when the treatments are not fixed. This occurs when the various factor levels are sampled from a larger population. Because the levels themselves are random variables, some assumptions and the method of contrasting the treatments differ from ANOVA model 1. References: Wikipedia, The Free Encyclopedia. Analysis of variance. Mixed-effects models (Model 3). A mixed-effects model contains experimental factors of both fixed and random-effects types, with appropriately different interpretations and analysis for the two types. References: Wikipedia, The Free Encyclopedia. Analysis of variance. Two-way Analysis of Variance. The two-way ANOVA is an extension of the one-way ANOVA for two independent variables. There are three classes of models which can also be used in the analysis, each of which determining the interpretation of the independent variables in the analysis. References: Wikipedia, The Free Encyclopedia. Analysis of variance. Carsten Dahl Mørch, ANOVA. Aalborg Universitet. Available on: http://www.smi.hst.aau.dk/~cdahl/BiostatPhD/ANOVA.pdf Gets the number of observations in the sample. Gets the number of samples presenting the first factor. Gets the number of samples presenting the second factor. Gets the number of replications of each factor. Gets or sets the variation sources obtained in the analysis. The variation sources for the data. Gets the ANOVA results in the form of a table. Gets or sets the type of the model. The type of the model. Constructs a new . The samples. The first factor labels. The second factor labels. The type of the analysis. Constructs a new . The samples in grouped form. The type of the analysis. Constructs a new . The samples in grouped form. The type of the analysis. Stuart-Maxwell test of homogeneity for K x K contingency tables. The Stuart-Maxwell test is a generalization of McNemar's test for multiple categories. This is a Chi-square kind of test. References: Uebersax, John (2006). "McNemar Tests of Marginal Homogeneity". Available on: http://www.john-uebersax.com/stat/mcnemar.htm Sun, Xuezheng; Yang, Zhao (2008). "Generalized McNemar's Test for Homogeneity of the Marginal Distributions". Available on: http://www2.sas.com/proceedings/forum2008/382-2008.pdf Gets the delta vector d used in the test calculations. Gets the covariance matrix S used in the test calculations. Gets the inverse covariance matrix S^-1 used in the calculations. Creates a new Stuart-Maxwell test. The contingency table to test. Sign test for the median. In statistics, the sign test can be used to test the hypothesis that the difference median is zero between the continuous distributions of two random variables X and Y, in the situation when we can draw paired samples from X and Y. It is a non-parametric test which makes very few assumptions about the nature of the distributions under test - this means that it has very general applicability but may lack the statistical power of other tests such as the paired-samples t-test or the Wilcoxon signed-rank test. References: Wikipedia, The Free Encyclopedia. Sign test. Available on: http://en.wikipedia.org/wiki/Sign_test // This example has been adapted from the Wikipedia's page about // the Z-Test, available from: http://en.wikipedia.org/wiki/Z-test // We would like to check whether a sample of 20 // students with a median score of 96 points ... double[] sample = { 106, 115, 96, 88, 91, 88, 81, 104, 99, 68, 104, 100, 77, 98, 96, 104, 82, 94, 72, 96 }; // ... could have happened just by chance inside a // population with an hypothesized median of 100 points. double hypothesizedMedian = 100; // So we start by creating the test: SignTest test = new SignTest(sample, hypothesizedMedian, OneSampleHypothesis.ValueIsSmallerThanHypothesis); // Now, we can check whether this result would be // unlikely under a standard significance level: bool significant = test.Significant; // false (so the event was likely) // We can also check the test statistic and its P-Value double statistic = test.Statistic; // 5 double pvalue = test.PValue; // 0.99039 Gets the alternative hypothesis under test. If the test is , the null hypothesis can be rejected in favor of this alternative hypothesis. Tests the null hypothesis that the sample median is equal to a hypothesized value. The number of positive samples. The total number of samples. The alternative hypothesis (research hypothesis) to test. Tests the null hypothesis that the sample median is equal to a hypothesized value. The data samples from which the test will be performed. The constant to be compared with the samples. The alternative hypothesis (research hypothesis) to test. Computes the one sample sign test. Wilcoxon signed-rank test for the median. The Wilcoxon signed-rank test is a non-parametric statistical hypothesis test used when comparing two related samples, matched samples, or repeated measurements on a single sample to assess whether their population mean ranks differ (i.e. it is a paired difference test). It can be used as an alternative to the paired Student's t-test, t-test for matched pairs, or the t-test for dependent samples when the population cannot be assumed to be normally distributed. The Wilcoxon signed-rank test is not the same as the Wilcoxon rank-sum test, although both are nonparametric and involve summation of ranks. This test uses the positive W statistic, as explained in https://onlinecourses.science.psu.edu/stat414/node/319 References: Wikipedia, The Free Encyclopedia. Wilcoxon signed-rank test. Available on: http://en.wikipedia.org/wiki/Wilcoxon_signed-rank_test // This example has been adapted from the Wikipedia's page about // the Z-Test, available from: http://en.wikipedia.org/wiki/Z-test // We would like to check whether a sample of 20 // students with a median score of 96 points ... double[] sample = { 106, 115, 96, 88, 91, 88, 81, 104, 99, 68, 104, 100, 77, 98, 96, 104, 82, 94, 72, 96 }; // ... could have happened just by chance inside a // population with an hypothesized median of 100 points. double hypothesizedMedian = 100; // So we start by creating the test: WilcoxonSignedRankTest test = new WilcoxonSignedRankTest(sample, hypothesizedMedian, OneSampleHypothesis.ValueIsSmallerThanHypothesis); // Now, we can check whether this result would be // unlikely under a standard significance level: bool significant = test.Significant; // false (so the event was likely) // We can also check the test statistic and its P-Value double statistic = test.Statistic; // 40.0 double pvalue = test.PValue; // 0.98585347446367344 Gets the alternative hypothesis under test. If the test is , the null hypothesis can be rejected in favor of this alternative hypothesis. Tests the null hypothesis that the sample median is equal to a hypothesized value. The data samples from which the test will be performed. The constant to be compared with the samples. The alternative hypothesis (research hypothesis) to test. True to compute the exact distribution. May require a significant amount of processing power for large samples (n > 12). If left at null, whether to compute the exact or approximate distribution will depend on the number of samples. Default is null. Whether to account for ties when computing the rank statistics or not. Default is true. Sign test for two paired samples. This is a Binomial kind of test. Gets the alternative hypothesis under test. If the test is , the null hypothesis can be rejected in favor of this alternative hypothesis. Creates a new sign test for two samples. The number of positive samples (successes). The total number of samples (trials). The alternative hypothesis (research hypothesis) to test. Creates a new sign test for two samples. The first sample of observations. The second sample of observations. The alternative hypothesis (research hypothesis) to test. Computes the two sample sign test. Binomial test. In statistics, the binomial test is an exact test of the statistical significance of deviations from a theoretically expected distribution of observations into two categories. The most common use of the binomial test is in the case where the null hypothesis is that two categories are equally likely to occur (such as a coin toss). When there are more than two categories, and an exact test is required, the multinomial test, based on the multinomial distribution, must be used instead of the binomial test. References: Wikipedia, The Free Encyclopedia. Binomial-Test. Available from: http://en.wikipedia.org/wiki/Binomial_test This is the second example from Wikipedia's page on hypothesis testing. In this example, a person is tested for clairvoyance (ability of gaining information about something through extra sensory perception; detecting something without using the known human senses. // A person is shown the reverse of a playing card 25 times and is // asked which of the four suits the card belongs to. Every time // the person correctly guesses the suit of the card, we count this // result as a correct answer. Let's suppose the person obtained 13 // correctly answers out of the 25 cards. // Since each suit appears 1/4 of the time in the card deck, we // would assume the probability of producing a correct answer by // chance alone would be of 1/4. // And finally, we must consider we are interested in which the // subject performs better than what would expected by chance. // In other words, that the person's probability of predicting // a card is higher than the chance hypothesized value of 1/4. BinomialTest test = new BinomialTest( successes: 13, trials: 25, hypothesizedProbability: 1.0 / 4.0, alternate: OneSampleHypothesis.ValueIsGreaterThanHypothesis); Console.WriteLine("Test p-Value: " + test.PValue); // ~ 0.003 Console.WriteLine("Significant? " + test.Significant); // True. Gets the alternative hypothesis under test. If the test is , the null hypothesis can be rejected in favor of this alternative hypothesis. Tests the probability of two outcomes in a series of experiments. The experimental trials. The hypothesized occurrence probability. The alternative hypothesis (research hypothesis) to test. Tests the probability of two outcomes in a series of experiments. The number of successes in the trials. The total number of experimental trials. The hypothesized occurrence probability. The alternative hypothesis (research hypothesis) to test. Creates a Binomial test. Computes the Binomial test. Converts a given test statistic to a p-value. The value of the test statistic. The p-value for the given statistic. Converts a given p-value to a test statistic. The p-value. The test statistic which would generate the given p-value. Computes the two-tail probability using the Wilson-Sterne rule, which defines the tail of the distribution based on a ordering of the null probabilities of X. (Smirnoff, 2003) References: Jeffrey S. Simonoff, Analyzing Categorical Data, Springer, 2003 (pg 64). Wilcoxon signed-rank test for paired samples. Gets the alternative hypothesis under test. If the test is , the null hypothesis can be rejected in favor of this alternative hypothesis. Tests whether the medians of two paired samples are different. The first sample. The second sample. The alternative hypothesis (research hypothesis) to test. True to compute the exact distribution. May require a significant amount of processing power for large samples (n > 12). If left at null, whether to compute the exact or approximate distribution will depend on the number of samples. Default is null. Whether to account for ties when computing the rank statistics or not. Default is true. One-sample Student's T test. The one-sample t-test assesses whether the mean of a sample is statistically different from a hypothesized value. This test supports creating power analyses through its property. References: Wikipedia, The Free Encyclopedia. Student's T-Test. William M.K. Trochim. The T-Test. Research methods Knowledge Base, 2009. Available on: http://www.le.ac.uk/bl/gat/virtualfc/Stats/ttest.html Graeme D. Ruxton. The unequal variance t-test is an underused alternative to Student's t-test and the Mann–Whitney U test. Oxford Journals, Behavioral Ecology Volume 17, Issue 4, pp. 688-690. 2006. Available on: http://beheco.oxfordjournals.org/content/17/4/688.full // Consider a sample generated from a Gaussian // distribution with mean 0.5 and unit variance. double[] sample = { -0.849886940156521, 3.53492346633185, 1.22540422494611, 0.436945126810344, 1.21474290382610, 0.295033941700225, 0.375855651783688, 1.98969760778547, 1.90903448980048, 1.91719241342961 }; // One may rise the hypothesis that the mean of the sample is not // significantly different from zero. In other words, the fact that // this particular sample has mean 0.5 may be attributed to chance. double hypothesizedMean = 0; // Create a T-Test to check this hypothesis TTest test = new TTest(sample, hypothesizedMean, OneSampleHypothesis.ValueIsDifferentFromHypothesis); // Check if the mean is significantly different test.Significant should be true // Now, we would like to test if the sample mean is // significantly greater than the hypothesized zero. // Create a T-Test to check this hypothesis TTest greater = new TTest(sample, hypothesizedMean, OneSampleHypothesis.ValueIsGreaterThanHypothesis); // Check if the mean is significantly larger greater.Significant should be true // Now, we would like to test if the sample mean is // significantly smaller than the hypothesized zero. // Create a T-Test to check this hypothesis TTest smaller = new TTest(sample, hypothesizedMean, OneSampleHypothesis.ValueIsSmallerThanHypothesis); // Check if the mean is significantly smaller smaller.Significant should be false Gets the power analysis for the test, if available. Gets the standard error of the estimated value. Gets the estimated parameter value, such as the sample's mean value. Gets the hypothesized parameter value. Gets the 95% confidence interval for the . Gets the alternative hypothesis under test. If the test is , the null hypothesis can be rejected in favor of this alternative hypothesis. Gets a confidence interval for the estimated value within the given confidence level percentage. The confidence level. Default is 0.95. A confidence interval for the estimated value. Tests the null hypothesis that the population mean is equal to a specified value. The test statistic. The degrees of freedom for the test distribution. The alternative hypothesis (research hypothesis) to test. Tests the null hypothesis that the population mean is equal to a specified value. The estimated value (θ). The standard error of the estimation (SE). The hypothesized value (θ'). The degrees of freedom for the test distribution. The alternative hypothesis (research hypothesis) to test. Tests the null hypothesis that the population mean is equal to a specified value. The data samples from which the test will be performed. The constant to be compared with the samples. The alternative hypothesis (research hypothesis) to test. Creates a T-Test. Tests the null hypothesis that the population mean is equal to a specified value. The sample's mean value. The standard deviation for the samples. The number of observations in the sample. The constant to be compared with the samples. The alternative hypothesis (research hypothesis) to test. Computes the T-Test. Computes the T-test. Computes the T-test. Update event. Converts a given p-value to a test statistic. The p-value. The test statistic which would generate the given p-value. Converts a given test statistic to a p-value. The value of the test statistic. The p-value for the given statistic. Converts a given test statistic to a p-value. The value of the test statistic. The tail of the test distribution. The test distribution. The p-value for the given statistic. Converts a given p-value to a test statistic. The p-value. The tail of the test distribution. The test distribution. The test statistic which would generate the given p-value. Two-sample Kolmogorov-Smirnov (KS) test. The Kolmogorov-Smirnov test tries to determine if two samples have been drawn from the same probability distribution. The Kolmogorov-Smirnov test has an interesting advantage in which it does not requires any assumptions about the data. The distribution of the K-S test statistic does not depend on which distribution is being tested. The K-S test has also the advantage of being an exact test (other tests, such as the chi-square goodness-of-fit test depends on an adequate sample size). One disadvantage is that it requires a fully defined distribution which should not have been estimated from the data. If the parameters of the theoretical distribution have been estimated from the data, the critical region of the K-S test will be no longer valid. The two-sample KS test is one of the most useful and general nonparametric methods for comparing two samples, as it is sensitive to differences in both location and shape of the empirical cumulative distribution functions of the two samples. This class uses an efficient and high-accuracy algorithm based on work by Richard Simard (2010). Please see for more details. References: Wikipedia, The Free Encyclopedia. Kolmogorov-Smirnov Test. Available at: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test NIST/SEMATECH e-Handbook of Statistical Methods. Kolmogorov-Smirnov Goodness-of-Fit Test. Available at: http://www.itl.nist.gov/div898/handbook/eda/section3/eda35g.htm Richard Simard, Pierre L’Ecuyer. Computing the Two-Sided Kolmogorov-Smirnov Distribution. Journal of Statistical Software. Volume VV, Issue II. Available at: http://www.iro.umontreal.ca/~lecuyer/myftp/papers/ksdist.pdf Kirkman, T.W. (1996) Statistics to Use. Available at: http://www.physics.csbsju.edu/stats/ In the following example, we will be creating a K-S test to verify if two samples have been drawn from different populations. In this example, we will first generate a number of samples from two different distributions and then check if the K-S can indeed see the difference between them: // Generate 15 points from a Normal distribution with mean 5 and sigma 2 double[] sample1 = new NormalDistribution(mean: 5, stdDev: 1).Generate(25); // Generate 15 points from an uniform distribution from 0 to 10 double[] sample2 = new UniformContinuousDistribution(a: 0, b: 10).Generate(25); // Now we can create a K-S test and test the unequal hypothesis: var test = new TwoSampleKolmogorovSmirnovTest(sample1, sample2, TwoSampleKolmogorovSmirnovTestHypothesis.SamplesDistributionsAreUnequal); bool significant = test.Significant; // outputs true The following example comes from the stats page of the College of Saint Benedict and Saint John's University (Kirkman, 1996). It is a very interesting example as it shows a case in which a t-test fails to see a difference between the samples because of the non-normality of the sample's distributions. The Kolmogorov-Smirnov nonparametric test, on the other hand, succeeds. The example deals with the preference of bees between two nearby blooming trees in an empty field. The experimenter has collected data measuring how much time does a bee spent near a particular tree. The time starts to be measured when a bee first touches the tree, and is stopped when the bee moves more than 1 meter far from it. The samples below represents the measured time, in seconds, of the observed bees for each of the trees. double[] redwell = { 23.4, 30.9, 18.8, 23.0, 21.4, 1, 24.6, 23.8, 24.1, 18.7, 16.3, 20.3, 14.9, 35.4, 21.6, 21.2, 21.0, 15.0, 15.6, 24.0, 34.6, 40.9, 30.7, 24.5, 16.6, 1, 21.7, 1, 23.6, 1, 25.7, 19.3, 46.9, 23.3, 21.8, 33.3, 24.9, 24.4, 1, 19.8, 17.2, 21.5, 25.5, 23.3, 18.6, 22.0, 29.8, 33.3, 1, 21.3, 18.6, 26.8, 19.4, 21.1, 21.2, 20.5, 19.8, 26.3, 39.3, 21.4, 22.6, 1, 35.3, 7.0, 19.3, 21.3, 10.1, 20.2, 1, 36.2, 16.7, 21.1, 39.1, 19.9, 32.1, 23.1, 21.8, 30.4, 19.62, 15.5 }; double[] whitney = { 16.5, 1, 22.6, 25.3, 23.7, 1, 23.3, 23.9, 16.2, 23.0, 21.6, 10.8, 12.2, 23.6, 10.1, 24.4, 16.4, 11.7, 17.7, 34.3, 24.3, 18.7, 27.5, 25.8, 22.5, 14.2, 21.7, 1, 31.2, 13.8, 29.7, 23.1, 26.1, 25.1, 23.4, 21.7, 24.4, 13.2, 22.1, 26.7, 22.7, 1, 18.2, 28.7, 29.1, 27.4, 22.3, 13.2, 22.5, 25.0, 1, 6.6, 23.7, 23.5, 17.3, 24.6, 27.8, 29.7, 25.3, 19.9, 18.2, 26.2, 20.4, 23.3, 26.7, 26.0, 1, 25.1, 33.1, 35.0, 25.3, 23.6, 23.2, 20.2, 24.7, 22.6, 39.1, 26.5, 22.7 }; // Create a t-test as a first attempt. var t = new TwoSampleTTest(redwell, whitney); Console.WriteLine("T-Test"); Console.WriteLine("Test p-value: " + t.PValue); // ~0.837 Console.WriteLine("Significant? " + t.Significant); // false // Create a non-parametric Kolmogorov-Smirnov test var ks = new TwoSampleKolmogorovSmirnovTest(redwell, whitney); Console.WriteLine("KS-Test"); Console.WriteLine("Test p-value: " + ks.PValue); // ~0.038 Console.WriteLine("Significant? " + ks.Significant); // true Gets the alternative hypothesis under test. If the test is , the null hypothesis can be rejected in favor of this alternative hypothesis. Gets the first empirical distribution being tested. Gets the second empirical distribution being tested. Creates a new Two-Sample Kolmogorov test. The first sample. The second sample. Creates a new Two-Sample Kolmogorov test. The first sample. The second sample. The alternative hypothesis (research hypothesis) to test. Converts a given p-value to a test statistic. The p-value. The test statistic which would generate the given p-value. Converts a given test statistic to a p-value. The value of the test statistic. The p-value for the given statistic. Two-sample Student's T test. The two-sample t-test assesses whether the means of two groups are statistically different from each other. References: Wikipedia, The Free Encyclopedia. Student's T-Test. William M.K. Trochim. The T-Test. Research methods Knowledge Base, 2009. Available on: http://www.le.ac.uk/bl/gat/virtualfc/Stats/ttest.html Graeme D. Ruxton. The unequal variance t-test is an underused alternative to Student's t-test and the Mann–Whitney U test. Oxford Journals, Behavioral Ecology Volume 17, Issue 4, pp. 688-690. 2006. Available on: http://beheco.oxfordjournals.org/content/17/4/688.full Gets the power analysis for the test, if available. Gets the alternative hypothesis under test. If the test is , the null hypothesis can be rejected in favor of this alternative hypothesis. Gets whether the test assumes equal sample variance. Gets the standard error for the difference. Gets the combined sample variance. Gets the estimated value for the first sample. Gets the estimated value for the second sample. Gets the hypothesized difference between the two estimated values. Gets the actual difference between the two estimated values. Gets the degrees of freedom for the test statistic. Gets the 95% confidence interval for the statistic. Gets a confidence interval for the statistic within the given confidence level percentage. The confidence level. Default is 0.95. A confidence interval for the estimated value. Tests whether the means of two samples are different. The first sample. The second sample. The hypothesized sample difference. True to assume equal variances, false otherwise. Default is true. The alternative hypothesis (research hypothesis) to test. Tests whether the means of two samples are different. The first sample's mean. The second sample's mean. The first sample's variance. The second sample's variance. The number of observations in the first sample. The number of observations in the second sample. True assume equal variances, false otherwise. Default is true. The hypothesized sample difference. The alternative hypothesis (research hypothesis) to test. Creates a new two-sample T-Test. Computes the T Test. Update event. Converts a given p-value to a test statistic. The p-value. The test statistic which would generate the given p-value. Converts a given test statistic to a p-value. The value of the test statistic. The p-value for the given statistic. Two sample Z-Test. References: Wikipedia, The Free Encyclopedia. Z-Test. Available on: http://en.wikipedia.org/wiki/Z-test Gets the power analysis for the test, if available. Gets the alternative hypothesis under test. If the test is , the null hypothesis can be rejected in favor of this alternative hypothesis. Gets the standard error for the difference. Gets the estimated value for the first sample. Gets the estimated value for the second sample. Gets the hypothesized difference between the two estimated values. Gets the actual difference between the two estimated values. Gets the 95% confidence interval for the statistic. Gets a confidence interval for the statistic within the given confidence level percentage. The confidence level. Default is 0.95. A confidence interval for the estimated value. Constructs a Z test. The first data sample. The second data sample. The hypothesized sample difference. The alternative hypothesis (research hypothesis) to test. Constructs a Z test. The first sample's mean. The second sample's mean. The first sample's variance. The second sample's variance. The number of observations in the first sample. The number of observations in the second sample. The hypothesized sample difference. The alternative hypothesis (research hypothesis) to test. Constructs a Z test. Computes the Z test. Computes the Z test. Computes the Z test. Update event. Converts a given p-value to a test statistic. The p-value. The test statistic which would generate the given p-value. Converts a given test statistic to a p-value. The value of the test statistic. The p-value for the given statistic. Wald's Test using the Normal distribution. The Wald test is a parametric statistical test named after Abraham Wald with a great variety of uses. Whenever a relationship within or between data items can be expressed as a statistical model with parameters to be estimated from a sample, the Wald test can be used to test the true value of the parameter based on the sample estimate. Under the Wald statistical test, the maximum likelihood estimate of the parameter(s) of interest θ is compared with the proposed value θ', with the assumption that the difference between the two will be approximately normal. References: Wikipedia, The Free Encyclopedia. Wald Test. Available on: http://en.wikipedia.org/wiki/Wald_test Constructs a Wald's test. The test statistic, as given by (θ-θ')/SE. Constructs a Wald's test. The estimated value (θ). The hypothesized value (θ'). The standard error of the estimation (SE). One-sample Z-Test (location test). The term Z-test is often used to refer specifically to the one-sample location test comparing the mean of a set of measurements to a given constant. Due to the central limit theorem, many test statistics are approximately normally distributed for large samples. Therefore, many statistical tests can be performed as approximate Z-tests if the sample size is large. If the test is , the null hypothesis can be rejected in favor of the alternate hypothesis specified at the creation of the test. This test supports creating power analyses through its property. References: Wikipedia, The Free Encyclopedia. Z-Test. Available on: http://en.wikipedia.org/wiki/Z-test This example has been gathered from the Wikipedia's page about the Z-Test, available from: http://en.wikipedia.org/wiki/Z-test Suppose there is a text comprehension test being run across a given demographic region. The mean score of the population from this entire region are around 100 points, with a standard deviation of 12 points. There is a local school, however, whose 55 students attained an average score in the test of only about 96 points. Would their scores be surprisingly that low, or could this event have happened due to chance? // So we would like to check that a sample of // 55 students with a mean score of 96 points: int sampleSize = 55; double sampleMean = 96; // Was expected to have happened by chance in a population with // an hypothesized mean of 100 points and standard deviation of // about 12 points: double standardDeviation = 12; double hypothesizedMean = 100; // So we start by creating the test: ZTest test = new ZTest(sampleMean, standardDeviation, sampleSize, hypothesizedMean, OneSampleHypothesis.ValueIsSmallerThanHypothesis); // Now, we can check whether this result would be // unlikely under a standard significance level: bool significant = test.Significant; // We can also check the test statistic and its P-Value double statistic = test.Statistic; double pvalue = test.PValue; Gets the power analysis for the test, if available. Gets the standard error of the estimated value. Gets the estimated value, such as the mean estimated from a sample. Gets the hypothesized value. Gets the 95% confidence interval for the . Gets the alternative hypothesis under test. If the test is , the null hypothesis can be rejected in favor of this alternative hypothesis. Gets a confidence interval for the statistic within the given confidence level percentage. The confidence level. Default is 0.95. A confidence interval for the estimated value. Constructs a Z test. The data samples from which the test will be performed. The constant to be compared with the samples. The alternative hypothesis (research hypothesis) to test. Constructs a Z test. The sample's mean. The sample's standard error. The hypothesized value for the distribution's mean. The alternative hypothesis (research hypothesis) to test. Constructs a Z test. The sample's mean. The sample's standard deviation. The hypothesized value for the distribution's mean. The sample's size. The alternative hypothesis (research hypothesis) to test. Constructs a Z test. The test statistic, as given by (x-μ)/SE. The alternate hypothesis to test. Computes the Z test. Computes the Z test. Constructs a T-Test. Update event. Converts a given p-value to a test statistic. The p-value. The test statistic which would generate the given p-value. Converts a given test statistic to a p-value. The value of the test statistic. The p-value for the given statistic. Converts a given test statistic to a p-value. The value of the test statistic. The tail of the test distribution. The p-value for the given statistic. Converts a given p-value to a test statistic. The p-value. The tail of the test distribution. The test statistic which would generate the given p-value. Hypothesis type The type of the hypothesis being made expresses the way in which a value of a parameter may deviate from that assumed in the null hypothesis. It can either state that a value is higher, lower or simply different than the one assumed under the null hypothesis. The test considers the two tails from a probability distribution. The two-tailed test is a statistical test in which a given statistical hypothesis, H0 (the null hypothesis), will be rejected when the value of the test statistic is either sufficiently small or sufficiently large. The test considers the upper (right) tail from a probability distribution. The one-tailed, upper tail test is a statistical test in which a given statistical hypothesis, H0 (the null hypothesis), will be rejected when the value of the test statistic is sufficiently large. The test considers the lower (left) tail from a probability distribution. The one-tailed, lower tail test is a statistical test in which a given statistical hypothesis, H0 (the null hypothesis), will be rejected when the value of the test statistic is sufficiently small. Common test Hypothesis for one sample tests, such as and . Tests if the mean (or the parameter under test) is significantly different from the hypothesized value, without considering the direction for this difference. Tests if the mean (or the parameter under test) is significantly greater (larger, bigger) than the hypothesized value. Tests if the mean (or the parameter under test) is significantly smaller (lesser) than the hypothesized value. Common test Hypothesis for two sample tests, such as and . Tests if the mean (or the parameter under test) of the first sample is different from the mean of the second sample, without considering any particular direction for the difference. Tests if the mean (or the parameter under test) of the first sample is greater (larger, bigger) than the mean of the second sample. Tests if the mean (or the parameter under test) of the first sample is smaller (lesser) than the mean of the second sample. Hypothesis for the one-sample Kolmogorov-Smirnov test. Tests whether the sample's distribution is different from the reference distribution. Tests whether the distribution of one sample is greater than the reference distribution, in a statistical sense. Tests whether the distribution of one sample is smaller than the reference distribution, in a statistical sense. Test hypothesis for the two-sample Kolmogorov-Smirnov tests. Tests whether samples have been drawn from significantly unequal distributions. Tests whether the distribution of one sample is greater than the other, in a statistical sense. Tests whether the distribution of one sample is smaller than the other, in a statistical sense. Hypothesis for the one-sample Grubb's test. Tests whether there is at least one outlier in the data. Tests whether the maximum value in the data is actually an outlier. Tests whether the minimum value in the data is actually an outlier. Scatter Plot class. Gets the integer label associated with this class. Gets the indices of all points of this class. Gets all X values of this class. Gets all Y values of this class. Gets or sets the class' text label. Returns an enumerator that iterates through a collection. An object that can be used to iterate through the collection. Returns an enumerator that iterates through a collection. An object that can be used to iterate through the collection. Collection of Histogram bins. This class cannot be instantiated. Searches for a bin containing the specified value. The value to search for. The histogram bin containing the searched value. Attempts to find the index of the bin containing the specified value. The value to search for. The index of the bin containing the specified value. Histogram Bin A "bin" is a container, where each element stores the total number of observations of a sample whose values lie within a given range. A histogram of a sample consists of a list of such bins whose range does not overlap with each other; or in other words, bins that are mutually exclusive. Unless is true, the ranges of all bins i are defined as Edge[i] <= x < Edge[i+1]. Otherwise, the last bin will have an inclusive upper bound (i.e. will be defined as Edge[i] <= x <= Edge[i+1]. Gets the actual range of data this bin represents. Gets the Width (range) for this histogram bin. Gets the Value (number of occurrences of a variable in a range) for this histogram bin. Gets whether the Histogram Bin contains the given value. Optimum histogram bin size adjustment rule. Does not attempts to automatically calculate an optimum bin width and preserves the current histogram organization. Calculates the optimum bin width as 3.49σN, where σ is the sample standard deviation and N is the number of samples. Scott, D. 1979. On optimal and data-based histograms. Biometrika, 66:605-610. Calculates the optimum bin width as ceiling( log2(N) + 1 )m where N is the number of samples. The rule implicitly bases the bin sizes on the range of the data, and can perform poorly if n < 30. Calculates the optimum bin width as the square root of the number of samples. This is the same rule used by Microsoft (c) Excel and many others. Histogram. In a more general mathematical sense, a histogram is a mapping Mi that counts the number of observations that fall into various disjoint categories (known as bins). This class represents a Histogram mapping of Discrete or Continuous data. To use it as a discrete mapping, pass a bin size (length) of 1. To use it as a continuous mapping, pass any real number instead. Currently, only a constant bin width is supported. Constructs an empty histogram Constructs an empty histogram The values to be binned in the histogram. Constructs an empty histogram The values to be binned in the histogram. Initializes the histogram's bins. Sets the histogram's bin ranges (edges). Update statistical value of the histogram. The method recalculates statistical values of the histogram, like mean, standard deviation, etc., in the case if histogram's values were changed directly. The method should be called only in the case if histogram's values were retrieved through property and updated after that. Gets the Bin values of this Histogram. Bin index. The number of hits of the selected bin. Gets the Bin values for this Histogram. Gets the Range of the values in this Histogram. Gets the edges of each bin in this Histogram. Gets the collection of bins of this Histogram. Gets or sets whether this histogram represents a cumulative distribution. Gets or sets the bin size auto adjustment rule to be used when computing this histogram from new data. Default is . The bin size auto adjustment rule. Mean value. The property allows to retrieve mean value of the histogram. Standard deviation. The property allows to retrieve standard deviation value of the histogram. Median value. The property allows to retrieve median value of the histogram. Minimum value. The property allows to retrieve minimum value of the histogram with non zero hits count. Maximum value. The property allows to retrieve maximum value of the histogram with non zero hits count. Total count of values. The property represents total count of values contributed to the histogram, which is essentially sum of the array. Gets or sets a value indicating whether the last bin should have an inclusive upper bound. Default is true. If set to false, the last bin's range will be defined as Edge[i] <= x < Edge[i+1]. If set to true, the last bin will have an inclusive upper bound and be defined as Edge[i] <= x <= Edge[i+1] instead. true if the last bin should have an inclusive upper bound; false otherwise. Get range around median containing specified percentage of values. Values percentage around median. Returns the range which containes specifies percentage of values. The method calculates range of stochastic variable, which summary probability comprises the specified percentage of histogram's hits. Sample usage: // create histogram Histogram histogram = new Histogram( new int[10] { 0, 0, 1, 3, 6, 8, 11, 0, 0, 0 } ); // get 50% range IntRange range = histogram.GetRange( 0.5 ); // show the range ([4, 6]) Console.WriteLine( "50% range = [" + range.Min + ", " + range.Max + "]" ); Computes (populates) an Histogram mapping with values from a sample. The values to be binned in the histogram. The desired width for the histogram's bins. Computes (populates) an Histogram mapping with values from a sample. The values to be binned in the histogram. The desired number of histogram's bins. Computes (populates) an Histogram mapping with values from a sample. The values to be binned in the histogram. The desired number of histogram's bins. Whether to include an extra upper bin going to infinity. Computes (populates) an Histogram mapping with values from a sample. The values to be binned in the histogram. The desired number of histogram's bins. The desired width for the histogram's bins. Computes (populates) an Histogram mapping with values from a sample. The values to be binned in the histogram. Actually computes the histogram. Computes the optimum number of bins based on a . Integer array implicit conversion. Converts this histogram into an integer array representation. Creates a histogram of values from a sample. The values to be binned in the histogram. A histogram reflecting the distribution of values in the sample. Subtracts one histogram from the other, storing results in a new histogram, without changing the current instance. The histogram whose bin values will be subtracted. A new containing the result of this operation. Subtracts one histogram from the other, storing results in a new histogram, without changing the current instance. The histogram whose bin values will be subtracted. A new containing the result of this operation. Adds one histogram from the other, storing results in a new histogram, without changing the current instance. The histogram whose bin values will be added. A new containing the result of this operation. Adds one histogram from the other, storing results in a new histogram, without changing the current instance. The histogram whose bin values will be added. A new containing the result of this operation. Multiplies one histogram from the other, storing results in a new histogram, without changing the current instance. The histogram whose bin values will be multiplied. A new containing the result of this operation. Multiplies one histogram from the other, storing results in a new histogram, without changing the current instance. The value to be multiplied. A new containing the result of this operation. Adds a value to each histogram bin. The value to be added. A new containing the result of this operation. Subtracts a value to each histogram bin. The value to be subtracted. A new containing the result of this operation. Creates a new object that is a copy of the current instance. A new object that is a copy of this instance. Scatter Plot. Gets the title of the scatter plot. Gets the name of the X-axis. Gets the name of the Y-axis. Gets the name of the label axis. Gets the values associated with the X-axis. Gets the corresponding Y values associated with each X. Gets the label of each (x,y) pair. Gets an integer array containing the integer labels associated with each of the classes in the scatter plot. Gets the class labels for each of the classes in the plot. Gets a collection containing information about each of the classes presented in the scatter plot. Constructs an empty Scatter plot. Constructs an empty Scatter plot with given title. Scatter plot title. Constructs an empty scatter plot with given title and axis names. Scatter Plot title. Title for the x-axis. Title for the y-axis. Constructs an empty Scatter Plot with given title and axis names. Scatter Plot title. Title for the x-axis. Title for the y-axis. Title for the labels. Computes the scatter plot. Array of values. Computes the scatter plot. Array of X values. Array of corresponding Y values. Computes the scatter plot. Array of X values. Array of corresponding Y values. Array of integer labels defining a class for each (x,y) pair. Computes the scatter plot. Array of { x,y } values. Computes the scatter plot. Array of { x,y } values. Array of integer labels defining a class for each (x,y) pair. Computes the scatter plot. Array of { x,y } values. Computes the scatter plot. Array of { x,y } values. Array of integer labels defining a class for each (x,y) pair. Bhattacharyya distance. Initializes a new instance of the class. Bhattacharyya distance between two histograms. The first histogram. The second histogram. The Bhattacharyya between the two histograms. Bhattacharyya distance between two histograms. The first histogram. The second histogram. The Bhattacharyya between the two histograms. Bhattacharyya distance between two histograms. The first histogram. The second histogram. The Bhattacharyya between the two histograms. Bhattacharyya distance between two datasets, assuming their contents can be modelled by multivariate Gaussians. The first dataset. The second dataset. The Bhattacharyya between the two datasets. Bhattacharyya distance between two datasets, assuming their contents can be modelled by multivariate Gaussians. The first dataset. The second dataset. The Bhattacharyya between the two datasets. Bhattacharyya distance between two Gaussian distributions. Mean for the first distribution. Covariance matrix for the first distribution. Mean for the second distribution. Covariance matrix for the second distribution. The Bhattacharyya distance between the two distributions. Bhattacharyya distance between two Gaussian distributions. Mean for the first distribution. Covariance matrix for the first distribution. Mean for the second distribution. Covariance matrix for the second distribution. The Bhattacharyya distance between the two distributions. Bhattacharyya distance between two Gaussian distributions. Mean for the first distribution. Covariance matrix for the first distribution. Mean for the second distribution. Covariance matrix for the second distribution. The logarithm of the determinant for the covariance matrix of the first distribution. The logarithm of the determinant for the covariance matrix of the second distribution. The Bhattacharyya distance between the two distributions. Bhattacharyya distance between two Gaussian distributions. Mean for the first distribution. Covariance matrix for the first distribution. Mean for the second distribution. Covariance matrix for the second distribution. The logarithm of the determinant for the covariance matrix of the first distribution. The logarithm of the determinant for the covariance matrix of the second distribution. The Bhattacharyya distance between the two distributions. Bhattacharyya distance between two Gaussian distributions. The first Normal distribution. The second Normal distribution. The Bhattacharyya distance between the two distributions. Log-likelihood distance between a sample and a statistical distribution. The type of the distribution. Initializes a new instance of the class. Computes the distance d(x,y) between points and . The first point x. The second point y. A double-precision value representing the distance d(x,y) between and according to the distance function implemented by this class. Computes the distance d(x,y) between points and . The first point x. The second point y. A double-precision value representing the distance d(x,y) between and according to the distance function implemented by this class.