Machine learning classes¶
- class o2sclpy.kde_sklearn¶
Use scikit-learn to generate a KDE.
This is an experimental interface to provide easier interaction with C++.
Todo
Fix the comparison between sklearn and scipy, making sure they both produce the same log_pdf() in the correct conditions. Ensure the integral is normalized when appropriate.
- get_bandwidth()¶
Return the bandwidth
- log_pdf(x)¶
Return the log likelihood
- pdf(x)¶
Return the likelihood
- sample(n_samples=1)¶
Sample the Gaussian mixture model
- set_data(in_data, bw_array, verbose=0, kernel='gaussian', metric='euclidean', outformat='numpy', transform='unit', bandwidth='none')¶
Fit the mixture model with the specified input data, a numpy array of shape (n_samples,n_coordinates)
- set_data_str(in_data, bw_array, options)¶
Set the input and output data to train the interpolator, using a string to specify the keyword arguments.
- class o2sclpy.kde_scipy¶
Use scipy to generate a KDE
This is an experimental and very simplifed interface, mostly to provide easier interaction with C++.
- get_bandwidth()¶
Return the bandwidth
- log_pdf(x)¶
Return the log likelihood
- pdf(x)¶
Return the likelihood
- sample(n_samples=1)¶
Sample the Gaussian mixture model
- set_data(in_data, verbose=0, weights=None, outformat='numpy', bw_method=None, transform='unit')¶
Fit the mixture model with the specified input data, a numpy array of shape (n_samples,n_coordinates)
- set_data_str(in_data, weights, options)¶
Set the input and output data to train the interpolator, using a string to specify the keyword arguments.
- string_to_dict(s)¶
Convert a string to a dictionary, converting strings to values when necessary.
- class o2sclpy.gmm_sklearn¶
Use scikit-learn to generate a Gaussian mixture model of a specified set of data.
This is an experimental interface to provide easier interaction with C++.
- components(v)¶
For a point (or set of points) specified in
v
, use the Gaussian mixture at to compute the density (or densities) of each component as a contiguous numpy array. Each array will have entries which sum to 1.
- get_data()¶
Return the properties of the Gaussian mixture model as contiguous numpy arrays. This function returns, in order, the weights, the means, the covariances, the precisions (the inverse of the covariances), and the Cholesky decomposition of the precisions.
- log_pdf(x)¶
Return the per-sample average log likelihood of the data as a single floating point value given the vector or vectors specified in x.
- o2graph_to_gmm(o2scl, amp, link, args)¶
The function providing the ‘to-gmm’ command for o2graph.
- predict(v)¶
Predict the labels (the index of the Gaussian) given a vector or vectors v and return them in a one-dimensional numpy array with data type int64.
- sample(n_samples=1)¶
Sample the Gaussian mixture model, returning a tuple with two components, the first being an 2D array of the coordinates of the new samples and the second being a 1D array of the labels for each new sample.
- score_samples(x)¶
Given a vector (or list of vectors) in
x
, return the log likelihood at each point as a numpy array.
- set_data(in_data, verbose=0, n_components=2, covariance_type='full', tol=0.001, reg_covar=1e-06, max_iter=100, n_init=1)¶
Fit the mixture model with the specified input data, a numpy array of shape (n_samples,n_coordinates)
- set_data_str(in_data, options)¶
Set the input and output data to train the interpolator, using a string to specify the keyword arguments.
- class o2sclpy.bgmm_sklearn¶
Use scikit-learn to generate a Bayesian Gaussian mixture model of a specified set of data.
This is an experimental interface to provide easier interaction with C++.
- components(v)¶
For a point (or set of points) specified in
v
, use the Gaussian mixture at to compute the density (or densities) of each component as a contiguous numpy array. Each array will have entries which sum to 1.
- get_data()¶
Return the properties of the Gaussian mixture model as contiguous numpy arrays. This function returns, in order, the weights, the means, the covariances, the precisions (the inverse of the covariances), and the Cholesky decomposition of the precisions.
- log_pdf(x)¶
Return the per-sample average log likelihood of the data as a single floating point value given the vector or vectors specified in x.
- o2graph_to_bgmm(o2scl, amp, link, args)¶
The function providing the ‘to-bgmm’ command for o2graph.
- predict(v)¶
Predict the labels (the index of the Gaussian) given a vector or vectors v and return them in a one-dimensional numpy array with data type int64.
- sample(n_samples=1)¶
Sample the Gaussian mixture model, returning a tuple with two components, the first being an 2D array of the coordinates of the new samples and the second being a 1D array of the labels for each new sample.
- score_samples(x)¶
Given a vector (or list of vectors) in
x
, return the log likelihood at each point as a numpy array.
- set_data(in_data, verbose=0, n_components=2, covariance_type='full', tol=0.001, reg_covar=1e-06, max_iter=100, n_init=1)¶
Fit the mixture model with the specified input data, a numpy array of shape (n_samples,n_coordinates)
- set_data_str(in_data, options)¶
Set the input and output data to train the interpolator, using a string to specify the keyword arguments.
- class o2sclpy.interpm_sklearn_gp¶
Interpolate one or many multimensional data sets using a Gaussian process from scikit-learn
- eval(v)¶
Evaluate the GP at point
v
.
- eval_unc(v)¶
Evaluate the GP and its uncertainty at point
v
.# AWS, 3/27/24: Keep in mind that # o2scl::interpm_python.eval_unc() expects the return type to # be a tuple of numpy arrays.
- set_data(in_data, out_data, kernel='1.0*RBF(1.0,(1e-2,1e2))', test_size=0.0, normalize_y=True, outformat='numpy', verbose=0)¶
Set the input and output data to train the interpolator
- set_data_str(in_data, out_data, options)¶
Set the input and output data to train the interpolator, using a string to specify the keyword arguments.
- string_to_dict(s)¶
Convert a string to a dictionary, converting strings to values when necessary.
- class o2sclpy.interpm_tf_dnn¶
Interpolate one or many multimensional data sets using a deep neural network from TensorFlow
- eval(v)¶
Evaluate the NN at point
v
.
- set_data(in_data, out_data, outformat='numpy', verbose=0, activations=['relu', 'relu'], batch_size=None, epochs=100, transform='none', test_size=0.0, evaluate=False, hlayers=[8, 8], loss='mean_squared_error')¶
Set the input and output data to train the interpolator
some activation functions: relu [0,infty] sigmoid [0,1] tanh [-1,1]
transformations: quantile transforms to [0,1] MinMaxScaler transforms to [a,b]
- set_data_str(in_data, out_data, options)¶
Set the input and output data to train the interpolator, using a string to specify the keyword arguments.
- string_to_dict(s)¶
Convert a string to a dictionary, converting strings to values when necessary.