Examples
Classification
Below is a complete example for binary classification on the Wisconsin breast cancer dataset:
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from honeio.integrations.sklearn.qcmlsklearn import QCMLClassifier
# Load the breast cancer dataset
X, y = datasets.load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Standardize features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
# Initialize the QCMLClassifier model
model = QCMLClassifier(hilbert_space_dim=4, lr=0.01, epochs=100)
# Train the model
model.fit(X_train, y_train)
# Generate predictions
label_forecasts = model.predict(X_test)
label_forecasts_prob = model.predict_proba(X_test)
# Display results
print(f"Label forecasts: \n {label_forecasts[:10]}")
print(f"Label probability forecasts: \n {label_forecasts_prob[:10]}")
The expected output of running the above code block is similar to:
2025-08-07 11:16:36 [warning ]
You are using the community edition of honeio.
There are some limitations that can be lifted by purchasing a commercial license.
Please contact [email protected] for more information.
Label forecasts:
[1 0 0 1 1 0 0 0 1 1]
Label probability forecasts:
[[0.01897758 0.9810224 ]
[0.97935164 0.02064834]
[0.9840843 0.01591573]
[0.01646753 0.9835324 ]
[0.0158821 0.98411787]
[0.98512185 0.01487813]
[0.98319155 0.01680848]
[0.97874296 0.02125704]
[0.2479068 0.75209326]
[0.01639351 0.98360646]]
The performance of the model can be verified using standard classification metrics such as accuracy, precision, recall, and ROC AUC score.
from sklearn.metrics import accuracy_score, classification_report
# Evaluate the model
accuracy = accuracy_score(y_test, label_forecasts)
report = classification_report(y_test, label_forecasts)
print(f"Accuracy: {accuracy}")
print(f"Classification Report: \n {report}")
Accuracy: 0.9736842105263158
Classification Report: precision recall f1-score support
0 0.95 0.98 0.97 43
1 0.99 0.97 0.98 71
accuracy 0.97 114
macro avg 0.97 0.97 0.97 114
weighted avg 0.97 0.97 0.97 114
Note
If a GPU is available, you can pass device='cuda'
to run your models on a GPU for faster training and inference.
Note
For best results, it is always recommended to standardize continuous features and to one-hot encode categorical features.
Regression with cross-validation
In this example we compare the performance of QCMLRegressor against standard regression models such as Linear Regression and Random Forest Regressor on the California housing dataset using 5-fold cross-validation.
from honeio.integrations.sklearn.qcmlsklearn import QCMLRegressor
import pandas as pd
import torch
from sklearn import datasets
from sklearn.linear_model import LinearRegression
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import root_mean_squared_error, mean_absolute_error, mean_absolute_percentage_error, r2_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import StandardScaler
# Load California housing dataset and set up CV
SEED = 0
K_FOLDS = 5
max_obs = 1000 # use only first 1000 observations for community edition
X, y = datasets.fetch_california_housing(return_X_y=True)
X = X[:max_obs]
y = y[:max_obs]
kf = KFold(n_splits=K_FOLDS, shuffle=True, random_state=SEED)
print(f"X shape: {X.shape}")
print(f"y shape: {y.shape}")
X shape: (1000, 8)
y shape: (1000,)
Warning
The community edition of QCML has a limitation on the maximum number of training samples. For larger datasets, consider requesting a license for the enterprise edition at https://www.qognitive.io/api-request/
The following code block runs 5-fold cross-validation for QCMLRegressor, Linear Regression, and Random Forest Regressor, and stores various error metrics for each model on each fold.
# Initialize models and run 5-fold CV
model_list = [
QCMLRegressor(
device="cuda" if torch.cuda.is_available() else "cpu",
dropout_rate=0.3,
),
LinearRegression(),
RandomForestRegressor(),
]
error_funcs = [
mean_absolute_percentage_error,
root_mean_squared_error,
mean_absolute_error,
r2_score,
]
error_stats = {}
for model in model_list:
model_name = model.__class__.__name__
print(f"Training {model_name}...")
for fold, (train_index, test_index) in enumerate(kf.split(X, y)):
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
# Standardize the features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
# Train the model using the training sets
model.fit(X_train_scaled, y_train)
# Make predictions using the testing set
y_pred = model.predict(X_test_scaled)
for error_func in error_funcs:
error_stats.setdefault((model_name, fold), {})[f"{error_func.__name__}"] = error_func(y_test, y_pred)
Note
In this example we added a dropout_rate=0.3
parameter to the QCMLRegressor to improve generalization and reduce overfitting.
Finally we can calculate summary statistics to evaluate each model performance and store the results in a pandas DataFrame for easy visualization.
# Summarize results
error_stats_df = pd.DataFrame(error_stats).T
average_error_stats = error_stats_df.groupby(level=0).mean()
average_error_stats.sort_values('mean_absolute_percentage_error', ascending=True, inplace=True)
print(average_error_stats)
The cross-validation results demonstrate excellent regression performance for QCML compared to other standard regression models:
Model |
MAPE |
RMSE |
MAE |
R² Score |
---|---|---|---|---|
QCMLRegressor |
0.1226 |
0.3918 |
0.2492 |
0.8034 |
RandomForestRegressor |
0.1438 |
0.4273 |
0.2728 |
0.7668 |
LinearRegression |
0.2143 |
0.5439 |
0.3834 |
0.6219 |
Blog
Follow Qognitive’s blog for more examples and tutorials: https://www.qognitive.io/blog/