Scikit-learn Integration

The sklearn integration provides quantum-enhanced estimators that are fully compatible with the scikit-learn API. This allows you to use quantum machine learning models as drop-in replacements for traditional ML algorithms in your existing workflows.

Classes

The sklearn integration module contains quantum-enhanced estimators.

Core Estimators

QCMLRegressor

The QCMLRegressor class provides quantum-enhanced regression capabilities with scikit-learn compatibility.

QCMLClassifier

The QCMLClassifier class provides quantum-enhanced classification capabilities with scikit-learn compatibility.

Utilities and Hooks

The sklearn integration also provides utilities for model serialization and deserialization.

`save_states_pickle`	Save the states of the model to a pickle file.
`load_states_pickle`	Load the states of the model from a pickle file.

Serialization Functions

save_states_pickle(path, \*, scaler_pkl_name=”scaler.pkl”, weighted_layer_pkl_name=”weighted_layer.pkl”, model_parameters_pkl_name=”model_parameters.pkl”): Save the states of the model to pickle files.
load_states_pickle(path, \*, scaler_pkl_name=”scaler.pkl”, weighted_layer_pkl_name=”weighted_layer.pkl”, model_parameters_pkl_name=”model_parameters.pkl”): Load the states of the model from pickle files.

Usage Examples

Basic Regression Example

from honeio.integrations.sklearn import QCMLRegressor
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

# Generate sample data
X, y = make_regression(n_samples=1000, n_features=20, noise=0.1, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and train the quantum-enhanced regressor
qcml_regressor = QCMLRegressor(
    hilbert_space_dim=16,
    epochs=100,
    lr=0.01,
    random_state=42
)

qcml_regressor.fit(X_train, y_train)

# Make predictions
y_pred = qcml_regressor.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse:.4f}")

Basic Classification Example

from honeio.integrations.sklearn import QCMLClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Generate sample data
X, y = make_classification(n_samples=1000, n_features=20, n_classes=2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and train the quantum-enhanced classifier
qcml_classifier = QCMLClassifier(
    hilbert_space_dim=16,
    epochs=100,
    lr=0.01,
    random_state=42
)

qcml_classifier.fit(X_train, y_train)

# Make predictions
y_pred = qcml_classifier.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.4f}")

Model Persistence Example

from pathlib import Path
from honeio.integrations.sklearn import QCMLRegressor
from honeio.integrations.sklearn.hooks import save_states_pickle, load_states_pickle

# Train a model
model = QCMLRegressor(hilbert_space_dim=8, epochs=50)
model.fit(X_train, y_train)

# Set up persistence hooks
save_path = Path("./saved_models/qcml_model")
model.save_model_fn = save_states_pickle(save_path)
model.load_states_fn = load_states_pickle(save_path)

# Save the model
model.save_model()

# Load the model in a new instance
new_model = QCMLRegressor()
new_model.load_states_fn = load_states_pickle(save_path)
new_model.load_states()

# Use the loaded model
predictions = new_model.predict(X_test)

Integration with Scikit-learn Pipelines

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from honeio.integrations.sklearn import QCMLRegressor

# Create a pipeline with preprocessing and quantum model
pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('qcml', QCMLRegressor(hilbert_space_dim=16, epochs=100))
])

# Train the pipeline
pipeline.fit(X_train, y_train)

# Make predictions
predictions = pipeline.predict(X_test)

Parameter Configuration

Model Parameters

The quantum-enhanced estimators support various parameters for customization:

hilbert_space_dim (int): Dimension of the quantum Hilbert space
epochs (int): Number of training epochs
lr (float): Learning rate for optimization
weights_lr (float): Learning rate specifically for weight parameters
batch_size (int, optional): Batch size for training
device (str): Computing device (“cpu” or “cuda”)
dropout_rate (float): Dropout rate for regularization
groups (list of lists, optional): Groupings for layer organization
random_state (int): Random seed for reproducibility

Advanced Configuration

# Advanced configuration example
advanced_model = QCMLRegressor(
    hilbert_space_dim=32,
    epochs=200,
    lr=0.001,
    weights_lr=0.01,
    batch_size=64,
    device="cuda",
    dropout_rate=0.1,
    groups=[[0, 1], [2, 3], [4, 5]],
    random_state=42
)

Compatibility

The QCML sklearn integration is designed to be fully compatible with:

Scikit-learn pipelines
Cross-validation utilities
Grid search and hyperparameter optimization
Metric evaluation functions
Model selection utilities

Performance Tips

Hilbert Space Dimension: Start with smaller dimensions (8-16) and increase based on problem complexity
Batch Size: Use batch processing for large datasets to improve training efficiency
Device Selection: Use GPU acceleration when available for faster training
Regularization: Apply dropout to prevent overfitting on small datasets
Hyperparameter Tuning: Use scikit-learn’s GridSearchCV for optimal parameter selection. Optuna has been extensively tested with QCMLClassifier and QCMLRegressor as well

Troubleshooting

Common issues and solutions:

Memory Issues: Reduce batch_size or hilbert_space_dim if encountering out-of-memory errors.
Slow Training: Enable GPU acceleration by setting device=”cuda” if CUDA is available.
Poor Performance: Try different hilbert_space_dim values or adjust learning rates.
Convergence Issues: Increase the number of epochs or adjust the learning rate schedule.