How to Solve Python Modulenotfounderror: No module named ‘sklearn.datasets.samples_generator

by | Data Science, Python, Tips

Python’s scikit-learn library is a popular toolkit for machine learning, but sometimes you may encounter an error like:

ModuleNotFoundError: No module named 'sklearn.datasets.samples_generator'

This error occurs when the samples_generator module is not found in the sklearn.datasets package, typically because the module has been deprecated or removed in recent versions of scikit-learn. In this blog post, we will explore the root cause of the error, and how to fix it using the latest alternatives.

Why Does This Error Occur?

In older versions of scikit-learn (before version 0.20), the sklearn.datasets.samples_generator module was available to help generate sample datasets. However, this module has been removed in later versions, and its functionalities have been moved to other parts of the library.

If you’re using a version of scikit-learn 0.20 or later and you try to import samples_generator, you’ll see the ModuleNotFoundError.

Steps to Reproduce the Error

To reproduce the error, install the latest version of scikit-learn (>= 0.20) and run the following Python code:

from sklearn.datasets.samples_generator import make_blobs

# Attempting to generate synthetic data
X, y = make_blobs(n_samples=100, centers=3, n_features=2, random_state=42)
print(X, y)

If you’re using scikit-learn version 0.20 or newer, this code will raise the following error:

ModuleNotFoundError: No module named 'sklearn.datasets.samples_generator'

In [1]: import sklearn

In [2]: print(sklearn.__version__)
1.3.0

How to Fix the Error

To fix this error, you need to update your code to use the modern equivalent of samples_generator. The functions in samples_generator were moved to different parts of scikit-learn. For instance, make_blobs is now available directly in sklearn.datasets.

Here’s how you can rewrite the code:

from sklearn.datasets import make_blobs

# Generating synthetic data using the new import path
X, y = make_blobs(n_samples=100, centers=3, n_features=2, random_state=42)
print(X, y)

By importing make_blobs from sklearn.datasets, the issue is resolved. No further installation or library modifications are necessary.

Verify Your scikit-learn Version

To avoid confusion, make sure to check which version of scikit-learn you are using by running:

pip show scikit-learn

Alternatively, check your version inside a Python script or terminal:

import sklearn
print(sklearn.__version__)

If you’re using a version older than 0.20 and wish to use the newer imports, you can upgrade your scikit-learn package using:

pip install --upgrade scikit-learn

Summary

If you encounter the ModuleNotFoundError: No module named 'sklearn.datasets.samples_generator', it’s likely because the module has been deprecated in scikit-learn 0.20 or later. You can fix this issue by updating your code to import the required functionality from the correct module, such as make_blobs from sklearn.datasets. Always ensure that your scikit-learn package is up to date to take advantage of these fixes.

By following these simple steps, you can resolve the error and continue working on your machine learning projects without disruptions.

If you found this guide helpful, feel free to share it with others who might face the same issue!

For further reading on missing scikit-learn module issues, visit

For further reading on missing modules in Python, go to the articles:

Profile Picture
Senior Advisor, Data Science | [email protected] | + posts

Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.

Buy Me a Coffee