Pandas selecting rows with multiple conditions

avril 20, 2022

Let's make a dataframe:

>>> np.random.seed(0)
>>> df = pd.DataFrame(np.random.randn(5,3), columns=list('ABC'))
>>>
 A B C
0 1.764052 0.400157 0.978738
1 2.240893 1.867558 -0.977278
2 0.950088 -0.151357 -0.103219
3 0.410599 0.144044 1.454274

I want to get all the rows where column 'C' is between 0 and 1 inclusive.

This code works:

>>> df[(df['C'] >= 0) & (df['C'] <= 1)]
 A B C
0 1.764052 0.400157 0.978738
4 0.761038 0.121675 0.443863

But this (what I feel is equivalent) code doesn't:

>>> df[(0 <= df['C'] <= 1)]
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "C:\Users\panda\anaconda3\lib\site-packages\pandas\core\generic.py", line 1537, in __nonzero__
 raise ValueError(
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Do I really have to split any multi condition booleans into separate conditions in pandas? Is there a better way to accomplish this?

Solution of the problem

You can use between. By default, it's both sides inclusive.

out = df[df['C'].between(0,1)]

If you want only one side inclusive, you can select that as well. For example, the following is only right-side inclusive:

out = df[df['C'].between(0,1, inclusive='right')]

Output:

 A B C
0 1.764052 0.400157 0.978738

Rechercher dans ce blog

Blog du programmeur

GOOGLE ADS

Pandas selecting rows with multiple conditions

Solution of the problem

Commentaires

Enregistrer un commentaire

Posts les plus consultés de ce blog

La fonction GCP Cloud pour écrire des données dans BigQuery s'exécute avec succès, mais les données n'apparaissent pas dans la table BigQuery

Erreur Symfony : "Une exception a été levée lors du rendu d'un modèle"

Le shell POSIX (sh) redirige stderr vers stdout et capture stderr et stdout dans des variables