Pandas selecting rows with multiple conditions

Let's make a dataframe:

>>> np.random.seed(0)
>>> df = pd.DataFrame(np.random.randn(5,3), columns=list('ABC'))
>>>
A B C
0 1.764052 0.400157 0.978738
1 2.240893 1.867558 -0.977278
2 0.950088 -0.151357 -0.103219
3 0.410599 0.144044 1.454274

I want to get all the rows where column 'C' is between 0 and 1 inclusive.

This code works:

>>> df[(df['C'] >= 0) & (df['C'] <= 1)]
A B C
0 1.764052 0.400157 0.978738
4 0.761038 0.121675 0.443863

But this (what I feel is equivalent) code doesn't:

>>> df[(0 <= df['C'] <= 1)]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\panda\anaconda3\lib\site-packages\pandas\core\generic.py", line 1537, in __nonzero__
raise ValueError(
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Do I really have to split any multi condition booleans into separate conditions in pandas? Is there a better way to accomplish this?


Solution of the problem

You can use between. By default, it's both sides inclusive.

out = df[df['C'].between(0,1)]

If you want only one side inclusive, you can select that as well. For example, the following is only right-side inclusive:

out = df[df['C'].between(0,1, inclusive='right')]

Output:

 A B C
0 1.764052 0.400157 0.978738

Commentaires

Posts les plus consultés de ce blog

Comment signer la connexion OKEx API version 5 avec websockets ?

La fonction GCP Cloud pour écrire des données dans BigQuery s'exécute avec succès, mais les données n'apparaissent pas dans la table BigQuery

Erreur Symfony : "Une exception a été levée lors du rendu d'un modèle"