Click here to Skip to main content
15,946,342 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Here i try to check if a particular string ("SPF", "UV", "SUN") exist in each value of LDESC column. I use str.contains(), for loop and if condition for that. As such when i write the code instead of showing whether the strings exist or not it shows the value error.

What I have tried:

Here's the Code
import pandas as pd
import numpy as np
import regex as re
df=pd.read_excel(r'D:\Practice Files\Anirudh Exercise Skin India.xlsx')
Nonsun= []
for value in df["LDESC"]:
    if df["LDESC"].str.contains("SPF|UV"):
        Nonsun.append("SPF")
    elif df["LDESC"].str.contains("SPF"):
            Nonsun.append("SPF")
    elif df["LDESC"].str.contains("UV"):
        Nonsun.append("UV")
    elif df["LDESC"].str.contains("SUN"):
        Nonsun.append("SUN")
    else:
        Nonsun.append("NON-SUN")

df["SPF/UV/SUN"] = Nonsun
print(df)


Here's the Error
Traceback (most recent call last):
  File "C:\Users\ani\PycharmProjects\pythonProject\main8.py", line 44, in <module>
    if df["LDESC"].str.contains("SPF|UV"):
  File "C:\Users\ani\PycharmProjects\pythonProject\venv\lib\site-packages\pandas\core\generic.py", line 1527, in __nonzero__
    raise ValueError(
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().


How to solve this error
Posted
Updated 12-Dec-22 22:09pm

1 solution

I attach my sample code below.
Python
"""A simple pandas test"""
import pandas as pd
import numpy as np
from io import StringIO

df=pd.read_excel(r'cp.xlsx')
print('Raw data as read from Excel')
print(df)
print('')

df['Basepacksize'] = df['Basepacksize'].astype(str).astype(float)
print('After converting \'Basepacksize\' items to float types')
print(df)
print('')

bins= [0,10,15,30,50,100]
df1=pd.cut(df['Basepacksize'], bins=bins)
print('After using \'cut\' into separate bins')
print(df1)
print('')

spfuv = df['LDESC'].str.contains('SPF|UV')
print('After using "str.contains(\'SPF|UV\')"')
print(spfuv)
print('')

print('Iterating the items in the \'str.contains\' result')
for item in spfuv:
    print(item)

My Excel files contains two columns, as shown in the raw data print. The content is based on guesses since you have refused to show us a sample of your data.

The output from the above code is:
Raw data as read from Excel
  LDESC  Basepacksize
0   SPF            10
1    UV            15
2    UV            21
3   SUN            23
4   SPF            27
5    UV            44
6   SUN            55
7   SUN            57
8   SPF            88

After converting 'Basepacksize' items to float types
  LDESC  Basepacksize
0   SPF          10.0
1    UV          15.0
2    UV          21.0
3   SUN          23.0
4   SPF          27.0
5    UV          44.0
6   SUN          55.0
7   SUN          57.0
8   SPF          88.0

After using 'cut' into separate bins
0      (0, 10]
1     (10, 15]
2     (15, 30]
3     (15, 30]
4     (15, 30]
5     (30, 50]
6    (50, 100]
7    (50, 100]
8    (50, 100]
Name: Basepacksize, dtype: category
Categories (5, interval[int64, right]): [(0, 10] < (10, 15] < (15, 30] < (30, 50] < (50, 100]]

After using "str.contains('SPF|UV')"
0     True
1     True
2     True
3    False
4     True
5     True
6    False
7    False
8     True
Name: LDESC, dtype: bool

Iterating the items in the 'str.contains' results
True
True
True
False
True
True
False
False
True

See Working with text data — pandas 1.5.2 documentation[^] for how to use the contains method on a series.

So you really need to study the pandas documentation for how to use these various methods and what results they produce, rather than guessing, and hoping that your guesses will yield valid results.
 
Share this answer
 
v3
Comments
CPallini 13-Dec-22 4:58am    
My 5.
Richard MacCutchan 13-Dec-22 5:06am    
Thanks Carlo.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900