Click here to Skip to main content
15,885,546 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I have a csv file that contains the following: the date a tweet is published, the sentiment value (0=negative and 1=positive), and then the tweet in the third column, all seperated by commas.

I need to plot the date on the x-axis and the sentiment value on the y-axis to see the change of tweets sentiment as time passes. the file looks like this

Python
Date,SA,Tweet,
2009-06-11,0,  No Post Title,
 
2009-06-11,0,  No Post Title,
 
2009-06-11,0,  No Post Title,
 
2009-06-11,0,  No Post Title,
 
2009-06-11,0,  No Post Title,
 
2009-06-11,0,  No Post Title,
 
2009-06-11,0,  No Post Title,
 
2009-06-11,0,  No Post Title,
 
2009-06-12,0,  No Post Title,
 
2009-06-12,0,  No Post Title,
 
2009-06-12,0,  No Post Title,
 
2009-06-12,0,  No Post Title,


What I have tried:

I have searched many sites and run many codes but nothing seems to work.
I thought of using time series but i dont know the correct way
A sample code i used(which i am sure is wrong):


import pandas as pd
import plot_utils as pu
df = pd.read_csv('SA.csv', index_col=0)
pu.plot_timeseries(df['SA'], ylabel='Unit: %', title='Sentiment anaylsis variation')



I get this error:
ValueError: Unknown string format


and an empty graph

I followed the following post:
https://github.com/jsh9/python-plot-utilities
Posted
Comments
V. 18-Jun-18 8:47am    
have you tried to check and correct for wrong data? It might be that a date is malformed or that the SA column contains non-numeric values.
Also check the df.dtypes value. The datetime will probably have been read as a string (=object type in pandas). My guess is you need to change that to type datetime. (see pandas pages on how to do that)
Richard MacCutchan 18-Jun-18 11:49am    
You need to parse the dates and sentiment strings to proper numerical values so that the plotter can calculate their plot positions. I suggest you check the documentation for plot_utils to see how to do it.
Member 13647869 19-Jun-18 1:03am    
The first two columns that I am plotting are numerical. Do you guys mean that I have to change the date from 2009-06-11 to another format? The second column is 1 and 0 so its already in a numerical form
V. 19-Jun-18 2:01am    
No, what we're saying is that, though it might look like a date to you, pandas probably sees it as a string. Have you checked the dataframe's dtypes property after you read the CSV? I bet it says "object" for the first column, which basically is regarded as a string. Convert it to datetime which pandas also provides and check the dtypes again. it should read something like "datetime".
Check the pandas documentation: http://pandas.pydata.org/pandas-docs/stable/

PS: same for the second column, VERIFY it to be numerical
Richard MacCutchan 19-Jun-18 3:54am    
Like I said, check the documentation to see what form the data needs to be in.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900