Wrong date associated with early morning scraped from weather website (Python) + January data only?
Every time I try to run the code below, the output file will show the wrong date (previous day within 12 AM-1: 00) in the column on the right. Is there a way to get around this - a snippet that I could insert into the code that would prevent this from doing this? Thank you for your advice.
import pandas as pd
import datetime as dt
startDt = dt.datetime(2012,1,1)
endDt = dt.datetime.now()
#columns for dataframes
ListOfCol = ['TimeCET',
'TemperatureC',
'Dew PointC',
'Humidity',
'Sea Level PressurehPa',
'VisibilityKm',
'Wind Direction',
'Wind SpeedKm/h',
'Gust SpeedKm/h',
'PrecipitationCm',
'Events',
'Conditions',
'WindDirDegrees',
'Day'
]
for year in range(startDt.year,endDt.year+1):
for month in range(startDt.month,13):
if year < endDt.year: #means any remaining (future) days and months in the current year aren't included
url = 'http://www.wunderground.com/history/airport/LZIB/{:d}/{:d}/1/DailyHistory.html?format=1'.format(year,month)
elif month <= endDt.month: #means any remaining (future) days and months in the current year aren't included
url = 'http://www.wunderground.com/history/airport/LZIB/{:d}/{:d}/1/DailyHistory.html?format=1'.format(year,month)
else: #if current year and past current month leave as is
break
if year == startDt.year and month == startDt.month: #if first date for LZIB Airport create dataframe
BlavaDataFrame = pd.read_csv(url,comment='<',skiprows=1)
BlavaDataFrame.columns = ListOfCol
else: #if NOT first date for LZIB Airport append to dataframe to make long list of all data organized by date
BlavaDataFrameTEMP = pd.read_csv(url,comment='<',skiprows=1)
BlavaDataFrameTEMP.columns = ListOfCol
BlavaDataFrame = BlavaDataFrame.append(BlavaDataFrameTEMP,ignore_index=True)
BlavaDataFrame.to_csv('./LZIBhrs.csv')
print('Finished writing ./LZIBhrs.csv to disk')
0
source to share
No one has answered this question yet
See similar questions:
or similar: