【python】获取全国各个城市的历史天气、温度、风向和风力
程序员文章站
2022-07-13 15:36:48
...
输入:
年份:target_year_list = ["2013", "2014", "2015", "2016", "2017"]
城市信息(城市名 城市拼音):http://yinter.iteye.com/blog/575549
北京 BEIJING
上海 SHANGHAI
天津 TIANJIN
重庆 CHONGQING
阿克苏 AKESU
安宁 ANNING
安庆 ANQING
鞍山 ANSHAN
安顺 ANSHUN
安阳 ANYANG
白城 BAICHENG
白山 BAISHAN
白银 BAIYIN
输出:每个城市的历史数据
源源不断地获取每个城市对应的历史数据:
每个城市对应的历史数据,如下所示:
python 写入(write)文件时,出现乱码问题:
## 以下方式在mac上出现中文内容乱码
file = open('./weather_result/{}_weather.csv'.format(city_dict[city]), 'w')
## encoding='gb18030' 以这种方式创建文件即可
file = open('./weather_result/{}_weather.csv'.format(city_dict[city]), 'w', encoding='gb18030')
部分城市的“风级”数据缺失:以北京市2016年7月份为例
字段缺失,会导致程序报错,具体细节看代码(有注释)。
_str += li.string + ','
## li.string 不能为空,否则报错,如下:
TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'
源代码:
#encoding:utf-8
import requests
from bs4 import BeautifulSoup
target_year_list = ["2013", "2014", "2015", "2016", "2017"]
target_month_list = ["01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12"]
def get_urls(city_pinyin):
urls = []
for year in target_year_list:
for month in target_month_list:
date = year + month
urls.append("http://lishi.tianqi.com/{}/{}.html".format(city_pinyin, date))
return urls
## author: chulei
## date: 20180803
def get_city_dict(file_path):
city_dict = {}
with open(file_path, 'r') as file:
# line_list = f.readline()
for line in file:
line = line.replace("\r\n", "")
city_name = line.split(" ")[0]
city_pinyin = (line.split(" ")[1]).lower()
## 赋值到字典中...
city_dict[city_pinyin] = city_name
return city_dict
# =============================================================================
# main
# =============================================================================
file_path = "./city_pinyin_list.txt"
city_dict = get_city_dict(file_path)
for city in city_dict.keys():
file = open('./weather_result/{}_weather.csv'.format(city_dict[city]), 'w', encoding='gb18030')
urls = get_urls(city)
for url in urls:
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
weather_list = soup.select('div[class="tqtongji2"]')
try:
for weather in weather_list:
weather_date = weather.select('a')[0].string.encode('utf-8')
ul_list = weather.select('ul')
i=0
for ul in ul_list:
li_list= ul.select('li')
str=""
for li in li_list:
try:
## TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'
_str += li.string + ','
except:
pass
if i!=0:
file.write(str+'\n')
i+=1
except:
print("except {}".format(url))
file.close()
print("Done {}".format(city))