c语言选手早已听说Python的方便和其各大爬虫模块的实用性,于是暑假无所事事的我决定在没有系统学习Python语法的情况下,一边写一个和自己兴趣有关的爬虫,一边学习一下Python方便的语法,体验Python的方便之处。
一·用到的语法
1.requests.get()
这个是常用的爬虫模块,可以爬取网页的json文件,语法为request.get(url,param,headers)
其中url是必须要的,用来给网络文件定位;param不一定需要,具体是干啥的我也没遇到过;headers是防止被反爬虫的,可以把自己的user-agent模拟成浏览器,并且可以输入cookie,保证可以不被网页的反爬虫机制阻挡。
2.re.findall()
这个是正则表达式模块re里面的,对于正则表达式还需要继续学习,但是本项目只需要使用这个找出英文单词(英雄名)和数字(英雄代码),比较简单
3.for循环
Python的循环很有意思,遍历数组arr中的元素用到循环for elem in arr:其中我们还可以在后面加上[x:]表示从下标为x的开始和[:x]表示只遍历前面的x个
4.print
Python的print只可以输出字符串,不可以输出数字,如果输出数组需要转换为字符串str(num)
5.input
input也是处理字符串,如果要读取数字也需要转换:
int(input(“请输入数字:”))
二·分析LOL战绩查询的网站
进入网站社区个人中心-英雄联盟官方网站-腾讯游戏 (qq.com),登录以后按f12使用开发工具。到网络-JS里面寻找对局的JS文件。
刚开始你会发现JS文件是空的,这是因为它已经载入好了,而载入的时候你还没有按下F12,所以就没有抓取到。这时候刷新一下,就会有一堆js文件出现。
我们可以在这一堆文件里面找到这几个?c=Battle开头的。点开观察一下,第一个是matchList,点击浏览发现里面记录的是对局Id:
所以我们的第一步就确定了:爬取matchList,获得gameId。
然后我们再根据gameId一局局爬取对局信息:
这时候打开一局游戏的js文件观察:
首先是url:这时候我们就发现gameId的用处了:每一句的url区分就靠这个gameId,所以我们就根据gameId的修改爬取数据。
下面还有Cookie:
在写hearder的时候我们就可以复制这个Cookie
(注意:这个Cookie是会变化的,所以可能你几个小时之前可以爬取数据,现在就对json报错了,那就是Cookie变了,需要重新复制一个)
然后点浏览:
在一级目录msg的下方有gameInfo,gameStats,participants三个目录:我们分别找一找有什么值得爬取的数据:
gameInfo里,有个gameTypeId:根据我一局一局和掌盟比对发现
然后看gameStats下面的teamStats
0是蓝色方,1是红色方,我们可以看到队伍胜利情况:这局0[‘win’]==’Win’,所以是蓝色方胜利。
然后看Participants:一共有0-9十个人(废话),我们看看第一个(0):
可以看到召唤师名字和其队伍,其实排列顺序是0-4是蓝色方,5-9是红色方。
然后我们看stats
有很多东西,我们这里爬取:英雄 击杀 死亡 助攻 伤害(物理 魔法 真实)推塔 经济
如果想爬取更多(比如插眼个数)也是可以的,这里面还有很多数据可以爬取
这时候其实就已经差不多了,我们先做,最后还有个小问题,我也是做完才发现的= =
三·代码实现
头文件:
import requests,json,re
分别是爬虫,读取json,正则
爬取matchList:
url和cookie按照之前的url和Cookie填写:
url = '太长了0.0'
headers = {'Cookie':'太长了','User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36'}
然后爬取:我们查看js文件发现前16个字符和gameId无关,掠过:
res = requests.get(url,headers=headers)#解析网页
#print(res.text)
json1 = json.loads(res.text[16:])
games = json1['msg']['games']
gameIds = []
for game in games:
gameId = game['gameId']
gameIds.append(gameId)
#print(gameIds)
这里把json和gameId输出检查一下,没问题以后继续。
爬取游戏:
首先我们确定一下要爬多少场:
x = int(input("输入查询的对局场数:"))
然后我们遍历gameId的前x场:
注意这里的url中的gameId是可变的。
输出游戏id 游戏类型 游戏结果
接下来遍历每一个玩家(participants0-9)
输出我们爬取的数据
for i in gameIds[:x]:
url2 = 'https://lol.sw.game.qq.com/lol/api/?c=Battle&a=combatGains&areaId=18&gameId='+str(i)+'&r1=combatGains'
headers2 = {'Cookie':'请使用自己的','User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36'}
res2 = requests.get(url2,headers=headers2)
json2 = json.loads(res2.text[18:])
print("\n对局ID:"+str(i))
gameInfo = json2['msg']['gameInfo']
gameType = gameInfo['gameTypeId']
if gameType==19: print("匹配模式(召唤师峡谷)")
if gameType==18: print("排位模式(召唤师峡谷)")
if gameType==21: print("极地大乱斗(嚎哭深渊)")
win = json2['msg']['gameStats']['teamStats'][0]['win']
if win == 'win':
print("结果:蓝色方胜利")
else:
print("结果:红色方胜利")
participants = json2['msg']['participants']
for par in participants:
if par == participants[0]:
print("蓝色方:")
if par == participants[5]:
print("红色方:")
summonerName = par['summonerName']
chId = par['championId']
print(chId)
stats = par['stats']
kill = stats['kills']
death = stats['deaths']
assist = stats['assists']
totDam = stats['totalDamageDealtToChampions']
phyDam = stats['physicalDamageDealtToChampions']
magDam = stats['magicDamageDealtToChampions']
truDam = stats['trueDamageDealtToChampions']
damTak = stats['totalDamageTaken']
tower = stats['towerKills']
gold = stats['goldEarned']
print(" 召唤师名称:" + str(summonerName) + "\n 经济:"+str(gold)+"\n K/D/A:" + str(kill) + "/" + str(death) + "/" + str(assist) + " 推塔个数:" + str(tower) + "\n 对英雄伤害:" + str(totDam) + " 物理伤害:" + str(phyDam) + " 魔法伤害:" + str(magDam) + " 真实伤害:" + str(truDam) + "\n 承受伤害:" + str(damTak)+"\n")
大致就完成了,但是这时候输出的英雄是英雄Id(数字),而不是英雄名字,说实话正常人谁知道英雄的代码是多少,所以我们还需要爬取英雄Id和名字的对应关系。
四·爬取英雄名字
这时候我们就需要在一堆js文件里面寻找可能附带有英雄Id的(比如champion,hero等字样的),可是我找了好半天也找不到,所以只好在官网的其他地方找找:
游戏资料-英雄联盟官方网站-腾讯游戏 (qq.com)
最终我在英雄资料页找到了,随便打开一个英雄页面,我们发现了这个:
芜!这不就是我要找的英雄Id和名称对应吗:266号是剑魔,103号是阿狸
刚开始,对于python不熟悉的我采取的一个笨比方法:复制过来,用c语言写出python代码(真就我用代码写代码)
#include<stdio.h>
#include<string.h>
char str[102400];
int main(){
FILE *src = fopen("champion_src.txt","r");
FILE *dst = fopen("champion_dst.txt","w");
char c;
int i = 0;
while(~fscanf(src,"%c",&c)){
if((c>='a'&&c<='z')||(c>='A'&&c<='Z')||(c>='0'&&c<='9')){
str[i++] = c;
}
else str[i++] = ' ';
}
puts(str);
int num;
while(~scanf("%d",&num)){
char name[100];
scanf("%s",name);
fprintf(dst,"champions[%d] = '%s'\n",num,name);
}
}
就这样,我生成了Python代码:
champions = ['0' for i in range(1000)]
champions[266] = 'Aatrox'
champions[103] = 'Ahri'
champions[84] = 'Akali'
champions[12] = 'Alistar'
champions[32] = 'Amumu'
champions[34] = 'Anivia'
champions[1] = 'Annie'
champions[523] = 'Aphelios'
champions[22] = 'Ashe'
champions[136] = 'AurelionSol'
champions[268] = 'Azir'
champions[432] = 'Bard'
champions[53] = 'Blitzcrank'
champions[63] = 'Brand'
champions[201] = 'Braum'
champions[51] = 'Caitlyn'
champions[164] = 'Camille'
champions[69] = 'Cassiopeia'
champions[31] = 'Chogath'
champions[42] = 'Corki'
champions[122] = 'Darius'
champions[131] = 'Diana'
champions[119] = 'Draven'
champions[36] = 'DrMundo'
champions[245] = 'Ekko'
champions[60] = 'Elise'
champions[28] = 'Evelynn'
champions[81] = 'Ezreal'
champions[9] = 'Fiddlesticks'
champions[114] = 'Fiora'
champions[105] = 'Fizz'
champions[3] = 'Galio'
champions[41] = 'Gangplank'
champions[86] = 'Garen'
champions[150] = 'Gnar'
champions[79] = 'Gragas'
champions[104] = 'Graves'
champions[120] = 'Hecarim'
champions[74] = 'Heimerdinger'
champions[420] = 'Illaoi'
champions[39] = 'Irelia'
champions[427] = 'Ivern'
champions[40] = 'Janna'
champions[59] = 'JarvanIV'
champions[24] = 'Jax'
champions[126] = 'Jayce'
champions[202] = 'Jhin'
champions[222] = 'Jinx'
champions[145] = 'Kaisa'
champions[429] = 'Kalista'
champions[43] = 'Karma'
champions[30] = 'Karthus'
champions[38] = 'Kassadin'
champions[55] = 'Katarina'
champions[10] = 'Kayle'
champions[141] = 'Kayn'
champions[85] = 'Kennen'
champions[121] = 'Khazix'
champions[203] = 'Kindred'
champions[240] = 'Kled'
champions[96] = 'KogMaw'
champions[7] = 'Leblanc'
champions[64] = 'LeeSin'
champions[89] = 'Leona'
champions[127] = 'Lissandra'
champions[236] = 'Lucian'
champions[117] = 'Lulu'
champions[99] = 'Lux'
champions[54] = 'Malphite'
champions[90] = 'Malzahar'
champions[57] = 'Maokai'
champions[11] = 'MasterYi'
champions[21] = 'MissFortune'
champions[62] = 'MonkeyKing'
champions[82] = 'Mordekaiser'
champions[25] = 'Morgana'
champions[267] = 'Nami'
champions[75] = 'Nasus'
champions[111] = 'Nautilus'
champions[518] = 'Neeko'
champions[76] = 'Nidalee'
champions[56] = 'Nocturne'
champions[20] = 'Nunu'
champions[2] = 'Olaf'
champions[61] = 'Orianna'
champions[516] = 'Ornn'
champions[80] = 'Pantheon'
champions[78] = 'Poppy'
champions[555] = 'Pyke'
champions[246] = 'Qiyana'
champions[133] = 'Quinn'
champions[497] = 'Rakan'
champions[33] = 'Rammus'
champions[421] = 'RekSai'
champions[58] = 'Renekton'
champions[107] = 'Rengar'
champions[92] = 'Riven'
champions[68] = 'Rumble'
champions[13] = 'Ryze'
champions[113] = 'Sejuani'
champions[235] = 'Senna'
champions[875] = 'Sett'
champions[35] = 'Shaco'
champions[98] = 'Shen'
champions[102] = 'Shyvana'
champions[27] = 'Singed'
champions[14] = 'Sion'
champions[15] = 'Sivir'
champions[72] = 'Skarner'
champions[37] = 'Sona'
champions[16] = 'Soraka'
champions[50] = 'Swain'
champions[517] = 'Sylas'
champions[134] = 'Syndra'
champions[223] = 'TahmKench'
champions[163] = 'Taliyah'
champions[91] = 'Talon'
champions[44] = 'Taric'
champions[17] = 'Teemo'
champions[412] = 'Thresh'
champions[18] = 'Tristana'
champions[48] = 'Trundle'
champions[23] = 'Tryndamere'
champions[4] = 'TwistedFate'
champions[29] = 'Twitch'
champions[77] = 'Udyr'
champions[6] = 'Urgot'
champions[110] = 'Varus'
champions[67] = 'Vayne'
champions[45] = 'Veigar'
champions[161] = 'Velkoz'
champions[254] = 'Vi'
champions[112] = 'Viktor'
champions[8] = 'Vladimir'
champions[106] = 'Volibear'
champions[19] = 'Warwick'
champions[498] = 'Xayah'
champions[101] = 'Xerath'
champions[5] = 'XinZhao'
champions[157] = 'Yasuo'
champions[83] = 'Yorick'
champions[350] = 'Yuumi'
champions[154] = 'Zac'
champions[238] = 'Zed'
champions[115] = 'Ziggs'
champions[26] = 'Zilean'
champions[142] = 'Zoe'
champions[143] = 'Zyra'
print(champions)
我真是天才hhhhhhhh
这里说一下python的数组语法:
用arr = []生成的空数组是不能够访问的,就连arr[0]访问了也会出错,要生成一个有初始化的数组需要使用上面的champions = [‘\0’ for i in range(1000)]
然后我们用c语言生成的Python代码给这个数组赋值。
但是这种方法总归是太低级了,我还是想用爬虫:
一番询问下,我学会了用Python处理:
先爬取js:
url3 = 'https://lol.qq.com/biz/hero/champion.js'
headers3 = {'Cookie':'请使用自己的','User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36'}
json3 = requests.get(url3,headers=headers3)
#print(json1.text)
t = json3.text[57:]
首先js文件的前面有一些不是英雄名的单词,我们跳过,直接从第56个字符读起。
然后我们分别用re.findall读取数字和英雄名字:
championId = re.findall("[0-9]+",t)
#print(championId)
championName = re.findall("[A-Za-z]+",t)
#print(championName)
这时候championId和championName的相同位置记录的就是同一个英雄的Id和名字,当然后面还会有一些多余的单词和数字被存储,但是这无关紧要,我们从第一个开始找就好了,反正找不到后面的就结束了:
id = input("")
for i in range(180):
if id==championId[i]:
print(championName[i])
break
测试ok以后加入到之前的代码里面:
chId = par['championId']
for i in range(180):
if str(chId)==championId[i]:
print(championName[i])
break
这样就OK了
五·完整代码
import requests,json,re
url3 = 'https://lol.qq.com/biz/hero/champion.js'
headers3 = {'Cookie':' ','User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36'}
json3 = requests.get(url3,headers=headers3)
#print(json1.text)
t = json3.text[57:]
championId = re.findall("[0-9]+",t)
#print(championId)
championName = re.findall("[A-Za-z]+",t)
#print(championName)
url = ' '
headers = {'Cookie':' ','User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36'}
res = requests.get(url,headers=headers)#解析网页
#print(res.text)
json1 = json.loads(res.text[16:])
games = json1['msg']['games']
gameIds = []
for game in games:
gameId = game['gameId']
gameIds.append(gameId)
#print(gameIds)
x = int(input("输入查询的对局场数:"))
for i in gameIds[:x]:
url2 = 'https://lol.sw.game.qq.com/lol/api/?c=Battle&a=combatGains&areaId=18&gameId='+str(i)+'&r1=combatGains'
headers2 = {'Cookie':' ','User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36'}
res2 = requests.get(url2,headers=headers2)
json2 = json.loads(res2.text[18:])
print("\n对局ID:"+str(i))
gameInfo = json2['msg']['gameInfo']
gameType = gameInfo['gameTypeId']
if gameType==19: print("匹配模式(召唤师峡谷)")
if gameType==18: print("排位模式(召唤师峡谷)")
if gameType==21: print("极地大乱斗(嚎哭深渊)")
win = json2['msg']['gameStats']['teamStats'][0]['win']
if win == 'win':
print("结果:蓝色方胜利")
else:
print("结果:红色方胜利")
participants = json2['msg']['participants']
for par in participants:
if par == participants[0]:
print("蓝色方:")
if par == participants[5]:
print("红色方:")
summonerName = par['summonerName']
chId = par['championId']
for i in range(180):
if str(chId)==championId[i]:
print(championName[i])
break
stats = par['stats']
kill = stats['kills']
death = stats['deaths']
assist = stats['assists']
totDam = stats['totalDamageDealtToChampions']
phyDam = stats['physicalDamageDealtToChampions']
magDam = stats['magicDamageDealtToChampions']
truDam = stats['trueDamageDealtToChampions']
damTak = stats['totalDamageTaken']
tower = stats['towerKills']
gold = stats['goldEarned']
print(" 召唤师名称:" + str(summonerName) + "\n 经济:"+str(gold)+"\n K/D/A:" + str(kill) + "/" + str(death) + "/" + str(assist) + " 推塔个数:" + str(tower) + "\n 对英雄伤害:" + str(totDam) + " 物理伤害:" + str(phyDam) + " 魔法伤害:" + str(magDam) + " 真实伤害:" + str(truDam) + "\n 承受伤害:" + str(damTak)+"\n")
六·最后效果&&总结
(这个是VSCode的终端截图)
如果需要加数据可以继续爬取
yysy我掌握python的时间不超过一天,但是已经体会到了python的方便之处,比如说这个re.findall,比如说py的for循环和字符串直接等于等等,非常方便。
虽然对于python的知识了解还是甚少,不过这次写爬虫的尽力还是一个好的开始。
今天的文章lol战绩查询接口_英雄联盟之决胜巅峰「建议收藏」分享到此就结束了,感谢您的阅读。
版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 举报,一经查实,本站将立刻删除。
如需转载请保留出处:https://bianchenghao.cn/64792.html