当前位置：首页 > Web开发 > 正文

如何下载web资源目的最近机工社宣布开放工程科技数字图书馆

2024-03-31 Web开发

如何下载web资源

目的

比来机工社公布发表开放工程科技数字藏书楼，全网免费共克时艰！

发明有些书是以web页面的方法给用户看的，，一张一张，很难一次性下载

有没有步伐一次性下载他们呢？

好比书

研究 test 1： chrome extension

上网查到很多chrome extension但是他们都认不到页面内的连接。这是因为页面里面根柢没有连接

biru

页面链接如下

该链接其实最终酿成?path=http://www.mamicode.com/openresources/teach_ebook/uncompressed/13780/OEBPS/Text/chapter33.html

所以怪不得扩展不认识了

看来还是要本身写一个了

最简单就是用python了

测试以上链接

C:\Users\cutep>python -m wget ?path=http://www.mamicode.com/openresources/teach_ebook/uncompressed/13780/OEBPS/Text/chapter33.html -o 33.html 100% [................................................................................] 4000 / 4000 Saved under 33.html

告成！

test 2: 最终写了如下python脚本 import os #from selenium import webdriver #from urllib2 import urlopen import requests def my_system(cmd): print(cmd) os.system(cmd) def download(url, file): cmd = 'python -m wget %s -o %s'%(url, file) my_system(cmd) def download_chapter(click_url, file): download('?path=%s'%click_url, file) def get_bookname(cont): s='<div class="book-name">' p1 = cont.find(s) p1 = p1 + len(s) p1 = cont.find('<span>', p1) p1 = p1 + len('<span>') p2 = cont.find('</span>', p1) #print(p1, p2) name=cont[p1:p2] return name def get_value_token(cont): s='"ebookId" value="' p1 = cont.find(s) p1 = p1 + len(s) p2 = cont.find('"/>', p1) #print(p1, p2) ebookId=cont[p1:p2] s2 = 'name="token" value="' p3 = cont.find(s2, p2) p3 = p3 + len(s2) p4 = cont.find('"/>', p3) #print(p3, p4) token=cont[p3:p4] print('ebookId, token %s %s'%(ebookId, token)) return [ebookId, token] def download_book(main_link): my_system('del main*.html') download(main_link, 'main.html') main_cont = open('main.html', 'r', encoding='utf-8').read() [ebookId, token] = get_value_token(main_cont) bookname = get_bookname(main_cont) print(bookname) if os.path.isdir(bookname): return my_system('rd/s/q my_temp') my_system('md my_temp') os.chdir('my_temp') my_system('cd') #response = requests.post('', data={'ebookId':15917,'token':"e87436c8bc7849c397a1db2f27c0ba5d"}) response = requests.post('', data={'ebookId':ebookId,'token':token}) resp_json = response.json() #print(resp_json) for i in resp_json['data']['data']: ref_link = i['ref'] file = ref_link[ref_link.rfind('/')+1:] print(ref_link, file) download_chapter(ref_link, file) os.chdir('..') my_system('cd') my_system('md "%s"'%bookname) my_system('xcopy /c/d/e/y my_temp "%s"'%bookname) #download_book('') download_book('') download_book('') download_book('') download_book('') download_book('') download_book('') download_book('') download_book('') download_book('') download_book('') download_book('') download_book('') download_book('') download_book('')

Test result

Saved under chapter51.xhtml /openresources/teach_ebook/uncompressed/16571/OEBPS/Text/chapter52.xhtml chapter52.xhtml python -m wget ?path=http://www.mamicode.com/openresources/teach_ebook/uncompressed/16571/OEBPS/Text/chapter52.xhtml -o chapter52.xhtml 100% [................................................................................] 1058 / 1058 Saved under chapter52.xhtml /openresources/teach_ebook/uncompressed/16571/OEBPS/Text/chapter53.xhtml chapter53.xhtml python -m wget ?path=http://www.mamicode.com/openresources/teach_ebook/uncompressed/16571/OEBPS/Text/chapter53.xhtml -o chapter53.xhtml 100% [................................................................................] 4625 / 4625 Saved under chapter53.xhtml /openresources/teach_ebook/uncompressed/16571/OEBPS/Text/chapter54.xhtml chapter54.xhtml python -m wget ?path=http://www.mamicode.com/openresources/teach_ebook/uncompressed/16571/OEBPS/Text/chapter54.xhtml -o chapter54.xhtml 100% [..................................................................................] 705 / 705 Saved under chapter54.xhtml /openresources/teach_ebook/uncompressed/16571/OEBPS/Text/chapter55.xhtml chapter55.xhtml python -m wget ?path=http://www.mamicode.com/openresources/teach_ebook/uncompressed/16571/OEBPS/Text/chapter55.xhtml -o chapter55.xhtml 100% [................................................................................] 1814 / 1814 Saved under chapter55.xhtml /openresources/teach_ebook/uncompressed/16571/OEBPS/Text/chapter56.xhtml chapter56.xhtml python -m wget ?path=http://www.mamicode.com/openresources/teach_ebook/uncompressed/16571/OEBPS/Text/chapter56.xhtml -o chapter56.xhtml 100% [..............................................................................] 10025 / 10025 Saved under chapter56.xhtml /openresources/teach_ebook/uncompressed/16571/OEBPS/Text/chapter57.xhtml chapter57.xhtml python -m wget ?path=http://www.mamicode.com/openresources/teach_ebook/uncompressed/16571/OEBPS/Text/chapter57.xhtml -o chapter57.xhtml

其他下面这个是啥框架写的？

温馨提示: 本文由Jm博客推荐，转载请保留链接: https://www.jmwww.net/file/web/30715.html

上一篇：讯飞socket版tts之nodejs应用
下一篇：vue animate.css训练动画案例列表循环

如何下载web资源目的最近机工社宣布开放工程科技数字图书馆

推荐文章

热门文章

标签

友情链接

关于本站

联系我们

特别鸣谢

如何下载web资源 目的 最近机工社宣布开放工程科技数字图书馆

推荐文章

热门文章

标签

友情链接

关于本站

联系我们

特别鸣谢

如何下载web资源目的最近机工社宣布开放工程科技数字图书馆