Difference between revisions of "BASH Website"
From Indie IT Wiki
imported>Plittlefield |
Plittlefield (talk | contribs) |
||
(5 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
+ | == Get List Of All Pages In A Web Site == | ||
+ | |||
+ | lynx -listonly -dump <nowiki>https://www.google.co.uk</nowiki> | awk '/http/{print $2}' | sort | uniq | ||
+ | |||
== Reload Page In Remote Web Browser Over SSH == | == Reload Page In Remote Web Browser Over SSH == | ||
sudo apt-get install xdotool | sudo apt-get install xdotool | ||
DISPLAY=:0 xdotool key F5 | DISPLAY=:0 xdotool key F5 | ||
+ | |||
+ | and | ||
+ | |||
+ | sleep 5s && DISPLAY=:0 xdotool search "Window Title" windowactivate --sync key --clearmodifiers ctrl+r | ||
https://stackoverflow.com/questions/28132070/how-to-reload-google-chrome-tab-from-terminal | https://stackoverflow.com/questions/28132070/how-to-reload-google-chrome-tab-from-terminal | ||
Line 22: | Line 30: | ||
== Download An Entire Web Site == | == Download An Entire Web Site == | ||
− | wget --no-verbose --append-output=wget.log -- | + | Quietly download an entire web site including all assets from a CDN like AWS Cloudfront ... |
+ | |||
+ | wget --no-verbose --append-output=wget.log --span-hosts --page-requisites --convert-links --random-wait -r -p --level 1 -E -e robots=off -U mozilla --domains cloudfront.net --domains domain.co.uk --no-parent <nowiki>https://www.domain.co.uk/</nowiki> | ||
== Show Web Page Headers == | == Show Web Page Headers == |
Latest revision as of 16:20, 16 September 2024
Get List Of All Pages In A Web Site
lynx -listonly -dump https://www.google.co.uk | awk '/http/{print $2}' | sort | uniq
Reload Page In Remote Web Browser Over SSH
sudo apt-get install xdotool DISPLAY=:0 xdotool key F5
and
sleep 5s && DISPLAY=:0 xdotool search "Window Title" windowactivate --sync key --clearmodifiers ctrl+r
https://stackoverflow.com/questions/28132070/how-to-reload-google-chrome-tab-from-terminal
Check Dead Links
sudo apt install linkchecker linkchecker --verbose --file-output=text -r1 --no-follow-url=www.mydomain.com http://www.mydomain.com/path/to/page grep -B2 'Error' linkchecker-out.txt
Dump HTTP Header Using WGET
wget --server-response --spider http://www.google.co.uk
Download Web Page And All Files Linked
$ wget -r -np -k http://syncapp.bittorrent.com/1.4.111/
Download An Entire Web Site
Quietly download an entire web site including all assets from a CDN like AWS Cloudfront ...
wget --no-verbose --append-output=wget.log --span-hosts --page-requisites --convert-links --random-wait -r -p --level 1 -E -e robots=off -U mozilla --domains cloudfront.net --domains domain.co.uk --no-parent https://www.domain.co.uk/
Show Web Page Headers
curl -I www.bbc.co.uk
Sample output...
HTTP/1.1 200 OK Server: Apache Content-Type: text/html Content-Language: en-GB Etag: "4c572592f520bedc9a1e2c0238accaa6" X-PAL-Host: pal041.back.live.cwwtf.local:80 Transfer-Encoding: chunked Date: Fri, 06 Mar 2015 11:30:51 GMT Connection: keep-alive Set-Cookie: BBC-UID=15d4ff99887f5eab78a60516b12f651317b26b71e7f4e4a61ad0970254e467200curl/7.35.0; expires=Tue, 05-Mar-19 11:30:51 GMT; path=/; domain=.bbc.co.uk X-Cache-Action: HIT X-Cache-Hits: 1149 X-Cache-Age: 64 Cache-Control: private, max-age=0, must-revalidate Vary: X-CDN
Thanks to CyberCiti.