BASH Website
From Indie IT Wiki
Get List Of All Pages In A Web Site
lynx -listonly -dump https://www.google.co.uk | awk '/http/{print $2}' | sort | uniq
Reload Page In Remote Web Browser Over SSH
sudo apt-get install xdotool DISPLAY=:0 xdotool key F5
and
sleep 5s && DISPLAY=:0 xdotool search "Window Title" windowactivate --sync key --clearmodifiers ctrl+r
https://stackoverflow.com/questions/28132070/how-to-reload-google-chrome-tab-from-terminal
Check Dead Links
sudo apt install linkchecker linkchecker --verbose --file-output=text -r1 --no-follow-url=www.mydomain.com http://www.mydomain.com/path/to/page grep -B2 'Error' linkchecker-out.txt
Dump HTTP Header Using WGET
wget --server-response --spider http://www.google.co.uk
Download Web Page And All Files Linked
$ wget -r -np -k http://syncapp.bittorrent.com/1.4.111/
Download An Entire Web Site
Quietly download an entire web site including all assets from a CDN like AWS Cloudfront ...
wget --no-verbose --append-output=wget.log --span-hosts --page-requisites --convert-links --random-wait -r -p --level 1 -E -e robots=off -U mozilla --domains cloudfront.net --domains domain.co.uk --no-parent https://www.domain.co.uk/
Show Web Page Headers
curl -I www.bbc.co.uk
Sample output...
HTTP/1.1 200 OK Server: Apache Content-Type: text/html Content-Language: en-GB Etag: "4c572592f520bedc9a1e2c0238accaa6" X-PAL-Host: pal041.back.live.cwwtf.local:80 Transfer-Encoding: chunked Date: Fri, 06 Mar 2015 11:30:51 GMT Connection: keep-alive Set-Cookie: BBC-UID=15d4ff99887f5eab78a60516b12f651317b26b71e7f4e4a61ad0970254e467200curl/7.35.0; expires=Tue, 05-Mar-19 11:30:51 GMT; path=/; domain=.bbc.co.uk X-Cache-Action: HIT X-Cache-Hits: 1149 X-Cache-Age: 64 Cache-Control: private, max-age=0, must-revalidate Vary: X-CDN
Thanks to CyberCiti.