BASH Website
From Indie IT Wiki
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
Get List Of All Pages In A Web Site
lynx -listonly -dump https://www.google.co.uk | awk '/http/{print $2}' | sort | uniq
Reload Page In Remote Web Browser Over SSH
sudo apt-get install xdotool DISPLAY=:0 xdotool key F5
https://stackoverflow.com/questions/28132070/how-to-reload-google-chrome-tab-from-terminal
Check Dead Links
sudo apt install linkchecker linkchecker --verbose --file-output=text -r1 --no-follow-url=www.mydomain.com http://www.mydomain.com/path/to/page grep -B2 'Error' linkchecker-out.txt
Dump HTTP Header Using WGET
wget --server-response --spider http://www.google.co.uk
Download Web Page And All Files Linked
$ wget -r -np -k http://syncapp.bittorrent.com/1.4.111/
Download An Entire Web Site
Quietly download an entire web site including all assets from a CDN like AWS Cloudfront ...
wget --no-verbose --append-output=wget.log --span-hosts --page-requisites --no-clobber --convert-links --random-wait -r -p --level 1 -E -e robots=off -U mozilla --domains cloudfront.net --domains domain.co.uk --no-parent https://www.domain.co.uk/
Show Web Page Headers
curl -I www.bbc.co.uk
Sample output...
HTTP/1.1 200 OK Server: Apache Content-Type: text/html Content-Language: en-GB Etag: "4c572592f520bedc9a1e2c0238accaa6" X-PAL-Host: pal041.back.live.cwwtf.local:80 Transfer-Encoding: chunked Date: Fri, 06 Mar 2015 11:30:51 GMT Connection: keep-alive Set-Cookie: BBC-UID=15d4ff99887f5eab78a60516b12f651317b26b71e7f4e4a61ad0970254e467200curl/7.35.0; expires=Tue, 05-Mar-19 11:30:51 GMT; path=/; domain=.bbc.co.uk X-Cache-Action: HIT X-Cache-Hits: 1149 X-Cache-Age: 64 Cache-Control: private, max-age=0, must-revalidate Vary: X-CDN
Thanks to CyberCiti.