wwwget man page

(Rev. February 2007)


wwwget [-v] [-s] [-m...] [-abs] [-post| -head|-get|-redirect] [-c name=value] [-p# prompt] [-q] [-D domain] [-F from ] [-U username] [-P password] [-i input_file] [-o output_file] [-r range] [ -to secs] [URL|host[:port]] [query_argument]...


wwwget gets one or several HTTP document(s) directly on the standard output. It avoids the usage of Netscape or similar browser. When only the host is specified as a command-line argument, documents relative to the host are assumed to be specified in the standard input.


is a verbose option (display number of bytes transferred)
emulates Mozilla (insert in the message the User-Agent, etc); the default version is 3.
silent mode: the HTTP header is stripped. By default this header (containing the status, Content-Type, etc), which is terminated by a blank line, is displayed on the standard error.
get the document in a POST method. The default is GET method.
get the document in a HEAD method (only the header). The default is GET method.
use the default GET method to retrieve the document.
generates a Redirect block instead of querying the remote site. Essentially useful when associated with the -q option.
translates relative anchors to absolute ones, providing therefore useable HTML files. This operation is similar to the wwwabs(1) program.
-c cookie
to forward a cookie (in the form name =value)
-p[n] prompt
to define the prompt text - a text which indicates to stop the data. A number n may be attached to the -p argument to indicate to stop at nth occurence of the prompt.
to indicate a query, i.e. the URL indicates only a cgi script, and arguments to this script are given as query_argument supplementary arguments. When no supplementary query_argument appears on the command line, the arguments are assumed to be in the input_file or the standard input, one line per argument; in this input, lines starting by a blank (or a tab) are considered as a continuation of the previous line. An example is given below.
-F from
specifies the From: string in the HTTP protocol, typically used to propagate e-mails. This directorive is used to propagate the origin of Aladin calls to VizieR.
-D domain
specifies the domain name which is required in the WWW-Authenticate context; the domain name is specified in the WWW-Authenticate: answer from the HTTP server.
-U username
specifies the username for documents requiring an Authorization.
-P password
specifies the password for documents requiring an Authorization.
-i input_file
specifies the input file, useful in the -query mode. Default input file is stdin.
-o output_file
specifies the output file, containing the results. Default output file is stdout.
-r range_of_bytes
specifies a starting/ending point of the document to get.
-to secs
specifies a time-out in seconds between the reception of 2 packets; the default is 1200 (20min).


When a full URL is specified, the document is located and displayed.
if only a hostname is supplied, document names are assumed to be specified in the standard input; documents specified in the standard input without hostname are then assumed to be relative to host.
when no document or host is specified, the standard input is assumed to contain fully qualified URLs.

Returned Status

wwwget returns 0 in case of success.

The code 1 is returned when there are invalid arguments, or when the host could not be contacted.

The code 2 is returned when the contacted server indicates an error (HTTP error code above 400).

HTTP Statuses

(from http://www.faqs.org/rfcs/rfc2616.html)

Informational 1xx
100 Continue
101 Switching Protocols
Successful 2xx
200 OK
201 Created
202 Accepted
203 Non-Authoritative Information
204 No Content
205 Reset Content
206 Partial Content
Redirection 3xx
300 Multiple Choices
301 Moved Permanently
302 Found
303 See Other
304 Not Modified
305 Use Proxy
306 (Unused)
307 Temporary Redirect
Client Error 4xx
400 Bad Request
401 Unauthorized
402 Payment Required
403 Forbidden
404 Not Found
405 Method Not Allowed
406 Not Acceptable
407 Proxy Authentication Required
408 Request Timeout
409 Conflict
410 Gone
411 Length Required
412 Precondition Failed
413 Request Entity Too Large
414 Request-URI Too Long
415 Unsupported Media Type
416 Requested Range Not Satisfiable
417 Expectation Failed
Server Error 5xx
500 Internal Server Error
501 Not Implemented
502 Bad Gateway
503 Service Unavailable
504 Gateway Timeout
505 HTTP Version Not Supported


Get the result of a query into a reuseable file:
wwwget -strip -abs http://vizier/cgi-bin?-source=HIP > HIP.html
Query vizier with arguments specified in the standard input:
wwwget -strip -q http://vizier.u-strasbg.fr/cgi-bin/asu-… << ====ENDofQuery

which could also be called as:
wwwget -q http://vizier.u-strasbg.fr/cgi-bin/asu-… -source=I/239/hip_main HIP=1..10 -out.all

or as
wwwget -q http://vizier.u-strasbg.fr/cgi-bin/asu-… HIP=1..10 -out.all

See Also

netscape(1) wwwabs(1)