html2wiki man page

html2wiki — convert HTML into wiki markup

Synopsis

html2wiki [options] [file]

Commonly used options:

--dialect=dialect    Dialect name, e.g. "MediaWiki" (required unless
                     the WCDIALECT environment variable is used)
--encoding=encoding  Source encoding (default is 'utf-8')
--base-uri=uri       Base URI for relative links
--wiki-uri=uri       URI fragment for wiki links
--wrap-in-html       Wrap input in <html> and </html> (enabled by default).
                     Use --no-wrap-in-html to disable.
--escape-entities    Escape HTML entities within text elements (enabled by
                     default). Use --no-escape-entities to disable.
--list               List installed dialects and exit
--options            List all recognized options (except for negations
                     such as --no-wrap-in-html)
--help               Show this message and exit
Additional options, including those corresponding to dialect
attributes, are also supported. Consult the html2wiki man page for
details.

Example:

html2wiki --dialect MediaWiki --encoding iso-8859-1 \
    --base-uri http://en.wikipedia.org/wiki/ \
    --wiki-uri http://en.wikipedia.org/wiki/ \
    input.html > output.wiki

Description

"html2wiki" is a command-line interface to HTML::WikiConverter, which it uses to convert HTML to wiki markup.

Dialects

If the dialect you provide in "--dialect" is not installed on your system (e.g. if you specify "MediaWiki" but have not installed its dialect module, HTML::WikiConverter::MediaWiki) a fatal error will be issued. Use "html2wiki --list" to list all available dialects on your system. Additional dialects may be downloaded from the CPAN.

Options

Correspondence of options and attributes

Each of the options accepted by "html2wiki" corresponds to an HTML::WikiConverter attribute. Commonly used options described in "html2wiki --help" therefore correspond to attributes discussed in "ATTRIBUTES" in HTML::WikiConverter. That section also contains other attributes that may be used as "html2wiki" command-line options.

Mapping an attribute name to an option name

While related, option names are not identical to their corresponding attribute names. The only difference is that attribute names use underscores to separate words while option names use hyphens. For example, the "base_uri" attribute corresponds to the "--base-uri" command-line option.

Additional options defined in dialect modules

Individual dialects may define their own attributes, and therefore make available their own command-line options to "html2wiki", in addition to the ones defined by "HTML::WikiConverter". The same rules described above apply for converting between these attribute names and their corresponding command-line option names. For example, Markdown supports an "unordered_list_style" attribute that takes a string value. To use this attribute on the command line, one would use the "--unordered-list-style" option. Consult individual dialect man pages for a list of supported attributes.

Options that are enabled by default

Attributes that take boolean values may be enabled by default. The "wrap_in_html" attribute is one such example. Because of this, "html2wiki" will effectively behave by default as if "--wrap-in-html" had been specified in every invokation. If this is not desired, the option name may be prefixed with "no-" to disable the option, as in "--no-wrap-in-html".

Options that take multiple values

Some attributes (eg, "wiki_uri" and "strip_tags") accept an array of values. To accommodate this in "html2wiki", such options can be specified more than once on the command line. For example, to specify that only comment and script elements should be stripped from HTML:

% html2wiki --strip-tags ~comment --strip-tags script ...

Input/Output

Input is taken from STDIN, so you may pipe the output from another program into "html2wiki". For example:

curl http://example.com/input.html | html2wiki --dialect MediaWiki

You may also specify a file to read HTML from:

html2wiki --dialect MediaWiki input.html

Output is sent to STDOUT, though you may redirect it on the command line:

html2wiki --dialect MediaWiki input.html > output.wiki

Or you may pipe it into another program:

html2wiki --dialect MediaWiki input.html | less

Author

David J. Iberri, "<diberri@cpan.org>"

See Also

HTML::WikiConverter

Info

2006-07-11 perl v5.24.0 User Contributed Perl Documentation