Developers > Kernel Documentation > Fundamentals > Misc.pm

Misc.pm



ExSite::Misc -- miscellaneous utility functions needed by ExSite::

This package contains routines and functions that have no dependence on other ExSite code.

Hash Encoding Tools

For purposes of decoding input, these tools are deprecated. See ExSite::Input for better input management. These tools may still be used for general-purpose hash encoding, however.

EncodeHash(%hash)
Converts a hash to a URL-encoded string. If the hash contains array references, those are converted to comma-delimited lists of values. If the hash contains hash references, those are recursively converted using &EncodeHash.

DecodeString($string)
Converts a URL-encoded string to a hash. DecodeString can also be used to parse any string with key=value pairs delimited by character X, if X is supplied as the 2nd argument.

Multiply-defined keys have their values appended to each other, separated by the value of $multi_sep (``; '' by default).

DecodeQuery()
Decodes the QUERY_STRING. (Passes $ENV{QUERY_STRING} through &DecodeString().)

DecodePost()
Deprecated - use ExSite::Input instead.

Decode the POST data. For URL-encoded form data, passes the data though &DecodeString(). Multipart-encoded data is more complex (and required for file uploads); in that case, the CGI:: package is used to parse the data.

DecodeInput()

Deprecated - use ExSite::Input instead.

Decodes the input to the page. Decodes either the POST or QUERY data, depending on the REQUEST_METHOD.

DecodeAttributes($string)
Decodes a string of attributes in HTML tag style, eg.
    key1=val1 key2="quoted val2" key3='another quoted value' key4

Returns a hash or hashref of the keys => values. Only parses the key/value pairings and quotes; does not do any HTML escape handling.

Datahash Array Conversion Functions:

columnwise($ref)
Convert a list of datahashes into a hash of datalists.

input: an array of hash references with identical keys.

output: a hash of arrays, indexed by the same keys. For instance, ( { a=>1, b=>2 }, { a=>3, b=>4 }, { a=>5, b=>6 } )

becomes ( a => [1,3,5], b => [2,4,6] )

keywise($key,$ref)
Convert a list of datahashes into a hash of datahashes.

input: a hash key to index on, and an array of hash references

output: a hash of hashes, indexed by one of the keys. It is assumed that the value associated with this key is unique across all the hashes. If not, the last such key is used.

For example, taking ( { a=>1, b=>2 }, { a=>3, b=>4}, { a=>5, b=>6 } )

and indexing it by key ``a'', gives { 1=>{a=>1,b=>2}, 3=>{a=>3,b=>4 }.

Text Encoding/Decoding Functions

url_escape($text,$escape_chars)

Uses URL-escaping conventions to modify $text by escaping the characters in $escape_chars. If $escape_chars is not given, escapes these characters by default:
    %+ &@"?#

url_unescape($text)
Removes URL-escape codes from $text, restoring the plaintext.

html_escape($text,$escape_chars,$use_wierd_amp_esc)
Uses HTML-escaping conventions to modify $text by escaping the characters in $escape_chars. If $escape_chars is not given, the following characters are escaped by default:

    &<>"

$use_wierd_amp_esc is a flag, which if true, causes ``&'' to be escaped with a non-standard escape code ``<!amp>''. This is sometimes necessary to prevent the HTML editor from mangling our escape codes, as some client browsers will rewrite the HTML against our wishes.

html_unescape($text,$escape_chars,$use_wierd_amp_esc)
This reverses the escape codes inserted by html_escape().

safetext($text)
Converts arbitrary data into a string consisting only of the characters a-z, A-Z, 0-9, and _. Such strings are safe(er) for use as variable names or field IDs.

unsafetext($text)
Converts a safe string back to its original representation.

safehtml($html,@tags)

Makes an untrusted HTML string safer to inline into a web page, by disabling all but a limited set of tags, and closing any tags that were left open. By default, it uses a set of allowed tags that are useful for minor text formatting and linking, but disallows everything else, including layout tags (divs, tables, frames, etc.), I/O tags (forms, inputs, etc.), script and style tags, and other advanced features. It allows some compound tags, such as ol/ul, li, but does not check for syntactic correctness in tag nesting. Disabled tags are rendered literally, ie. the angle brackets are shown. If the default tag list is not suitable, you can pass a list of allowed tag names.

General Text-Processing Functions

substitute($text,$subhash,$startmark,$endmark)
Performs simple text substitutions.

Replaces marker tags such as ``[[tag]]'' with the value in $subhash{"tag"}. Does so for all such tags in the text. Tags may contain word characters only (alphanumeric plus _).

$text
string to perform substitutions on

$subhash
hash of marker keys=>values

$startmark
substring denoting the start of a marker tag (``[['' default)

$endmark

substring denoting the end of a marker tag (``]]'' default)

html_to_plaintext($html)
Converts HTML to plaintext, simplistically. It does not attempt to preserve any formatting other than linebreaks.

latin1_to_ascii($text)
Converts Latin-1 text (ISO-8859-1 or Windows-1252 encoding) to plain ASCII. Accented characters are converted to unaccented characters; ligatures are broken out into component characters; symbols are converted to nearest lookalike, or to character sequences that are reasonable facsimiles of the original character.

randtext($len,$src)
Generates strings of random text. Generated strings will be of length

$len, and composed of characters from $src. Both parameters are optional. Default length is 7-10 characters, and the default source string includes most of the printable ASCII character set.

This is useful as a secure password or passkey generator.

array2text(@data)
Converts a perl array to a text string, similar to how the array would be represented in perl code if all the references were stripped out. The array is encoded using arrayref notation [...]. The array is not guaranteed to be a perfect perl representation of the data -- the normal usage is just to provide a unique linearization of the data for use as a hash key. This is faster than linearizing using Storable or JSON, or hashing using MD5.

hash2text(%data)

Converts a perl hash to a text string, similar to how the hash would be represented in perl code if all the references were stripped out. The hash is encoded using hashref notation {...}. See notes under array2text() above.

Miscellaneous other functions...

MimeType($filename)
Guess the MIME type of a file. This routine uses the filename (in particular, the suffix) to guess the MIME type of the file.

MimeToFile($mimetype)

Guess the file extension for a mime type. The routine returns a standard/common file extension for files with the given mime type.

ShowHash()
Display a hash's contents in HTML format. A hash or hash reference may be passed to this routine; a block of HTML is returned. The routine descends into the hash to display its full structure. There is no protection against infinite loops if the hash becomes self-referential.

ShowList()
Display an array's contents in HTML format. The routine descends into the array to display its full structure. There is no protection against infinite loops if the array becomes self-referential.

match_hash($pattern,$hash)
Returns true (1) if the hash matches the pattern. The pattern is a hash of keys/values which must be present and equal to the hash; the hash may contain other keys/values that are ignored.

clean_filename($filename)
Modifies a file name to make it safe for storage on all filesystems. This essentially involves removing any special shell meta characters from the filename.

get_file($opt)
Return contents of a file. $opt is a hash reference, which presently only accepts one key:
imgpath

adds this path value in front of any IMG SRC= attributes found in the file. (Presumably the file is HTML.)

relink(%param)
Generates a recursive link back to same URL, with modified query parameters.

To set a new parameter, give the parameter a value in %param. To clear a parameter, set its value in %param to undef.

relink() returns the new URL to link to.

To link back to the same URL, use &relink();, not &relink;.

See also Modules::BaseDCD::link(), which is better for relinking into a dynamic content module.

sizeof($dat,$max)
Returns the size of $dat, in bytes. If $dat is an array ref, returns the sum of the sizes of the elements. If $dat is a hash ref, returns the sum of the sizes of the keys and the values.

Sizeof() stops counting at a size of $max, if given. This avoids wasted cycles if you're just testing that the size is less than $max.

Topics