Variables, Arrays, and FunctionsHow It Works The first thing this plug-in does is check whether $word has a value, and if not, it returns a single-element array with the value FALSE.. Th
Trang 1Variables, Arrays, and Functions
How It Works
The first thing this plug-in does is check whether $word has a value, and if not, it returns a single-element array with the value FALSE
After that, two variables are declared as static The reason is that this plug-in is built in
such a way that it can be called more than once, something that’s likely if a section of text has more than one unknown word Therefore for optimum speed it uses static variables, which retain their value between calls to the function, but outside the function they have no value
or a different value This avoids re-creating large arrays each time the function is called
The two static variables used are $count, which counts the number of times the function has been called and $words which contains an array of words If $count has a value of zero, then this is the first time the function has been called and so the contents of
$dictionary are loaded into $dict, and as long as the load was successful, the words are split out into the array $words using the explode() function
On future calls to the function, $count will have a value greater than zero and therefore populating the $words array is unnecessary since $words is a static variable that will remember its contents from the last call Note that this static value is accessible each time the function is recalled, but only persists during the response to a single web request Subsequent web requests always start with PHP variables not existing until they are defined
F IGURE 12-7 The plug-in has chosen three possible spelling corrections for the word spenr.
Trang 2Next, three arrays are prepared These are $possibles, which will contain a large number of words the program will make up that are similar to $word Then there is $known, which will contain all the words in $possibles that also exist in the dictionary of words in
$words—in other words, they are proper words, even though they were created by an algorithm Lastly, there’s $suggested, which will be populated with all the words the
plug-in wishes to return as suggested replacements for $word
The variable $wordlen is then set to the length of $word and the array $chars is created out of the 26 letters of the alphabet by using str_split() to split up the provided string Next a whole collection of made-up words similar to $word have to be placed in
$possibles Four types of new words are created:
1 The set of words similar to $word but with each letter missing
2 The set of words similar to $word but with each letter substituted with another
3 The set of words similar to $word but with letter pairs swapped
4 The set of words similar to $word but with new letters inserted This is all achieved within separate for and foreach loops For a word length of five characters, 295 variations will be created; for six, it’s 349, and so on Most of these will be meaningless gibberish, but because (we assume) the user meant to type something meaningful but probably just made a typo, some of them stand a chance of being real words, and could be what the user intended to type
To extract the good words, the array_intersect() function is called to return all words that exist in both the $possibles and $words arrays, the result of which is placed in
$known, which becomes our set of known real words that could be what the user intended Next, all the duplicate occurrences of words in $known are counted up using the array_ count_values() function, which returns an array of keys and values in which the key is the word and the value is the number of times it appears in the array This array is then sorted into reverse order using the arsort() function so that those words that appeared the most frequently come first That means the most likely candidates will always be at the start
of the array, with less and less likely ones further down the array
A foreach statement then steps through each of the elements to extract just the key in
$temp (discarding the value in $val), which is then used to populate the next available element of the array $suggested
When the loop completes, $suggested contains the list of words the plug-in thinks the user may have meant, in order of likelihood So a two-element array is returned, the first of which is the number of words returned, while the second is an array containing the words
How to Use It
When you want to offer alternate spelling suggestions to a user, just call this plug-in with the misspelled word and the path to a file of words, like this:
$word = 'spenr';
echo "Suggested spellings for '$word':<br /><ul>";
$results = PIPHP_SuggestSpelling($word, 'dictionary.txt');
if (!$results[0]) echo "No suggested spellings.";
else foreach ($results[1] as $spelling) echo "<li>$spelling</li>";
Trang 3You can call the plug-in multiple times and could therefore use drop-down lists inserted within the text at the occurrence of each unrecognized word, or one of many other methods
to offer suggestions for all misspelled words found in a section of text
Of course, to be truly interactive you ought to rewrite the function in JavaScript and you could then offer interactive spelling management directly within a web page, but that’s a plug-in for another book
The Plug-in
function PIPHP_SuggestSpelling($word, $dictionary) {
if (!strlen($word)) return array(FALSE);
static $count, $words;
if ($count++ == 0) {
$dict = @file_get_contents($dictionary);
if (!strlen($dict)) return array(FALSE);
$words = explode("\r\n", $dict);
}
$possibles = array();
$known = array();
$suggested = array();
$wordlen = strlen($word);
$chars = str_split('abcdefghijklmnopqrstuvwxyz');
for($j = 0 ; $j < $wordlen ; ++$j) {
$possibles[] = substr($word, 0, $j) substr($word, $j + 1);
foreach($chars as $letter) $possibles[] = substr($word, 0, $j) $letter
substr($word, $j + 1);
}
for($j = 0; $j < $wordlen - 1 ; ++$j) $possibles[] = substr($word, 0, $j) $word[$j + 1]
$word[$j] substr($word, $j +2 );
for($j = 0; $j < $wordlen + 1 ; ++$j) foreach($chars as $letter)
$possibles[] = substr($word, 0, $j)
$letter
substr($word, $j);
Trang 4$known = array_intersect($possibles, $words);
$known = array_count_values($known);
arsort($known, SORT_NUMERIC);
foreach ($known as $temp => $val) $suggested[] = $temp;
return array(count($suggested), $suggested);
}
Google Translate
If there’s one thing you can be sure of it’s that your web site attracts visitors from all around the world So why not translate parts of your site for them? You could even use the first
plug-in plug-in this chapter, number 91, Get Country from IP, to determplug-ine where a user is from and then
offer a translated version of your site accordingly Figure 12-8 shows this plug-in used to translate the start of the U.S Declaration of Independence from English into German
About the Plug-in
This plug-in takes a string of text and converts it from one language to another Upon success,
it returns the translated text On failure, it returns FALSE It requires the following arguments:
• $text Text to be translated
• $lang1 Source language
• $lang2 Destination language
F IGURE 12-8 Translating the start of the U.S Declaration of Independence into German
97
Trang 5The source and destination languages must each be one of the following:
Variables, Arrays, and Functions
How It Works
This plug-in makes use of a Google API It starts by creating the associative array $langs in which each of the supported languages is a key, which has a value that will be used by Google to represent the language
Next the two arguments $lang1 and $lang2 are converted to all lowercase strings using the strtolower() function; the root of the API URL is defined in $root; and the URL
to call, based on $root, is created in $url
Before proceeding, the associated key values for each of $lang1 and $lang2 in the array
$langs are looked up If one or the other is not set, as determined by the isset() function, then an unknown language was requested and so FALSE is returned
Next the call to Google’s API is made using the file_get_contents() function with arguments comprising $url, the contents of $text after encoding in URL format using the urlencode() function, and the pair of language identifiers separated by a %7C character
The @ before the function call suppresses any unwanted error messages
If the call is unsuccessful, then the returned value in $json will be empty and so FALSE
is returned Otherwise, $json now contains a JSON (JavaScript Object Notation) format string returned by the Google API, which is then parsed using the json_decode() function, the result of which is placed in the object $result
The translated text is then extracted from $result and returned
How to Use It
Translating text with this plug-in is as simple as passing it the original text, the language that text was written in, and a language to which the text should be translated, like this:
echo '<html><head><meta http-equiv="Content-Type" ' 'content="text/html; charset=utf-8" /></head><body>';
$text = "We hold these truths to be self-evident, that all " "men are created equal, that they are endowed by " "their creator with certain unalienable rights, that " "among these are life, liberty and the pursuit of "