I am working with someone elses code and his explanation has not worked and he has not responded in over 7 weeks so I need to ask here, a total newbie, with limited understanding of coding
Breaking it apart, this is a Drupal module for calling information from different sources, one being Wikipedia.
This is code portion that looks to pull the raw:
function createfromweb_operator_wikipedia_execute($query) {
$result = array();
unset($_SESSION['createfromweb']['operator']);
$gurl = "http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=";
$gurl.= urlencode('site:wikipedia.org/wiki/ '.$query); // intitle, quotes, ... ?
$gjson = file_get_contents($gurl);
$_SESSION['createfromweb']['operator']['result1'] = $gjson;
$gres = json_decode($gjson);
$url = (string)$gres->responseData->results[0]->unescapedUrl; // get first google match
$rawurl = str_replace(".wikipedia.org/wiki/", ".wikipedia.org/w/index.php?title=", $url);
$rawurl.= "&action=raw§ion=0";
The instructions to pull the entire page from Wikipedia are change the last line to:
$rawurl.="&printable=yes";
This does not work. I have tried the following:
$rawurl.= "&action=raw";
$rawurl.= "&action&printable=yes";
all three of these modifications return a blank. The value set =0 tells the code to pull the first section of a wikipedia page and manually changing that I can move down the page, but it is not capturing anything else. I also tried to make that a variable or a range, but it broke the pull as well, returning a blank.
The code then goes through some strip and replace functions ending in the creation of the new page:
$result['title'] = $title;
$result['body'] = trim($body);
with the instructions for $printable=yes the $result['body'] = trim line is to be changed to
$result['body']
this just returns an error code for the next line of code.
I tried this function, but all it does is sidestep the "trim" functions and drops the limited raw into the page.
$result['body'] = $body;
Thoughts would be appreciated, guidance on this one is a real need.
Thanks for any help.
FULL CODE
<?php
// $Id: operator_wikipedia.inc,v 1.2.4.2 2009/02/17 13:38:44 brevity Exp $
/* TODO FormAPI image buttons are now supported.
FormAPI now offers the 'image_button' element type, allowing developers to
use icons or other custom images in place of traditional HTML submit buttons.
$form['my_image_button'] = array(
'#type' => 'image_button',
'#title' => t('My button'),
'#return_value' => 'my_data',
'#src' => 'my/image/path.jpg',
); */
/ TODO New user_mail_tokens() method may be useful.
user.module now provides a user_mail_tokens() function to return an array
of the tokens available for the email notification messages it sends when
accounts are created, activated, blocked, etc. Contributed modules that
wish to make use of the same tokens for their own needs are encouraged
to use this function. /
/ TODO
There is a new hook_watchdog in core. This means that contributed modules
can implement hook_watchdog to log Drupal events to custom destinations.
Two core modules are included, dblog.module (formerly known as watchdog.module),
and syslog.module. Other modules in contrib include an emaillog.module,
included in the logging_alerts module. See syslog or emaillog for an
example on how to implement hook_watchdog.
function example_watchdog($log = array()) {
if ($log['severity'] == WATCHDOG_ALERT) {
mysms_send($log['user']->uid,
$log['type'],
$log['message'],
$log['variables'],
$log['severity'],
$log['referer'],
$log['ip'],
format_date($log['timestamp']));
}
} /
/ TODO Implement the hook_theme registry. Combine all theme registry entries
into one hook_theme function in each corresponding module file.
function operator_wikipedia_theme() {
return array(
);
} /
function createfromweb_operator_wikipedia_name() {
return "some content from wikipedia article";
}
function createfromweb_operator_wikipedia_execute($query) {
$result = array();
unset($_SESSION['createfromweb']['operator']);
$gurl = "http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=";
$gurl.= urlencode('site:wikipedia.org/wiki/ '.$query); // intitle, quotes, ... ?
$gjson = file_get_contents($gurl);
$_SESSION['createfromweb']['operator']['result1'] = $gjson;
$gres = json_decode($gjson);
$url = (string)$gres->responseData->results[0]->unescapedUrl; // get first google match
$rawurl = str_replace(".wikipedia.org/wiki/", ".wikipedia.org/w/index.php?title=", $url);
/**$rawurl.= "&action=raw§ion=0";*/
$rawurl.= "&action=raw";
/**$rawurl.= "&action=raw";*/
$title = trim(preg_replace("#.*?wikipedia.org/wiki/|_#", " ", $url));
drupal_set_message("getting: ".$url); // " -> ".$rawurl);
$raw = (file_get_contents($rawurl)) or drupal_set_message("could not retrieve wikidata", ERROR);
#$_SESSION['createfromweb']['operator']['result2'] = $raw;
$raw = preg_replace("|<!--(.*?)-->|","",$raw); // no html comments
$raw = preg_replace("/\[\[(.*?)(\|(.*?))?\]\]/", "$1", $raw); // no wiki/alias links
$raw = preg_replace("/\[(.*?) (.*?)\]/", "$1", $raw); // only http urls
preg_match_all('/\{((?>[^{}]+)|(?R))*\}/x', $raw, $boxes);
foreach($boxes[1] as $box) {
if (preg_match('/\{Infobox/', $box)) {
$infobox = $box;
break;
}
}
#$_SESSION['createfromweb']['operator']['infobox'] = $infobox;
$body = preg_replace('/{{.*?}}/ms','',$raw); //rest w/out boxes
$body = preg_replace("/'''([^']*?)'''/", "<strong>$1</strong>", $body);
$body = preg_replace("/''([^']*?)''/", "<em>$1</em>", $body);
$infobox = preg_replace("|<br ?/?>|","; ", $infobox); // html linebreaks to semicolons
$infobox = strip_tags($infobox);
$result = getBoxProperties($infobox);
$result['title'] = $title;
/**$result['body'] = trim($body);*/
$result['body'] = $body;
$_SESSION['createfromweb']['operator']['result'] = $result;
//$result = array($result);
return $result;
}
/**
* [url]http://code.google.com/p/linuxpedia/[/url]
* Retrieves properties defined in an infobox as an associative array
*
* @param $box Infobox code
* @param $toLower Whether to convert all predicate keys to lowercase
* @return Associative array with predicates as keys
*/
function getBoxProperties($box, $toLower = false) {
/* Remove outside curly brackets */
$box = substr($box, 1, strlen($box) - 2);
/* Remove HTML comments */
$box = preg_replace('/<\!--[^>]*->/mU', '', $box);
/* Split triples; ignoring triples in subtemplates */
$triples = preg_split('/\| (?! [^{]*\}\} | [^[]*\]\] )/x',$box);
$a = array();
foreach ($triples as $triple) {
$predObj = split('=',$triple,2);
if (count($predObj) == 2 && ($pred = trim($predObj[0])) != "" && ($obj = trim($predObj[1])) != "")
{
$key = ($toLower ? strtolower($pred) : $pred);
$a[$key] = $obj;
}
}
return $a;
}
?>