I am trying to parse meta-tags (description, keywords) out of $Data. I have the html of the website in $Data and I would like to put the contents of those meta-tags in there own vars like $Description and $Keywords. This is as far as I have got but I cant seam to figure out how to get the keywords into $keywords and the description into $description. Please keep in mind that the order of the meta-tag may not always be in this order.

Does anyone know how this can be done?


$Data = "<meta name=\"keywords\" content=\"php, mysql, php templates, apache, php
manual, server, pdf, database, flash, phpbuilder, content management
system, sql, script, oracle, string, xml, regular expressions, php5,
webalizer, php tutorials, code, nusoap, classes, developers\"><meta name=\"description\" content=\"some description about php and other stuff.\">";

preg_match_all("/<meta.+?content\s*=\s*([\"']?)(.+?)\\1.*?>/is", $Data,$Matches);

print_r ($Matches);

    Function:

    function meta_content($str, $name)
    {
        $meta_str = '<meta name="' . $name . '"';
        $cnt_str = 'content="';
        if (($meta_str_pos = strpos($str, $meta_str)) !== false) {
            $cnt_str_pos = strpos($str, $cnt_str, $meta_str_pos);
            $cnt_pos = $cnt_str_pos + strlen($cnt_str);
            $quote_pos = strpos($str, '"', $cnt_pos);
            $content = substr($str, $cnt_pos, $quote_pos - $cnt_pos);
            return $content;
        } else {
            return 'n/a';
        }
    }

    Usage:

    $data = '<meta name="keywords" content="php, mysql, php templates, 
             apache, php manual, server, pdf, database, flash, phpbuilder, 
             content management system, sql, script, oracle, string, xml, 
             regular expressions, php5, webalizer, php tutorials, code, 
             nusoap, classes, developers"><meta name="description" 
             content="some description about php and other stuff.">';
    $keywords = meta_content($data, 'keywords');
    echo $keywords . '<br />';
    $description = meta_content($data, 'description');
    echo $description . '<br />';

    Output:

    php, mysql, php templates, apache, php manual, server, pdf, database, 
    flash, phpbuilder, content management system, sql, script, oracle, 
    string, xml, regular expressions, php5, webalizer, php tutorials, code, 
    nusoap, classes, developers
    some description about php and other stuff.

      Hi all,

      Just finished this reply so you might as well have it.

      There are probably more slick ways of doing this but I'd do it in stages ...

      <? 
      $Data = "<meta name=\"keywords\" content=\"php, mysql, php templates, 
      apache, php manual, server, pdf, database, flash, phpbuilder, content management 
      system, sql, script, oracle, string, xml, regular expressions, php5, webalizer, php tutorials, code, nusoap, classes, developers\"><meta name=\"description\" content=\"some description about php and other stuff.\">"; 
      
      // Split up the tags first
      preg_match_all('/<meta([^>]*)\>/', $Data, $Matches);
      
      // Only need 1st parenthesis in regex (equiv. to \1 or $1)
      $Matches = $Matches[1];
      $meta = array(); // This will hold all the results
      
      // For each match, there will be an array of attributes
      for($i = 0; $i < count($Matches); $i++){
      	$meta[$i] = array();
      	// Match the attribute name and value
      	preg_match_all('/([^=]*)="([^"]*)"/', $Matches[$i], $name_values);
      
      // For each attribute, get name and value arrays
      $names = $name_values[1]; 	// 1st parenthesis in regex
      $values = $name_values[2];	// 2nd parenthesis in regex
      for($j = 0; $j < count($names); $j++){
      	$meta[$i][trim($names[$j])] = trim($values[$j]);
      }
      }
      echo '<pre>';	print_r($meta);	echo '</pre>';
      ?>

      Paul 🙂

        $data = '<meta name="keywords" content="php, mysql, php templates, 
                 apache, php manual, server, pdf, database, flash, phpbuilder, 
                 content management system, sql, script, oracle, string, xml, 
                 regular expressions, php5, webalizer, php tutorials, code, 
                 nusoap, classes, developers"><meta name="description" 
                 content="some description about php and other stuff.">';
        
        // captures required values and variable names         
        preg_match_all('#<meta name="([^"]*)"[^"]*"([^"]*)#', $data, $out); // stores values in variables $keywords and $description // or any other meta name found. foreach ($out[1] as $k=>$v) $$v=$out[2][$k];
          Write a Reply...