I've been going through a bunch of tutorials and looking at posts about regex, but I can't seem to figure out to use it. Any help would be greatly appreciated.

I'm trying to extract some text from a string, here's an example of the string:

$parse_string = "[{program:"30 Year Fixed $417,000",data:["4.750%","4.808%"]},{program:"15 Year Fixed $417,000",data:["4.125%","4.290%"]}]";

This is what I'm trying with no luck:

$programs = preg_split('/program:\"[A-Za-z0-9]"$/', $parse_value, -1, PREG_SPLIT_OFFSET_CAPTURE);

What I ultimate want is to break the string into variables so that I'd have an array of the program titles and then a multi dimensional array for the data.

Thanks in advance for any help, or if you've found a good tutorial, I'd love to read it.

Thanks!

    First problem I see is that you are anchoring your regexp to the beginning ("") and end ("$") of the string, so you need to remove those anchor characters. Also, you don't account for spaces in the contents between the quotes. A simpler way might be to just match on any non-quote:

    '/program:"[^"]"/'
    

    (Quoting the regexp in single quotes will then allow you to use double quotes within it without having to escape them.)

      Where's the text coming from? It looks almost – but not quite – like JSON. If it was JSON then [man]json_decode[/man] could be used. But for it to be JSON, "program" and "data" would need to be quoted.

      (If it's supposed to be JSON but you can't fix it, then

      json_decode(preg_replace('/program|data/', '"$1"', $parse_string));

      would turn it into a PHP structure; that assumes that (a) "program" and "data" only appear as member keys, and (b) only "program" and "data" appear as member keys).

        Thanks for the replies.

        NogDog... I tried plugging that line in, and now I'm just getting the whole string. I'm not sure what's wrong.

        Weedpacket... I really don't know anything about JSON, so I can't say for sure that it isn't. But what I'm doing is pulling mortgage rates from a table using:

        $rates = file_get_contents("http://www.loansifter.com/rates2v2.aspx?uid=32828");

        I found that there's a Javascript variable (I'll post the whole string at the bottom of this post) on that site that sets up the table, so I'm trying to extract the variable to use in PHP. I figure that I could use a regex command to extract what I need, but I just can't get my head around regex. Everytime I think I've got it, it doesn't work... very frustrating!

        Thanks again for the replies... any other suggestions would be greatly appreciated.

        Here's the full section of code that I'm trying to work with:

        <script type="text/javascript">
        var data=[{program:"30 Year Fixed $417,000",data:["4.750%","4.794%","0.08","360","$2175","$3189","CitiMortgage Correspondent","4.625%","4.710%","0.55","360","$2143","$3189","CitiMortgage Correspondent","4.500%","4.633%","1.13","360","$2112","$3189","CitiMortgage Correspondent"]},
        {program:"15 Year Fixed $417,000",data:["4.125%","4.272%","0.56","180","$3110","$3189","CitiMortgage Correspondent","4.000%","4.241%","1.20","180","$3084","$3189","CitiMortgage Correspondent","3.875%","4.215%","1.89","180","$3058","$3189","CitiMortgage Correspondent","3.750%","4.245%","2.98","180","$3032","$3189","CitiMortgage Correspondent"]},
        {program:"30 Year Fixed $729,750",data:["5.000%","5.042%","0.15","360","$2952","$3189","CitiMortgage Correspondent","4.875%","4.961%","0.66","360","$2910","$3189","CitiMortgage Correspondent","4.750%","4.890%","1.29","360","$2869","$3189","CitiMortgage Correspondent","4.625%","4.846%","2.26","360","$2827","$3189","CitiMortgage Correspondent","4.500%","4.743%","2.54","360","$2786","$3189","CitiMortgage Correspondent","4.375%","4.653%","2.99","360","$2746","$3189","CitiMortgage Correspondent"]},
        {program:"20 Year Fixed $417,000",data:["4.500%","4.592%","0.32","240","$2530","$3189","CitiMortgage Correspondent","4.375%","4.561%","1.14","240","$2503","$3189","CitiMortgage Correspondent","4.250%","4.525%","1.94","240","$2476","$3189","CitiMortgage Correspondent"]},{program:"10 Year Fixed $417,000",data:["4.125%","4.267%","0.21","120","$4073","$3189","CitiMortgage Correspondent","4.000%","4.334%","1.12","120","$4049","$3189","CitiMortgage Correspondent","3.875%","4.401%","2.04","120","$4026","$3189","CitiMortgage Correspondent","3.750%","4.504%","3.15","120","$4002","$3189","CitiMortgage Correspondent"]}]
        </script>
        

          If I understand correctly (based on your initial pattern, you wish to capture the contents of program:" ... ");

          Example:

          $html = <<<EOF
          <script type="text/javascript">
          var data=[{program:"30 Year Fixed $417,000",data:["4.750%","4.794%","0.08","360","$2175","$3189","CitiMortgage Correspondent","4.625%","4.710%","0.55","360","$2143","$3189","CitiMortgage Correspondent","4.500%","4.633%","1.13","360","$2112","$3189","CitiMortgage Correspondent"]},
          {program:"15 Year Fixed $417,000",data:["4.125%","4.272%","0.56","180","$3110","$3189","CitiMortgage Correspondent","4.000%","4.241%","1.20","180","$3084","$3189","CitiMortgage Correspondent","3.875%","4.215%","1.89","180","$3058","$3189","CitiMortgage Correspondent","3.750%","4.245%","2.98","180","$3032","$3189","CitiMortgage Correspondent"]},
          {program:"30 Year Fixed $729,750",data:["5.000%","5.042%","0.15","360","$2952","$3189","CitiMortgage Correspondent","4.875%","4.961%","0.66","360","$2910","$3189","CitiMortgage Correspondent","4.750%","4.890%","1.29","360","$2869","$3189","CitiMortgage Correspondent","4.625%","4.846%","2.26","360","$2827","$3189","CitiMortgage Correspondent","4.500%","4.743%","2.54","360","$2786","$3189","CitiMortgage Correspondent","4.375%","4.653%","2.99","360","$2746","$3189","CitiMortgage Correspondent"]},
          {program:"20 Year Fixed $417,000",data:["4.500%","4.592%","0.32","240","$2530","$3189","CitiMortgage Correspondent","4.375%","4.561%","1.14","240","$2503","$3189","CitiMortgage Correspondent","4.250%","4.525%","1.94","240","$2476","$3189","CitiMortgage Correspondent"]},{program:"10 Year Fixed $417,000",data:["4.125%","4.267%","0.21","120","$4073","$3189","CitiMortgage Correspondent","4.000%","4.334%","1.12","120","$4049","$3189","CitiMortgage Correspondent","3.875%","4.401%","2.04","120","$4026","$3189","CitiMortgage Correspondent","3.750%","4.504%","3.15","120","$4002","$3189","CitiMortgage Correspondent"]}]
          </script>
          EOF;
          
          preg_match_all('#program:"([^"]+)"#', $html, $matches);
          echo '<pre>'.print_r($matches[1], true);
          

          Output:

          Array
          (
              [0] => 30 Year Fixed $417,000
              [1] => 15 Year Fixed $417,000
              [2] => 30 Year Fixed $729,750
              [3] => 20 Year Fixed $417,000
              [4] => 10 Year Fixed $417,000
          )
          

          Is that along the lines you are looking for?

            woodeye wrote:

            Here's the full section of code that I'm trying to work with:

            It does look like almost-JSON, then; so the second paragraph of my earlier reply applies.

              Thanks again for the help. Here's what I'm finding now...
              Weedpacket... I'm not sure if I'm using the json_decode correctly, here's what I tried:

              	$rates = file_get_contents("http://www.loansifter.com/rates2v2.aspx?uid=32828");
              
              $start = strpos($rates, "var");
              $stop = strpos($rates, '</script>');
              
              $parse_value = substr($rates, $start + 10, ($stop - $start - 11));
              
              $data = json_decode(preg_replace('/program|data/', '"$1"', $parse_value));	
              var_dump($data);
              

              All I get is a NULL value.

              nrg_alpha - That works perfectly to get the program values, now I'm totally stumpped on how to extract the data variable into a multi-dimensional array. I was thinking that I should first extract the sections between the { }'s and then parse out the data variable into an array, but I can't even figure out how to extract the sections between the {}'s. I tried '#{ }#' but that didn't seem to work.

              Am I reading your regex correctly?

              Here's the regex: #program:"(["]+)"#

              I think the #'s are the delimiters, the program:" is the search value, the ()'s define the area to capture, the ["]+ tell it to catch everything except quote marks. But what is the + for?

              So if I'm right, then shouldn't this regex return something? #data:[(["])]#

              All I get is this error:
              "Warning: preg_match_all() [function.preg-match-all]: Compilation failed: unmatched parentheses at offset 11 in /website/parse/check.php on line 17"

              Thanks again for all the help.

                Typo on my part. that should be a $0.

                Also, you're chopping off the '['...']' on the ends, which breaks it. So the substr() needs to be adjusted to retain them.

                After all that [man]array_chunk[/man] can be used to group elements in $data.

                  Thanks Weedpacket! That works!

                  I couldn't figure out how to make array_chunk work the way that I needed based on the results I was getting, so here's what I came up with. It probably isn't as elegant as it could be, but it gives me the results that I needed.

                  Thanks again!

                  	$rates = file_get_contents("http://www.loansifter.com/rates2v2.aspx?uid=32828");
                  
                  $start = strpos($rates, "var");
                  $stop = strpos($rates, '</script>');
                  
                  $parse_value = substr($rates, $start + 9, ($stop - $start - 9));
                  
                      $data = json_decode(preg_replace('/program|data/', '"$0"', $parse_value));     
                  
                  $final_array = array();
                  $counter = 0;
                  
                  foreach($data as $value){
                  	foreach($value as $key=>$val){
                  		$final_array[$counter][$key] = $val;
                  	}
                  	$counter++;
                  }
                  print "<pre>";
                  print_r ($final_array);
                  print "</pre>";
                  
                    Write a Reply...