Consider this:
$data = <<<HTML
<td class="td1"><span class="one">testing...</span></td>
<td class="td3"><span class="whatever">testing again...</span></td>
<td class="td2"><span style="color:#123456">Sometext 90:80</span></td>
<td class="td3"><span style="color:#123656">SomeMoreText 17:34</span></td>
<td class="td2"><span style="color:#123456">SomeMoreText 65:01</span></td>
<td class="td3"><span style="color:#123656">SomeMoreText 02:01</span></td>
<td class="td2"><span style="color:#123456">Just a test:90</span></td>
<td class="td3"><span style="color:#123656">A simple test: expecting a simple result.</span></td>
HTML;
$dom = new DOMDocument;
@$dom->loadHTML($data);
$xpath = new DOMXPath($dom);
$td = $xpath->query('//td[@class="td2" or @class="td3"]/span[contains(.,":")]');
foreach($td as $val){
echo $val->nodeValue . "<br />\n";
}
The idea here is use predicates to 'fine-tune' the info I select. Currently, you'll notice that I require the atomic value from a span tag (where the atomic value contains a : in it) of which the span tag is a child of a td tag with a class declaration of either "td2" or "td3". The above currently fetches more info than I want. What I am trying to do is make a predicate that only fetches entries that contain [0-9]+:[0-9]+ [one or more digits surrounding the colon on both sides] within the span's atomic value (therefore, the returned entries would be precise - no additional screening required).
Seems adding regex into that contains() function doesn't cut it. So I look further into XPath and regex on google, and I come across the function matches(), but I am having issues getting that to work. Does anyone know how to integrate predicate search criteria that involves macthing unknown values (such as \d+:\d+)?