Hey.
So I've got a text file and I'm looking to parse through its contents line by line and read certain bits of data into a database.
To this end I've read the contents of the file into an array and I'm going through it line by line, identifying the patterns using preg_replace so as to filter out the bits I don't need, and assign the values of what I do need to a variable.
The basic structure is this:
File.txt contains this:
User 1: corcode (1500 points)
User 2: corcode2 (500 points)
User 3: corcode2 (600 points)
User 4: corcode2 (2400 points)
The info I'm looking from each line is the User number, the user name, and the number of points.
For every line of the file that's like this, I'm using the preg_match function as follows:
preg_match('/User (?P<user_num>\w+): (?P<username>\w+).(?P<points>\w+) points/', $file[$line], $user);
This, for the most part, allows me to store the values using the results, $user[user_num], $user[username] and $user[points].
The problem is that there's no real format to what the usernames in the text file have to adhere to, so usernames like cor.cod.e aren't caught.
I then used \S instead of \w in the username subpattern, but there are also usernames which include whitespaces (eg: "cor code");
This has me kind of stumped, it's not essential that I use a named sub pattern, so long as I can get the info I need in whatever way that works I'm happy enough, but I'm pretty new to reg expressions and can't think of how I would catch all 3 types of patterns.
Is there an easy way that I'm overlooking to catch:
User 1: Corcode2 (1500 points)
User 2: Cor Code5 (1500 points)
User 3: Cor-code99 (1500 points)
User 4: Cor.co.de (1500 points)
?
The rest of the format never changes, the username is always prefixed by "User $no: " and suffixed by " (xxxx points)", is there a way to catch everything in between there (not including the whitespaces following the ":" and before the "(")?
Also, Im having a bit of bother with escaping metacharacters at another part of each file.
There's a list of payments, which I'm looking to break down and enter individually.
In the file, they'd be displayed as:
$6.50+$4.30+$3.20 USD
But the amounts are variable so I was looking to use subpatterns <payment1>, <payment2>, <payment3> to get each individual one as a decimal figure without the dollar sign (6.50, 4.30, 3.20), I've tried backslashing the dollar/plus signs and every combination of \Q's and looked up loads of tutorials but just can't see anything that readily applies.
Any help at all is much appreciated!
Thanks in advance, corcode.