Hi,
I am looking at a download from clickbank and I notice that
it has two files, a very small on suffixed with .dtd which I
list below, and a huge file 26 Mb suffixed with .xml
Here is the .dtd
<!ELEMENT Catalog ( Category ) >
<!ELEMENT Category ( Name, Site, Category ) >
<!ELEMENT Commission ( #PCDATA ) >
<!ELEMENT Description ( #PCDATA ) >
<!ELEMENT EarnedPerSale ( #PCDATA ) >
<!ELEMENT TotalEarningsPerSale ( #PCDATA ) >
<!ELEMENT TotalRebillAmt ( #PCDATA ) >
<!ELEMENT HasRecurringProducts ( #PCDATA ) >
<!ELEMENT Gravity ( #PCDATA ) >
<!ELEMENT Id ( #PCDATA ) >
<!ELEMENT Name ( #PCDATA ) >
<!ELEMENT PercentPerSale ( #PCDATA ) >
<!ELEMENT PopularityRank ( #PCDATA ) >
<!ELEMENT Referred ( #PCDATA ) >
<!ELEMENT Site ( Commission? | Description+ | EarnedPerSale? | TotalEarningsPerSale? | TotalRebillAmt? | Gravity? | Id+ | PercentPerSale? | PopularityRank+ | Referred? | Title+ | HasRecurringProducts ) >
<!ELEMENT Title ( #PCDATA ) >
And here is the first few lines on the .xml file.
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE Catalog SYSTEM "marketplace_feed_v1.dtd">
<Catalog>
<Category>
<Name>Business to Business</Name>
<Site>
<Id>REGEASY</Id>
<PopularityRank>1</PopularityRank>
<Title><![CDATA[Registry Easy - #1 Converting Registry Cleaner & System Optimizer.]]></Title>
<Description><![CDATA[Stunning Conversions With Extremely Low Refund Rate. Dedicated Affiliate Support. Extraordinary Customer Service. Any Kind Of Conversion Tracking & Multiple Landing Pages. Talk To Us! Http://www.cheesesoft.com/affiliates/registry-easy/.]]></Description>
<HasRecurringProducts>false</HasRecurringProducts>
<Gravity>226.333</Gravity>
<EarnedPerSale>31.7204</EarnedPerSale>
<PercentPerSale>75.0</PercentPerSale>
<TotalEarningsPerSale>31.7204</TotalEarningsPerSale>
<TotalRebillAmt>0.0</TotalRebillAmt>
<Referred>68.0</Referred>
<Commission>75</Commission>
</Site>
<Site>
<Id>BRYXEN4</Id>
<PopularityRank>2</PopularityRank>
<Title><![CDATA[Keyword Elite 2.0: The New Generation Of Keyword Research Software!]]></Title>
<Description><![CDATA[Dominate Adwords. Dominate Niche Marketing. Dominate The Search Engines. Go Here For Tons Of Affiliate Tools: Http://www.keywordelite.com/affiliate/.]]></Description>
<HasRecurringProducts>true</HasRecurringProducts>
<Gravity>229.6</Gravity>
<EarnedPerSale>65.1052</EarnedPerSale>
<PercentPerSale>48.0</PercentPerSale>
<TotalEarningsPerSale>74.1738</TotalEarningsPerSale>
<TotalRebillAmt>15.2186</TotalRebillAmt>
<Referred>79.0</Referred>
<Commission>50</Commission>
</Site>
Ok - so that shows the header info and the first two lines of data.
Now, the first line of the header info refers to the .dtd file.
If I just use the info in the .dtd file to create a table
with columns ( fields) as it states.
Or I could just create the table structure from looking at the first few
records in the xml file that I have shown.
Once I have done that, I guess that I write a php script
to open the file and then step through each row and pull out the contents that is found between the tags.
As it finds each tag it can locate the contents and update the table.
So:
$CB_file = file('clickbank.xml');
for($i=0; $i<count($CB_file); $i++) {
$arrayOfLine = explode('???', $geo_arr[$i]);
Update cbdb SET ????? = ??????
$result = mysql_query($sql) or die("could not CBDB"). mysql_error();
break;
}
}
Yes, I know that I have a lot of gaps to fill in :o
But, my question is, can this approach work with a
xml file of 28 Mb and based on the files that I have
can you please help me fill in the gaps.
PS I have searched and read up about XML -> MySQL but I didn' like the
look of XML:😃OM or SAX-based parsers, and so would prefer to try and
get something "hand made" to work for my specific files.
Thanks for any input and help.