Hello, I humbly ask for some help here (I've spent over 3 weeks trying to figure this out on my own), I am new at coding and need a little hand-holding. Here is my question, please:

I've been provided a username and password to post form data to a website.

This script does not work (reason listed below):

[FONT=Arial]<?php
$ch = curl_init('http://www.website.com/index.php?username=myself&password=password');
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_POSTFIELDS, array('inputname1'=>"hammer",'inputname2'=>'nail'));
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (compatible, MSIE 10, Windows NT 6.2');
$data = curl_exec($ch);
curl_close($ch);
var_dump($data);
?>[/FONT]

Reason it doesn't work: the website's sourcecode shows a "hidden input" field with a random number generated with each visit
[FONT=Arial]<input type="hidden" name="controlNumber" value="39282928">[/FONT]

May I please ask your help to modify my code to first "get" that random number, and then "post" it along with the other post variables?

Note: the website does not require me to have javascript turned on, and does not require me to allow cookies, either.

Thank you.

    Note: this forum does provide tags for formatting PHP code (see the FAQs).

    Sounds like you'd need to request the form page, parse the HTML response to get the controlNumber field, then make the form submission request.
    Writing that for you wouldn't be "modification", that would be writing the whole thing.

      Weedpacket, yes, thank you, I had already done that as follows:

      <?php
      url = ''http://www.website.com/index.php?username=myself&password=password';
      $options = array(
      'http'=>array(
      'method'=>"GET",
      'header'=>"Accept-language: en\r\n"
      )
      );
      $context = stream_context_create($options);
      $file = file_get_contents($url, false, $context);
      preg_match('/controlNumber\"\svalue=\"(.*)\"/',$file,$token);
      
      //Now that we have the "$token", lets put it in the CURL post:
      
      $ch = curl_init('http://www.website.com/index.php?username=myself&password=password');
      curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
      curl_setopt($ch, CURLOPT_POST, true);
      curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
      curl_setopt($ch, CURLOPT_POSTFIELDS, array('controlNumber'=>$token,'inputname1'=>"hammer",'inputname2'=>'nail'));
      curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (compatible, MSIE 10, Windows NT 6.2');
      $data = curl_exec($ch);
      curl_close($ch);
      var_dump($data);
      ?>

      But for whatever reason, it doesn't work.

      When I used the first half of the code, and then echoed the "token," the random number displayed perfectly.

      So I know my code was retrieving the random number.

      However, when using that same random number token in the second half of my code, it would not work.

      I can only guess that the website sees the second half of my code as a "new" access, and therefore generates a brand new random number. (A new random number with each access).

        At a guess, it is storing the random value in a session, so you'll need to store the session cookie when you request the form, then send it back when you submit the form. Therefore, you'll probably have to add a COOKIEFILE to your cURL options.

          Nogdog, thank you very much for your reply. The website worked in my web browser with cookies disabled... so I didn't think I needed cookies to work. I re-did the code using all curl -- an it still doesn't work. Does this look right? I'm not sure how to "get" the cookies and "resubmit" them, other than what I've done here. What do you think?

          <?php
          $ch = curl_init();
          curl_setopt($ch, CURLOPT_URL,'http://www.website.com/index.php?username=myself&password=password');
          curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
          curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
          curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (compatible, MSIE 10, Windows NT 6.2');
          
          ###COOKIE STUFF HERE:
          curl_setopt($ch, CURLOPT_COOKIEJAR, COOKIE_FILE);
          curl_setopt($ch, CURLOPT_COOKIEFILE, COOKIE_FILE);
          
          
          $data = curl_exec($ch);
          
          # GETS THE RANDOM NUMBER "TOKEN":
          preg_match('/controlNumber\"\svalue=\"(.*)\"/',$data,$clip);  
          $TOKEN = $clip[1]; //Now that we have the "$token", lets put it in the CURL post: $ch = curl_init('http://www.website.com/index.php?username=myself&password=password'); curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); curl_setopt($ch, CURLOPT_POST, true); curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (compatible, MSIE 10, Windows NT 6.2'); ###COOKIE CODE HERE: curl_setopt($ch, CURLOPT_COOKIEJAR, COOKIE_FILE); // Cookie management. curl_setopt($ch, CURLOPT_COOKIEFILE, COOKIE_FILE); curl_setopt($ch, CURLOPT_POSTFIELDS, array('controlNumber'=>$token,'inputname1'=>"hammer",'inputname2'=>'nail')); $data = curl_exec($ch); curl_close($ch); var_dump($data); ?>

          Thank you!!

            Dalecosp, thank you for pointing that out. However, this still does not work. Do I have something out of order? Or duplicated where it shouldn't be (e.g. the cookiejar/cookiefile)?

             <?php
            $ch = curl_init();
            curl_setopt($ch, CURLOPT_URL,'http://www.website.com/index.php?username=myself&password=password');
            curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
            curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
            curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (compatible, MSIE 10, Windows NT 6.2');
            
            ###COOKIE STUFF HERE:
            curl_setopt($ch, CURLOPT_COOKIEJAR, COOKIE_FILE);
            curl_setopt($ch, CURLOPT_COOKIEFILE, COOKIE_FILE);
            
            
            $data = curl_exec($ch);
            
            # GETS THE RANDOM NUMBER "TOKEN":
            preg_match('/controlNumber\"\svalue=\"(.*)\"/',$data,$clip);  
            $TOKEN = $clip[1]; //Now that we have the "$token", lets put it in the CURL post: curl_setopt($ch, CURLOPT_POST, true); curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); ###COOKIE CODE HERE: curl_setopt($ch, CURLOPT_COOKIEJAR, COOKIE_FILE); // Cookie management. curl_setopt($ch, CURLOPT_COOKIEFILE, COOKIE_FILE); curl_setopt($ch, CURLOPT_POSTFIELDS, array('controlNumber'=>$token,'inputname1'=>"hammer",'inputname2'=>'nail')); $data = curl_exec($ch); curl_close($ch); var_dump($data); ?>

            Thank you kindly.

              You're also setting the cookie stuff twice; I'm not sure what effect that would have. It's not necessary, and it if happens to empty the cookiejar when you call the setopt again on it, you might be, er, "tossing your cookies".

              I don't see where you've assigned the second URL for the second part of the transaction. So, do we assume it's the same URL for both the original request and the follow-up POST?

              Also, is the COOKIE_FILE constant properly defined? Can you verify that it is created and holds cookie data?

                Dalecosp, I deleted the repetition of the Cookiejar code, so now it's there only once at the beginning. And, yes, it is the same URL for both the original request and the followup post.

                <?php 
                $ch = curl_init(); 
                curl_setopt($ch, CURLOPT_URL,'http://www.website.com/index.php?username=myself&password=password'); 
                curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
                curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout); 
                curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (compatible, MSIE 10, Windows NT 6.2'); 
                
                ###COOKIE STUFF HERE: 
                curl_setopt($ch, CURLOPT_COOKIEJAR, COOKIE_FILE); 
                curl_setopt($ch, CURLOPT_COOKIEFILE, COOKIE_FILE); 
                
                $data = curl_exec($ch); 
                
                # GETS THE RANDOM NUMBER "TOKEN": 
                preg_match('/controlNumber\"\svalue=\"(.*)\"/',$data,$clip);   
                $token = $clip[1]; //Now that we have the "$token", lets put it in the CURL post: curl_setopt($ch, CURLOPT_POSTFIELDS, array('controlNumber'=>$token,'inputname1'=>"hammer",'inputname2'=>'nail')); $data = curl_exec($ch); curl_close($ch); var_dump($data); ?>

                Now the code looks shorter, but still doesn't work.

                I wish I had the knowledge/experience of advanced users so that I could just look at the code and know right away what's wrong. I've stared at the code for a long time, I can't figure out what's wrong.

                  Dalecosp, oops, I forgot to address your mention of the cookie file actually creating cookies. Here's the same code, with "cookies.txt" being the cookie file:

                  <?php 
                  $ch = curl_init(); 
                  curl_setopt($ch, CURLOPT_URL,'http://www.website.com/index.php?username=myself&password=password'); 
                  curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
                  curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout); 
                  curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (compatible, MSIE 10, Windows NT 6.2'); 
                  
                  ###COOKIE STUFF HERE: 
                  curl_setopt($ch, CURLOPT_COOKIEJAR,'cookies.txt'); 
                  curl_setopt($ch, CURLOPT_COOKIEFILE,'cookies.txt');
                  
                  $data = curl_exec($ch); 
                  
                  # GETS THE RANDOM NUMBER "TOKEN": 
                  preg_match('/controlNumber\"\svalue=\"(.*)\"/',$data,$clip);   
                  $token = $clip[1]; //Now that we have the "$token", lets put it in the CURL post: curl_setopt($ch, CURLOPT_POSTFIELDS, array('controlNumber'=>$token,'inputname1'=>"hammer",'inputname2'=>'nail')); $data = curl_exec($ch); curl_close($ch); var_dump($data); ?>

                  The code doesn't work, I have a feeling I'm missing something super-obvious...

                    In my experience, very little about automated browsing/scraping/data-interchange is "super-obvious". Have you ascertained that cookies.txt contains cookie data? Does $data contain what you think it should? Does $token appear to be a "likely-to-be-valid" token?

                    Shoot, I've seen sites that wouldn't work unless cURL lied and sent a Firefox browser UA string. I assume you don't control the server so you can't see its POV (logfiles) ....?

                      I'll be honest, I've never once used COOKIEJAR or COOKIEFILE, I've always just set the cookie header. You could try reading the return headers and setting the cookie header manually. here's an example of sending my session cookie back to my website.

                      $ch = curl_init('https://derokorian.com');
                      curl_setopt_array($ch, [
                        CURLOPT_RETURNTRANSFER => true,
                        CURLOPT_SSL_VERIFYPEER => false,
                      
                      // We need to get headers back in our response
                        CURLOPT_HEADER => true,
                      ]);
                      $result = curl_exec($ch);
                      
                      // Check that the request was successful
                      if ($result === false) {
                        die(sprintf("cURL failed (%d): %s\n", curl_errno($ch), curl_error($ch)));
                      }
                      
                      // Get the session cookie out of the response
                      if (preg_match_all("/^Set-Cookie: DFW_SESSION=([^;]*)/mi", $result, $matches)) {
                        $sessId = $matches[1][0];
                      } else {
                        die("No session cookies found in response\n");
                      }
                      
                      // Make a new request, this time sending in the session cookie
                      $ch = curl_init('https://derokorian.com');
                      curl_setopt_array($ch, [
                        CURLOPT_RETURNTRANSFER => true,
                        CURLOPT_SSL_VERIFYPEER => false,
                      
                      // Get headers again, to prove out that the cookie was successfully sent
                        CURLOPT_HEADER => true,
                      
                      // You can either use CURLOPT_COOKIE like this or...
                        CURLOPT_COOKIE => 'DFW_SESSION='.$sessId
                      
                      // You can use CURLOPT_HTTPHEADER like this
                        CURLOPT_HTTPHEADER => [
                          'Cookie: DFW_SESSION='.$sessId
                        ]
                      ]);
                      $result = curl_exec($ch);
                      
                      // Look at the result, as you can now see, there is no longer a set-cookie in the response, because the correct session was already sent in
                      var_dump($result);
                      

                        Derokorian, Thank you so much for pointing those things out. To answer your questions, I'm sure "$token" really contains the token because when I end the script half way through with "print $token;" my ssh window will show the randomly generated $token number. (So at least I know that "$data" and "$token" are correctly populated).

                        I'd like to try your cookie method and then come back here to let you know the results -- but I'm sorry I'm a little confused: the part in your code where you have the preg_match and "DFW_SESSION" like this:

                        // Get the session cookie out of the response 
                        if (preg_match_all("/^Set-Cookie: DFW_SESSION=([^;]*)/mi", $result, $matches)) { 
                          $sessId = $matches[1][0]; 
                        } else { 
                          die("No session cookies found in response\n"); 
                        }
                        

                        How did you know what to preg_match for? And what is "DFW-Session?" Because when I run my code and it contacts the webserver, I have no idea what the names of their session variables are...

                        I'm sorry, I have the feeling I've asked a dumb question, but would you mind please clarifying this for me? If you can clarify it, I'll go ahead and re-write my code, try it out, and post the results here.

                        Thank you, Derokorian.

                          codinghelper;11061733 wrote:

                          How did you know what to preg_match for? And what is "DFW-Session?" Because when I run my code and it contacts the webserver, I have no idea what the names of their session variables are...

                          I'm pretty sure that, since it's a regexp, it's specifically written for his application, and that his cookie contained that string. Yours would be different if you chose to use this approach.

                          I will say that I've read hundreds of thousands of pages from the WWW using PHP, cURL and the cURL COOKIEJAR.

                          Out of curiosity, in this case, what's your target system? Is it running an ASP server by any chance?

                            Dalescop, the server being used is "PicLan-IP 2.0.0 (build 175)."

                            I've confirmed that "cookies" are not being used because no "cookies.txt" file gets placed on my desktop (when I visit other sites that DO use cookies, a file called "cookies.txt" gets placed on my desktop).

                            This is interesting: in the middle of my script when I've preg_matched the $token, I added "print $token;" and then at the end of my script I have var_dump($data) to see the entire sourcecode of the page I've CURLed. The "token" from the "print $token" is DIFFERENT than the $token in the var_dump!

                            I cannot figure out what else this page wants! It shouldn't be rocket science. An ordinary web browser simply posts the variables a webserver needs, in this case:
                            1.) 'controlNumber'=>$token,
                            2.) 'inputname1'=>"hammer",
                            -and-
                            3.) 'inputname2'=>'nail'

                            Interestingly, the URL has my username and password encoded into it which means it's a "get" request.... but since http://www.website.com/index.php?username=myself&password=password works in an ordinary web browser, it should work fine when I CURL init that same URL.

                            I can't figure out what else it needs, unless:
                            a.) My script is wrong
                            b.) My script is in the wrong order
                            c.) I have something wrongly duplicated (like, for example, I have "$data = curl_exec($ch);" listed twice.... but maybe that's okay, I don't know)

                            What are your thoughts, please?

                            Here's the script as I have it now:

                            <?php  
                            $ch = curl_init();
                            curl_setopt($ch, CURLOPT_URL,'http://www.website.com/index.php?username=myself&password=password');
                            curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
                            curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
                            curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (compatible, MSIE 10, Windows NT 6.2'); ### THIS GETS THE RANDOM NUMBER "TOKEN": $data = curl_exec($ch); preg_match('/controlNumber\"\svalue=\"(.*)\"/',$data,$clip);
                            $token = $clip[1]; ### Now that we have the "$token", lets put it in the CURL post: curl_setopt($ch, CURLOPT_POSTFIELDS, array('controlNumber'=>$token,'inputname1'=>"hammer",'inputname2'=>'nail'));
                            $data = curl_exec($ch); ### This var_dump should have the result I am seeking ### Unfortunately, the var_dump only shows the same thing as the but instead it does not ### The following var_dump should have the result I am seeking. ### Unfortunately, it does not. It only shows the same ### original website as if I've simply refreshed the page ### as if nothing got POSTed. var_dump($data);
                            ?>

                              In your latest example, you've not set CURLOPT_POST. cURL's default method is GET, as I'm sure you're aware.

                              Other thoughts I've had are "why are credentials in the GET string, if if they're supposed to be there for the initial load, should they really be there the 2nd time?"

                              I suppose there could easily be reasons for that.

                              As for the token changing ... does any JavaScript in the browser adjust the token before it's sent to the server for the 2nd request?

                                Could it be that your useragent is malformed? You have an opening parenthesis but not a closing. Looks incomplete to me.

                                  codinghelper;11061733 wrote:

                                  How did you know what to preg_match for? And what is "DFW-Session?" Because when I run my code and it contacts the webserver, I have no idea what the names of their session variables are...

                                  dalecosp;11061753 wrote:

                                  I'm pretty sure that, since it's a regexp, it's specifically written for his application, and that his cookie contained that string. Yours would be different if you chose to use this approach.

                                  Yes, I was looking for my specific session name, however, you could just drop the DFW-Session part, and pull all cookies out using the rest of the regexp. Since I know the one cookie I need from my site, that's all I looked for, but if you didn't know you could catch them all, and just concatenate them together in the new request.

                                    unbelievable... I finally got it to work.... ALMOST! Here's the "almost working" code:

                                    <?php  
                                    $ch = curl_init();
                                    curl_setopt($ch, CURLOPT_URL,'http://www.website.com/index.html?user=username&pass=password'); curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
                                    curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/601.7.0 (KHTML, like Gecko) Version/9.0.1 Safari/537.81.8'); $result = curl_exec($ch); preg_match('/ControlNumber\"\svalue=\"(.*)\"/',$result,$number);
                                    $token = $number[1]; curl_setopt($ch, CURLOPT_POSTFIELDS, array('controlNumber'=>$token,'inputname1'=>"hammer",'inputname2'=>'nail')); $Desired_Result = curl_exec($ch); var_dump($Desired_Result); ?>

                                    PROBLEM: Using a normal web browser yields a server response with TWO pieces of data. However, using this CURL script above yields a server response with only ONE piece of data.

                                    Further, I discovered that in my web browser's "developer tool Network Tab," I can click "Copy as CURL" as well as "Copy Response."

                                    The "Copy Response" has ALL the data I want, absolutely perfect! Yet when I "Copy as CURL" and paste it in my SSH terminal, I only get a server response with just ONE bit of data...

                                    In fact, the "Copy as CURL" is real complicated, it has a zillion headers, gzip, deflate, user_agent, everything.... I would have thought this perfectly would mimic a web browser and thus provide me with a server response with all the data I require. Yet it is different.

                                    Why?

                                      One possible reason for the difference may be that the site recognised the CURL submission as a duplicate and so treated it differently from the original one.

                                      I'm starting to wonder if the site has been engineered to prevent automated access....