Why isn't CURL logging into external website?

2019-02-15 21:03发布

I've been experimenting over and over but what I've got so far doesn't allow me to login into Pinterest with CURL and I cannot understand why..

function pinLogin()
{   
    $login_post     = array(
        'source_url' => '/login/',
        'data' => '{
            "options":{
                "username_or_email":"email",
                "password":"password"
                },
            "context":{}}',
        'module_path' => 'App()>LoginPage()>Login()>Button(text=Log In, size=large, class_name=primary, type=submit)',
    );

    $httpheaders    = array(
       'Connection: keep-alive',
       'Pragma: no-cache',
       'Cache-Control: no-cache',
       'Content-Type: application/x-www-form-urlencoded; charset=UTF-8',
       'User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:35.0) Gecko/20100101 Firefox/35.0',
       'Accept: application/json, text/javascript, */*; q=0.01',
       'Accept-Language: en-US,en;q=0.5',
       'Accept-Encoding: gzip, deflate',
    );

    $login_header   = array(
        'X-Pinterest-AppState: active',
        'X-NEW-APP: 1',
        'X-APP-VERSION: 71854ca',
        'X-Requested-With: XMLHttpRequest',
        'Accept: application/json, text/javascript, */*; q=0.01'
    );

    // request home page to establish cookies and a session, set curl options
        $ch = curl_init('http://www.pinterest.com/');
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
        curl_setopt($ch, CURLOPT_AUTOREFERER, 1);
        curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookie.txt');
        curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie.txt');
        curl_setopt($ch, CURLOPT_VERBOSE, 1);
        curl_setopt($ch, CURLOPT_STDERR, fopen('/tmp/debug.txt', 'w+'));
        curl_setopt($ch, CURLOPT_HEADER, 1);
        curl_setopt($ch, CURLOPT_HTTPHEADER, $httpheaders);

        $data = curl_exec($ch);
    // ----------------------------------------------------------------------------

    // parse the csrf token out of the cookies to set later when logging in
        list($headers, $body) = explode("\r\n\r\n", $data, 2);

        preg_match('/csrftoken=(.*?)[\b;\s]/i', $headers, $csrf_token);

        // next request the login page
        curl_setopt($ch, CURLOPT_URL, 'http://www.pinterest.com/login/');
        $data = curl_exec($ch);
    // ----------------------------------------------------------------------------

    // perform login post    
        $login_header[] = 'X-CSRFToken: ' . $csrf_token[1];

        curl_setopt($ch, CURLOPT_URL, 'http://www.pinterest.com/resource/UserSessionResource/create/');
        curl_setopt($ch, CURLOPT_POST, 1);
        curl_setopt($ch, CURLOPT_POSTFIELDS, $login_post);
        curl_setopt($ch, CURLOPT_HTTPHEADER, array_merge($httpheaders, $login_header));
        curl_setopt($ch, CURLOPT_REFERER, 'http://www.pinterest.com/login/');
        curl_setopt($ch, CURLOPT_HEADER, 0);

        $data = curl_exec($ch);
    // ----------------------------------------------------------------------------


    if (curl_getinfo($ch, CURLINFO_HTTP_CODE) != 200)
    {
        echo "Error logging in.<br />";
        var_dump(curl_getinfo($ch));

    } else {
        $response = json_decode($data, true);

        if ($response === null)
        {
            echo "Failed to decode JSON response.<br /><br />";
            var_dump($response);
        } else if ($response['resource_response']['error'] === null) {
            echo "Logged in..";
        }
        print_r($response);
    }
}

I've tried to emulate the same headers that are sent to pinterest but I'm still not able to login for some reason..

https://www.pinterest.com/resource/UserSessionResource/create/

POST /resource/UserSessionResource/create/ HTTP/1.1
Host: www.pinterest.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:35.0) Gecko/20100101 Firefox/35.0
Accept: application/json, text/javascript, */*; q=0.01
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
X-Pinterest-AppState: active
X-CSRFToken: suv5Dm0MHGc3tWY4GTPHzgBjYSXo94xt
X-NEW-APP: 1
X-APP-VERSION: 71854ca
X-Requested-With: XMLHttpRequest
Referer: https://www.pinterest.com/login/?next=https%3A%2F%2Fwww.pinterest.com%2F%3Fusername%3DUSER&prev=https%3A%2F%2Fwww.pinterest.com%2F%3Fusername%3DUSER
Content-Length: 456
Cookie: __utma=229774877.1495817695.1423754956.1424404967.1424434787.45; __utmz=229774877.1424125793.30.5.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=(not%20provided); csrftoken=suv5Dm0MHGc3tWY4GTPHzgBjYSXo94xt; _pinterest_sess=TWc9PSZmWTFLSWM5cGx5aEhiM0ZTdHR2R21xS2JMVlVPejZYV1lMZWZadXBtak9icVlaRjdKZGozMU5vY3k4ZXRVUjZCQS90aFI0NndIeTNWWnR5RkVHY0VtSlM1UHRIZm01UFNGY093OHk0US9GRGY5Qk1FT0JsVEZjdTVSMDA5ODdPZUhhd2tvcWJVc3hqYmlNdG9PLytMQXc9PSZ5RXRjOUdvZFI0L1hoWTVFMnlsb2lNKzRSTW89; _b="AQ1q3LoHG1dIHash9bxk4SiJLwh9Pie2j1AhDB2OYuDFJcwxnUdVLzs9hLcTSKS53mU="; _pinterest_pfob=disabled; c_dpr=1; __utmb=229774877.28.4.1424435987021; __utmc=229774877; __utmt=1; logged_out=True; fba=True; GCSCE_5B243246522C4B23F685F2EB9D5F3C78DF8A0272_S3=C=694505692171-31closf3bcmlt59aeulg2j81ej68j6hk.apps.googleusercontent.com:S=c313ffc1a154b200119a21be80be878b703de85b.BK7j4ooMbUBBATCa.2d62:I=1424435991:X=1424522391
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache
source_url=/login/
&data=
{
    "options":
    {
        "username_or_email":"EMAIL@EMAIL.COM",
        "password":"PASSWORD1GOES2HERE"
    },
    "context":{}
}
&module_path=App()>LoginPage()>Login()>Button(text=Log In, size=large, class_name=primary, type=submit)

3条回答
放荡不羁爱自由
2楼-- · 2019-02-15 21:12

I think you should use https instead of http

$ch = curl_init('https://www.pinterest.com/'); // <-- HERE

and comment this line :

// $login_header[] = 'X-CSRFToken: ' . $csrf_token[1];
查看更多
可以哭但决不认输i
3楼-- · 2019-02-15 21:19

im not sure why your code doesn't work, but im pretty sure the array_merge will mess up the numeric keys (if any).. and that you're not handling X-CSRFToken header correctly (it changes on several places, and you only check it once).. anyway, doing this without an api isn't as easy as it may look like, but this works as of 22 february 2015, but be careful with the username/password, as i am probably not escaping it correctly (should probably escape it with json_encode() somehow)

EDIT: updated code so you get the logged in HTML on last request. (this proves beyond all reasonable doubt that you have in fact logged in ;) the way i checked it was to base64_encode() the output, then run this javascript in my browser: document.body.outerHTML=atob("base64"); , then i saw the same "you are logged in" screen)

<?php
error_reporting(E_ALL);
set_error_handler("exception_error_handler");
function exception_error_handler($errno, $errstr, $errfile, $errline ) {
    if (!(error_reporting() & $errno)) {
        // This error code is not included in error_reporting
        return;
    }
    throw new ErrorException($errstr, 0, $errno, $errfile, $errline);
}


$curlh=hhb_curl_init(array(
CURLOPT_USERAGENT=>"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.115 Safari/537.36"
,CURLOPT_HEADER=>true
)
);
$username="f327410@trbvm.com";
$password="f327410@trbvm.compassword";
$matches=array();
$info=hhb_curl_exec($curlh,'https://www.pinterest.com/login/?next=https%3A%2F%2Fwww.pinterest.com%2F&prev=https%3A%2F%2Fwww.pinterest.com%2F');//get session cookie and stuff (should be handled by curl automatically)
preg_match("/csrftoken\=([^\;]*)/",$info,$matches);
$CSRFToken=$matches[1];
curl_setopt_array($curlh,array(
CURLOPT_URL=>'https://www.pinterest.com/resource/UserSessionResource/create/'
,CURLOPT_POST=>true
,CURLOPT_ENCODING=>"gzip, deflate"
,CURLOPT_HTTPHEADER=>array(
    'Accept:application/json, text/javascript, */*; q=0.01',
    'Accept-Language:nb-NO,nb;q=0.8,no;q=0.6,nn;q=0.4,en-US;q=0.2,en;q=0.2',
    'Connection:keep-alive',
    //TODO: Content-Length:414
    'Content-Type:application/x-www-form-urlencoded; charset=UTF-8',
//Cookie:csrftoken=wu1TXmJFeCD1q5scixeeK8QFkHSIIXg1; _pinterest_sess=TWc9PSZIbitpRE1Ka2tuRmNXTGNHY3NXQS9reXVvNENxdytpM3BkMCswNldrOUk5WDRucEk5UldYWEIwUERlWG84YXFOT1VrdlRiVHVIMUxTMkthM3hrYTZLTkM0NWJHQzFiQzVvdUQ5Ynp1Q255OUFBOEFVOWFpSzh4NHo2SC9RcTJ5M3NiNEt3YmliTmR2YTRyb0RPMlN3elE9PSZxUWtoVkZ3c0xXYkhMNEtYQVZBWXY5ak1Ec2s9; c_dpr=1; __utmt=1; __utma=229774877.1252202543.1424620619.1424620619.1424620619.1; __utmb=229774877.5.7.1424620619; __utmc=229774877; __utmz=229774877.1424620619.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)
    'Host:www.pinterest.com',
    'Origin:https://www.pinterest.com',
    'Referer:https://www.pinterest.com/',
    'X-APP-VERSION:7c24931',
    'X-CSRFToken:'.$CSRFToken,
    'X-NEW-APP:1',
    'X-Pinterest-AppState:active',
    'X-Requested-With:XMLHttpRequest',

    )
,CURLOPT_POSTFIELDS=>
'source_url='.rawurlencode('/login/?next=https%3A%2F%2Fwww.pinterest.com%2F&prev=https%3A%2F%2Fwww.pinterest.com%2F').
'&data='.rawurlencode('{"options":{"username_or_email":"'.$username.'","password":"'.$password.'"},"context":{}}').
//not sure if username/password is escaped correctly.
'&module_path='.rawurlencode('App()>LoginPage()>Login()>Button(text=Logg inn, size=large, class_name=primary, type=submit)')
));
$info=hhb_curl_exec($curlh,'https://www.pinterest.com/resource/UserSessionResource/create/');;
$matches=array();
preg_match("/csrftoken\=([^\;]*)/",$info,$matches);
$CSRFToken=$matches[1];
//var_dump(__LINE__,$matches,$info);die();
//^this is interesting..
curl_setopt_array($curlh,array(
CURLOPT_URL=>"https://www.pinterest.com/resource/UserRegisterTrackActionResource/update/"
,CURLOPT_POST=>true
,CURLOPT_ENCODING=>"gzip, deflate"
,CURLOPT_HTTPHEADER=>array(
    "Origin:https://www.pinterest.com",
    "Accept-Language:nb-NO,nb;q=0.8,no;q=0.6,nn;q=0.4,en-US;q=0.2,en;q=0.2",
    "Accept:application/json, text/javascript, * /*; q=0.01",
    "X-Requested-With:XMLHttpRequest",
    "X-NEW-APP:1",
    "X-APP-VERSION:7c24931",
    "X-Pinterest-AppState:active",
    "Referer:https://www.pinterest.com/",
    "Connection:keep-alive",
    //TODO: Content-Length:358
    "Content-Type:application/x-www-form-urlencoded; charset=UTF-8",
    "Host:www.pinterest.com",
    "X-CSRFToken:".$CSRFToken//TODO: verify that the token has not changed.
    )
,CURLOPT_POSTFIELDS=>
'source_url='.rawurlencode('/login/?next=https%3A%2F%2Fwww.pinterest.com%2F&prev=https%3A%2F%2Fwww.pinterest.com%2F').
'&data='.rawurlencode('{"options":{"action":"setting_new_window_location"},"context":{}}').
//not sure if username/password is escaped correctly.
'&module_path='.rawurlencode('App()>LoginPage()>Login()>Button(text=Logg inn, size=large, class_name=primary, type=submit)')

));
$info=hhb_curl_exec($curlh,'https://www.pinterest.com/resource/UserRegisterTrackActionResource/update/');
//var_dump(__LINE__,$info);die();
//now we should be logged in! :D

curl_setopt_array($curlh,array(
CURLOPT_URL=>"https://www.pinterest.com/resource/UserRegisterTrackActionResource/update/"
,CURLOPT_POST=>false
,CURLOPT_ENCODING=>"gzip, deflate"
,CURLOPT_HTTPHEADER=>array(
    "Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
    "Accept-Language:nb-NO,nb;q=0.8,no;q=0.6,nn;q=0.4,en-US;q=0.2,en;q=0.2",
    "Connection:keep-alive",
    "Host:www.pinterest.com",
    "Referer:https://www.pinterest.com/"
    )
));
/*
//fuckthis Accept-Encoding:gzip, deflate, sdch
//Cookie:c_dpr=1; __utmt=1; __utma=229774877.1252202543.1424620619.1424620619.1424620619.1; __utmb=229774877.5.7.1424620619; __utmc=229774877; __utmz=229774877.1424620619.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); _b="AQ3m6m5qQAVDaIkyqRoJYJ9ecazmK4aobP3PczTxb/BtXObCwlC/5kusK9/Ymj2luo8="; csrftoken=EitE4BCiLq3sz0hf5lHtCx6uNvyIaalo; _pinterest_sess="TWc9PSZLclVramZrWGRUMVYzZW1ZbmxXTXFXeWpHU2ZOVFBFNmFUOXU2ZlNJWFJ0TkdzTy9TZ1RQdmxtNmxZa0JNWnliR2VRS2t6UTRZVGtSVnNySlJKRVBEUjh4K3FWR2gxYi8yS3AxTmhqWW9COWZWaGJiK1Q2Y29ydjhLSGRDb2srdTdHaVh6RU12SEZnVmxlM09UNEloQU9JKzQxRDNqOFlISHRHZ0hIVW9kTUttWlhEd1BOaTJnbHZYTDZ5enBRSGtubDJaSnNKSjlzaG9SaWsrMFZaenhLeWpVaElxbTdZOG1sa3ZGeWQ3MWNFQy81YmtHQkxsZDlBQVNEK1FTUUJEYWZqV2tUMzVDVVM4R1VXL0lCOHZ3MHhPcC81YVZjOWRnSkZoTXFVQXRLU21OK05PZmtFczNvY2ZGdVRMS2pWdXR0WG8wakJTeTdYNlRqV3NZVmtHQzBsM0VyVnhVeXIzVkRWdXlqT3Q1eFVqWUJwVkxuR1ZwY3M5YXJBU2xKQ0lua3U1UkRxYWk0c0lVR1lJcHpMOUZNQXo0ZWlRSDRlaGVSa3NUaEFnREl2Q2lvN0xQc05DNjk5emNESDdaM3YxVmFwNU9KVFhLUGJBVStLcVZjVk1pMlREa3JzcW1FWEdSMGF0cXdvTlpGaz0mYmpVenZYQk1UY0xsN0Y1ZXRzTGhLV2FyRThRPQ=="
*/

$info=hhb_curl_exec($curlh,'https://www.pinterest.com');
var_dump(__LINE__,$info);die();

/*    
//Cookie:c_dpr=1; __utmt=1; __utma=229774877.1252202543.1424620619.1424620619.1424620619.1; __utmb=229774877.5.7.1424620619; __utmc=229774877; __utmz=229774877.1424620619.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); _b="AQ3m6m5qQAVDaIkyqRoJYJ9ecazmK4aobP3PczTxb/BtXObCwlC/5kusK9/Ymj2luo8="; csrftoken=EitE4BCiLq3sz0hf5lHtCx6uNvyIaalo; _pinterest_sess="TWc9PSZLclVramZrWGRUMVYzZW1ZbmxXTXFXeWpHU2ZOVFBFNmFUOXU2ZlNJWFJ0TkdzTy9TZ1RQdmxtNmxZa0JNWnliR2VRS2t6UTRZVGtSVnNySlJKRVBEUjh4K3FWR2gxYi8yS3AxTmhqWW9COWZWaGJiK1Q2Y29ydjhLSGRDb2srdTdHaVh6RU12SEZnVmxlM09UNEloQU9JKzQxRDNqOFlISHRHZ0hIVW9kTUttWlhEd1BOaTJnbHZYTDZ5enBRSGtubDJaSnNKSjlzaG9SaWsrMFZaenhLeWpVaElxbTdZOG1sa3ZGeWQ3MWNFQy81YmtHQkxsZDlBQVNEK1FTUUJEYWZqV2tUMzVDVVM4R1VXL0lCOHZ3MHhPcC81YVZjOWRnSkZoTXFVQXRLU21OK05PZmtFczNvY2ZGdVRMS2pWdXR0WG8wakJTeTdYNlRqV3NZVmtHQzBsM0VyVnhVeXIzVkRWdXlqT3Q1eFVqWUJwVkxuR1ZwY3M5YXJBU2xKQ0lua3U1UkRxYWk0c0lVR1lJcHpMOUZNQXo0ZWlRSDRlaGVSa3NUaEFnREl2Q2lvN0xQc05DNjk5emNESDdaM3YxVmFwNU9KVFhLUGJBVStLcVZjVk1pMlREa3JzcW1FWEdSMGF0cXdvTlpGaz0mYmpVenZYQk1UY0xsN0Y1ZXRzTGhLV2FyRThRPQ=="


Response Headersview source
Accept-Ranges:bytes
Cache-Control:no-cache, no-store, must-revalidate, max-age=0
Connection:keep-alive
Content-Encoding:gzip
Content-Length:348
Content-Type:application/json; charset=utf-8
Date:Sun, 22 Feb 2015 15:57:42 GMT
Expires:Thu, 01 Jan 1970 00:00:00 GMT
Pinterest-Breed:CORGI
Pinterest-Generated-By:ngapp2-1af98e48
Pinterest-Version:7c24931
Pragma:no-cache
Server:nginx
Set-Cookie:_pinterest_pfob=disabled; Domain=.pinterest.com; expires=Wed, 21-Feb-2018 15:57:42 GMT; Max-Age=94607999; Path=/
Vary:User-Agent, Accept-Encoding






 */






function hhb_curl_init($custom_options_array = array()) {
    if(empty($custom_options_array)){
        $custom_options_array=array();
//i feel kinda bad about this.. argv[1] of curl_init wants a string(url), or NULL
//at least i want to allow NULL aswell :/
    }
    if (!is_array($custom_options_array)) {
        throw new InvalidArgumentException('$custom_options_array must be an array!');
    };
    $options_array = array(
        CURLOPT_AUTOREFERER => true,
        CURLOPT_BINARYTRANSFER => true,
        CURLOPT_COOKIESESSION => true,
        CURLOPT_FOLLOWLOCATION => true,
        CURLOPT_FORBID_REUSE => false,
        CURLOPT_HTTPGET => true,
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_SSL_VERIFYPEER => false,
        CURLOPT_CONNECTTIMEOUT => 10,
        CURLOPT_TIMEOUT => 11,
        CURLOPT_ENCODING=>"",
        CURLOPT_REFERER=>'example.org',
        CURLOPT_USERAGENT=>'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.146 Safari/537.36'
    );
    if (!array_key_exists(CURLOPT_COOKIEFILE, $custom_options_array)) {
        //do this only conditionally because tmpfile() call..
         static $curl_cookiefiles_arr=array();//workaround for https://bugs.php.net/bug.php?id=66014
     $curl_cookiefiles_arr[]=$options_array[CURLOPT_COOKIEFILE] = tmpfile();
     $options_array[CURLOPT_COOKIEFILE] =stream_get_meta_data($options_array[CURLOPT_COOKIEFILE]);
     $options_array[CURLOPT_COOKIEFILE]=$options_array[CURLOPT_COOKIEFILE]['uri']; 
    }
    //we can't use array_merge() because of how it handles integer-keys, it would/could cause corruption
    foreach($custom_options_array as $key => $val) {
        $options_array[$key] = $val;
    }
    unset($key, $val, $custom_options_array);
    $curl = curl_init();
    curl_setopt_array($curl, $options_array);
    return $curl;
}
$hhb_curl_domainCache = "";
function hhb_curl_exec($ch, $url) {
    global $hhb_curl_domainCache; //
    //$hhb_curl_domainCache=&$this->hhb_curl_domainCache;
    //$ch=&$this->curlh;
        if(!is_resource($ch) || get_resource_type($ch)!=='curl')
    {
        throw new InvalidArgumentException('$ch must be a curl handle!');
    }
    if(!is_string($url))
    {
        throw new InvalidArgumentException('$url must be a string!');
    }
    $tmpvar = "";
    if (parse_url($url, PHP_URL_HOST) === null) {
        if (substr($url, 0, 1) !== '/') {
            $url = $hhb_curl_domainCache.'/'.$url;
        } else {
            $url = $hhb_curl_domainCache.$url;
        }
    };
    curl_setopt($ch, CURLOPT_URL, $url);
    $html = curl_exec($ch);
    if (curl_errno($ch)) {
        throw new Exception('Curl error (curl_errno='.curl_errno($ch).') on url '.var_export($url, true).': '.curl_error($ch));
        // echo 'Curl error: ' . curl_error($ch);
    }
    if ($html === '' && 203 != ($tmpvar = curl_getinfo($ch, CURLINFO_HTTP_CODE)) /*203 is "success, but no output"..*/ ) {
        throw new Exception('Curl returned nothing for '.var_export($url, true).' but HTTP_RESPONSE_CODE was '.var_export($tmpvar, true));
    };
    //remember that curl (usually) auto-follows the "Location: " http redirects..
    $hhb_curl_domainCache = parse_url(curl_getinfo($ch, CURLINFO_EFFECTIVE_URL), PHP_URL_HOST);
    return $html;
}

you can see the code live in action here: http://codepad.viper-7.com/D8qk6q (for a few days until the server delete the code, anyway. or until some internet dickhead changes the password. it's a throwaway account anyway, obviously)

查看更多
冷血范
4楼-- · 2019-02-15 21:33

I'm pretty sure this is not going to work without getting an request_identifier which is required. To explain, when you load the page you get an unique number for that 'session' which is compared when you are going to log in. This is for a reason to avoid CSRF (Cross-Site Request Forgery). If you examine the actual POST, you will notice that not only an username or password are posted, but a few items more.

查看更多
登录 后发表回答