Google Apps Script - how to login and get data?

Introduction:
I'm pretty inexperienced, but I recently tried to access some data from a website using Google Apps scripts. However, to access the data, I must be registered with this website. There have actually been many posts about similar problems before, but none of them was very helpful until I came to this: how to get the Wordpress admin page using google apps script... The accepted answer gave a method to save cookies and send them again in a second request. I basically copied and pasted the code into my own GAS file. Since the problem in this post was logging in Wordpress, I tried this first and it worked. I had to remove the if statement checking the response code because 200 was returning even when I entered the correct combination. I don't know if it was just a bug in the postal code or what. Anyway, I confirmed that the second request I made returned information as if I were logged in.


Site- Specific Details: The actual website I am trying to login to has some strange hashing method that I have not seen on any other login page. When you click the Submit button, the password navigates to something really long before moving to another page. The form opening tag looks like this:

<form action="/guardian/home.html" method="post" name="LoginForm" target="_top" id="LoginForm" onsubmit="doPCASLogin(this);">

      

As you can see, it has an "onsubmit" attribute, which I believe will just run "doPCASLogin (this)"; when the form is submitted. I decided to play with the page by simply typing javascript into the address bar. What I found was to do a command like this (after entering my username and password):

javascript: document.forms[0].submit();

      

does not work. So I dug around and found the "doPCASLogin ()" function in a javascript file called "md5.js". I believe md5 is some sort of hashing algorithm, but it doesn't really matter. The important part of "doPCASLogin ()" is the following:

function doPCASLogin(form) {
   var originalpw = form.pw.value;
   var b64pw = b64_md5(originalpw);
   var hmac_md5pw = hex_hmac_md5(pskey, b64pw)
   form.pw.value = hmac_md5pw;
   form.dbpw.value = hex_hmac_md5(pskey, originalpw.toLowerCase())
   if (form.ldappassword!=null) {
     form.ldappassword.value = originalpw;
   }
}

      

There are other things as well, but I found it didn't matter to my input. Obviously, this just runs the password through another function multiple times, using "pskey" (stored in hidden input, every time every reboot) as the key, and puts it in the original data in the original form ("dbpw" and "ldappassword "are hidden inputs, and" pw "is the visible password input). Then he sends. I found this other function, hex_hmac_md5 (), which actually connects with a whole host of other functions to hash the password. It doesn't matter anyway, because I can just call "hex_hmac_md5 ()" from javascript I type in the address bar. This is the working code I came up with, I just broke the line for readability:

javascript:
document.forms['LoginForm']['account'].value="username";
document.forms['LoginForm']['pw'].value="hex_hmac_md5(pskey, b64_md5('password');)";
document.forms['LoginForm']['ldappassword'].value="password";
document.forms['LoginForm']['dbpw'].value="hex_hmac_md5(pskey, 'password')";
document.forms['LoginForm'].submit();

      

Wherever you find "username" or "password" it means that I entered my username and password in those places, but obviously I deleted them. When I found that it worked, I wrote a little Chrome extension that will automatically log in when I go to the site (the login process looks weird, so Chrome doesn't remember my username and password). It was good, but it was not my ultimate goal.

Dilemma:
After discovering all this about hashing, I tried to just insert all of these values โ€‹โ€‹into the HTTP payload in my GAS file, although I was skeptical that it would work. This is not the case, and I suspect it is because the values โ€‹โ€‹are just read as strings and the javascript is not actually running. This makes sense because running the actual javascript is likely to be a security issue. However, why did it work in the address bar? Just as a side note, I get a 200 response code and it also seems like the cookie is being sent back as well, although it may not be valid. When I read the actual answer, it is again the login page.

I also thought about trying to replicate the entire function in my own code after seeing this: How do I programmatically log into a site? but since the "pskey" is different on every reboot, I think the hashing should be done with the new key on the second UrlFetch. So even if I copied all the functions into my GAS file, I don't think I would be able to log in successfully, because I would need to know the "pskey" that will be generated for a specific request before submitting the request was would be impossible. The only way it will work is to somehow save one page and read it before sending the data, but I don't know how to do this with GAS.

EDIT: I found another input called "contextData" which is the same as "pskey" when the page is loaded. However, if I go in and watch a POST request made with Chrome Developers tools once, I can copy all the input values โ€‹โ€‹including "contextData" and I can send another request a second time. Using javascript in the address bar it looks like this:

javascript:
document.forms['LoginForm']['account'].value="username";
document.forms['LoginForm']['pw'].value="value in field that browser sent once";
document.forms['LoginForm']['ldappassword'].value="password";
document.forms['LoginForm'['dbpw'].value="value in field that browser sent once";
document.forms['LoginForm'['contextData'].value="value in field that browser sent once";
document.forms['LoginForm'].submit();

      

I can log into the site as many times as I want, this way no matter what "pskei" is, because I submit to everyone directly and no hashing is done. However, this still doesn't work for me, so I'm kind of stuck. I should note that I have checked other hidden input fields and I can still successfully login with the above javascript even after clearing each login on the form.

QUESTIONS:
- Am I correcting myself under the assumption that the code I posted was interpreted as a string?
- why is the new code below that I recently wrote not working?
- for future reference, how to use GAS to login to a site like Google where a randomly generated string is submitted in the login form and needs to be submitted back?

function getData() {
  var loginURL = 'login page';
  var dataURL = 'page with data';
  var loginPayload = {
     'account':'same as in previous code block',
     'pw':"same as in previous code block",
     'ldappassword':'same as in previous code block',
     'dbpw':"same as in previous code block",
     "contextData":"same as in previous code block",
  };
  var loginOptions = {'method':'post','payload':loginPayload,'followredirects':false};
  var loginResponse = UrlFetchApp.fetch(loginURL,loginOptions);

  var loginHeaders = loginResponse.getAllHeaders();
  var cookie = [loginResponse.getAllHeaders()["Set-Cookie"]];
  cookie[0] = cookie[0].split(";")[0];
  cookie = cookie.join(";");

  var dataHeaders = {'Cookie':cookie};
  var dataOptions = {'method':'get','headers':dataHeaders};
  var dataResponse = UrlFetchApp.fetch(dataURL,dataOptions);

  Logger.log(dataResponse);
}

      

+3


source to share





All Articles