Wednesday, July 8, 2015

Who bases their product on Mozilla? (I mean, OTHER than Firefox?)

Today's entry is a nameless application that's based on Mozilla's framework.  I'm not sure WHY they did it this way, but it posed some interesting challenges.  To start with, the initial application launcher didn't appear to actually DO anything, and then exited.  I can only assume it did some sort of magic behind the scenes to kick off the application, and then it exits.  It's very small, so I guess that's all it NEEDED to do.  What was kinda interesting, is that it was named the same name as the main executable.  So, if you looked at a task list, you would assume it was the launcher, and not the main application itself.

Once I found the main application, I threw it into IDA like normal, and that reveals nothing worthwhile from the outset, but running it with "Break on DLL load" was VERY informative.  The program loads the usual suspects, and then loads yet ANOTHER xxlicense.dll.  (Name changed to protect the innocent).  I exit the application, and go find xxlicense.dll.  Turns out, it's a screaming 23KB.  At least the LAST target attempted to thwart me by putting their licensing code in a HUGE .dll to make it harder to locate the pertinent bits.  Not so this time.  It's all right there in black and white.  Sorta.  So, I toss that file into IDA, it disassembles nicely, and I figure I'll just go to the exports list, and find the functions that I'm interested in, and start there.  To my surprise, the exports list is almost empty.  It contains 2 entries:  DllEntryPoint, and NSModule.  I've seen this sort of thing in the past when I was tinkering with a COM object's .dll.  So what's going on here?  Well, Mozilla has it's own list of methods that it uses, and IDA doesn't know about them.  (Or at least *I* don't know how to make IDA know about them.)  Either way.  So, no hints about what the functions are named.  We'll have to do it the hard way.  I open the file in Hex Workshop, and examine the strings contained inside.  There are lots of good ones!  I pick a nice one like "'License file ',27h,'%s',27h,' does not exist',0".  Go back to IDA, search for it, and boom.  I now have a pointer back to where the license file is opened.  A careful study of the function shows that it simply opens the file, and calls fgets in a loop to read all the contents into a buffer, after which, it closes the file.  Using IDA's cross reference feature, I see that this function is called from *1* place.  I go to that function, and see that it reads the license file into a buffer, and then passes the buffer to a function that processes it.

It appears that the "fields" in the license are all text, separated by bar "|" characters.  So, there is code to find the first bar, and copy from that point to the end of the license, to a new variable.  They then pass a pointer to this buffer to another function.

In the text area, there are some strings of garbage that aren't ASCII strings.  This always makes me suspicious.  In this case, it was warranted.  We have this:

loc_72AD1730 r
.data:72AD7061                 db 0E0h ; a
.data:72AD7062                 db 0E6h ; ยต
.data:72AD7063                 db 0F7h ; ˜
.data:72AD7064                 db 0E0h ; a
.data:72AD7065                 db 0F1h ; ±
.data:72AD7066                 db 0EEh ; e
.data:72AD7067                 db 0E2h ; G
.data:72AD7068                 db 0E7h ; t
.data:72AD7069                 db 0EEh ; e
.data:72AD706A                 db 0E0h ; a
.data:72AD706B                 db 0FCh ; n

The 1st character of that "string" is accessed inside the function that I'm looking at.  It's copied to a buffer, and then there's code that decrypts it.  It's simply XOR'd with 0xA5.  So, looking at the result yields "SECRETKGBKEY".  :-)  Funny guys!

They have that at the start of their buffer, and then copy the rest of the license over to it.  Once they have it all in the same place, they run a simple MD5 against it.  They save off the resultant hash, and return.  The produced hash is then compared against the 1st "field" of the license they read in from the file.  (The data before the 1st bar).  If it matches?  Code is GOOD!

There is one last thing that they do.  Inside the license text is a VERSION of the product.  "Product9" or "Product5".  Even though the hash matches, that doesn't mean that this license is for this version of the product.  So, they compare it.  (After they decrypt it that is!).  So, to keep you from just reading the strings from the .dll, and knowing what to put in the file, they make you work a LITTLE harder.

The product version string is stored with the high bit set, so that makes it also not a visible string when you look for it.  There is a loop that grabs a byte from this "string", ands it with 0x7F, and then compares it to a character from your license.  If they match, it goes on.  Otherwise, the return value is the same as your run of the mill strcmp.  (-1, 0, 1).  Once you product string matches, you're good to go!

One thing you have to be on the lookout for, it expects the LETTERS in your hash to be LOWERCASE.  Uppercase will cause the compare to fail.

So, the steps to make a license.  Take an existing license (Maybe even from another product from the same company!), change the product name to match what you find in the .DLL.  Replace the hash in the 1st field with SECRETKGBKEY.  Generate an MD5 hash on that, and replace SECRETKGBKEY with the hash.  Save, and enjoy your product!






Sunday, July 5, 2015

Embedded Linux based internet appliance keygen

A friend contacted me to tell me that an internet appliance that we both own has some extra functionality in it, if you enable it by entering a key code.  Of course you know that means that I'm interested.  The fact that said appliance is Linux based makes it more special for me, being a Linux hacker at heart.  He sends me some screenshots of the webpages involved, and I start grepping for the strings that I see in the pictures.  I find them, in a strings file.  (Since this app is localized to many different languages, this is to be expected).  I then search for references to that string ID.  And find them in a javascript file.  Of course it's compressed like those JS programmers do.  So, I go to my favorite online JS beautifier site, paste the code in, and viola!  I have readable code.  A quick search with a text editor, and I'm looking at where the "Invalid code entered" box is displayed.  And, I see that it's just a response to an error code.  Some tracing back through the other JS files brings me to a call to a function.  A quick grep for that function tells me that it's not in the JS anymore.  Sounds like we're going in the right direction.

I back out of the JS directory, and find the directory where the .so libraries are stored.  A quick ls of the directory shows me a file called xxlicense.so.  (Name has been changed to protect the innocent).  I toss that into IDA Pro, and see that there are some functions here, but that they call to some centralized function not present in this file.  The list of libraries that this thing loaded was impressive!  The normal ones for glibc, and the like, but also a TON of other libraries.  Not sure if they used them all, or if they were just trying to obfuscate where the protection bits were.  So, I start loading the libs, one at a time, looking for where the functions were that it was calling.  I FINALLY find them in a very large library.  A little digging around shows me that in typical Linux style, they didn't strip the symbols from the library, so not only do I have the names for the EXPORTED functions, but I also have the names of the utility functions used by the exported functions.  A little digging reveals verifyKey.  A cursory glance through here is initially intimidating.  OpenSSL, SHA1, big nums, etc.  All the makings for a genuine nightmare.  But, since you're reading about it here, you know where this is going.

Some further study reveals that the unlock code that you enter is 20 characters long.  They have a list of allowable characters.  (No 0's, O's, 1's, l's, or anything that could be construed as being something else).

char validCharacters[]= {"BCDFGHJKMPQRTVWXY2346789\0"};

These characters are processed one at a time, and a pseudo-summation is performed to generate a key.  The key contains some data, and a checksum of the key.  My mock up of their code looks like this:
      counter = 0;
      do
      {
        BN_mul_word(pOutput, 24);
        keyCharacter = pInput[counter];
        compareCharacter = 'B';
        innerLoopCounter = 0;
        while ( keyCharacter != compareCharacter )
        {
          ++innerLoopCounter;
          if ( innerLoopCounter == 24 )
          {
            innerLoopCounter = -1;
            break;
          }
          compareCharacter = validCharacters[innerLoopCounter];
        }
        ++counter;
        BN_add_word(pOutput, innerLoopCounter);
      }
      while ( counter != length );

There are 24 characters in their "safe list". That's why there's a comparison of innerLoopCounter and 24. The 1st character of their list is 'B'. That's why they "pre-load" B into compareCharacter.
 Now, as you see, they don't add the  CHARACTER to the sum, they add the INDEX of the character in their "safe list". And then, multiply the whole thing by 24. (Except for the LAST character). Being that the code is 20 characters long, this yields a valid number in the 46-91 bit range. That's why they use the BN (Big Number) functionality from OpenSSL. Once this is done, they then slice, and dice the results. The upper 45 bits are a SHA1 hash of the lower 46 bits. And the lower 46 bits contains 31 bits of serial number, and 4 bits of license count, and 11 unused bits. Since we understand all of this, we simply need to do things in the REVERSE order to make our keygen. So, we take a serial number, and number of licenses, and shift those into place. Generate a SHA1 of that half, truncate it to 45 bits, and slap it into the TOP half. THEN comes the interesting part. Reversing the function you see above. It took me a while to figure out exactly how to do it, since brute forcing 24^19th didn't seem like something that I would like to do on a regular basis. The way that I came to understand what would be the final technique was to start with a code of 00000000000000000001. I looked at the output. Then, I moved the 1 to the left, like this: 00000000000000000010. Looked at THAT value. I did that all the way across, and examined the values. It came out as 1 * 24 ^nth. (Where nth represents the position of the 1 character in the string).

 This triggered a thought. What if we do the LEFT most character first. And by DO, I mean divide it by 24^19th, use the result as the index into the character array, and then work on the remainder.

 So, I whipped up code to do that. It looks like this:
  // Set the divisor to 24^19
  BN_set_word(result, 24);
  BN_set_word(remainder, 19);
  BN_exp(rollingDivisor, result, remainder, ctx);

  // Process each digit (codes are 20 digits long)
  while ( count < 20 )
  {
    // If we make it to 0, just spit out B's for the rest (or hot fire).
    if ( BN_is_zero(code) )
    {
      outputCode[count++] = 'B';
      continue;
    }

    // Divide the code by the divisor
    BN_div(result, remainder, code, rollingDivisor, ctx);

    // The remainder becomes the new code
    BN_copy(code, remainder);

    // Adjust the divisor
    BN_div_word(rollingDivisor, 24);

    // Value should be in the realm of 0-23
    if ( BN_num_bits(result) > 5)
    {
      printf("Something is broken, too many bits\n");
      goto Exit;
    }

    // Horrible hack because BN_bn2bin didn't appear to want to work.
    index = atoi(BN_bn2dec(result));

    // Save the letter of the code
    outputCode[count++] = validCharacters[index];
  }
I ran it, and it worked! Some additional digging was required, as the target application also maintains an internal blacklist of codes. I'm not sure if these are codes that they've seen shared online, or what the deal is, but there's a list of 40-some codes. So, I added code to verify that the entered code isn't in the blacklisted code. And lastly, the serial number that you select at generation time has to be within some defined ranges as well. So, added some quick code to verify that, and there you have it. An embedded Linux based internet appliance keygen.