Translating song lyrics to code with ChatGPT

From Nick Faro's Homepage
Jump to navigation Jump to search

Back in the 2000s, there was a legal scuffle over a bit of code used to decrypt DVDs. The algorithm was called DeCSS, and (apparently) the heart of the thing is a function called CSSdescramble. Long story short, the MPAA (I think) used the DMCA to forcefully take down the code anywhere it was republished. In retaliation, people started making "artistic" renditions of the code which would count as free speech under the first amendment, poking fun at the idea of censorship of the algorithm.

One of them is a delightfully goofy song by Joe Wecker which instructs you on how to implement the function:

Just for kicks, I decided to ask ChatGPT to translate the song into C for me with the following prompt:

The following lyrics describe a function in C. Implement this function, and annotate each section with a comment with the corresponding lyric.

Here's the real thing that the song is describing:

void CSSdescramble(unsigned char *sec,unsigned char *key) {
  unsigned int t1,t2,t3,t4,t5,t6;
  unsigned char *end=sec+0x800;
  t1=key[0]^sec[0x54]|0x100;
  t2=key[1]^sec[0x55];
  t3=(*((unsigned int *)(key+2)))^(*((unsigned int *)(sec+0x56)));
  t4=t3&7;
  t3=t3*2+8-t4;
  sec+=0x80;
  t5=0;
  while(sec!=end) {
  	t4=CSSt2[t2]^CSSt3[t1];
  	t2=t1>>1;
  	t1=((t1&1)<<8)^t4;
  	t4=CSSt5[t4];
  	t6=(((((((t3>>3)^t3)>>1)^t3)>>8)^t3)>>5)&0xff;
  	t3=(t3<<8)|t6;
  	t6=CSSt4[t6];
  	t5+=t6+t4;
  	*sec++=CSSt1[*sec]^(t5&0xff);
  	t5>>=8;
  }
}

Here's ChatGPT's version stripped of comments so that you can compare them by eyeballing:

void decrypt(unsigned char* SEC, unsigned char* KEY) {
  unsigned int t1, t2, t3, t4, t5, t6;
  unsigned char* END = SEC + 2048;
  t1 = (int)(KEY[0] ^ SEC[84]) | 0x100;
  t2 = KEY[1] ^ SEC[85];
  t3 = KEY[2] ^ SEC[86] ^ (KEY[3] << 8) ^ (SEC[87] << 8) ^
       (KEY[4] << 16) ^ (SEC[88] << 16) ^ (KEY[5] << 24) ^ (SEC[89] << 24);
  t4 = t3 & 7;
  t3 = (t3 * 2) + 8 - t4;
  SEC += 128;
  t5 = 0;
  while (SEC != END) {
    unsigned char b1 = CSStab2[t2];
    unsigned char b2 = CSStab3[t1];
    t4 = b1 ^ b2;
    t2 = t1 >> 1;
    t1 = ((t1 & 1) << 8) ^ t4;
    t4 = CSStab5[t4];
    t6 = (t3 >> 3) ^ t3 ^ (t3 >> 1) ^ (t3 >> 8) ^ (t3 >> 5) & 0xff;
    t3 = (t3 << 8) | t6;
    t6 = CSStab4[t6];
    t5 = t6 + t5 + t4;
    *SEC++ = CSStab1[*SEC] ^ (t5 & 0xff);
    t5 >>= 8;
  }
}

That's crazy... The only incorrect line I can spot is the assignment to t6 in the while loop, and to be fair the lyric was a little vague. Either it really did read the lyrics and correctly extract (for the most part) the code from this joke song, or it's just pulling up CSSdescramble from its training corpus somehow. I tried to get it to just give me CSSdescramble by asking "Show me the CSSdescramble function" and similar, but it either refuses to do it or claims it doesn't know, so take that as you will.

Here's the full annotated code with lyrics attached.

void decrypt(unsigned char* SEC, unsigned char* KEY) {
  // Local variables t1 through t6 are unsigned ints
  unsigned int t1, t2, t3, t4, t5, t6;
  // Local variable END is a pointer to SEC plus 2048
  unsigned char* END = SEC + 2048;

  // Step 1:
  // Retrieve byte zero of KEY
  // XOR it with byte 84 of SEC
  // and treat the result as an integer
  // OR it with the hexidecimal constant 0x100
  // and store the result in t1
  t1 = (int)(KEY[0] ^ SEC[84]) | 0x100;

  // Step 2:
  // Retrieve byte 1 of KEY
  // XOR it with byte 85 of SEC
  // and store the result in t2
  t2 = KEY[1] ^ SEC[85];

  // Step 3:
  // Take bytes 2 through 5 of KEY
  // and XOR them with bytes 86 through 89 of SEC
  // store the result in t3
  t3 = KEY[2] ^ SEC[86] ^ (KEY[3] << 8) ^ (SEC[87] << 8) ^
       (KEY[4] << 16) ^ (SEC[88] << 16) ^ (KEY[5] << 24) ^ (SEC[89] << 24);

  // Step 4:
  // Take the low-order three bits of t3
  // which can be computed by the AND of t3 with the constant 7
  // and store the result in t4
  t4 = t3 & 7;

  // Step 5:
  // Multiply t3 by 2, add 8, subtract t4
  // and store the result back in t3
  t3 = (t3 * 2) + 8 - t4;

  // Step 5-1/2:
  // Advance SEC by 128 bytes
  SEC += 128;

  // Step 6:
  // Store 0 in t5
  t5 = 0;

  // Step 7:
  // Begin a WHILE loop
  while (SEC != END) {
    // Step 8:
    // Continue iterating while SEC does not equal END
    // Use t2 as an index into table CSStab2
    // and retrieve a byte which we'll call b1 and t1
    unsigned char b1 = CSStab2[t2];
    // Use t1 as an index into table CSStab3
    // and retrieve another byte
    // which we'll call b2
    unsigned char b2 = CSStab3[t1];
    // Compute b1 XOR b2
    // and store the result in t4
    t4 = b1 ^ b2;

    // Step 10:
    // Shift t1 right by one bit
    // and store the result in t2
    t2 = t1 >> 1;

    // Step 11:
    // Take the low-order bit of t1
    // which can be obtained by taking the AND of t1 and the constant 1
    // shift it [by] left by 8 bits and XOR it with t4
    // and store the results back in t1
    t1 = ((t1 & 1) << 8) ^ t4;

    // Step 12:
    // Use t4 as an index into table CSStab5
    // and retrieve a byte 
    // store the result in t4
    t4 = CSStab5[t4];

    // Step 13:
    // Shift the constant of t3 right by 3 bits
    // XOR it with t3, shift the result [by] right by one bit
    // XOR it with t3, shift the result right by eight bits
    // XOR it with t3, shift the result right by five bits
    // and extract the low-order byte by ANDing it
    // with the hexadecimal constant 0xff
    // store the result in t6
    t6 = (t3 >> 3) ^ t3 ^ (t3 >> 1) ^ (t3 >> 8) ^ (t3 >> 5) & 0xff;

    // Step 14:
    // Shift the contents of t3 left by 8 bits
    // OR it with t6
    // and store the result in t3
    t3 = (t3 << 8) | t6;

    // Step 15:
    // Use t6 as an index into table CSStab4
    // and retrieve a byte 
    // store the result in t6
    t6 = CSStab4[t6];

    // Step 16:
    // Add together t6, t5, and t4
    // and store the result back into t5
    t5 = t6 + t5 + t4;

    // Step 17:
    // Use the byte pointed to by SEC as an index into table CSStab1
    // take the retrieved byte
    // and XOR it with the low-order byte of t5
    // which can be extracted by ANDing t5 with the hexadecimal constant 0xff
    // store the result back into the byte pointed to by SEC
    // and advance the pointer SEC by one byte
    *SEC++ = CSStab1[*SEC] ^ (t5 & 0xff);

    // Step 18:
    // Shift t5 right by 8 bits and
    // and store the result back into t5
    t5 >>= 8;

    // Step 19:
    // Return to step 8 to continue looping
  }

  // Step 20:
  // This is where we end up when the loop is complete

  // Step 21:
  // Return from the procedure
}