Thursday, June 28, 2007

[TECH] Memory corruption and format specifiers "%s%c" is different from "%s%1s"

I have been fixing several issues last few weeks, and the following issue in someone's code is a clear test for understanding differences between strings and char's.

On the first look "%s%c" and "%s%1s" seem very similar, but unfortunately NO! and they can create some nasty runtime bugs corrupting your variables, suppose the code existing in someone's code like this.

void BuggyScanner(){
    char buf[MAX_SIZE];
    int a;
    char b;
    scanf("%d",&a);
    scanf("%3s%1s",buf,&b);
    printf("a=%d buf=%s b=%c\n",a,buf,b);
}
The format specifier is really a blunder as scanf "%1s" is going to write beyond one byte (the extra '\0' which gets padded for strings)at the address of 'b' , since 'buf','a','b' are on the stack writing one byte beyond the address of 'b' can do really nasty stuff.
  • Just as in this corrupted the variables.
  • Potentially corrupt the return address of the function, creating a great security bug.
Be Careful guys! Vamsi.

Tuesday, June 19, 2007

[BOOKS] List of books I want to buy when I'm in india

1. The Practice of Programming (Paperback) by Brian W. Kernighan (Author), Rob Pike (Author) 2. Programming Pearls (2nd Edition) (Paperback) by Jon Bentley (Author). 3. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology (Hardcover)

I'll keep adding to this list...
Life has been through a lot of code in Matlab/Octave, Perl and C. Cheers! Vamsi.

Saturday, June 09, 2007

[Perl] Don't nest your regular expressions when you use /g switch

I have been busy all these days with quite a few things, well the motive of this blog is to document some of the issues I face in my day to day work.

while($line =~ /input .*? name="(.*?)" .*? value="(.*?)"/g){
  $param_name = $1;
  $param_value = $2;
  if($param_name =~ /blah/){
    # Do some stuff
  }else{
    #Do some stuff
  }
}

The above code can go miserably wrong and hard to debug especially when the outer while loop has hundreds of lines of code in it, as you can see when we use '/g' switch in the matching all the matches within a string, make sure that THERE ARE NO NESTED REGULAR EXPRESSIONS WHEN USING /g switch