on Perl/C/Ruby strings

ethan on 2003-09-14T07:53:10

For a while I was wondering why the XS-portion of my on-going String::Ruby was so ugly and wordy. It's now over 60K, with a lot of functions taking five or six argument. A not untypical prototype of such a function looks like:


/* Returns an array of the fields */
char ** split (
  register char *string, int len,
  register char *delim, int dlen, 
  int limit,
  int **lens, int *nelems      /* length of each field and how many we have */
);
But then it dawned on me that this is probably not so much my fault but rather has something to do with the difference between C strings and those of Ruby and Perl. Most functions from string.h will fail miserably when the strings contain the NULL byte, whereas this is no problem for Perl or Ruby.

Subsequently, for every string I pass to a function I need an additional length parameter. Instead of using the str* family of functions, I have to use mem* instead. This is the reason why I am progressing with the speed of a turtle. Had I thought about this before, I would have created a structure for the string. Too late now.

On the other hand, it's pretty fun. C is still a wonderfully expressive and idiomatic language. It'll also enforce some B&D strategies on the programmer. Not by making silly rules what you may not do but rather by producing segfaults which is much more convincing.