Like I mentioned in my previous article, I've been working on refactoring the temporal data handling in Drizzle. The major problem I've been dealing with is poorly or non-documented code. The lack of documentation has led me to rely on the MySQL Manual in some cases, and then additional research and lastly, my own intuition as to what was going on.
One of the earliest cases of me saying "WTF?" was when I was investigating how day numbers were calculated. Here is the original, unmodified code from MySQL (/libmysql/my_time.c:746-778). I've highlighted in blue the massive amount of comments explaining the inner workings of the function and what it is doing.
/*
Calculate nr of day since year 0 in new date-system (from 1615)
SYNOPSIS
calc_daynr()
year Year (exact 4 digit year, no year conversions)
month Month
day Day
NOTES: 0000-00-00 is a valid date, and will return 0
RETURN
Days since 0000-00-00
*/
long calc_daynr(uint year,uint month,uint day)
{
long delsum;
int temp;
DBUG_ENTER("calc_daynr");
if (year == 0 && month == 0 && day == 0)
DBUG_RETURN(0); /* Skip errors */
delsum= (long) (365L * year+ 31*(month-1) +day);
if (month <= 2)
year--;
else
delsum-= (long) (month*4+23)/10;
temp=(int) ((year/100+1)*3)/4;
DBUG_PRINT("exit",("year: %d month: %d day: %d -> daynr: %ld",
year+(month <= 2),month,day,delsum+year/4-temp));
DBUG_RETURN(delsum+(int) year/4-temp);
} /* calc_daynr */
I tend to like code that is well-commented. And by well-commented, I don't mean comments that simply repeat an if condition or state the obvious. I mean, of course, comments that elucidate non-obvious parts of the code. The non-obvious parts of the above code are, well, most of it. Let's start at the top and I'll let you know why this little bastard of a function caused me so much pain.
OK, so the function header comment says that the function "Calculate nr of day since year 0 in new date-system (from 1615)". Well, knowing even the small amount of stuff I knew about calendars before I really got into this stuff, I knew that there was no "year 0" in a calendar. So...I immediately suspected something was up. But, I made a big mistake at this point. I read the end of the comment "new date-system (from 1615)" as meaning that this function calculated the Gregorian Day Number. I thought since October 15, 1582 was close to the year 1615, that the function author meant the Gregorian calendar.
Of course, I was wrong. The function actually is intended to calculate the number of days since "year 0" of the SQL date range. Unfortunately, the SQL calendar contains years 1 - 9999. There is no "year 0". So, I now have to figure out why this function considers 00.00.00 to be a real date. I have yet to figure this out. Perhaps it is a throwback to the MySQL days of February 31st, I don't really know.
Moving on through the function, the code contains a number of lines which just baffled me given the lack of comments. For instance:
delsum-= (long) (month*4+23)/10;
What is the number 23 doing? Is it a presumed constant in a known formula? Is it an adjustment number? Who knows...
And here:
temp=(int) ((year/100+1)*3)/4;
The line above seems to be trying to calculate the number of leap days (which are any day divisible by 4 and if divisible by 100 must also be divisible by 400.) But without comments, it's tough to tell...
OK, so with a little bit of throwing my hands up, I move on to the function which takes a day number and converts it to a date...
Day Number to Date?
And, going the other way around, here is how a date is calculated from a day number (/sql/time.cc:142-185). Again, the voluminous comments are highlighted.
/* Change a daynr to year, month and day */
/* Daynr 0 is returned as date 00.00.00 */
void get_date_from_daynr(long daynr,uint *ret_year,uint *ret_month,
uint *ret_day)
{
uint year,temp,leap_day,day_of_year,days_in_year;
uchar *month_pos;
DBUG_ENTER("get_date_from_daynr");
if (daynr <= 365L || daynr >= 3652500)
{ /* Fix if wrong daynr */
*ret_year= *ret_month = *ret_day =0;
}
else
{
year= (uint) (daynr*100 / 36525L);
temp=(((year-1)/100+1)*3)/4;
day_of_year=(uint) (daynr - (long) year * 365L) - (year-1)/4 +temp;
while (day_of_year > (days_in_year= calc_days_in_year(year)))
{
day_of_year-=days_in_year;
(year)++;
}
leap_day=0;
if (days_in_year == 366)
{
if (day_of_year > 31+28)
{
day_of_year--;
if (day_of_year == 31+28)
leap_day=1; /* Handle leapyears leapday */
}
}
*ret_month=1;
for (month_pos= days_in_month ;
day_of_year > (uint) *month_pos ;
day_of_year-= *(month_pos++), (*ret_month)++)
;
*ret_year=year;
*ret_day=day_of_year+leap_day;
}
DBUG_VOID_RETURN;
}
And that's about the point in time I gave up trying to figure out the existing code.