Saturday 29 November 2014

Survey results and my own findings: timestamp vs DateTime (PHP)

Apologies for the delay in getting back to this survey ("PHP: a survey about date data handling") - although I'm sure few people would have been waiting with bated breath - I've just had a very busy week with one thing and another. Now it's Saturday and I have a moment or two to breathe before finding a pub to watch the rugby (England v Aussie and/or Wales v South Africa, which are on simultaneously).

Anyway... here are the results:

Q1: "During the lifecycle of a date value, from being a string passed from a form or URL parameter, through being used in programming logic then ultimately output in an HTML document, how do you tend to store the date data?"

Basically: do you use DateTime objects or just timestamps (which in PHP are just an integer number of seconds since Jan 1 1970, see the time() function).

The results were:

Answer ChoicesResponses
As the original string5 (7.81%)
Converted to a unix timestamp24 (37.50%)
Converted to a DateTime object30 (46.88%)
Other (please specify)5 (7.81%)
Total64 responses

The "Other" responses were:

Note: just reading from the docs that this is the same as DateTime, but its methods return values, rather than modifying the object itself. Fair enough.

The date/time formats internal to the RDBMS chosen for the project.
Bleah. This doesn't make for very transportable code. Obviously code isn't always needing to be transportable, but this approach kinda paints one's self into a corner a bit. Plus also means it invalidates PHP's own internal date functions and methods.

ISO 8601
For internal usage, eg date arithmetic and calculations? Yikes. How?

We use ISO8601. That's what the database uses and it's easier for humans/programmers to work with.
I think you'll find that whilst a DB might accept input as ISO8601, internally it's using an integer. ISO formats are for humans, and for computers to parse from humans, not for internal usage. What does your function look like that adds a day to an ISO8601 string?

All of the above, as necessary.
Yes, there's always a fence-sitter answer or two. Obviously this will generally be the case for everyone, but doesn't answer the question in a helpful way, which the inference ought to have been: which do you prefer, or which tends to be the better of the two, all things being equal. I guess it is important to note that one should not have a dogmatic stock response to these things, but whilst "it depends..." is a poignant philosophical response, it's not a great answer unless one continues after the ellipsis. Which this respondent did not.

Q2: "Anything to add?"

While I prefer to try and use DateTime objects where I can, its often too easy to be lazy and just pass the raw string around, and convert it to an object if/when needed
I guess this is being honest. I think one should avoid "lazy" code though. It generally requires more effort once one comes to need to maintain it, so the laziness only helps the initial dev, and selfishly ignores the devs following in their footsteps.

depends of use case.

Even though it's less human readable in unix timestamp form, I find it's MUCH easier to work with than messing around with date objects. PHP's date objects do not have the cleanest/most useful API, and require several objects to do what you should be able to do with just one IMO....
I know what you mean, and initially agreed. However I think the class implementation is about right, now that I've used them. That said, had they implemented all of the DateTime, and the formatter and the interval within one class, I'd not be going "oh that's terrible". It'd be fine. But it's also fine the way they have implemented it.

Different situations call for different storage. I chose the storage type based on SQL query effeciency.
That's false economy/optimisation and also adding unnecessary portability considerations. You should use an approach suitable to the system in which you're doing the processing, and then at the "endpoint" convert to an appropriate interchange format.

It's often useful to have an unambiguous, absolute value that lends itself natively to arithmetic.
Like adding day's worth of seconds to get the following day? Screw that.

We extend DateTime with a number of helpers for conversions, calculations and other date stuff.
Nice one! That's a good pragmatic approach.

Cheers for the answers!

Bottom line: people error towards DateTime objects, but not by a dramatic margin over those using timestamps.

My own answer, btw - after doing the investigation I'll be sharing below - was "I'd err towards using a DateTime but some operations might be easier just sticking with the integer value of the 'timestamp'."

Let's have a look at some comparative code. I wrote two test rigs: viaTimestamp.php and viaDateTime.php. I'll look at equivalent sections from each. Firstly, the code is designed to be called from the command line, and takes two arguments: a date in format d/m/yyyy, and an optional locale. The locale defaults to en-GB:

$raw = $argv[1];

$localeFromUser = array_key_exists(2, $argv) ? $argv[2] : "en-GB";
$locale = setlocale(LC_TIME, $localeFromUser);

echo "Raw: $raw\n";
echo "locale: $localeFromUser -> $locale\n";

This is common to both files, and outputs:

Raw: 1/12/2014
Locale: de-de -> de-de

(I passed de-de as the locale).

Next, I take that $raw string and convert it to a date via the respective mechanisms:

// timestamp
$date = strtotime($raw);
echo "date: $date\n";

date: 1389484800

// DateTime
$date = DateTime::createFromFormat("d/m/Y", $raw);
printf("date: %s\n", $date->format(DateTime::RFC822));

date: Mon, 01 Dec 14 12:42:54 +0000

Next I format the date in a locale-friendly way:

// timestamp
function dateToString($date){
    return strftime("%A %#d %B, %Y", $date);

$formattedDate = dateToString($date);
echo "Formatted date: $formattedDate\n";

// DateTime
function dateToString($date, $locale) {
    $dateFormatter = new IntlDateFormatter($locale, IntlDateFormatter::LONG, IntlDateFormatter::NONE,null,null,"EEEE d LLLL, yyyy");
    return $dateFormatter->format($date);

$formattedDate = dateToString($date, $localeFromUser);
echo "Formatted date: $formattedDate\n";

The strftime() function pays attention to the current locale for the request when formatting the timestamp, which is handy. It's a lot more work to format a DateTime, needing to use a IntlDateFormatter object. But this makes sense: decoupling concerns.

One thing that's bloody annoying is that PHP doesn't have one unified way of representing date/time parts in mask strings. Notice how the formats for the IntlDateFormatter is completely different from strftime(). What was the thinking there?! And these are not the only two: there's completely different formats and completely different characters in various different date-masking functions / methods around the place. This is really really poor language design.

Also now look at what these two code snippets output:


Formatted date: Sonntag 12 Januar, 2014


Formatted date: Montag 1 Dezember, 2014

Erm... come again? I passed 1/12/2014, which is clearly Dec 1. Isn't it? No? Where are you sitting? In the middle of North America, right? Well... yes. You guys can't do dates properly. A date goes d/m/y, not m/d/y dudes. Sigh. Unfortunately strtotime() seems to ass-u-me a USA format, which is odd given my machine's locale is en-GB, and the request's locale is de-DE. Neither of which use m/d/y (because they're not frickin' daft). Fortunately DateTimes are a bit forward thinking and allow one to specify the mask to use when creating a date from a string, which is kinda essential. Hence the DateTime got it right, and strtotime() got it wrong. I could not find a way of telling strtotime() that it's not being run in USA, but if there is, please let me know.

Now I set strtotime() to fail here by using an "ambiguous" date string; but a lot of date strings are ambiguous. I could have monkeyed around with the date string and given it to strtotime(), but that would demonstrate from the outset that using timestamps isn't a fully-formed solution in PHP.

Normally when passing dates around the place as a string, I would use one of the ISO formats, which are unambiguous for everyone, btw.

Next I had a look at date arithmetic: adding a year, month, week, day and hour to my date:

// timestamp
$futureDate = strtotime('+1 year +1 month +1 week +1 day +1 hour', $date);
printf("Future date: %d / %s\n", $futureDate, dateToString($futureDate));

Future date: 1424394000 / Freitag 20 Februar, 2015

// DateTime
$oneYearOneMonthOneWeekOneDayOneHour = new DateInterval("P1Y1M8DT1H");
printf("Future date: %s\n", dateToString($date, $localeFromUser));

Future date: Samstag 9 Januar, 2016

strtotime() also does date arithmetic. Odd. strtotime() says "convert a string to a time", it does not say "and maybe add or subtract some stuff to it too". So that's pretty poor language design again. PHP ought to have separate functions here. Still: it's nicer in approach to CFML in a way here, as it can add all the bits and once. In CFML we'd need to do:

futureDate = date.add("yyyy", 1).add("m", 1).add("ww", 1).add("d", 1).add("h", 1)

The DateTime approach requires an object for the interval, but that makes good sense. Notice yet another two date-part masking approaches here.

Finally I look at getting the differences between dates:

// timestamp
$secondsUntilNow = $futureDate - time();
$daysUntilNow = (int) ($secondsUntilNow / (24 * 60 * 60));
printf("Difference from now (days): %s\n", $daysUntilNow);

Difference from now (days): 82

// DateTime
$daysUntilNow = $date->diff(new DateTime());
printf("Var type of date difference: %s\n", get_class($daysUntilNow));
printf("Difference from now (days): %s\n", $daysUntilNow->format("%R%a"));

Var type of date difference: DateInterval
Difference from now (days): -406

Strange but true: there seems to be no function to take a difference between two timestamps. All the docs I read (in the docs, and on Stack Overflow) just suggest using DateTimes for this sort of thing. But that would be a cop-out for these contrasting demonstrations, so what we need to do is to use standard arithmetic: take the numeric difference between the two, divide by the number of seconds in a day, then round it off. Bleah. That's godawful.

Note that with the DateTime version we are working with DateIntervals again, and those allow us to format/return the interval as whatever unit we want.

My conclusion is very much that using timestamps seems like a very last-decade way of doing things... from before PHP had any nod to OO. DateTime's seem the better solution, even if one does need to "horse around being all OO" in places. This makes for better code than dividing seconds by days-worth of seconds and rounding things off to do a date-diff operation. That. Is. Rubbish.

I'm gonna stick with DateTimes.