You know, time kind of sucks.
I mean, I figured the other day that time doesn't really exist, but ignoring that little detail for the moment..
What does this mean: 2004-06-23 19:10:42?
It actually doesn't really mean anything by itself. The problem is that it needs to be taken in a context. Basically I need to know the Time Zone and determine if Daylight Time applies. Then I would know 'exactly' what I mean.
The thing is, users of computer programs don't think that's cool. Further, sometimes even the business doesn't care about the 'real' time, they care about the 'socially adjusted time'. If I want to know if something happened during business hours for example, and that thing is related to five different people or entities that exist in potentially different time zones and I store a 'real' time, then if I wanted to know what happened between 9am and 5pm (ridiculous working hours! i know..) then I would have to join through a billion tables to find all those 'related entities that could be used to determine a time context' and then apply some logic to determine which *one* (which might not always be clear, or correct) to apply, so I could find a 'socially adjusted' date and time. (heh, I just like the term 'socially adjusted' OK! ;)
The easiest thing to do of course, is only store the 'socially adjusted' time, and forget entirely about determining context. So, if I have users around the country doing work for me and they say they did the work at 2004-06-23 19:10:42 then I'll just take that for granted, and ignore the 'details' of when the work 'really happened'. Of course, sometimes I might also need to know when something 'really' happened (for audit or logging purposes for example) and what do I do then?
Should I store UTC time in my DB for audit / logging time, and 'socially adjusted' times for business information..?
My RDBMS only has the concept of smalldatetime and datetime. My programming platform only has the concept of DateTime and TimeSpan. I have my own API for nullable dates, nullable times, and nullable datetimes. I'm wondering if I should create another date type, so that at least by the time I'm working with data in my programming environment I can determine the 'type' of a date by it's type. That is, I'm considering implementing a MyRealTime class in my nullable type library. My MyDate and MyTime class are both 'socially adjusted' types, a MyDate has an indeterminate relative context that will apply but that is deemed generally unuseful so it can remain 'indeterminate' (basically meaning I have no idea what time zone applies to that value). A MyTime is relative to a MyDate, so by extension is also 'socially adjusted', it just models the 'period since the socially adjusted date last changed'. I encapsulate a DateTime in a MyDate (using on the the date part), and a TimeSpan in a MyTime (using only hours less than 24) and provide formatting and serialization capabilities. A MyDateTime is implemented with a DateTime and uses both the date and time parts. All temporal data that I model is restricted in range to be suitably compatible with my databases ability to persist such data. However, at the moment, MyDateTime doesn't really have any indication of it's 'type' built in. Sometimes it might mean a 'real' time, other times it might mean a 'socially adjusted' time. Perhaps that is for the best.. but I'm not sure. There are a few issues, and that's what I'm here to talk about.
Firstly, there is serialization. When I serialize a MyDate or a MyTime I can just use any sort of format I like, something like yyyy-MM-dd and HH:mm:ss. That's fine, because we don't care for being more precise with these types. However, what to do about MyDateTime? If I serialize it using RFC-822 (updated in RFC-1123) then I would likely want to store time zone information. That's fine, if I know the time zone that I'm running under. What if I have anonymous users entering data on my website telling me what time of day they did something..? Then if I serialize one of these babies using a 'real' time, then I'm basically just telling lies. I have 'no idea' that is 'the truth' about the time, so why would I want to give anyone the indication that I had any idea what I was talking about by encoding it in UTC (aka GMT) with a local time indicator? I really wouldn't know what the 'local time' was, unless I asked the user, who'd probably get confused, or get the shits, or make a mistake, etc.
Other interesting facets of dates and times are to do with locale itself. If I want to display a date or a time to user, then I might like to format it for their locale, say using their spelling for the day of the week, etc. However, during transmission between my servers, and during persistence I'd really want to know exactly what it was that I was dealing with, and I'd need an 'invariant' or 'agnostic' method of persisting that data. So, I couldn't use an accommodating 'locale' for storage in a database, or during serialization, for example.
Other problems also arise simply from the variety of potentially applicable time zones and platform persistence/precision capabilities, for example, if I have a DateTime that is at the minimum value it can't be stored in my database. Further, if I create a DateTime in my time zone (say +10) and I encode the minimum value and transmit it to another time zone (say -4) then I need to take the minimum value back by 14 hours for find a 'real' time, but I can't represent less than the minimum value, so I'll have a problem and won't be able to read the time at all. In such a situation, having time zone information actually creates problems, and such problems might not have been necessary based on the business requirements.
Then of course there is that whole 'what time is my server in?' thing to worry about. If I write logs with UTC then I need to work harder to give the user something that is useful to them, and I've got all sorts of conflicting systems working against each other. The system time, the system time APIs, the database time, the database time APIs, the users time, the users thread culture, the servers thread culture, the UI culture, etc.
At any rate, I'm not too interested in making things hard for myself, and I don't want to introduce the possibility for much error. So, what I'm really thinking about now is:
A) Do I create two classes to represent different types of time, i.e. 'real' and 'socially adjusted'.
B) Regardless of A), how will I serialize the types that I do have, in an XML document, binary format, etc.
The best answers are probably the ones that I won't have time to do, and that aren't likely to be justifiable from a business perspective (because 'doing it right' would probably mean making the user 'work harder' and that's not cool if the business doesn't care!).
I think that I'll use 'smalldatetime' in my DB to store MyDate and MyTime types. I think I'll keep the MyDateTime type and apply it's semantics as a 'real' time. If I need a 'socially adjusted' DateTime then I will use two fields, one with a MyDate and the other with a MyTime. I will serialize MyDateTime by first converting from the 'user time zone' to GMT then using the RFC-1123 format, and I will serialize MyDate with the ISO yyyy-MM-dd format, and MyTime with the HH:mm:ss format.
The 'user time zone' will be whatever time the app domain that conducts serialization/serialization knows about a 'system time'. So, in a web-based application 'user time' is the time on the web server, this will be suitable for most users, and will at least be standard for other users. If a MyDateTime is ever in any execution context that queries for its value, the value will always be a 'real time' relative to the system time in which it is queried. This can be easily accomplished by ensuring it has the appropriate value at the time that it is serialized. On the other hand, and data that is ever 'on the wire' or 'in the database' will be in GMT. So, serialization goes to GMT, and serialization comes from GMT, in each case the non-GMT time is dictated by the users environment, whether that's a web server or a workstation.
MyDate will have a ToString() property that will return a short date based on the current culture. Serialization will be a 'socially adjusted' date (in other words, no change in time zone will be catered for) using the date part of the ISO 8601 standard (yyyy-MM-dd).
MyTime will have a ToString() property that will return a short time based on the current culture. Serialization will be a 'socially adjusted' time (or 'time of day', what day? pick one!) using the time part of the ISO 8601 (HH:mm:ss). Binary serialization can use native platform serialization of TimeSpan, or could override that and serialize as a string in all cases (still thinking about potential significant performance hit, but it is attractive from the POV of consistency.. in either case, it doesn't matter how as long as it is understood that there is no conversion and that those times don't really *mean* anything unless you know or can infer something about their *real* context).
MyDateTime will have a ToString() method that will return a date and time based on the current culture. Serialization will be done by converting to GMT and then formatting as RFC-1123. The reason I will 'actually convert' the data to GMT rather than using a local time indicator (like say +10) is because my database won't be able to store this information as readily in a consistent fashion. Basically I want to always have that data literally in GMT when 'in transit or storage' and in 'user time' for the UI. So serialization will make the time-zone adjustment back to a 'real time' in the users time-zone, the time-zone will be the app-domain that does the serialization, so I can serialize between app domains with no problems and always no what I'm dealing with. Thus, all data logged in the database from a MyDateTime will be known GMT time, but will always appear as a localized time for a user (except if retrieve directly from the database, which would be unusual since this is intended only for 'audit' information not 'core business' information). The serialization/serialization of a MyDateTime will need to look in an appropriate place for a 'known time zone'. Some data will be logged in the database *by the database*, so T-SQL for logging GMT will need to be found (T-SQL has GETUTCDATE()). Some data will be created by the 'system' so the 'system' time zone will need to be known, this will be fine based on the system clock and automatically supported by the .NET framework and methods on DateTime. Users distributed across a vast geographic area interacting on a single machine (like a web interface) may need to specify an 'exact time', and but knowing their time-zone would be necessary to allow for this. We don't want to build this into the serialization process (although we probably could with some logical thread affinitive data to indicate their time-zone, and then juggle GMT, system time and user time). In the case that we need to do this, we'll do the conversion from 'user time' to 'system time' prior to serialization and serialization will convert from 'system time' to GMT time for data persistence. This should be fine for now. In the future, the MyRealTime class might be implemented to automatically juggle a 'real' time based on 'user time' separate from 'system time' during serialization/serialization, but there is no pressing need for this class.
Examples of situations where a user would enter a 'socially adjusted date and time' would be “What time did you arrive at the store?”, where as a 'real' time would be “What time are you arriving in Sydney?”. I won't have to deal much with 'real' times, apart from for logging. Basically I will know that anything in my database stored as a datetime is a GMT time and anything in a smalldatetime is either a 'socially adjusted' date, or a 'time of day'. If I'm using any temporal data in my programming environment I can know that the time that is reported to the user will be relative to the 'system time' which will be the web servers time for web users and the workstation time for standalone users.
It might be worth adding some very basic checks during login to determine if a users reported time-zone and system clock seem to be in an appropriate state. This would only be relevant if the system was going to aid in 'picking' dates for the user, which I'd say should generally be avoided, either the user tells us a date/time or we generate a value on a server with a well maintained clock.
There... that was fun! (boy I hope I don't get a HTTP failure when I post this... CTRL+A, CTRL+C)
John.
(yep, HTTP failure.. lucky, I'm on the ball! ;)
A few extra notes/updates, etc.
In case it wasn't obvious the My* prefix is bogus, my types aren't acutally called that.. :P
I'll use HH:mm for MyTime, since that is minimum the level of precision (since I store in smalldatetime).
Using the smalldatetime in all circumstance will be fine for me. So, I can actually 'encode' the semantics of the temporal data (i.e. is it 'socially adjusted' or GMT). That's handy, and worth what I lose in flexibility.
I think I'll always serialize my types as strings. This will add reletively significant bloat on the wire, but will mean that validating data that goes onto or comes off the wire will be easier, and in many cases I'll want to be serializing to text based formats anyway (i.e. XML, XML or XML, etc.).
There is still potential for confusion, since a users screen will display audit information localised to their time. This will generally be the most useful format for the user though. I hope to avoid having anything but audit information stored as 'real' time, as this will make my life easier when reporting data from the database.
I'm still a little worried about all this, working with dates across disparate systems, time-zones and cultures is always going to be a little rough, particularly when sometimes you don't actually care about the 'real' time and sometimes you do..
John.