Rob Gonda's Blog

ColdFusion Vs. SQL UUID

A few days ago I blogged about database level data integrity and promised a follow up concentrating in uuids.

A UUID stands for Universally Unique Identifier. The intent of UUIDs is to enable distributed systems to uniquely identify information without significant central coordination. Thus, anyone can create a UUID and use it to identify something with reasonable confidence that the identifier will never be unintentionally used by anyone for anything else. Information labelled with UUIDs can therefore be later combined into a single database without needing to resolve name conflicts. The most widespread use of this standard is in Microsoft's Globally Unique Identifiers (GUIDs) which implement this standard (source: wikipedia).

A UUID is essentially a 16-byte (128-bit) number. In its canonical form a UUID may look like this:

    xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx (8-4-4-4-12)

However, for some reason ColdFusion's UUID looks like

    xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxx (8-4-4-16)

Microsoft SQL has a native datatype called uniqueidentifier, which represents the 36-characters GUID. Many ColdFusion developers choose not to use the GUID because it cannot be implicitly validated by ColdFusion and it cannot be seamlessly moved to a different database like mysql, postgre, oracle.

The most widely adopted solution is to use a 35-character primary key and insert a ColdFusion UUID, nonetheless, how do you validate a proper uuid at the database level? What if you want the database to generate the primary key? If the key gets altered, it will fail ColdFusions implicit UUID datatype validation.

The solution is to add some constraints in the database level.

Is it really simple to generate a UUID, since all it takes it just to remove the 4th hyphen.

CREATE FUNCTION dbo.newUUID(@GUID varchar(36))
RETURNS varchar(35)
AS
BEGIN
 RETURN left(@GUID, 23) + right(@GUID,12)
END


Note that due to limitations and not being able to invoke a newID() function inside a user defined function, we need to pass the GUID. Now, that said, we can add a default value to our primary keys and let SQL Server generate them for us:

Default Value: dbo.newUUID(newid())


To validate a proper UUID is a little more complicated, since SQL has no native isUUID or isGUID function. I chose to use a regular expression, but guess what? SQL Server 2000 has no regular expression capabilities.

So step one is to create a regular expression evaluator function

CREATE FUNCTION dbo.find_regular_expression
    (
        @source varchar(5000),
        @regexp varchar(1000),
        @ignorecase bit = 0
    )
RETURNS bit
AS
    BEGIN
        DECLARE @hr integer
        DECLARE @objRegExp integer
        DECLARE @objMatches integer
        DECLARE @objMatch integer
        DECLARE @count integer
        DECLARE @results bit
       
        EXEC @hr = sp_OACreate 'VBScript.RegExp', @objRegExp OUTPUT
        IF @hr <> 0 BEGIN
            SET @results = 0
            RETURN @results
        END
        EXEC @hr = sp_OASetProperty @objRegExp, 'Pattern', @regexp
        IF @hr <> 0 BEGIN
            SET @results = 0
            RETURN @results
        END
        EXEC @hr = sp_OASetProperty @objRegExp, 'Global', false
        IF @hr <> 0 BEGIN
            SET @results = 0
            RETURN @results
        END
        EXEC @hr = sp_OASetProperty @objRegExp, 'IgnoreCase', @ignorecase
        IF @hr <> 0 BEGIN
            SET @results = 0
            RETURN @results
        END   
        EXEC @hr = sp_OAMethod @objRegExp, 'Test', @results OUTPUT, @source
        IF @hr <> 0 BEGIN
            SET @results = 0
            RETURN @results
        END
        EXEC @hr = sp_OADestroy @objRegExp
        IF @hr <> 0 BEGIN
            SET @results = 0
            RETURN @results
        END
    RETURN @results
    END


Now that we have this, all we need is the UUID regEx pattern and call this function.

CREATE FUNCTION dbo.isUUID (@uuid varchar(35)) 
RETURNS bit AS 
BEGIN

DECLARE @uuidRegex varchar(50)
SET @uuidRegex = '^[0-9A-F]{8}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{16}$'

RETURN dbo.find_regular_expression(@uuid,@uuidRegex ,0)

END


Alright! now we have a isUUID function, which you can easily invoke from everywhere... open a sql script and execute

SELECT [dbo].[isUUID]('D929E4FB-537C-495F-BB3F31B8E42C0FBB')


Now that we tested it and know how it works, all we need is to add a constraint to your primary key:

Open your table in design mode, click constraints, new, and add this line:

([dbo].[isUUID]([ID]) = 1)


where ID is the name of the primary key.

So you learned how to generate a UUID, default your primary key to use one, validate a UUID regEx, and add a constraint to enforce db data integrity.

Related Blog Entries

TrackBacks
There are no trackbacks for this entry.

Trackback URL for this entry:
http://www.robgonda.com/blog/trackback.cfm?8EDCA79D-3048-7431-E4267A13B3517212

Comments
Nice post! Very useful.
# Posted By Sami Hoda | 2/4/07 8:25 PM
Rob, why wouldn't you just use SQL's UUID format throughout your app and db? CF7 can use IsValid("guid",someUUID) which will correctly validate SQL's UUID format. Perhaps I missed the point you were making - sorry.

It would be nice if CF8 introduces a new attribute in CreateUUID() to naively generate an SQL format UUID. For now I use insert("-", CreateUUID(), 23) which does the same thing.

Another difference between CF and SQL's UUIDs is that CF so obviously bases its UUID on the current time whereas SQL uses what looks like a totally random sequence of characters. There are 2 issues with that. Firstly, when you rely heavily on UUIDs and you're trying to compare records for testing/debugging purposes you usually only have to remember the first 4 or 5 chars of a SQL UUID if you want to compare one UUID with another (they are so random it's likely the first chars are unique), but a CF UUID usually has the same first 4 or 5 chars which makes a quick mental comparaison difficult.

Secondly, if your app generates lots of records then only the middle 4 or so chars will change in a CF generated UUID over a small period of time. That means if you are relying on the obscurity of a UUID (like who can correctly guess a 128 bit ID?) to protect a record from being viewed by someone fiddling with the url parameter, then it won't take long for someone to manually or programmatically change a few chars and correctly guess a UUID that exists. Of course the lesson is to never rely on obscurity, but you've got to admire the visual randomness of MS SQL's UUIDs. I suspect they hash a number to achieve it so I'm sure CF could do the same quite easily.
# Posted By Gary Fenton | 2/4/07 8:44 PM
http://www.ietf.org/rfc/rfc4122.txt

If you're going to use SQL Server, have you considered using varchar(36), along with newID() for your default, and in ColdFusion do as Gary suggests and insert a dash in the appropriate place when creating your IDs from ColdFusion?
# Posted By Calvin | 2/5/07 7:35 AM
Hi
Very like it!
I like it a lot! Good work, keep it up. Tomorrow I shall return again.
# Posted By ???? | 2/6/07 4:14 AM
This blog is running version 5.9.003. Contact Blog Owner