Home SQL partially serializable
Reply: 0

SQL partially serializable

Mark Sowul
1#
Mark Sowul Published in 2014-06-13 22:47:02Z

The high-level issue that we are trying to solve is to prevent duplicate records from a user-kicked-off ETL process, which calls a procedure for each row it is importing (this is a simplification and changing the design is not part of my question). So basically we have something like this (note that the table has about 20 columns):

PROCEDURE ImportRow ( @p1, @p2, @p3 etc )

IF NOT EXISTS
(SELECT 1
FROM TargetTable
WHERE col1 = @p1 AND col2 = @p2 AND col3 = @p3 etc)

INSERT INTO TargetTable(col1, col2, col3...)
VALUES (@p1, @p2, @p3)

So obviously we have a race condition that if two people inadvertently run the same importer at the same time, we end up with duplicate records.

We are willing to use SERIALIZABLE to ensure that IF NOT EXISTS will prevent the second transaction from doing the insert...

PROCEDURE ImportRow ( @p1, @p2, @p3 etc )

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE

IF NOT EXISTS
(SELECT 1
FROM TargetTable (UPDLOCK)
WHERE col1 = @p1 AND col2 = @p2 AND col3 = @p3 etc)

INSERT INTO TargetTable(col1, col2, col3...)
VALUES (@p1, @p2, @p3)

...however, as you know, without a covering index, this will lock the entire table...

...and we can't add a covering index because we have 20 columns.

But the first two parameters/columns are basically an "importer id" and an "as-of date", so, really it would be okay to have only THAT be serializable. In other words, prevent the same importer from being run more than once, i.e. have the locking granularity be between 'entire table' and 'row-by-row'. Something like this:

PROCEDURE ImportRow ( @importer, @asOfDate, @p3, @p4, @p5 etc )

--lock down the importer
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
SELECT 1
FROM TargetTable (UPDLOCK)
WHERE importer = @importer AND asOfDate = @asOfDate

IF NOT EXISTS
(SELECT 1
FROM TargetTable (UPDLOCK)
WHERE importer = @importer AND asOfDate = @asOfDate AND col3 = @p3 AND col4 = @p4 AND col5 = @p5 etc)

INSERT INTO TargetTable(importer, asOfDate, col3, col4, col5...)
VALUES (@importer, @asOfDate, @p3, @p4, @p5)

We have a covering index on importer and asOfDate, so only that should be locked down with serializable, while the rest of the table should be free and clear. The problem is that now this obviously produces a slow, ugly result set from that first SELECT query.

So is there another way to do this "partial serializable"? I was thinking of burning a temp table, but that's wasteful.

SELECT 1
INTO #Ignored FROM TargetTable (UPDLOCK)
WHERE importer = @importer AND asOfDate = @asOfDate

The only other thing I can think of would be to put importer/asOfDate in a separate table and giving that pair an id ('importerDateId'), and then use that importerDateId in TargetTable, i.e. to normalize it a bit more, but I'd like to avoid a schema change.

You need to login account before you can post.

About| Privacy statement| Terms of Service| Advertising| Contact us| Help| Sitemap|
Processed in 0.282961 second(s) , Gzip On .

© 2016 Powered by mzan.com design MATCHINFO