Sunday, August 9, 2009

Should I use a #temp table or a @table variable?

From: http://databases.aspfaq.com/

In a stored procedure, you often have a need for storing a set of data within the procedure, without necessarily needing that data to persist beyond the scope of the procedure. If you actually need a table structure, there are basically four ways you can "store" this data: local temporary tables (#table_name), global temporary tables (##table_name), permanent tables (table_name), and table variables (@table_name).

There is a partial list of questions and answers about table variables, including some differences between table variables and #temp tables, in KB #305977 - INF: Frequently Asked Questions - SQL Server 2000 - Table Variables.


Local Temporary Tables

CREATE TABLE #people
(
id INT,
name VARCHAR(32)
)

A temporary table is created and populated on disk, in the system database tempdb — with a session-specific identifier packed onto the name, to differentiate between similarly-named #temp tables created from other sessions. The data in this #temp table (in fact, the table itself) is visible only to the current scope (usually a stored procedure, or a set of nested stored procedures). The table gets cleared up automatically when the current procedure goes out of scope, but you should manually clean up the data when you're done with it:

DROP TABLE #people

This will be better on resources ("release early") than if you let the system clean up *after* the current session finishes the rest of its work and goes out of scope.

A common use of #temp tables is to summarize/compact/reorganize data coming from another stored procedure. So, take this example, which pares down the results of the system procedure sp_who2 into only the SPID, Status, and HostName of *active* processes that are *not* part of the regular operation of the system:

CREATE TABLE #sp_who3
(
SPID INT,
Status VARCHAR(32) NULL,
Login SYSNAME NULL,
HostName SYSNAME NULL,
BlkBy SYSNAME NULL,
DBName SYSNAME NULL,
Command VARCHAR(32) NULL,
CPUTime INT NULL,
DiskIO INT NULL,
LastBatch VARCHAR(14) NULL,
ProgramName VARCHAR(32) NULL,
SPID2 INT
)

INSERT #sp_who3 EXEC sp_who2 'active'

SELECT SPID, Status, HostName FROM #sp_who3
WHERE spid > 15

DROP TABLE #sp_who3

One of the main benefits of using a #temp table, as opposed to a permanent table, is the reduction in the amount of locking required (since the current user is the only user accessing the table), and also there is much less logging involved. (You could also increase this performance by placing tempdb on a separate drive... but that's another story.)

One minor problem with #temp tables is that, because of the session-specific identifier that is tacked onto the name, the name you give it is limited to 116 characters, including the # sign (while other table types are limited to 128). If you try, you will see this:

Server: Msg 193, Level 15, State 1, Line 1
The object or column name starting with '#' is too long. The maximum length is 116 characters.

Hopefully this won't be a limitation in your environment, because I can't imagine a table name that long being useful or manageable.

Another potential problem with #temp tables is that, if you enter a transaction and use a #temp table, and then cancel without ever issuing a ROLLBACK or COMMIT, you could be causing unnecessary locks in tempdb (for more information, see KB #159747).


Global Temporary Tables

CREATE TABLE ##people
(
id INT,
name VARCHAR(32)
)

Global temporary tables operate much like local temporary tables; they are created in tempdb and cause less locking and logging than permanent tables. However, they are visible to all sessions, until the creating session goes out of scope (and the global ##temp table is no longer being referenced by other sessions). If two different sessions try the above code, if the first is still active, the second will receive the following:

Server: Msg 2714, Level 16, State 6, Line 1
There is already an object named '##people' in the database.

I have yet to see a valid justification for the use of a global ##temp table. If the data needs to persist to multiple users, then it makes much more sense, at least to me, to use a permanent table. You can make a global ##temp table slightly more permanent by creating it in an autostart procedure, but I still fail to see how this is advantageous over a permanent table. With a permanent table, you can deny permissions; you cannot deny users from a global ##temp table.


Permanent Tables

CREATE TABLE people
(
id INT,
name VARCHAR(32)
)

A permanent table is created in the local database, however you can (unlike #temp tables) choose to create a table in another database, or even on another server, for which you have access. Like global ##temp tables, a permanent table will persist the session in which it is created, unless you also explicitly drop the table. (For contention and concurrency reasons, creating a "temporary" permanent table in this manner doesn't really make a lot of sense.) If you are planning to create a permanent table when the procedure runs, you should check to see if it exists, in order to avoid errors like the one mentioned above. For more information about checking for the existence of both local #temp tables and permanent tables, see Article #2458.

Like global ##temp tables, there seems to be little reason to use a permanent table unless the data is going to persist... and if that is the case, why not create the permanent table before the stored procedure is ever run, thereby eliminating all the CREATE / DROP logic?


Table Variables

DECLARE @people TABLE
(
id INT,
name VARCHAR(32)
)

A table variable is created in memory, and so performs slightly better than #temp tables (also because there is even less locking and logging in a table variable). A table variable might still perform I/O to tempdb (which is where the performance issues of #temp tables make themselves apparent), though the documentation is not very explicit about this.

Table variables are automatically cleared when the procedure or function goes out of scope, so you don't have to remember to drop or clear the data (which can be a good thing or a bad thing; remember "release early"?). The tempdb transaction log is less impacted than with #temp tables; table variable log activity is truncated immediately, while #temp table log activity persists until the log hits a checkpoint, is manually truncated, or when the server restarts.

Table variables are the only way you can use DML statements (INSERT, UPDATE, DELETE) on temporary data within a user-defined function. You can create a table variable within a UDF, and modify the data using one of the above statements. For example, you could do this:

CREATE FUNCTION dbo.example1
(
)
RETURNS INT
AS
BEGIN
DECLARE @t1 TABLE (i INT)
INSERT @t1 VALUES(1)
INSERT @t1 VALUES(2)
UPDATE @t1 SET i = i + 5
DELETE @t1 WHERE i < max =" MAX(i)">

However, try that with a #temp table:

CREATE FUNCTION dbo.example2
(
)
RETURNS INT
AS
BEGIN
CREATE TABLE #t1 (i INT)
INSERT #t1 VALUES(1)
INSERT #t1 VALUES(2)
UPDATE #t1 SET i = i + 5
DELETE #t1 WHERE i < max =" MAX(i)">

Results:

Server: Msg 2772, Level 16, State 1, Procedure example2, Line 7
Cannot access temporary tables from within a function.

Or try accessing a permanent table:

CREATE TABLE table1
(
id INT IDENTITY,
name VARCHAR(32)
)
GO

CREATE FUNCTION dbo.example3
(
)
RETURNS INT
AS
BEGIN
INSERT table1(name) VALUES('aaron')
RETURN SCOPE_IDENTITY()
END
GO

Results:

Server: Msg 443, Level 16, State 2, Procedure example3, Line 8
Invalid use of 'INSERT' within a function.

Table variables can lead to fewer stored procedure recompilations than temporary tables (see KB #243586 and KB #305977), and — since they cannot be rolled back — do not bother with the transaction log.

So, why not use table variables all the time? Well, when something sounds too good to be true, it probably is. Let's visit some of the limitations of table variables (part of this list was derived from KB #305977):
  • Table variables are only allowed in SQL Server 2000+, with compatibility level set to 80 or higher.

  • You cannot use a table variable in either of the following situations:

    INSERT @table EXEC sp_someProcedure

    SELECT * INTO @table FROM someTable

  • You cannot truncate a table variable.

  • Table variables cannot be altered after they have been declared.

  • You cannot explicitly add an index to a table variable, however you can create a system index through a PRIMARY KEY CONSTRAINT, and you can add as many indexes via UNIQUE CONSTRAINTs as you like. What the optimizer does with them is another story. One thing to note is that you cannot explicitly name your constraints, e.g.:

    DECLARE @myTable TABLE
    (
    CPK1 int,
    CPK2 int,
    CONSTRAINT myPK PRIMARY KEY (CPK1, CPK2)
    )

    -- yields:
    Server: Msg 156, Level 15, State 1, Line 6
    Incorrect syntax near the keyword 'CONSTRAINT'.

    -- yet the following works:
    DECLARE @myTable TABLE
    (
    CPK1 int,
    CPK2 int,
    PRIMARY KEY (CPK1, CPK2)
    )

  • You cannot use a user-defined function (UDF) in a CHECK CONSTRAINT, computed column, or DEFAULT CONSTRAINT.

  • You cannot use a user-defined type (UDT) in a column definition.

  • Unlike a #temp table, you cannot drop a table variable when it is no longer necessary—you just need to let it go out of scope.

  • You cannot generate a table variable's column list dynamically, e.g. you can't do this:

    SELECT * INTO @tableVariable

    -- yields:

    Server: Msg 170, Level 15, State 1, Line 1
    Line 1: Incorrect syntax near '@tableVariable'.

    You also can't build the table variable inside dynamic SQL, and expect to use it outside that scope, e.g.:

    DECLARE @colList VARCHAR(8000), @sql VARCHAR(8000)
    SET @colList = 'a INT,b INT,c INT'
    SET @sql = 'DECLARE @foo TABLE('+@colList+')'
    EXEC(@sql)
    INSERT @foo SELECT 1,2,3

    -- this last line fails:

    Server: Msg 137, Level 15, State 2, Line 5
    Must declare the variable '@foo'.

    This is because the rest of the script knows nothing about the temporary objects created within the dynamic SQL. Like other local variables, table variables declared inside of a dynamic SQL block (EXEC or sp_executeSQL) cannot be referenced from outside, and vice-versa. So you would have to write the whole set of statements to create and operate on the table variable, and perform it with a single call to EXEC or sp_executeSQL.

  • The system will not generate automatic statistics on table variables. Likewise, you cannot manually create statistics (statistics are used to help the optimizer pick the best possible query plan).

  • An INSERT into a table variable will not take advantage of parallelism.

  • A table variable will always have a cardinality of 1, because the table doesn't exist at compile time.

  • Table variables must be referenced by an alias, except in the FROM clause. Consider the following two scripts:

    CREATE TABLE #foo(id INT)
    DECLARE @foo TABLE(id INT)
    INSERT #foo VALUES(1)
    INSERT #foo VALUES(2)
    INSERT #foo VALUES(3)
    INSERT @foo SELECT * FROM #foo

    SELECT id
    FROM @foo
    INNER JOIN #foo
    ON @foo.id = #foo.id

    DROP TABLE #foo

    The above fails with the following error:

    Server: Msg 137, Level 15, State 2, Line 11
    Must declare the variable '@foo'.

    This query, on the other hand, works fine:

    SELECT id
    FROM @foo f
    INNER JOIN #foo
    ON f.id = #foo.id


  • Table variables are not visible to the calling procedure in the case of nested procs. The following is legal with #temp tables:

    CREATE PROCEDURE faq_outer
    AS
    BEGIN
    CREATE TABLE #outer
    (
    letter CHAR(1)
    )

    EXEC faq_inner

    SELECT letter FROM #outer

    DROP TABLE #outer
    END
    GO

    CREATE PROCEDURE faq_inner
    AS
    BEGIN
    INSERT #outer VALUES('a')
    END
    GO


    EXEC faq_outer

    Results:

    letter
    ------
    a

    (1 row(s) affected)

    However, you cannot do this with table variables. The parser will find the error before you can even create it:

    CREATE PROCEDURE faq_outer
    AS
    BEGIN
    DECLARE @outer TABLE
    (
    letter CHAR(1)
    )

    EXEC faq_inner

    SELECT letter FROM @outer
    END
    GO

    CREATE PROCEDURE faq_inner
    AS
    BEGIN
    INSERT @outer VALUES('a')
    END
    GO

    Results:

    Server: Msg 137, Level 15, State 2, Procedure faq_inner, Line 4
    Must declare the variable '@outer'.

    For more information about sharing data between stored procedures, please see this article by Erland Sommarskog.

Conclusion

Like many other areas of technology, there is no "right" answer here. For data that is not meant to persist beyond the scope of the procedure, you are typically choosing between #temp tables and table variables. Your ultimate decision should depend on performance and reasonable load testing. As your data size gets larger, and/or the repeated use of the temporary data increases, you will find that the use of #temp tables makes more sense. Depending on your environment, that threshold could be anywhere — however you will obviously need to use #temp tables if any of the above limitations represents a significant roadblock.

Related Articles

How do I build a query with optional parameters?
How do I calculate the median in a table?
How do I create a store locator feature?
How do I deal with MEMO, TEXT, HYPERLINK, and CURRENCY columns?
How do I deal with multiple resultsets from a stored procedure?
How do I debug my SQL statements?
How do I determine if a column exists in a given table?
How do I enable or disable connection pooling?
How do I enumerate through the DSNs on a machine?
How do I find a stored procedure containing ?
How do I get a list of Access tables and their row counts?
How do I get the latest version of the JET OLEDB drivers?
How do I handle alphabetic paging?
How do I handle BIT / BOOLEAN columns?
How do I handle error checking in a stored procedure?
How do I ignore common words in a search?
How do I page through a recordset?
How do I present one-to-many relationships in my ASP page?
How do I prevent duplicates in a table?
How do I prevent my ASP pages from waiting for backend activity?
How do I prevent NULLs in my database from mucking up my HTML?
How do I protect my Access database (MDB file)?
How do I protect my stored procedure code?
How do I protect myself against the W32.Slammer worm?
How do I remove duplicates from a table?
How do I rename a column?
How do I retrieve a random record?
How do I return row numbers with my query?
How do I send a database query to a text file?
How do I simulate an array inside a stored procedure?
How do I solve 'Could not find installable ISAM' errors?
How do I solve 'Operation must use an updateable query' errors?
How do I temporarily disable a trigger?
How do I use a SELECT list alias in the WHERE or GROUP BY clause?
How do I use a variable in an ORDER BY clause?
Should I index my database table(s), and if so, how?
Should I store images in the database or the filesystem?
Should I use a view, a stored procedure, or a user-defined function?
Should I use recordset iteration, or GetRows(), or GetString()?
What are all these dt_ stored procedures, and can I remove them?
What are the limitations of MS Access?
What are the limitations of MSDE?
What are the valid styles for converting datetime to string?
What datatype should I use for my character-based database columns?
What datatype should I use for numeric columns?
What does "ambiguous column name" mean?
What is this 'Multiple-step OLE DB' error?
What is wrong with 'SELECT *'?
What naming convention should I use in my database?
What should I choose for my primary key?
What should my connection string look like?
When should I use CreateObject to create my recordset objects?
Where can I get this 'Books Online' documentation?
Where do I get MSDE?
Which database platform should I use for my ASP application?
Which tool should I use: Enterprise Manager or Query Analyzer?
Why are there gaps in my IDENTITY / AUTOINCREMENT column?
Why can I not 'open a database created with a previous version...'?
Why can't I access a database or text file on another server?
Why can't I use the TOP keyword?
Why do I get 'Argument data type text is invalid for argument [...]'?
Why do I get 'Not enough space on temporary disk' errors?
Why does ASP give me ActiveX errors when connecting to a database?
Should I use COALESCE() or ISNULL()?
Where can I get basic info about using stored procedures?

No comments:

Post a Comment

Popular Posts