SQL: Introduction To Partitioning

When a database table grows in size to hundreds of gigabytes or more, it can become more difficult to load new data, remove old data, and maintain indexes. Just the sheer size of the table causes simple operations to take much longer than they might otherwise. Data that must be loaded or removed can be very sizable, making INSERT and DELETE operations on the table impractical. Database software, such as is provided with SQL Server, provides table partitioning to make such operations more efficient.

If your database has a very large table, you can “partition” this large table onto separate “filegroups.”  Filegroups are special types of files that allows a user to divide data onto different disks for the purpose of optimizing query performance and maintenance operations.  For example, say you have a tall file cabinet where you keep lists of customer names. As the number of files grow, the longer it takes to locate any given file.  Database tables operate on the same principle. 

The first thing you will need to do when planning out a partitioned table is to decide how you will divide the table up.  As with our file cabinet example, we might decide to create one file cabinet where last names begin with A – F, another cabinet for files G – N, and another for files O – Z.  Likewise, our table might also be divided up using last names as a way to organize.  You can see how faster it would be to locate the last name “Jones” in one of three cabinets instead of a giant cabinet.

As business changes and data grows, partitioning can become more complicated; however, the example above does provide a basic understanding as to how partitioning can save time and resources.

SQL : “Synonym Swap” Design Pattern

Frequently, I get asked to explain a database design pattern involving “synonym-swapping.”  Though I’m getting better at clarifying this approach, it is still often difficult for the business-side folks to really understand.  I will now attempt to write a post to direct these people so that they do not have to listen to my ramblings.

First, database synonyms are a type of “alias”, or alternate name for a table, view, sequence, or other schema objects in a database.  Some people refer to them as “pointers”  because sometimes a synonym might point to one table and at another time it will point to a different table.

Synonyms are great ways of referencing one or more tables in your database.  You can reference one or more tables by a single name. For example, say you had table that contained orders for your company.  Your orders table has millions of rows and needs to be accessed 24-hours a day.  However, what if updating the data in the table takes a couple of hours?   Also, what if maintenance takes a few hours?  Your users do not have the time to wait or suffer through a performance degradation.  What will you do?

1. Create two tables that are exact duplicates of each other (“ORDERS_TABLE_A”, “ORDERS_TABLE_B”).  Then create two synonyms, “Orders_Staging”,”Orders.”  The “Orders” synonym will ALWAYS point to the table with the latest data.

2. Say “Orders” is currently pointing to “ORDERS_TABLE_A” but the data needs refreshed.  You will update “ORDERS_TABLE_B” then “swap” the synonym to point to “ORDERS_TABLE_B.”  The time it takes to toggle a synonym between two tables is almost a split second.   If something goes awry during the data refresh, then the synonyms never get swapped and the user has no interruption of service either way.
Synonyms are a very useful tool as it means that in your application does not have to worry about changing the  table names from “ORDERS_TABLE_A” to “ORDERS_TABLE_B” and vice versa every time you do a refresh.

As you can see Synonyms are very useful tools for using aliases and can help when you have processes that involve building new tables to the side of existing ones for performance reasons.

MSDN Synonyms

An Even Better Explanation

SQL: Introduction to Databases

For the sake of the more non-technical people, a database is a computer full of spreadsheets.  Anytime you’ve looked at data in a grid-format, then you’ve looked at a spreadsheet at some point.  For example, Microsoft Excel is the most popular spreadsheet software among office folks.  Why do people use databases if they can just look at data in a spreadsheet?  There are many of wonderful uses of databases, but the most necessary can be narrowed down to two:  size and reporting.


SIZE


Data gets big very quickly.  In Excel 2007, a user can enter 16,384 columns and 1,048,576 rows.  This seems like allot of data!  Your computer very well might crash if you tried to open an Excel spreadsheet containing a million rows, not to mention the amount of hard-drive space this would take!

A database is a dedicated piece of software created specifically for large amounts of data.  Usually, a database will run on a computer that contains only databases.

REPORTING

Businesses operate on data.  A business might use an Excel spreadsheet to keep track of the cost of office supplies.  Another business might have to keep track a chain of store inventories s across the country and what inventory is in each store.  They also might want to know what product sells more in the Midwest as opposed to the West Coast.  Data is useless without a means to make sense of it all.   Databases provide such means in a cost-effective and efficient manner.    With proper setup, a query into a database can search through millions of rows to find the one piece of information that you want and do it in a matter of milliseconds.

Databases can be as simple or as complex as you can imagine.  From cell phones to ATMs — all are dependent on data systems.  The world runs on data. Now you know what I am referring to when I talk about databases.

SQL Server : Use XQuery On XML Data Types

XQuery is to XML what SQL is to database tables.   XQuery is designed to query XML data – not just XML files, but anything that can appear as XML, including databases.  In SQL Server, there are five methods that you can use to speed up your xquery to a column of type “xml:”  query(), value(), exist(), modify(), and nodes().  I’ll cover the first, and most commonly used, three.
For the purpose of this example, I will use the following sample xml in field “XmlField :”
<msg>
  <head>  
    <Id>X9356G356Y</Id>
  </head>
  <body>
    <ReturnedResponse>
      <Response>
        <ReturnCode>0</ReturnCode>      
        <Message>SUCCESS</Message>
      </Response>    
      <Customer>
   <CustomerId>CID459283</CustomerId>
        <State>NV</State>
 <Name>
  <FirstLast>Name1_Name2 </FirstLast>
        </Name>
      </Customer>
    </ReturnedResponse>
  </body>
</msg>
–XPATH Examples
–.value
SELECT  A.TransDate, A.MainId,
XmlField.value(‘/msg[1]/body[1]/ReturnedResponse[1]/Customer[1]/CustomerId[1]’, ‘VARCHAR(25)’) as xCustomerId
FROM TABLE2 B WITH (NOLOCK)
INNER JOIN TABLE1 A WITH (NOLOCK) ON B.APk = A.MainId
WHERE
A.MainId = ‘HI05491882’
–.exist
SELECT  A.TransDate, A.MainId,
XmlField.value(‘/msg[1]/body[1]/ReturnedResponse[1]/Customer[1]/CustomerId[1]’, ‘VARCHAR(25)’) as xCustomerId
FROM TABLE2 B WITH (NOLOCK)
INNER JOIN TABLE1 A WITH (NOLOCK) ON B.APk = A.MainId
WHERE
A.MainId = ‘HI05491882’
AND XmlField.exist(‘/msg/body/ReturnedResponse/Customer/CustomerId’) = 1
–.query
SELECT XmlField.query(‘/msg/body/ReturnedResponse/Customer/CustomerId’)
FROM TABLE2 B WITH (NOLOCK)
INNER JOIN TABLE1 A WITH (NOLOCK) ON B.APk = A.MainId
WHERE
A.MainId = ‘HI05491882’

SQL Server: Get RowCount Of All Tables Without the Overhead

Avoid overhead with this completely awesome SQL Server query that returns the row-count of all tables without actually having to query each table.  Instead, it just pulls from the statistics:

SELECT ‘[‘ + SCHEMA_NAME(t.schema_id)+ ‘].[‘ + t.name + ‘]’ AS full_name,
SCHEMA_NAME(t.schema_id) AS schema_name,
t.name AS table_name,
i.rows
FROM sys.tables AS t INNER JOIN
sys.sysindexes AS i ON t.object_id =i.id AND i.indid < 2