SQL Server Coding Standards - Part 1
Introduction
Everyone needs standards. They promote cleaner code, better programming, and make it easier for teams of people to work together. Standards are one of the foundations upon which computer science is built. Coding standards (as I define them) are the standards which specify how the object source code is written. This includes the format, documentation, structure of the object code, etc. Having good coding standards ensures that all object source code looks similar and allows different members of a team to more quickly understand the object.
In my career I have worked in a number of different programming environments and have seen different types of standards for developing code. These have ranged from free (read non-existent here) to extremely rigid (variable names listed in a central document and specific formats for writing code). There have not been too many that I have really liked, and many of them were cumbersome. One thing that I have usually found, however, is that almost never are there DBA standards. I've been guilty of this myself, when I managed two other DBAs, where I had not formally spelled out how object code was written.
In my current job, we recently had doubled the size of the development staff and completed a large project. During the lull that followed, a couple developers were assigned the task of developing coding standards to ensure that all development efforts would look alike. The resulting work (developed for Cold Fusion source code) was a document and a style that I really liked. After viewing this document, I started to develop my own document based on this guide.
What Types of Standards Are Included?
In developing my own coding standards, I decided to include the following items as they pertain to SQL source code. This would not apply to embedded SQL code which often exists in our Cold Fusion site (as well as many web sites. Here is the list of items that are covered:
Object Headers
Variable Names
Source Code Spacing
Source Code Formatting
Query Formatting
Object Footers
The Standards
What about SQL Server development standards? I think they should exist as well. A few jobs ago, I ran into issues with two other developers in examining each other's server code. We found that we were not only using different standards for layout, but also for accessing objects in transactions. Luckily we caught this before any deadlocks occurred in our system, but it presented the need for implementing some type of standards.
Over the last few years, I have continually developed and evolved my set of coding standards. Today, they exist as a document with my current company, but I still examine them at times to see if they are truly worthy of being "standards". After all, I implemented them (after some thought) because I needed them, not because I had the best solution for each category.
There are really two types of standards: coding standards and naming standards. This article will deal with the naming standards and part 2 will discuss coding standards. Here is a current list of my standards:
Databases
Each database on a server should be named using a name that is logical and applicable to the use of the database. Since third party databases often require specific names, this specification cannot give more concrete examples of naming standards. If you are building software which may be deployed on another server, you may wish to prefix the database name with some acronym signifying your company, as in example 3.
Examples:
Sales
Dynamics
IBM_SalesAnalysis
Backup Devices (Full Backup)
Any file that contains a complete backup should be named in the following format:
<database name>_<4 digit year><month><day><hour><minute><second>
where all times are the time the backup was started. The extension for all full backup files should be ".bak". All items should include leading zeros for values whose size is less than the size of the maximum value, i.e. always include a 2 digit month.
Examples:
Sales_20011015080000.bak
Dynamics_20010908000000.bak
Backup Devices (Differential Backup)
Any file that contains a differential backup should be named in the following format:
<database name>_<4 digit year><month><day><hour><minute><second>
where all times are the time the backup was started. The extension for all full backup files should be ".dif". All items should include leading zeros for values whose size is less than the size of the maximum value, i.e. always include a 2 digit month.
Examples:
Sales_20011015083000.dif
Dynamics_20010908120000.dif
Backup Devices (Transaction Log Backup)
Any file that contains a transaction log backup should be named in the following format:
<database name>_<4 digit year><month><day><hour><minute><second>
where all times are the time the backup was started. The extension for all full backup files should be ".trn". All items should include leading zeros for values whose size is less than the size of the maximum value, i.e. always include a 2 digit month.
Examples:
Sales_20011015081500.trn
Dynamics_20010908080000.trn
Logins
All login names should follow the company standards for network login names. Currently the standard is:
<first initial>_<last name><middle initial (if needed)>
Examples:
sjones
bknight
Users
All database user names should match the login name to which it is mapped. NO User accounts should be shared among multiple logins. Use roles instead.
Examples:
sjones
bknight
Roles
All database roles should be named for the function of the role. This may be the name of the department or the job function.
Examples:
Marketing
PurchasingAgents
Tables
All tables should be named for the function of the table. For multiple word tables, the name should be in proper case for each word. No spaces should be used in table names.
Examples:
Orders
OrderLineItems
Columns
Columns used in either tables or views should follow the same naming convention as for tables. Proper case all words with no spaces inside the name.
Examples:
OrderID
ProductCode
QuantityPurchased
Views
All view names should begin with a lower case "v" and then follow the same naming conventions as for a table. Proper case all words in the name with no internal spaces. If this is a view of a single table and contains all fields, then use "v" plus the table name.
Examples:
vOrderDetails
vProduct
Indexes
All indexes should be named in the following format:
<Table name>_<index type><index number (optional)> where the table name matches the table or view to which the index is being applied. The index types are:
Primary Key - PK
Clustered Index - IDX
Nonclustered Index - NDX
Only when there is more than one nonclustered index should the index numbering be used.
Examples:
Orders_PK
Products_IDX
ProductDetails_NDX
ProductDetails_NDX2
Triggers
All triggers should contain the name of the table, an underscore followed by "tr" and the letters which represent the intention of the trigger (i for insert, u for update, d for delete). If there are more than one trigger, a numerical designation, starting with 2 should be appended to the name.
Examples:
Customers_tri
Orders_triu
Products_trd
Products_trd2
User Defined Functions (UDFs)
A user defined function should prefixed by "udf_" and then a description that logically follows the function process. The description should be proper case words with no spaces.
Examples:
udf_GetNextID
udf_SumOrderLines
Defaults
All defaults should be prefixed with "df_" and then some description of the default value. The description should be proper case with no spaces or underscores.
Examples:
df_One
df_GetDate
Conclusion
I hope that all of you out there have some type of coding standard to which you adhere. I think most of us naturally develop one over time, but if you need to work with other individuals, it makes sense to formalize your standards. Feel free to adopt my coding standards if you need them and I would be interested in hearing about any enhancements you make or disagreements you have.
As always I welcome feedback in the forum below and please vote your opinion on this article.
Coding Standards - Part 2 - Formatting SQL
Introduction
One of the areas that I feel is very important for managing and maintaining a stable environment is the use of standards. In any SQL Server environment, we deal with any number of objects or areas, but we often work with T-SQL code in our daily work. And as anyone know who has worked with others, having standards greatly improves and simplifies the ability of one person to take over for another and maintain some continuity between the individuals.
This article examines another area of standards that I have found to be helpful in a multi-person team: the formatting of code. I know that formatting is a question of style and I will probably get some arguments here, but having everyone maintain a similar format greatly speeds the ease with which one can examine and understand the code.
The Problem
How often do you get a sliver of code that is formatted as one long line of code? If you work with developers, the answer is probably quite often. Usually a developer formats T-SQL code in a manner that corresponds to the way in which they would format C++, ASP, VB, or any other code. Which may not be how you format the code.
How often do you then reformat it? I know that I have often been sent something like this:
select * from customers, orders, orderline where customers.orderid = orders.orderid
and orders.orderid = orderline.orderid and orderline.amount > 100
Now this is a simple query, but when I am attempting to troubleshoot something, I like to have it in a format that makes life easier for me. Usually the first step for me would be to format this as follows:
select
*
from customers, orders, orderline
where customers.orderid = orders.orderid
and orders.orderid = orderline.orderid
and orderline.amount > 100
I would also then take further steps that would further clarify the joins that are occurring.
Why?
Because I understand the workings of a SQL statement by looking for a certain structure and verifying the joins are correct, no cross joins, proper columns, etc. And I often need to add and remove sections for testing, it helps me to quickly follow the logic when everything is arranged in a particular structure.
Setting a coding standard is something everyone should do and, IMHO, it is as important as setting naming standards. As with any standard, it simply makes it easier to work together and more efficiently when we understand each other's code.
I am not advocating you adopt my standards, though that would be nice. At least for me if I ever come to work for your company. However, I do urge that you adopt some standard and feel free to use these are a starting point.
My Solution
Here are a list of items that I have included in standards. These primarily apply to SELECT queries, but they can easily be adapted to INSERT, UPDATE or DELETE queries.
Tabs - Sounds like a minor item, but the formatting of anything starts with tabs. They can really help of hurt someone when editing or even viewing as some editors will reformat your display based on tabs. I set a standard of 3 spaces. Enough to see a difference and still fit fairly long statements on my screen completely.
Aliases - I pretty much always use an alias. I used to do the 'a', 'b', etc. thing, essentially numbering each table sequentially as it appeared in the query. I changed that to adopt a "standard" alias for each table in a database. These are usually 1 or 2 character abbreviations, sometimes 3, that I use over and over. While it seems cumbersome, I had a database with 300+ tables and remembering the aliases was pretty easy for both me and my development team. Some examples:
Orders - o
OrderLineItems - oi
Customers - c
CustomerSuport - cs
I'm sure this will be easy for you and it also makes the reading of a query much cleaner. I hate wading through joins that look like:
and customersupport.phonenumber = customeraddress.phonenumber
It's much easier to see:
and cs.phonenumber = ca.phonenumber
Spacing - I try to encourage spacing and the use of new lines for formatting queries. I break each section into a new line (see the next item) as well as indent it. I use tabs and spaces judiciously to make each area easier to read. my general rule is to indent using tabs, except for the major section of a query. Each of those is one space down from the query type (insert, select, etc.). An example looks like:
select
*
from Customers c
where c.customerid = 5
and c.customeraddress is not null
I also include spaces around each parenthesis and comma and operator. These make the CTRL- controls handy for moving inside the query. Nothing worse then being unable to get to the > in a comparison because the column names are slammed up against the operator.
Column names - I've gone back and forth on this, but overall I've found that having each column on a separate line to be extremely convenient for commenting them in and out of queries. I also precede columns with a comma, rather than having it follow the column name for the same reason. I often find this the best technique for debugging. Formatting in this way also allows me to easily cut and paste from the column list to the GROUP BY or ORDER BY sections.
I also indent the join syntax by one tab and the ON clause by one further tab. These help with the readability of the query. Example:
select
customerid
, customername
, address
, phone
, status
from Customers c
inner join orders o
on c.customerid = o.customerid
left outer join orderlines oi
on o.orderid = oi.orderid
and oi.status = 1
where c.customerid = 5
and c.customeraddress is not null
order by
customerid
, customername
, status
Note that I cut and pasted the column list to the order by and then removed a couple columns be selecting the entire line. A small thing, but it makes my typing seem so much faster and efficient.
Sections - Each query can be broken up into sections based on the major keywords. I move each of these to a new line to make the reading easy as well as the navigation convenient. It makes more sense when looking at it, so here is an example:
select
customerid
, customername
, count(*)
from Customers c
inner join orders o
on c.customerid = o.customerid
left outer join orderlines oi
on o.orderid = oi.orderid
and oi.status = 1
where customerid = 5
and c.customeraddress is not null
and (c.active = 1
OR
c.status > 5
)
group by
customerid
, customername
having
count(*) > 1
order by
customerid
, customername
A few other items worth mentioning. I include special operators (TOP, DISTINCT, etc.) on the first line with SELECT. This keeps me from confusing them as a column of some sort. I also indent subselects in a similar manner, but starting from the point of the parenthesis. I then match up the closing parenthesis with the opening one on its own line as in the last example above.
Conclusions
I know this isn't a complete list, it is merely intended to handle the formatting of SQL. Even there, I'm sure there are things I've left out. As I wrote this I realized how much is involved and how confusing it can be to explain.
Since I am most often working with SQL, this is a high priority area. There are also standards for formatting the various object statements, including comments, etc. I'll be tackling those in another article later on.
I'd be interested to hear what other standards, if any, people have implemented and how they are working. I suspect this is often more of a "my own style" area, but I have learned to adapt to others, and I have found that having a standard truly makes work easier.
As always I welcome feedback on this article using the "Your Opinion" button below.
Steve Jones
ヾkRanch.net June 2002