Tag Archives: Sorting SQL

SQL INDEXING – Simplified

Let us talk about indexing in SQL Server:

Depending on size of the data, a table can have more than a million rows. If you want to search for a specific record there are two ways:

Go row by row and see if that is the row you want.
Example: Let us say we want to search an employee table that has 1 million rows for an employee with last name “Smith”. Remember you can have more than one record with name “Smith”. One way to do this is to go record by record and check if last name is “Smith”. The problem with this approach is that this will be very slow as you have to go row by row. May be there is only one record with last name “Smith” and that could be the last record. In this case you will have to read one million times, only to find one record. This in SQL terms is called “TABLE SCAN”.
The second approach is a better approach. What if I tell you that all names are sorted in “Alphabetical Order”. So, if you directly go to half a million record and if that name is say “Karsen” then you know “Smith” will be after “Karsen” (‘S’ comes after ‘K’). Now you can go directly to six hundred thousand record and if that name is “Tedd” then you know “Smith” is between half a million record and six hundred thousand record. So now in two attempts you have narrowed it down from one million to hundred thousand. Isn’t that great? This is what is called as indexing.

Let us now get deep into details. Indexes are created on columns in tables. The index provides a fast way to look up data based on the values within those columns. An index is made up of a set of pages (index nodes) that are organized in a B-tree structure. Figure below shows how a typical B-tree looks like.

Data is stored in the form of pages. Now let us try to find records with last name “Dull”.
Go to Root Page (level 1). With indexing now we know that any name starts with ‘Bennet’ and is before ‘Karsen’ we should go to page 1007. What this means is if we want to find out employees with name “Charlie” we should go to page 1007 and if we want to find out employees with name “Kyle’ we should go to page 1009 (“Ky” comes between “Ka” and “Sm”).

So, looking at page 1007 (level 0) we see that if name is between ‘Bennet’ and ‘Greane’ (‘D’ comes between ‘B’ and ‘G’) we need to go to page 1132. So, let us go there and find out.

Looking at page 1132 we can see record with name “Dull”. Hurray…. our search is over. We found our record. This is how indexing will work typically.

There are two types of indexes, Clustered index and Non Clustered Index. Clustered index is on primary key and data will be physically sorted. So, leaf level will have the actual data record. In a non-clustered index level will have a pointer to actual data page. This means you have to go one extra step to actually get your data. Remember indexes can make your search faster but insert, updates and delete will be slower.

Thanks for visiting my blog. Please feel free to leave suggestions/feedbacks/comments.
If you like my post please click “LIKE” button below.

Read about SQL Technical Interview Questions here: Crack the SQL Interview

Written by Amit Uppal

Email us