Search This Blog

Loading...

Tuesday, June 17, 2014

Creating a Power Pivot Data Model in Microsoft Excel 2013

I was featured on SQL Server Pro for writing an article on creating a Power Pivot Data Model in Microsoft Excel 2013. In the article, I am providing step-by-step instructions for generating complex reports in Excel using a Power Pivot Table, an add-in that allows users to gather, store, model and analyze large amounts of data.

Read the full article here.

Thursday, May 15, 2014

SQL Server 2014 – Index Enhancements

Inline Specification of Clustered and NON-Clustered Indexes

SQL Server 2014 is launched and there are many new things added in the product. DBAs were waiting for a very long time for some of those. Creating Inline Clustered and NON-Clustered Indexes is one of them.
If you are familiar with T-SQL coding, you should know that you cannot add NON clustered index while creating a table in CREATE TABLE statement. The NON-Clustered indexes were to be added after the creation of the table. Now you can add Non-Clustered indexes in the CREATE TABLE statement.
Some of you say “What’s the PLUS point in this?” Well, I have myself witnessed that when you are creating the database tables, you are more involved in the design. Generally, you have all the information regarding the data that the table is going to hold and the major user queries that are going to be executed on that table. Point is, that is the best time to decide to add indexes. You generally do this. But at times, you forget to exclusively add and run the “CREATE INDEX” statement. This results in an un-wanted delays and, sometimes, results in the crash, especially when you are reading data for reports from the tables on which you forgot to add an index. With the ability of creating Inline Indexes, the chances of such mistakes is even lower now.
If you execute the following in SQL Server 2012, it will generate an error;
Use AdventureWorks2012
Go

CREATE TABLE tblTest
(
       ID int IDENTITY (1,1) PRIMARY KEY,
       NAME nvarchar(100),
       FATHERNAME nvarchar(100) index idx_FATHERNAME
)

The error will be;
Incorrect syntax near 'index'. If this is intended as a part of a table hint, A WITH keyword and parenthesis are now required. See SQL Server Books Online for proper syntax.

But if you execute the same code in SQL Server 2014 (any edition), the command will be executed successfully.
You can also create composite indexes in the same manner. Included columns and filtered conditions are not yet supported with Inline specification of indexes.
Use AdventureWorks2012
Go

CREATE TABLE tblTest
(
       ID int IDENTITY (1,1) PRIMARY KEY,
       NAME nvarchar(100),
       FATHERNAME nvarchar(100) index idx_FATHERNAME (FATHERNAME,NAME)
)

Table variable can have an index

Yes, now we can have an index on a table variable defined in the T-SQL code. We all know the importance of the table variable. Here is the example code;

Declare @vTabletest TABLE 
(
       ID int IDENTITY (1,1) PRIMARY KEY,
       NAME nvarchar(100),
       FATHERNAME nvarchar(100) index idx_FATHERNAME
)

Insert into @vTabletest values ('A','B')

Select * from @vTabletest

Again, if you try to execute the above code in SQL Server 2012, it will generate errors.
You can also create composite indexes in the same manner. Included columns and filtered conditions are not yet supported with Inline specification of indexes.

Clustered Columnstore Indexes

SQL Server 2014 enables you to create a clustered columnstore index. In the prior versions, only non-clustered columnstore indexes were offered.
In order to create a clustered columnstore index on a table, first you will have to create a table without any index on it i.e. a heap;
CREATE TABLE tblTest 
(
       ID int ,
       NAME nvarchar(100),
       FATHERNAME nvarchar(100)
)
GO

Now, execute the command to create a column stored Index;
CREATE CLUSTERED COLUMNSTORE INDEX idx_CS_Test on tbltest

The above command converts the table into the clustered columns store index named idx_CS_test. As you can see in the above statement, you cannot specify the key list in the index specification as we used to do in the non-clustered columnstore indexes. This is because the whole table is converted into the columnstore index.

Tuesday, March 18, 2014

Optimizing SQL Azure Database

1. Dynamic Management Views
SQL Azure provides a few handy system views called dynamic management views (DMVs) that are also available in SQL Server. SQL Azure exposes a subset of the DMVs, but all those related to query execution are available.

sys.dmexecconnections 
sys.dmexecqueryplan 
sys.dmexecquerystats 
sys.dmexecrequests 
sys.dmexecsessions 
sys.dmexecsqltext 
sys.dmexectextquery_plan

2. Connection pooling
Connection pooling is an important consideration for performance. Although this statement is generally accurate in the world of database programming, it becomes critical for SQL Azure. A poorly designed application may create too many connection requests, which can end up flooding SQL Azure. If too many connection requests are established, your connections will be throttled, meaning that you can no longer connect for a period of time.

3. Indexes
Creating the right indexes can be complex; it can take a long time to fully understand indexing and fine-tune database queries. One of the most important things to remember with indexing is that its primary purpose is to help SQL Azure find the data it needs quickly.

Indexes are like smaller tables that contain a subset of the primary table. The tradeoff is that indexes consume space and must be maintained by the SQL Azure engine as the primary data changes, which can impact performance under certain scenarios.

Remember, you must have a clustered index on each table in SQL Azure Database. Windows Azure SQL Database does not support tables without clustered indexes. A table must have a clustered index. If a table is created without a clustered constraint, a clustered index must be created before an insert operation is allowed on the table.

4. Indexed views 
Indexed views are an excellent alternative when you absolutely need to JOIN data, and traditional indexing doesn't yield the performance you're looking for. Indexed views behave like tables; the data covered is materialized to disk so it can be retrieved quickly. Before jumping on indexed views, understand that they have certain limitations and that due to their nature, you may incur a performance hit through the usual Insert, Delete, and Update statements.

5. Stored procedures
You've seen various ways to tune your statements and improve execution plans. However, keep in mind that you also have stored procedures at your disposal. Stored procedures can give you an edge if you need to execute logic that requires a large volume of data. Because you know that returning lots of data turns into a performance problem in SQL Azure, you can place the business logic that needs the data in a stored procedure, and have the procedure return a status code. Because you aren't charged for CPU time, this becomes an affordable option.

Stored procedures can also be an interesting security tool, allowing you proxy the calls to underlying tables through a procedure and never allowing direct access to the tables.

There are a few restrictions in SQL Azure Stored Procedures. You cannot use the following in CREATE PROCEDURE statement;

  • FOR REPLICATION option
  • EXTERNAL NAME ( external methods)
  • ENCRYPTION option

Also, following XML commands doe not work in SQL Azure;
  • sp_xml_preparedocument
  • sp_xml_removedocument
  • OpenXML
We have the workaround to tackle with these commands but that is not in the scope of this article.

NOTE: Reference taken from http://programming4.us/

Introducing Tabular Model - Self-Service Business Intelligence

Tabular Model is a new feature in SQL Server 2012. Tabular models are in-memory databases in Analysis Services. They use state-of-the-art compression algorithm and multi-threaded query processors named as VertpPaq (xVelocity in-memory analytics engine). This enable fast access to tabular objects and data by reporting applications such as MS Excel.

Tabular Models are created in SQL Server Data Tools (SSDT). It is the upgraded version of BI developer. SSDT is also used to develop Reports, Integration Services and Multidimensional projects. Tabular Model project is deployed as Analysis Services database just like Multidimensional projects. In Tabular Model, we can add partitions for optimization purposes. Row-level security can also be implemented in Tabular Model. We can also import external data to make it a part of the model. We can add relationships, measures and calculated columns on the tabular data in the model. Hierarchies are also supported in Tabular model as in multidimensional model. This makes it more do-able and easy to go with project type in SSAS. As Tabular Model is deployed as Analysis Services database, client applications can connect to the deployed model for reporting purposes.

Analysis services provides three approaches to develop business intelligence semantic models;

1. Tabular
2. Multidimensional
3. PowerPivot

Tabular Model is based on relational structure such as tables and relationships for modeling of data. Multidimensional and data mining are based on OLAP (cubes and dimensions). PowerPivot is a self-service BI tool.

Tabular Model support data access through two modes;

  • Cached Mode In cached mode, you can integrate data from multiple sources including relational databases, data feeds, and flat text files. 

  • Direct-Query Mode In this mode, you can by-pass the In-Memory model and allow cklient applications to query directly from the relational database. In short, Tabular model is designed to query huge data with much lesser response time in different client application, especially in MS Excel. 

Comparing With Multidimensional Model
With the release of Tabular Model, there are few decisions by Managers that has to be made. Some common questions are;

Question 1: When to use Tabular Model and Multidimensional Model?
Answer: Both Tabular and Multidimensional Models are the part of Analysis Services. Bot are developed in SSDT. The answer lies within the data to be used in the model. If you have huge data, like in TBs, to process in the model, then Multidimensional Model is the best suite for you. Tabular Model are more compressed than Multidimensional but it also requires more memory to load and process the data in memory. PowerPivot Models are for even lesser data size.

Question 2: What do we mean by 'External Data' or 'External Source'?
Answer: We can import data in Tabular Model and PowerPivot from External sources. These can be text files, excel files, data feeds and even some document formats. In multidimensional, you can only load data from relational sources using OLEDB and managed providers.

Question 3: What is the core benefit of Tabular Model?
Answer:  The core benefit is that they are easy to implement as PowerPivot. You can develop more efficient and effective solution. The data processing is very fast and it is easy to understand, It uses DAX as programming language. DAX is quite similar to Excel formula language.  

Question 4: Should we convert the current Multidimensional Models to Tabular?
Answer: NO. The project types are not interchangeable. And it will not be an effective decission to convert the multidimensional model into Tabular model. It is best to use the model for the new requirements.

Analysis Services 2012 Two in One Product
Most important aspect of Analysis Services 2012 is that it is two products in one. In SQL Server 2008 R2 and before, we only had Multidimensional Model option in Analysis services. But in SQL Server 2012, you can choose between Multidimensional and Tabular Model while installing SSAS instance. You need to have separate instance for both if you want to use both, multidimensional and tabular model. In the figure below, you can see that on Feature selection page, there is only one selection for Analysis Services;


But when you pass on to the Analysis configuration page, there you will see a radio button for Server Mode selection. The options are; a. Multidimensional and Data Mining Mode b. Tabular Mode




Licensing
Analysis services 2012 is available in the following editions of SQL Server;
  • SQL Server Standard Edition 
  • SQL Server BI Edition 
  • SQL Server Enterprise Edition 
In SQL Server Standard edition, you can only have Multidimensional Mode. SQL Server BI edition and SQL Server Enterprise editions have both Multidimensional and Tabular Mode. In terms of Analysis Services functionality, these two editions are the same; the only difference between them is that SQL Server Business Intelligence edition licensing is based on buying a server license plus Client Access Licenses (CALs), whereas SQL Server Enterprise edition is licensed on a per-CPU core basis. (You can no longer license SQL Server Enterprise edition on a server-plus-CALs basis as was possible in the past.) In SQL Server Business Intelligence and SQL Server Enterprise editions, both Tabular and Multidimensional models contain all available features and can use as many cores as the operating system makes available. The upshot of this is that it could be more expensive in some situations to use Tabular than Multidimensional because Multidimensional is available in SQL Server Standard edition and Tabular is not. If you have a limited budget, already have existing Multidimensional skills, or are willing to learn them, and your data volumes mean that you do not need to use Multidimensional features such as partitioning, it might make sense to use Multidimensional and SQL Server Standard edition to save money. If you are willing to pay slightly more for SQL Server Business Intelligence edition or SQL Server Enterprise edition, however, then licensing costs should not be a consideration in your choice of model. 

A hint to Self-Service BI and Corporate BI
Over the last few years, trends and demand of BI have significantly changed. Now, users are more interested in having the hold on their data and analyze it according to their own run-time requirement on the spot. Prior to that, they had to involve an IT professional to develop complex reports for them. This is Corporate BI. This scenario still exists but for analysis work, it becomes more cost and time consuming for engaging IT professional for the analysis work. This resulted in a new Self-Service BI. There are tools available in the market like QlikView and Tableu that enables users to query their own data in a user friendly interface. Analysis Services is a corporate-BI tool by MS as you need an IT professional to design and build an Analysis Services database. There are disadvantages of self-service BI like poor-quality data, lack of integration between multiple source systems, etc. Still, self-service BI is high on demand and the reason is that most of the data users think that in most cases, the time consuming step of creating a data warehouse is un-necessary in their initial scenario. MS, as a software company could not ignore this market demand, so in 2010, it released its own serf service BI tool, PowerPivot. Tabular Model resembles a lot to PowerPivot and a refined version of PowerPivot as you get the scalability and manageability in the Tabular Model. PowerPivot is more Agile and Self-Service BI tool.

Tuesday, December 10, 2013

Comparing File Names in a directory with the list in Excel File

Here is the Script...

---- Get File Names rom you directory
Declare @vOSFiles Table (MyFileName nvarchar(1000),MyDepth int, IsFile bit)
Insert into @vOSFiles
       EXEC xp_dirtree 'G:\Atif\', 1, 1

--Select * from @vOSFiles where IsFile = 1

---- Get List of Files from Excel

----  Insert Excel File list to another Temp Table.
---- The structure o fthis table depends upon you excel file and requirement
Declare @vExcelFileList Table (MyFileName nvarchar(1000))

---- This is for xlsx files
 Insert into @vExcelFileList
       SELECT [MyFileName] FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
       'Excel 12.0 Xml;Database=G:\Budjet.xlsx;HDR=Yes;IMEX=1',
       'SELECT * FROM [Sheet2$]')


--Select *
--from @vOSFiles

---- Join the two tables on the File Names
Select a.*, case when b.MyFileName is not Null then 'Exists' Else 'Not Exists' end as FileExistance
from @vOSFiles a
Left Outer Join @vExcelFileList b on b.MyFileName = a.MyFileName
Where a.IsFile = 1



You might want to enable ad hoc query options;

---- Enable AdHoc Queries
--USE MSDB
--GO
--sp_configure 'show advanced options', 1;
--GO
--RECONFIGURE;
--GO

--sp_configure 'Ad Hoc Distributed Queries', 1;
--GO
--RECONFIGURE;
--GO

---- Add OLEDB referance for xlsx
--EXEC master.dbo.sp_MSset_oledb_prop N'Microsoft.ACE.OLEDB.12.0'
--    , N'AllowInProcess', 1
--GO

--EXEC master.dbo.sp_MSset_oledb_prop N'Microsoft.ACE.OLEDB.12.0'
--    , N'DynamicParameters', 1
--GO


For XLS files (earlier 2007 versions of Excel), you can run the OpenRowset query as;

SELECT * FROM  OPENROWSET('Microsoft.Jet.OLEDB.4.0'
,'Excel 4.0;Database=G:\Budjet.xls','SELECT * FROM [sheet2$]')


5% OFF All Lenovo ThinkPad's. Coupon Code:CJDEC5UK Lenovo UK Homepage www.1and1.co.uk
5% OFF All Lenovo ThinkPad's.