This is another famous argument in the world of databases specially when it comes to data transfers and transforms.
As a developer with a history of Java development, naturally I find writing SPs is more flexible and most of all, more fun.
Lets look at the facts;
I have experience in loading data from different source systems (MS SQL, Excel) to a data warehouse which is residing in MS SQL using ETL.
I also have experience in loading data from a source system (MS SQL) to a staging database in a different MS SQL instance using MS SQL and then transforming and transferring the data from the staging DB to facts and dimensions.
Definitions:
SPs – is a batch of SQL statements that can be executed one after the other in a transaction. SQL statement have create, drop, alter, update, select and delete statements which allows to achieve almost any database related operation.
ETL – allows to extract, transform and load
1. Extract – retrieve the data from different source systems
2. Transform – convert the data to a form that we need
3. Write the data to a destination database (warehouse/staging etc)
Advantages of using SPs:
1. More flexible when database queries are involved- Specially for a person with a history of programming, writing SPs will feel like home. General DML and DDL in SQL gives the opportunity to do a lot of DB operations.
2. Performance – SQL runs in the SQL engine whereas in ETL, the data needs to be brought inside the SSIS memory space before execution.
3. Any changes required is really easy compared to ETLs. You need to only find the place in SP in the database and update, where as in ETL, you have to go to the ETL (maybe in visual studio), do the change, build and deploy in the SQL server.
Advantages of using ETL:
1. When complex operations are involved, SPs might not be the best. Mostly this results in the developers calling code snippets written in Java or c#, which can be quite messy. The exact way to do is explained in the below URL in my blog:
Calling a C# code from a SQL function
However, ETL provides easy ways to manipulate the data or do any complex operations to the existing data.
2. Easy for anyone who is not familiar with the world of programming or with the system being developed to continue with the development.
3. Auditing and logging is so much easier with SSIS than using SPs.
4. ETL lets you do much more than a standard T-SQL much more easily, such as read from all sorts of different sources, error handling, scripting, deploying, debugging, logging etc etc.
The question is when to use ETL and when to use SPs.
ETL:
- When there are multiple different types of source systems
- When there are highly complex data transformations are involved
- Very graphical and easy to understand.
- Dependencies are visible
- Helps logging and auditing
- Performance can be tuned when multiple sources and destinations are involved.
- When performance becomes critical for straightforward loading.
- When source and destination systems are the same and straightforward
- When it is a small and the same team is doing the project till the end
Leave a comment