Saturday, May 21, 2005

Hand-coded v/s Tools-based ETL:

There have been rounds of debates going around on which paradigm to opt, for developing ETL for typical data warehouse project.

I would share my experience working with Handcoding the ETL and using Oracle Warehouse Builder (Tool Based ETL.)

The ETL of old days has not remain same now. Challenges ETL has faced in the recent times are:

  1. Complex business systems and Analytics requirement
  2. Volume of data
  3. Small refresh window
  4. Orchestrating and automation of Loads
  5. Monitoring capabilities and exception handling
  6. Complex data quality and integrity requirements
  7. Complex transformation
  8. Match-merge and cleansing
  9. Flexibility to the change ETL logic on the fly as the business changes
  10. Varied nature of source systems

And list goes on. So our very old hand-coded ETL does not cope up with this all.

Let me delineate my understanding of ETL. ETL is not piece of code or application, which enables you to do Extraction, Transformation and Loading. E, T and L are three primary pillars and core of typical ETL. But there are few more aspects, which fall under the purview of ETL. Today, ETL stands for Extraction, transformation, Loading, infrastructure elements (for extraction, transformation, loading ).

insert into sales values(customerid, …,…,revenue)

( select orderheader.customerid,… orderline.qty * orderline.price from orderheader@src , orderline@src where orderheader.order_id = orderline.order_id)

Above could be simple ETL doing Extraction (select..) , transformation (qty*price) and loading(insert into..) but it still lacks something to suffice above 10 challenges.

  1. Infrastructure for scheduling and automating the loads
  2. Mechanism to handle errors and exceptions and to throw out necessary alerts.
  3. Resilience and error recovery. .
  4. Abilities to switch from Set-based (insert into …select *) to row based (picking up one row at a time and loading it) loading type
  5. Workflow to orchestrate the various sub-loads (individual units)
  6. Metadata management
  7. Versioning and management of design metadata.
  8. Capturing Runtime audit logs
  9. Capabilities to leverage the specific features of target data warehouse platform.
  10. Flexibility to change the configuration at any level ranging from whole ETL project to individual object.

It’s unrealistic for a hand-coded ETL project to cover all this.

ETL tools suites these days are typically bundled with all this functionalities. Oracle Warehouse Builder is one such tool. Apart from enabling one to model the ETL flows (and not code), it also gives you all the above functionalities. Tools like OWB gives above functionalities in one integrated suite. With the new release of OWB (Paris) things would be more streamlined and all the above functionalities will be accessible through one single interface. I would be sharing my experiences on OWB Paris in subsequent blogs.

As I keep hearing from the research analysts that applications would be modeled and not coded as time goes by. This is driven by the inherent complexities in business applications.

So coming back to our prime question of which approach is better between Hand-coding and tool-based. As everything depends on context, the answer to this also depends on the context. If the ETL requirement is simple and straight without any great need for infrastructure elements like monitoring, metadata management etc. hand coding the ETL would be the wise choice. But if the ETL is complex and there is a requirement of monitoring, metadata management etc. tool-based approach would better go ahead.


Adeptia Integration said...

Extract, transform, load and manage data from single environment through best etl tools / software which provides rapid improvement in data warehousing process.

Anonymous said...

Youth is not wow gold a time of life;world of warcraft gold it is a state of mind; cheap wow gold it is not a Maple Story Accounts matter of rosy cheeks, red lips and supple knees;mesos it is a matter of the will, a quality of the imagination,wow gold kaufen a vigor of the emotions; it is the freshness wow geld of the deep springs of life.maple story mesos Youth means a tempera-mental predominance of courage over timidity, of the appetite for adventure over the love of ease. This often exists in a man of 60 more than a boy of gold farmen Nobody grows old merely by a number of years.maple story money We grow old by deserting our mesos Years may wrinkle the skin, but to give up enthusiasm wrinkles the soul. Worry, fear, self-distrust bows the heart and turns the spring back to dust. Whether 60 or 16, there is in every human being’wow powerleveling s heart the lure of wonder, the unfailing childlike appetite of what’s maple story money next and the joy of the game of living.powerlevel In the center of your heart and my heart there is a wireless station: so long as it receives messages maplestory powerleveling of beauty, hope, cheer,world of warcraft power leveling courage and power from men and from the Infinite, so long are you young. When the aerials are down, and your spirit is covered with snows of cynicism and the ice of pessimism, then you are grown old, even at 20, but as long as your aerials are up, to catch waves of optimism, there is hope you may die young at 80!