Category Icon
Testing ETL or: Some More Fun with the Informatica Repository

Posted on April 8, 2008 by Thomas
Filed Under Data Warehousing, Informatica, Tips 'n Tricks |

Database Design

Family and friends sometimes ask me (although usually not more than once) what exactly it is that I do. Sure, you’re a Business Intelligence slash data warehouse slash ETL specialist, but honestly; do you honestly expect us to know what that means? Management information? Huh? I usually answer the question by explaining that, on a very basic level, I create database tables and the fancy programs that fill them from other tables. The very basis of any BI environment in a nutshell.

Those that didn’t stop listening somewhere halfway - and understood what I was trying to tell them - are generally rather disappointed. Filling tables with data? Is that all? What’s so complicated about that?

The rest of the conversation I will spare you, but truth be told: it took me a while to come up with a satisfying answer to this often-asked question. What is it that an ETL developer does all day? Really, what is it about our job that is so damn complicated? I thought about writing an article about the necessary steps for building a proper ETL flow, or perhaps about the challenges that we face… HOWEVER… Saying it like that makes them seem like an awfully boring subjects for an article.

So: now for something completely different. Let’s put a spotlight on one of the ETL development phases - testing - and talk about how Informatica Repository SQL can make that task a little easier.

People who read my articles will know that I’m a fan of keeping my work as challenging as I can. I do this, on the one hand, by spending as much time as I can on things that are complicated (and often not entirely necessary) and, on the other hand, by spending as little time as possible on stuff that I consider standard or repititive. And some of my most shameful behaviour I bring to bear during the testing phase of ETL development.

Testing is, of course, necessary. No one’s perfect. But checking other people’s  work is still terribly boring in my book. So if am unlucky enough to find myself in that position, I generally want to be done with it as quickly as I can. Anything that can be automated, I say, should be automated.

The responsibilities of an ETL tester as I see them are three-fold. From basic to advanced:

  1. Does the mapping adhere to the development standards and naming conventions?

  2. Does mapping do what the technical design says it should do?

  3. Does the mapping work correctly in relation to other processes in your data logistical flow?

When I test my mappings, I start out with the first and from there work my way down the list. As said; simple equals boring, and rules and conventions curbing the creative process are possibly even more annoying, so I generally spare the first point only a cursorary glance. That was, until one woeful day our data warehouse project’s administrator would not approve our ETL and pointed the big Finger of Responsibility in my direction. “From now on, ye shall be doing the checking of the transformation names!”

Bugger. But, no harm done. It was quickly decided to leave the first step to my most trusted, mindless computer minion: the Informatica Server. A set of mappings containing nothing more than a bunch of queries on the Informatica Repository would give me a quick list of which mappings or workflows did not have a description added, which transformations did not adhere to our naming conventions, which sessions still had ‘bulk load’ enabled or did not have the correct commit interval, which tasks did not fail their parent if they failed, etcetera, etcetera. The usual stuff. It would do so without complaining, repetitively, and without error.

Sure it took me a few days to get all the SQL on paper. Sure it was a little complex at first. But I can tell you it was a hell of a lot more fun than all the empty-headed days of clicking that I would have had to do instead. It saved me my sanity during that particular project and, even better: it is sure to continue doing so in many more to come…

Here’s a few tasty examples:

Shows all workflows that do not create a backwards compatible logfile (Informatica 8.5 and up)
SELECT OPB_SUBJECT.SUBJ_NAME AS FOLDER
,   OPB_TASK.TASK_NAME AS WORKFLOW
,   ‘Workflow does not create a backwards compatible logfile‘ AS ERROR
FROM OPB_TASK
,   OPB_TASK_ATTR
,   OPB_SUBJECT
WHERE OPB_TASK.TASK_TYPE = 71
AND OPB_TASK_ATTR.ATTR_ID = 12
AND OPB_TASK_ATTR.ATTR_VALUE <> 1
AND OPB_TASK_ATTR.TASK_ID = OPB_TASK.TASK_ID
AND OPB_SUBJECT.SUBJ_ID = OPB_TASK.SUBJECT_ID
–  AND UPPER(OPB_SUBJECT.SUBJ_NAME) LIKE UPPER(’FOLDERNAME’)
ORDER BY 1,2

Lists all sessions which do not have a parameter value in $Source or $Target
SELECT OPB_SUBJECT.SUBJ_NAME AS FOLDER
,   OPB_TASK.TASK_NAME AS TASK
,   DECODE(OPB_TASK_ATTR.ATTR_ID
                  ,  7, ‘$Source‘, 8, ‘$Target‘) AS PARAMETER
,   OPB_TASK_ATTR.ATTR_VALUE AS CURRENT_VALUE
,   ‘Value not set using parameter‘ AS ERROR
FROM OPB_SUBJECT
,   OPB_TASK
,   OPB_TASK_ATTR
WHERE OPB_SUBJECT.SUBJ_ID = OPB_TASK.SUBJECT_ID
AND OPB_TASK.TASK_ID = OPB_TASK_ATTR.TASK_ID
AND OPB_TASK_ATTR.ATTR_ID IN (7,8)
AND OPB_TASK.TASK_TYPE IN (68)
AND UPPER(OPB_TASK_ATTR.ATTR_VALUE) NOT LIKE ‘%$%
–  AND UPPER(OPB_SUBJECT.SUBJ_NAME) LIKE ‘FOLDERNAME’
ORDER BY 1,2,3,4

Shows all uncommented transformations in mappings
SELECT
distinct   OPB_SUBJECT.SUBJ_NAME AS FOLDER
,   OPB_MAPPING.MAPPING_NAME AS MAPPING
,   OPB_WIDGET_INST.INSTANCE_NAME AS TRANSFORMATION
,   ‘Missing comment in transformation‘ AS ERROR
,   DECODE(OPB_WIDGET_INST.WIDGET_TYPE, 3, 
    ’SOURCE QUALIFIER transformation (user defined) should be commented
,   DECODE(OPB_WIDGET_INST.WIDGET_TYPE, 4, 
    ’UPDATE transformation should be commented
,   DECODE(OPB_WIDGET_INST.WIDGET_TYPE, 5, 
    ’EXPRESSION transformation should be commented
,   DECODE(OPB_WIDGET_INST.WIDGET_TYPE, 6, 
    ’STORED PROCEDURE transformation should be commented
,   DECODE(OPB_WIDGET_INST.WIDGET_TYPE, 7, 
    ’SEQUENCE transformation should be commented
,   DECODE(OPB_WIDGET_INST.WIDGET_TYPE, 8, 
    ’EXTERNAL PROCEDURE transformation should be commented
,   DECODE(OPB_WIDGET_INST.WIDGET_TYPE, 9, 
    ’AGGREGATOR transformation should be commented
,   DECODE(OPB_WIDGET_INST.WIDGET_TYPE, 10, 
    ’FILTER transformation should be commented
,   DECODE(OPB_WIDGET_INST.WIDGET_TYPE, 11, 
    ’LOOKUP transformation (with user defined SQL) should be commented
,   DECODE(OPB_WIDGET_INST.WIDGET_TYPE, 12, 
    ’JOINER transformation should be commented
,   DECODE(OPB_WIDGET_INST.WIDGET_TYPE, 14, 
    ’NORMALIZER transformation should be commented
,   DECODE(OPB_WIDGET_INST.WIDGET_TYPE, 15, 
    ’ROUTER transformation should be commented
,   DECODE(OPB_WIDGET_INST.WIDGET_TYPE, 26, 
    ’RANKER transformation should be commented
,   DECODE(OPB_WIDGET_INST.WIDGET_TYPE, 50, 
    ’ADVANCED EXTERNAL PROCEDURE transformation should be commented
,   DECODE(OPB_WIDGET_INST.WIDGET_TYPE, 80, 
    ’SORTER transformation should be commented‘ , 
    ’Transformation should be commented‘ ))))))))))))))) AS DESCRIPTION
FROM    OPB_SUBJECT
,   OPB_MAPPING
,   OPB_WIDGET
,   OPB_WIDGET_INST
,  (SELECT DISTINCT OPB_WIDGET.WIDGET_ID   
    FROM OPB_WIDGET_ATTR, OPB_WIDGET, OPB_SUBJECT   
    WHERE OPB_WIDGET.WIDGET_ID = OPB_WIDGET_ATTR.WIDGET_ID     
    AND OPB_SUBJECT.SUBJ_ID = OPB_WIDGET.SUBJECT_ID     
    AND OPB_WIDGET.IS_VISIBLE = 1     
    AND OPB_WIDGET.WIDGET_TYPE IN (3,11)      
    AND OPB_WIDGET_ATTR.WIDGET_TYPE IN (3,11)     
    AND ATTR_ID = 1 
    AND ATTR_VALUE IS NOT NULL) T1
WHERE OPB_SUBJECT.SUBJ_ID = OPB_MAPPING.SUBJECT_ID  
AND OPB_MAPPING.MAPPING_ID = OPB_WIDGET_INST.MAPPING_ID  
AND OPB_WIDGET_INST.WIDGET_ID = OPB_WIDGET.WIDGET_ID  
AND OPB_WIDGET_INST.COMMENTS IS NULL  
AND OPB_WIDGET.IS_VISIBLE = 1  
AND OPB_WIDGET.IS_REUSABLE <> 1 
        – Reusables do not need extra comments  
AND OPB_WIDGET_INST.WIDGET_TYPE NOT IN (1,2,46,47)
        — Source Definitions, Targets, Mapplet Input/Output  
AND OPB_WIDGET.WIDGET_ID = T1.WIDGET_ID (+)  
        – SQ/LKP only need comments when User Defined
AND CASE WHEN OPB_WIDGET.WIDGET_TYPE IN (3,11)
        AND T1.WIDGET_ID IS NULL THEN 1 ELSE 0 END = 0  
ORDER BY 1,2,3

These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • Fark
  • Technorati
  • del.icio.us
  • Google
  • StumbleUpon
Email This Post Email This Post

Comments

8 Responses to “Testing ETL or: Some More Fun with the Informatica Repository”

  1. Remy on April 11th, 2008 7:23 am

    Some of these queries look quite familiar to me ;)

  2. Thomas on April 11th, 2008 7:29 am

    Yep, but these queries DO work! ;)

  3. Lewis Kwan on May 9th, 2008 5:03 pm

    I wonder if such a tool exists in DataStage or Oracle Warehouse Builder.

  4. Boopathi on May 12th, 2008 7:56 pm

    These are very usefull.

    Can i have a query which gives Lookupname, mapping name ,lookup table name and connection information?

    Basicaly i need all details related to lookup from metadata.

    Thanks,

  5. Boopathi on May 12th, 2008 9:27 pm

    I guess i got the data. I used this query if someone interested…

    Thanks

    Select distinct b.WIDGET_ID, a.mapping_name, b.INSTANCE_NAME Lkp_name, Decode(c.attr_id,2,c.attr_value) Table_name,
    (c.attr_id,6,c.attr_value) src_tgt
    FROM rep_all_mappings a,rep_widget_inst b, OPB_WIDGET_ATTR c
    where a.mapping_id=b.mapping_id
    and b.WIDGET_ID=c.WIDGET_ID
    and a.subject_area=’xyz’
    and b.WIDGET_TYPE=11
    and c.WIDGET_TYPE=11
    and c.ATTR_ID in (2,6)

  6. zerns on June 12th, 2008 3:06 pm

    How about queries to check the status of workflows? Do these exist or do I need to make them?

    Thanks!

    Jeff

  7. sridhar subramanian on June 17th, 2008 1:54 pm

    could anybody give me an idea where to get an article regarding the internal tables,attributes and their description???

  8. Justin on June 24th, 2008 7:01 pm

    Thomas, I have been searching for a query that will help me do impact analysis for a table change? Do you have a query that could help me with that? You would be a life saver!

Leave a Reply