If not for the new job that I have joined I would not have heard of this tool at all. Alteryx is another kid on the block that is mix of ETL / Analysis / Reporting all blended into one. The reporting part hasn’t impressed me much but some of the the ETL/ Analsyis features that it has just blew me away. In this post I would like to point out few key areas where I feel this just aces miles ahead of SSIS (or) Informatica.
Installation
You just need to go to the site, click on the Download Now. This would then prompt you to register and you MUST enable to their subscriptions. Once done, you would have 14 day trial of the product. It’s that simple.
Preview Data at every single stage post run
At each stage of transformation one can see the data before and after the transformation post the execution of the workflow. I just can’t fathom how are they even doing this. Let me show you with an example
Here is a simple workflow that is taking student data as input, doing some check on Gender (if Male/Female) and then cocatenating first name and last name for both the flows.
If you are coming with a ETL background, you would be able to quickly latch on to the transformations and what they do because they are so intuitive. Even an hours video from youtube would be sufficient to quickly scale up on transformations. If you notice each of the transformations above, there are green button like icons before and after. They are basically input and output (as if you didn’t decipher already). Now let’s say the workflow is run. Post run, I can click on any of these buttons to see the data at that particular stage.
Let me pick ‘Identifiy Gender’ transformation. Once you click on the transformation, all the inputs and outputs of that transformation are available to be previewed. Seen below is how the input data looks like. My condition was to seperate them by Gender –
If I want to see what the ‘T’ output rows look like (i.e. Males), I just click on it –
Now let me have a look at the ‘F’ output rows –
Imagine this being the case at every stage of the transformation. It’s just incredible to be able to see how your business rules are working at every stage.
Let me just repeat one more time, if you haven’t read the sub heading, this preview is POST-RUN. I just can’t think of any alternative for this in either SSIS/Informatica.
Testing the DFT without Output-
Just scroll little back up and see the screenshot of the workflow I posted. It doesn’t contain an ‘Output’. The last transformation that you see is just a UNION ALL. Say if you are developing a POC or just testing somethings out, you avoid the necessity to create a destination and then dumping the data. Of course, Trash Destination comes quickly to mind from SSIS stack but that’s an add-on and not out-of-the-box feature. I can’t think of any in Informatica though.
Multicasting
Pretty much every single transformation’s output you can multicast and then branch off to do some entirely new logic altogether.
Dynamic column propogation-
I have been saving the best for the last. It has this incredibly advanced capability of bringing in dynamic columns just as if they have been there all along. Let’s say in the data above, I perform two changes to the input file –
-Added new column say Location at the end
-Added new column Is Married after ‘Last Name’ column
Without doing any changes to the workflow, it just runs without throwing any error and here is my output from ‘Union’ transform –
HOLY COW! It’s just mind-blowing, ain’t it. In terms of data modifications, it was pretty drastic, as in new columns were added not just at the end but also in between, and Alteryx just doesn’t care about it. It just works!
I am sure, I am just scraping through the tip of the iceberg and there is HUGE amount of exploration left to do. What is also fantastic about this product is the community behind it.
Their community forums and learning channels are all free for anyone to ask and learn much like MSDN community or Informatica ones. They have weekly challenges running which are good fun to flex your muscle and give it a try. The whole interface though I feel they can improve on. I feel bit claustrophobic with all the overbearing green color theme and design but you get used to it.
All in all I am loving it. Watch out for future posts where I detail how it fares performance wise, error handling, configurability, looping, dynamic data parsing etc.