Pages

Informatica Vs OWB


For the developer standpoint both of the tools are great and can produce similar results, it will depend how familiar the developer is with the tool.

For the learning standpoint it is much easier to start working with OWB than Informatica because of the simple reason: Anyone can download the software from OTNs website and there are lots of open forums about the tool, rather than Informatica Powercenter that is not available for download unless purchasing the software.

Both of tools have great user interfaces and follow the drag-and-drop-columns-from-source-to-target approach. The only pre-requirement to learn OWB is to know the SQL basics, and Informatica is easier to learn at this point.

For the administration standpoint it is much easier to manage Informatica Powercenter and to assign the privileges using the Repository Manager. To be able to manage OWB the administrator may need a few years of database administration skills, since he will face a few challenges on the installation and during the lifecycle of the Data Warehouse: Setup the OWB repository, identify the target schemas and create the Oracle Workflow Repository. The administrator can install the main Informatica components at once during the installation process.

   * Informatica Power exchange seems similar to Oracle gateways, but with connectors to People soft and Siebel in addition to the SAP connectivity that both tools offer. Support is good in both tools for non-Oracle databases (DB2, SQL Server, Teradata, Sybase and so on)    

* One major difference is that OWB will only  populate Oracle 8i, 9i or 10g data warehouses, whilst Informatica works  against any major vendor (thanks Duncan for  pointing that one out,   one of those ’so obvious if you’re used to OWB,

 * Both tools allow you to built reusable components for transforming data, with Power center's being specific to the tool whilst Oracle’s are regular PL/SQL functions and procedures.

 * Informatica, like Oracle, is making a big noise about grid computing. "PC7 offers server grid capabilities, too, by which Power Center can distribute loads across heterogeneous UNIX, Windows, or Linux-based computing platforms. Although grid capabilities may seem exciting, it’s not like they match real-world need for grid computing yet, and it is recommend using them in place of other industry grid solutions."  

 * The main architectural different between Power center and OWB is that Power center has it’s own ETL engine, that sits on top of the source and target databases and does it’s own data movement and  transformation, whilst OWB uses SQL and the built-in ETL functions in 9i and 10g to move and transform data. Interestingly the article observes that the Informatica approach can be slower than the approach used with OWB. "Also, be aware that ETL tools are in general a slower (if more elegant) alternative to native SQL processing (such as Oracle PL*SQL or Microsoft Transact SQL)."  

 * Power center's use of web services and distributed computing looks more developed than OWB’s. "Power Center Web services are managed through the Web Services Hub, another component of the architecture, which supports standards such as Simple Object Access Protocol (SOAP), Web Services Description Language (WSDL), and Universal Description, Discovery, and Integration (UDDI). The architectural components can be collocated on a single server or spread across diverse servers, which allows solution parallelism, flexibility, and scalability."   

 * Power center starts at around $200,000 (yikes!) although there is a "Flexible pricing model.” OWB is licensed as part of 10gDS which is around $5000 per named user, although you’ll need the Enterprise Edition of the 8i, 9i or 10g database to provide the ETL functionality.




Connectivity and Interoperability


OWB was designed as an ELT/ETL tool that targets Oracle databases only.  Informatica an ETL tool that is database-agnostic (pulls from a variety of ODBC sources and targets a variety of relational sources)
OWB is very finicky about the connections (using database links.)  They become messy if you are not disciplined and the connections become difficult to manage when migrating across dev->test->prod.

Who leverages what?

OWB leverages the Oracle technology (very powerful).  It does its work inside of the database (it truly is ELT.)  This is where its power lies.

Informatica does its work outside of the database (processes spawned by the server using parallelism, etc. to increase efficiency.)  Hence, it still has to put the data in the database (if you are staging it first), take the data back out to prepare it for the next step in processing, and then load it to it's next destination.  All of this is outside of the database address space.  OWB does all of this inside of the database address space (with the exception of the pull from the source.)

Positioning of the Informatica server(s) is important.  There can be serious contention between the Informatica server(s) and the Oracle instance for resources. 

Network throughput is a serious consideration (for both of them.)  For Informatica, consider a high-speed network device (hub/switch/router) with a direct connection from the source server to the target server (for high-speed through-put) to minimize the impact on the business network.  Additionally, consider strongly using a dedicated server for the Informatica server(s) (separate from the database server) that also uses a network connection scheme similar to the private network for the databases (source to target) discussed above.

Barring the productivity argument of one tool or another (advocates for both sides will argue one or the other saves $$$ through increased productivity)

Informatica costs big $$$!  OWB is free (unless you need the enterprise options i.e. SCD, Data Profiling and cleansing, etc.)  It is a great way to inexpensively get your data warehouse project off of the ground...if your data warehouse is built on Oracle technology (the database.)

Getting the data you need…

We are using data from a variety of sources (SCADA and other operational sources such as hand-held devices and automated meter readers, Progress database, MS SQL Server, Oracle, Sybase, lots of Access and Excel, etc.)  Thinking through the architecture is very important.  Gateways are too expensive for our taste.  We have leveraged Oracle HSODBC heavily to allow us to connect to the various sources.  This has helped tremendously.  We are currently using the open source ODBC driver for MS Sql and Sybase to source the systems through the database (again as database links.)
Informatica does this all on its own.

From the developer's point of view:

Informatica and OWB use the same basic design approach (maps with transformations or operators) to facilitate the ETL design.

OWB stores the metadata inside of the design repository (which can be in the target database or on a separate server and Oracle instance.) 

Informatica stores the metadata in its own repository.

One huge advantage OWB has over Informatica is that it develops all of the deployable logic in Oracle PLSQL as packages.  You can actually view the source code as you develop to see what your logic is going to do.

OWB is also extensible.  You can create procedures, functions and packages and import them to extend (customize) the ETL environment.

OWB also retains the actual source definitions of the various schema objects that are imported into the repository.  This is very handy when for example, If  one want to examine a view definition and how it is impacting source data.  Informatica does not.

Enough rambling. Informatica is the Cadillac ETL tool out there.  But the question remains, how much do you need?  Especially if your target is Oracle and you can upgrade the tool and extend its functionality to meet your needs? 

How long will OWB be around? 

Tough question, lots of fog on that topic.  The Oracle technologists say:-

- OWB 12 is in development (but workflow will be deprecated in 11g and BPEL-like technology will be folded in to replace it, standardizing on Oracle's workflow technology being used in Fusion.)

- OWB and ODI will be merged and there will be an OWB with ODI technology in to solve some of OWBs shortcomings such as the connectivity issues.  And there will be an ODI with OWB technology for.

- Goldengate (another Oracle-purchased ETL tool for more real-time data processing) will replace OWB and ODI.

- Oracle will buy Informatica and OWB, ODI and Goldengate will all go away.

0 comments:

Post a Comment