SoftDocs: DataStage Tips n Tricks

DataStage Administrator

Add & Delete projects

Issue DataStage Engine commands directly from the selected project

View or set project properties (allow users to Cleanup Resources & Clear Status File from within the Job menu of DataStage Director)

Permission (Production Manager, Developer & Operator)

--> Production Manager - full access to all areas of the project and can create manipulate protected projects.

(If the project is created with ‘protected’ property, only Production Managers can add or remove objects. [Production manager is a user role]. Other user can,

Run jobs,

Set job properties,

Set job parameter default values)

--> Developer - full access to all areas of the project

--> Operator - permission to run and manage DataStage jobs

Server side Tracing (Can trace activity on the server to help diagnose project problems)

Scheduling the jobs by using Windows NT Schedule service

Performance tuning

--> Set the memory cache size for reading and writing hashed files

--> In-process row buffering (This allows connected active stages to pass data via buffers rather than row by row)

Inter Process row buffering (SMP) (This enables the job to run using a separate process for each active stage, which will run simultaneously on a separate processor)

DataStage Manager

Viewing & managing the contents of the Repository

Create, Delete Categories and Move items between Categories

Import & Export components and objects in the repository

Export utility creates an ASCII text file

Usage analysis tool (How modifying a particular object would affect the DataStage project as a whole)

Reporting Assistant allows you to generate reports at various levels within a project

Director

Used to validate, run, schedule and monitor jobs

Gather statistics as the job runs

Designer

To develop process for extracting, cleansing, transforming, integrating and loading data

Stages (Each stage describes a particular database or process)

Three basic types of stages (Built-in Stage, Plug-in Stage and Job Sequence Stage)

i) Built-in Stage - Used for ETL process

ü) Plug-in Stage - Additional stages to perform specialized tasks that built-in stages do not support.

üi) Job Sequence Stage - To define sequences of activities to run

Three basic type of jobs (Server jobs, Parallel jobs and Mainframe jobs)

i). Server and Parallel jobs are compiled and run on the DataStage Server. But parallel jobs supports parallel processing on SMP, MPP and cluster systems.

ü). Mainframe jobs are complied and run on the mainframe.

Server job stages (Active or Passive)

i). Active stage provide mechanisms for combining data streams, aggregating data and converting data from one data type to another. (E.X. Aggregator, Transformer, etc.)

ü) Passive stage handles access to databases for extraction or writing of data (E.X. ODBC, Hashed file, Sequential file, etc)

Reusable elements (Server shared container and Parallel shares container)

Job Sequences (To specify a sequence of DataStage jobs to be executed and actions to take depending on results)

SoftDocs

Pages

Categories

Blog Archive

Google Talk -- Live Chat

Popular Posts

DataStage Tips n Tricks

1 comments:

Post a Comment

Total Pageviews

About Me

Followers

Blogroll