Dynamic Partitioning to Increase Parallelism Based on Resources Availability
Informatica PowerCenter session partition can be used to process data in parallel and achieve faster data delivery. Using Dynamic Session Partitioning capability, PowerCenter can dynamically decide the degree of parallelism. The Integration Service scales the number of session partitions at run time based on factors such as source database partitions or the number of CPUs on the node resulting significant performance improvement.
Dynamic Partitioning Methods
Partition Tutorial Series
Part I : Partition Introduction.
Part II : Partition Implementation.
Part III : Dynamic Partition.
Part I : Partition Introduction.
Part II : Partition Implementation.
Part III : Dynamic Partition.
The Integration Service can decide the number of session partitions at run time based different factors.
- Based on source partitioning : Determines the number of partitions using database partition information. The number of partitions is the maximum of the number of partitions at the source.
- Based on number of CPUs : Sets the number of partitions equal to the number of CPUs on the node that prepares the session. If the session is configured to run on a grid, dynamic partitioning sets the number of partitions equal to the number of CPUs on the node that prepares the session multiplied by the number of nodes in the grid.
- Based on number of nodes in grid : Sets the partitions to the number of nodes in the grid running the session. If you configure this option for sessions that do not run on a grid, the session runs in one partition and logs a message in the session log.
- Based on number of partitions : Sets the partitions to a number that you define in the Number of Partitions attribute. Use the $DynamicPartitionCount session parameter, or enter a number greater than 1.
Note : Do not configure dynamic partitioning for a session that contains manual partitions. If you set dynamic partitioning and you manually partition the session, the session will be invalid.
Session Configuration With Dynamic Partitioning
If the dynamic partition is setup based on "Based on number of partitions", "Number of Partitions" property can be set up using the session parameter $DynamicPartitionCount, or enter a number greater than 1 as shown in below image.
Dynamic Partitioning with Different Partition Types
When using different partition types with dynamic partition, following rules must be considered.
Pass-through partitioning : If you change the number of partitions at a partition point, the number of partitions in each pipeline stage changes. If you use pass-through partitioning with a relational source, the session runs in one partition in the stage.
Key range partitioning : You must define a closed range of numbers or date keys to use dynamic partitioning. Dynamic partitioning does not scale partitions with key range partitioning on relational targets.
Database partitioning : When you use database partitioning, the Integration Service creates session partitions based on the source database partitions. This can be used only with Oracle and IBM DB2 sources.
Hash auto-keys, Hash user keys, Round robin : Use hash user keys, hash auto-keys, and round-robin partition types to distribute rows with dynamic partitioning. Use hash user keys and hash auto-keys partitioning when you want the Integration Service to distribute rows to the partitions by group. Use round-robin partitioning when you want the Integration Service to distribute rows evenly to partitions.
Rules and Guidelines for Dynamic Partitioning
Use the following rules and guidelines with dynamic partitioning.- Dynamic partitioning uses the same connection for each partition.
- You cannot use dynamic partitioning with XML sources and targets.
- Sessions that use SFTP fail if you enable dynamic partitioning.
- When you set dynamic partitioning to a value other than disabled, and you manually partition the session on the Mapping tab, you invalidate the session.
Using Dynamic Partitioning with Partition Types
The following rules apply to using dynamic partitioning with different partition types.
Pass-through partitioning :- If you change the number of partitions at a partition point, the number of partitions in each pipeline stage changes. If you use pass-through partitioning with a relational source, the session runs in one partition in the stage.
Key range partitioning :- You must define a closed range of numbers or date keys to use dynamic partitioning. The keys must be numeric or date datatypes. Dynamic partitioning does not scale partitions with key range partitioning on relational targets.
Database partitioning :- When you use database partitioning, the Integration Service creates session partitions based on the source database partitions. Use database partitioning with Oracle and IBM DB2 sources.
Hash auto-keys, hash user keys, or round-robin :- Use hash user keys, hash auto-keys, and round-robin partition types to distribute rows with dynamic partitioning. Use hash user keys and hash auto-keys partitioning when you want the Integration Service to distribute rows to the partitions by group. Use round-robin partitioning when you want the Integration Service to distribute rows evenly to partitions.
Hope you enjoyed this article and it is informative. Please leave us your comments and feedback.