This is another important question we often encounter during the interviews. I have been struggling a lot to understand and finally did it with a classic example.
Parallel processing comes in picture when there is a requirement to process large bulk of rows without compromising the performance which might otherwise get impacted greatly with non-parallel based processing. We would try to understand this concept with an actual example where this concept has been used.
Requirement - There are 100,000 employees in the organization and many of them have more than one assignments (EMPL_RCD) hence the total number of rows in the PER_ORG_ASGN record collectively will be 120,000. Develop a program to update the primary job indicator flag for each employee in the table PRIMARY_JOBS.
Please also see -
PeopleSoft Set Control Field Concept and Tableset Sharing
Understanding dynamic prompts in Peoplesoft
PeopleSoft 9.1 Person Model
Adding and Maintaining Person Of Interest in PeopleSoft
PeopleSoft Set Control Field
Peoplesoft Row Level Security Search Records
SQL Query for Max Effective (MAX (EFFDT) dated row from JOB table
SQL Query to find all the Direct Reports for a Manager
How to find the List of Users Assigned to a Role
Understanding Component Interface in PeopleSoft
Creating Query Report with PS Query in PeopleSoft
PeopleSoft HRMS Online Training
Implementing Parallel Processing using Temporary Record with Application Engine
Step :1 Open the App DesignerStep 3: Create a record BN_AGE50_AET of type "Derived/Work"
Step 4: Create a new Application Engine program "BN_UPD_PRIM" in app designer.
Step 5: Open the Application engine properties, go to "Temp Tables" tab and assign all three temp tables. Give the instance count as 10.
Step 7: Build the temporary records BN_EMPLID_WRK, BN_EMPLID1_WRK and BN_JOB_WRK. App designer will create the number of instances for each of these temporary tables equal to given in the Program Properties -> Temp Tab - Instance count box which is "10" in our case which means the tables for each temp record will be created as below:
BN_EMPLID_WRK - BN_EMPLID_WRK1, BN_EMPLID_WRK2... up to 10 instances
BN_EMPLID1_WRK - BN_EMPLID1_WRK1, BN_EMPLID1_WRK2.... up to 10 instances
BN_JOB_WRK - BN_JOB_WRK1, BN_JOB_WRK2.... up to 10 instances
Step 8: In the MAIN section of app engine (Or create a new section) add a Step/Action of SQL type, then add below SQL in that.
This SQL will load all the employees into temp table instances BN_EMPLID_WRK1, BN_EMPLID_WRK2... etc, Please make sure that you are using the meta sql %Table with the temp table name.
Lets say we have 100,000 employees in PS_PERSON record and each employee has just one assignment in PS_PER_ORG_ASGN. Since we have instance count for the temp record to 10 so when the program is run, The selected 100000 rows will be divided into like 100000/10 = 10000 and each chunk of 10000 rows will be inserted into one of the available instances BN_EMPLID_WRK1, BN_EMPLID_WRK2 ...etc.
Please also see -
PeopleSoft Set Control Field Concept and Tableset Sharing
Understanding dynamic prompts in Peoplesoft
PeopleSoft 9.1 Person Model
Adding and Maintaining Person Of Interest in PeopleSoft
PeopleSoft Set Control Field
Peoplesoft Row Level Security Search Records
SQL Query for Max Effective (MAX (EFFDT) dated row from JOB table
SQL Query to find all the Direct Reports for a Manager
Hiring a Person in PeopleSoft
Business Unit, Company and Regulatory Region in PeopleSoft
Process Security in PeopleSoft
Please note that we don't have to worry about which instance a particular chunk is being inserted into instead, %Table metasql does that for us. However, Its not necessary that application engine will process the rows in exactly same way explained above, its just a logical explanation which is trying to convey that the number of selected rows i.e 100,000 will be divided and loaded into appropriate temp table instances to be processed simultaneously which is nothing but parallel processing.
Once above step is processed, the temp table BN_EMPLID_WRK has been loaded with large volume of data (1000,000 rows) of employees eligible to be processed. How many rows have been loaded in which temp table instance (BN_EMPLID_WRK1, BN_EMPLID_WRK2... etc) is none of our concerns.
Step 9: Add below Step/Action to update the statistics of temp table BN_EMPLID_WRK for the indexes to improve the performance when the rows are selected from the instances of this table in further steps.
Step 10: Add below Step/Action to get the JOB data (EFFDT, EFFSEQ and EMPL_STATUS) for each employee loaded in BN_EMPLID_WRK, and insert into another temp table BN_JOB_WRK (BN_JOB_WRK1, BN_JOB_WRK2 ... upto 10 instances).
Here again as you notice, we just used %Table metasql with the table BN_JOB_WRK in the SQL which will take care as to how should be the allocation of rows for the instances BN_JOB_WRK1, BN_JOB_WRK2...etc.
Step 11: Add below Step/Action to update the statistics of temp table BN_JOB_WRK for the indexes to improve the performance when the rows are selected from the instances of this table in further steps.
Step 12: Add below Step/Action to select those employees loaded in the temp table BN_EMPLID_WRK who are all don't have more than one assignments (EMPL_RCD) for a benefit record number (BENEFIT_RCD_NBR), and insert into another temp table BN_EMPLID1_WRK (BN_EMPLID1_WRK1, BN_EMPLID1_WRK2...upto 10 instances).
Step 13: Add below Step/Action to update the statistics of temp table BN_EMPLID1_WRK for the indexes to improve the performance when the rows are selected from the instances of this table in further steps.
Step 14: Finally, mark the current EMPL_RCD as "Primary Job" in the table PS_PRIMARY_JOBS for the processed employees by joining all three temp tables.
Well, it's quite evident that the procedure to implement parallel processing appears simpler than we thought. However, if we try to compare it with the non-parallel way of addressing the same request then it would certainly help us understand the concept clearly.
Let's discuss how would it have been addressed if there was no mechanism called "Parallel Processing".
1) Record Definition - The number of records will be same but the record type for BN_EMPLID_WRK, BN_EMPLID1_WRK and BN_JOB_WRK will be "Table".
2) Since there is no temporary record created hence no need to assign the temp records in Program Properties -> Temp Tables. It won't be applicable.
3) No Need to use %Table metasql while referring to the records.
Everything else will remain exactly same. We also need to understand how different it will behave without parallel processing.
In this case, When we run the program, The system will store all the 100,000 rows into the table BN_EMPLID_WRK during the execution of Step 1 explained earlier unlike, into various temp table instances BN_EMPLID_WRK1, BN_EMPLID_WRK2..etc. So, when execution goes further then all the 100,000 rows are fetched from one table and processed but in case of parallel processing the chunk of rows i.e 10000 if fetched from each temp table instance and processed all simultaneously.
Hence, If time taken to complete the processing is 20 Mins then it will be 20/10 ~ 2 to 5 Mins if parallel processing is used.
Please also see -
PeopleSoft Functional and Technical Online Training
Understanding dynamic prompts in Peoplesoft
PeopleSoft 9.1 Person Model
Adding and Maintaining Person Of Interest in PeopleSoft
PeopleSoft Set Control Field
Peoplesoft Row Level Security Search Records
SQL Query for Max Effective (MAX (EFFDT) dated row from JOB table
SQL Query to find all the Direct Reports for a Manager
Hiring a Person in PeopleSoft
How to find the List of Users Assigned to a Role
Understanding Component Interface in PeopleSoft
FTE For Multiple Jobs in PoeopleSoft
Business Unit, Company and Regulatory Region in PeopleSoft
Hello, Wonderful article. Thank you for elaborating concept of Parallel processing in AE.
ReplyDeleteHowever, I got one question. Is Parallel or concurrent processing simply means having more than one instance of your app engine program running at the same time and each instance of the program gets it's dedicated Temp table instance.
OR
Parallel or concurrent processing could be achieved using Temp tables. For example if there are 100K rows to be processed and 10 temp table instances, each table is responsible to process 10K rows simultaneously on a single execution of application engine.
I believe your blog indicates a second concept. I tried to implement lattar in one of the application engine to process about over one Million records and when executed there is only one instance of Temp table allocated to the process. I want to know how it will process 10 temp tables (or more than one temp table) in single run of program.
As this is written it is going to dump 100,000 records into BN_EMPLID_WRK1 (actually BN_EMPLID_WRK04). It will not split up the 100,000 rows. You could create 10 programs and limit the initial load to say 10,000 rows,then each program would be processing that specific data at the same time. Or you could actually have one AE and it could determine the number of total rows to load, divide it by the number of temp tables you have, then create a SQL statement to load this data, store it in your state record and schedule each app engine to run. Either way, I do not see how what is written here is going to AUTOMATICALLY know how to divide the 100,000 rows and actually use all instances of the temp tables and process all 10 instances of the data at the same time. If I am missing anything please let me know, but I just do not see it here.
DeleteDale- you are absolutely correct.
DeleteIf I were to process 100 K rows in an application Engine, do I need to run different instances of application Engine for parallel processing of the process or data could automatically be spread out into number of Temp tables (Say 10000 rows among 10 temp tables where 10000 is arbitrary) in single run of program.
ReplyDeleteMy objective is to divide data among instances of temp tables in a single run without running many instances of the same program as our program will be scheduled and must not require any user intervention therefore no one will be there to enter run control parameters manually.
Our expected setting: One process -> one instance run of the program -> Uses 5 temp tables instances to insert huge data simultaneously in a single run using parallel processing to achieve performance. Your blog gives us a clue that we can implement that way (referring to para starting with 'Lets say....'). Are you sure your AE program when executed were using all instance of the temp tables to process data on a single run of an application engine.
@Mavarick_1979 - It can be implemented in both ways however the second one PeopleSoft delivered one and a lot easier to implement too.
ReplyDeleteDid you mention number of instance count in "Instance count " filed of App engine program properties (Explained in Step: 5)
@PeopleSoftOracleFusion - Yes, your understanding is correct.
This comment has been removed by the author.
ReplyDeleteI can not see the SQL that is referred to in Step 8, BUT, the way this is written, it will not spread the 100,000 rows across multiple temp tables. It will simply use 1 instance of the 10 tables created. If you want to spread this across 10 tables, you would need to have 10 copies of this app engine. the initial load in step 8, of each AE, would need to setup to only load a specific number or 'chunk' of data. From that point on, each program would be processing the 10k rows that it has loaded, The only value in what I see here is if this AE abended, you would be able to run it again using a different run control and it would use the next available instance of the temp tables
ReplyDeleteA very elaborated explanation thanks for sharing it...
ReplyDeleteHi Santosh,
ReplyDeleteNice explanation. Could you please explain restart-ability in app engine.
I have queued it up in my to be published list.. thanks for visiting guys I am delighted that you all are liking the blog. .
ReplyDeleteDale barthold is correct- parallel processing means you have the ability to run the same appengine working on different sets of data- if you implement this as it is elaborated here- it is only going to use one instance of the temp table. You need to segregate data at the runcontrol level and desing your insert sql for Insert into %table (temp_table) so that it fetches data based on different runcontrol parameters.
ReplyDeleteDale is correct
ReplyDeleteThis comment has been removed by the author.
ReplyDeletecan anybody please explain on the calculation of total instance count of temporary tables, if count is defined online as well as in application engine
ReplyDeleteThanks Santhosh . Very nicely explained.
ReplyDeleteerrors
ReplyDeletePeopleTools 8.50 - Application Engine
Copyright (c) 1988-2015 PeopleSoft, Inc.
All Rights Reserved
Application error: no SQL statement defined for SQL action (108,535)
Message Set Number: 108
Message Number: 535
Message Reason: Application error: no SQL statement defined for SQL action (108,535)
Process 1335 ABENDED at Step PARR_AE.MAIN.Step02 (SQL) -- RC = ? (108,524)
Message Set Number: 108
Message Number: 524
Message Reason: Process 1335 ABENDED at Step PARR_AE.MAIN.Step02 (SQL) -- RC = ? (108,524)
Process %s ABENDED at Step %s.%s.%s (Action %s) -- RC = %s
Such a wonderful work....You are Guru of Guru Sir....Thanks.
ReplyDeleteHi Blogger,
ReplyDeleteThanks a lot for sharing this knowledge.
But can u please address Dale's view point.
It seems your logic is not splitting the data and its not parallel processing.
Can you please clarify.
Thanks in advance.
Hi,
ReplyDeleteI have created a 10 instance as above mentioned .I am getting below error when I am processing.
Note:PS_DLT_OPR ID_TM1 is my Temp Table
SQL error. Stmt #: 2660 Error Position: 0 Return: 9999 - A n unknown error code (9999) wa s encountered (SQLSTATE PS999)
RUN TRUNCATE TABLE (PS_DLT_OPR ID_TM14)
Worthful Spark tutorial. Appreciate a lot for taking up the pain to write such a quality content on Spark Training. Just now I watched this similar Spark tutorial and I think this will enhance the knowledge of other visitors for sure. Thanks anyway.:-https://www.youtube.com/watch?v=8Kcu63H0d8c&
ReplyDeleteI have been reading various topics on your blog and I love the clarity in the explanation. Thank you!
ReplyDeleteThanks for sharing with us.
ReplyDeleteBusiness Setup in Dubai
Ajman Free Zone offshore formation
Ajman Free Zone
Ajman Media City Free Zone
Dubai PRO Service
DNRD Typing Dubai Immigration
PRO Services in UAE
Dubai Local Sponsor
Dubai Company Liquidation
Great read! Thank you for such useful insights. Visit here for advanced technical courses on OKTA ONLINE TRAINING
ReplyDeleteis that explained set processing or parllel processing? what is diffence between dividing and sharing temp tables inbetween same program run and multiple program runs?
ReplyDeletePročitajte moje svjedočanstvo o tome kako sam dobio brzi zajam od pouzdane tvrtke. Pozdrav, ja sam gospođica Veronica iz Hrvatske. Bio sam u financijskoj situaciji i trebao sam kupiti kuću. Pokušao sam potražiti zajam od različitih kreditnih tvrtki, privatnih i korporativnih, ali nije išlo, a većina banaka mi je odbila kredit. Ali kako bi Bog htio, upoznali su me s privatnim zajmodavcem koji mi je dao zajam od 90,000 eura i danas imam vlastitu kuću i vlasnik sam tvrtke i trenutno mi je dobro, ako morate ići na tvrtka za osiguranje zajma. bez kolaterala, bez provjere kreditne sposobnosti, samo 3% kamatna stopa i supotpisnik s boljim planovima i rasporedom otplate, kontaktirajte Davidson Albert Loan (davidsonalbertloan@gmail.com). On ne zna da to radim, ali sada sam tako sretna i odlučila sam dati ljudima više informacija o njemu i želim da ga Bog još više blagoslovi. Možete ga kontaktirati putem njegove e-pošte. davidsonalbertloan@gmail.com
ReplyDelete{WhatsApp: +38761545894}