SQL vs Native IO
Last Post 05 Oct 2011 10:00 PM by Marcia Dore. 39 Replies.
AddThis - Bookmarking and Sharing Button Printer Friendly
  •  
  •  
  •  
  •  
  •  
Sort:
PrevPrev NextNext
You are not authorized to post a reply.
Page 1 of 212 > >>
Author Messages Not Resolved
Fernando Nuñez
New Member
New Member
Posts:5

--
17 Mar 2006 02:07 PM
I´ve read in some iseries magazines that it´s better use SQL access than native I/O in the RPG programs. In fact, they show to change the DB definitons from DDS to SQL. I´ve searched information in internet about a comparison between SQL access vs native access but I haven´t found anything. So, I have wrote a few programs to verify the performance. The results about process time in SQL have been worse than native access. In my company, the data volume in the files is very big and the time process is very important, so, I don´t know if must change to use SQL in my programs or I must go on with native access. Could someone help me? Have I do anything wrong in my test? Are there any comparison between SQL and native access in V5R3? Results (in CPU_TIME) JOB_NAME CPU_TIME NATIVE 2933 SQL 20214 The file test had about 600.000 records Annexe a test source.
Lynne Noll
Senior Member
Senior Member
Posts:6567

--
17 Mar 2006 04:28 PM Accepted Answer
Changing the data definitions to SQL means a faster method of extracting sets of data will be used. This will speed up data retrieval with SQL. That is what the articles are about. Individual record accesses is often faster with RLA than with SQL. The new engine will not be used if the database was defined with DDS, so you are not getting the fastest SQL (depending on release). SQL can also be phrased in such a way that it works but works very slowly. These are all issues that complicate this question. You will not get any enhancements to the old database engine: it is the new one that will speed up over time. However, I am not pushing you to change. At my shop, where the data volume is relatively low, I use a lot of SQL, but not for data definition and not that much for maintenance. SQL retrieval time seems fine to me. I've worked for large volume shops. They always had huge needs for historical reporting in a flexible manner--SQL's forte. However, it can make most sense in this case to build special tables of denormalized historical data on its own platform for this requirement (data warehouse), which may or may not be db2.
Robert Clay
Veteran Member
Veteran Member
Posts:915
Avatar

--
17 Mar 2006 04:41 PM Accepted Answer
These are two really good links for explaining DDS-defined files versus DDL-defined files: http://www.iseriesnetwork.com/artar...topicid=50 http://www.iseriesnetwork.com/artarchive/index.cfm?fuseaction=viewarticle&CO_ContentID=20057&channel=art&subart=top&topicid=50 HTH, Robert
"Contrariwise, if it was so, it might be; and if it were so, it would be; but as it isn't, it ain't. That's logic."--Tweedledee
J Taylor
Senior Member
Senior Member
Posts:4007

--
17 Mar 2006 05:31 PM Accepted Answer
My understanding is that there are two issues here: 1. SQL vs. DDS for file definition 2. SQL vs. RLA (record level access) for data access I have read a number of articles listing the advantages of SQL over DDS. I cannot recall any that said SQL is better than RLA for general data access.
Scott Klement
Editorial Staff Member
Editorial Staff Member
Posts:16313
Avatar

--
17 Mar 2006 05:44 PM Accepted Answer
SQL is, indeed, faster than RLA for some sorts of database access. Particularly when the selection criteria is complex. OS/400 has a database access engine called CQE ("Classic Query Engine") that it's always used for database access. In recent releases, there's been a newer one called SQE ("SQL Query Engine"). SQE is much faster at processing queries and makes database access absolutely fly. However, IBM has been developing SQE slowly, and it's functionality is not complete. When you access the database, the system automatically chooses SQE or CQE for you. It selects SQE when you submit an SQL query that the SQE for your version of OS/400 supports. In each release of OS/400 (V5R2, V5R3, V5R4) the SQL has supported more and more functionality, and therefore is running more and more of the SQL queries. Non-SQL access, however, always uses the CQE. Any SQL that the SQE doesn't support will also go through the CQE automatically. That's why SQL is sometimes faster. When you happen to fit into the SQE, you get much better performance. And since SQE supports more and more with each release, your older programs using SQL may speed up since SQE might suddenly be applicable. On the other hand, if you're doing something very simple, like loading a single record based on a key, or reading a bunch of records in sequence, there's really nothing the system can do to make that any faster. It's already as fast as it can be, no matter which query engine handles it, because it's very simple. In those cases, native record level access will always be faster because there's no interpretation, or analysis, or optimizing that needs to be done at runtime. It's a simple directive -- go here and read this. I get tired of the constant debating that seems to go on with this topic. SQL is faster. RPG is faster. SQL is faster, blah blah blah. That's not a sensible debate, because you've oversimplified the situation. It really depends on what you're doing! If performance is important for your project, because the difference is significant to you, then it's worth trying both ways to see which one performs better for that particular project. If performance isn't important for your project, don't worry about which one is faster. Just pick the one that's easiest for you.
Elvis Budimlic
Veteran Member
Veteran Member
Posts:1433
Avatar

--
17 Mar 2006 10:37 PM Accepted Answer
Scott said it well, it depends. Only thing I'd like to add is that SQL often gets bad rap because RPG programmer tries it without doing any tuning to help query optimizer with more statistics and alternative implementation methods (i.e. indexes).
B.Hauser
Veteran Member
Veteran Member
Posts:2037

--
18 Mar 2006 11:54 AM Accepted Answer
Hi,
Only thing I'd like to add is that SQL often gets bad rap because RPG programmer tries it without doing any tuning to help query optimizer with more statistics and alternative implementation methods (i.e. indexes).
That's why an RPG programmer is used to determine the access path (i.e. logical file) by himself. Most RPG programmers think specifying a logical file in a SELECT statement will force the Query Engine to use this specified access path. But that's not true! If a logical file is specified in an SQL-statement, the query optimizer checks the field selection, join information and select/omit clauses defined in the logical file. After the SQL statement gets rewritten based on the physical files (or SQL tables) and the select/omit clauses as where conditions. In a second step the optimal access path gets determined, by checking ALL available access paths for the specified physical files (or SQL tables) and analyzing the statistics. If the access path defined in the logical file is used, it's nothing more than hazard. And what is much more important: All SQL-Select-statements where DDS described logical files (not SQL defined views) are used get rerouted to the Classic Query Engine (CQE) and cannot gain from the advantages of the SQL Query Engine (SQE). This rerouting may reduce the performance between 10 and 15%. Yet an other information about CQE and SQE: The SQE is not used, as long as logical files with SELECT/OMIT clauses are defined for the physical files (or SQL tables) used in the SQL statements. A work around is to set the IGNORE_DERIVED_INDEX in the QAQQINI. Birgitta
Joe Pluta
New Member
New Member
Posts:29
Avatar

--
19 Mar 2006 01:35 AM Accepted Answer
Let's be clear here: on single record access, I have NEVER seen a situation where SQL is faster than native I/O. If you think about it, there's almost no way SQL can be faster, since at the end of the day a single-record access is going to do a CHAIN or an UPDATE or a WRITE or a DELETE. The native I/O methods simply get there faster, with predefined keys and no parsing of any kind required. I re-ran my basic SQL vs. native I/O benchmarks on V5R3 back in November and they still show the same basic pattern: SQL wins on complex queries (as we've all said). However, on single-record access, native I/O beats SQL. SQL only catches up once you're accessing multiple records at a time. The turning point seems to be around 10 records on reads and more like 100 records on updates. Second, SQL performance depends very much on the actual data. You can write a query that performs very well on one set of data, and the exact same SQL can perform horribly on another set of data. I don't want to over-emphasize this point; the performance variance isn't usually to a degree that causes noticeable applicationd degradation. But it can happen; it's important to test with real world data early in your development cycle. Somewhat as an aside, there's also the issue of maintenance. I won't go into that in detail, but debugging a 30-line SQL statement can be something of a challenge. Whether you want to include that in your business decision is up to you. All that being said, I use SQL quite a bit. I'd say that about 10-15% of my production database access is now done via SQL (typically embedded SQL), with a much higher percentage during analysis and prototyping. Joe
Elvis Budimlic
Veteran Member
Veteran Member
Posts:1433
Avatar

--
20 Mar 2006 05:10 PM Accepted Answer
Being that heavy hitters are chiming in on this subject, I have somewhat related question that's been bugging me lately. In SQL, if all stars and planets are aligned correctly, it is possible to get index-only-access if all of the fields you are projecting are found in the index used for query implementation. Intuitively, I don't think same is possible with native I/O, but don't know for sure. Can anyone confirm/negate this? Thanks.
Stefan Jost
New Member
New Member
Posts:1

--
24 Mar 2006 08:02 AM Accepted Answer
SQL can be ways faster. But it depends on the task and on a good design. We cannot simply replace a CHAIN with a SELECT INTO statement. In general, it is necessary to rework an existing programm. To read 600.000 records, use OPEN CURSOR (once!) and FETCH (often!). If possible, use blocked fetch. That speeds things further up. If you have to use a statement multiple times, use PREPARE STATEMENT (once!) and EXECUTE (often!). Thats helps, but may still be slow compared to native RLA. For write/update operations, avoid using single record SQL INSERT/UPDATE. Use INSERT/UPDATE with subselect or UPDATE WHERE CURRENT OF in a FETCH loop. If all this is not possible, we use the native RLA operations - and no SQL. Hope our experince helps a little. Stefan
Jean-François BARBIER
New Member
New Member
Posts:14

--
24 Mar 2006 10:25 AM Accepted Answer
Hello, I have built logical files with select/omit over SQL created tables and indexes. Here is an example :
R MYTABLEF                 PFILE(MYTABLE)
K KEY
S STAMPSUP                  COMP(EQ *NULL)
I then use MYTABLELF in my SQL statements, in order to get rid of suppressed records. Now, I understand that this should prevent my programs from using SQE. Should I include "and stampsup is null" in every SQL statement to keep on dealing only with active records? What will be the performance impact since the query will have to process all the records before returning only the active ones? It seems rather simple to build a view for select statements, but what about update or delete statements? What can I do at the database level to improve selection of active only records? Thanks in advance for your help. Jean-François
Elvis Budimlic
Veteran Member
Veteran Member
Posts:1433
Avatar

--
24 Mar 2006 03:35 PM Accepted Answer
One possibility may be to indeed specify STAMPSUP IS NULL in your queries and add STAMPSUP as secondary key to your original index (as long as most of your quries use KEY for selection; if not build an independent STAMPSUP index). Another possibility that would get you down SQE path, would be to build an index over STAMPSUP. Then create an SQL view projecting all the fields from the base file with selection like WHERE STAMPSUP IS NULL. This view should be updatable as it's not doing joins, grouping, subqueries or any other update incompatible function (i.e. SUBSTR and like). Delete S/O LF so you hit SQE. Test your performance and verify that it's better that what it was. It may not be, depending on what your queries really do with the table/LF/view. Make sure you run your jobs in debug so you get query optimizer messages as to what indexes are really being used as well what additional indexes it'd like you to build. Good luck.
benderd
Advanced Member
Advanced Member
Posts:605

--
21 Aug 2011 06:48 AM Accepted Answer
Let's be clear here: on single record access, I have NEVER seen a situation where SQL is faster than native I/O. If you think about it, there's almost no way SQL can be faster, since at the end of the day a single-record access is going to do a CHAIN or an UPDATE or a WRITE or a DELETE. The native I/O methods simply get there faster, with predefined keys and no parsing of any kind required. I re-ran my basic SQL vs. native I/O benchmarks on V5R3 back in November and they still show the same basic pattern: SQL wins on complex queries (as we've all said). However, on single-record access, native I/O beats SQL. SQL only catches up once you're accessing multiple records at a time. The turning point seems to be around 10 records on reads and more like 100 records on updates. Second, SQL performance depends very much on the actual data. You can write a query that performs very well on one set of data, and the exact same SQL can perform horribly on another set of data. I don't want to over-emphasize this point; the performance variance isn't usually to a degree that causes noticeable applicationd degradation. But it can happen; it's important to test with real world data early in your development cycle. Somewhat as an aside, there's also the issue of maintenance. I won't go into that in detail, but debugging a 30-line SQL statement can be something of a challenge. Whether you want to include that in your business decision is up to you. All that being said, I use SQL quite a bit. I'd say that about 10-15% of my production database access is now done via SQL (typically embedded SQL), with a much higher percentage during analysis and prototyping. Joe
... recognized this thread, coming from a link of a newer threads. I have never seen a programm, running in real life, reading only one record of one file. Having a deeper look to this, we have two main patterns, batch and interactive. Batch: many programms are reading one file, producing the next file, reading this file, producing the next and after 5 steps, they have the final report. In these steps data is condensed, enriched and whatever. Doing it the SQL way, mostly you could condense this to one step, reading the originating data, producing the final report and mostly you will be faster and the SQL way will outperform the RLA way. Interactive: The program reads the main file in a loop and then keys are taken from the main file and dependent files are read by key. The user transaction typically reads about 50 records ore even more from a number of files, wich are related each to each other with some key dependencies. Doing this the SQL way, you would put all these reads to a minimal number of sql accesses, firing these against the database and again the SQL way will outperform RLA in the majority of cases. The Benchmarks used here for comparison are telling lies, as most benchmarks do, not because the numbers are faked, but because they don't reflect real life! SQL has indeed more overhead than RLA and so it can't outperform RLA, doing it the RLA way and so all Benchmarks, taking a well optimized RLA programm translating this one by one to SQL, will see RLA as a winner. Its the SQL way, writing programms, thats bringing the advantage for SQL. How could this work, if SQL has more overhead? its the granularity of data access, combined with prefetching and blocking by the database engine. The way to bring speed to SQL is, to minimize the number of statements, asking from the database, thats the SQL way! D*B PS: Having a look to programmers performance, the SQL way outperforms the RLA way by far!
Access any Database from ADABAS to XBASE with embedded SQL in RPG. http://sourceforge.net/projects/appserver4rpg/
Zenbudda
Veteran Member
Veteran Member
Posts:1130
Avatar

--
22 Aug 2011 06:52 PM Accepted Answer
It seems you might have followed the thread I posted on your thread in the SQL forum. I'm not a Pro VIP member therefore I cannot get to the article that Scott wrote on this topic. His numbers are very "real" and did not seem biased for or against SQL/Native IO. However, "if" my memory serves me correctly, he proved (in several tests) that the smaller the result set of records, Native IO is "more likely" going to out perform SQL. There are lots of caveats here though (of course there would be). Sometimes SQL is a better solution, sometimes Native IO is a better solution. I personally don't think that "pure SQL" is a bad "idea". I'm just reiterating what I "think" I remember reading in Scott's article. Can any of the Pro VIP members attest to this? I don't remember which year/month it was. I'm certain it was 2008 or later (most likely in the year 2009). It was an article in the actual magazine.
Marcia Dore
Veteran Member
Veteran Member
Posts:546

--
24 Aug 2011 01:07 PM Accepted Answer
To FNUNEZG Magazine articles are subjective. I think they all should have a lead in of - this is the type of company, this is why we chose this software/method etc. And the magazine editors should look for (and not suppress) oops my company chose that software/method (because it was pretty-and the salesman brought presents) and it was a major disaster. You need to do what is correct for your company - and which works best for it. Sometimes database files are large solely because no decision was ever made on how long to keep data and when to purge it.
benderd
Advanced Member
Advanced Member
Posts:605

--
25 Aug 2011 08:19 AM Accepted Answer
inspirated from a question in a german forum, I have a nice example for the sql way: filling a subfile in RLA: read in a loop restarting the loop at page up/page down subfile programm with sql: define DS with dim(999) (or greater...) blockfetch from static cursor, doing all in memory from now, including relative and absolute positioning D*B
Access any Database from ADABAS to XBASE with embedded SQL in RPG. http://sourceforge.net/projects/appserver4rpg/
davesmith
Veteran Member
Veteran Member
Posts:1257
Avatar

--
25 Aug 2011 09:26 AM Accepted Answer
inspirated from a question in a german forum, I have a nice example for the sql way: filling a subfile in RLA: read in a loop restarting the loop at page up/page down subfile programm with sql: define DS with dim(999) (or greater...) blockfetch from static cursor, doing all in memory from now, including relative and absolute positioning
Which doesnt help at all when a subfile is used to display my multi million row sales order file... Dave
benderd
Advanced Member
Advanced Member
Posts:605

--
25 Aug 2011 09:52 AM Accepted Answer
... strange application design, where a user has to page thru millons of records, but anyway ... - its possible to fetch another 1000 records over and over again, untill the user is tired D*B
Access any Database from ADABAS to XBASE with embedded SQL in RPG. http://sourceforge.net/projects/appserver4rpg/
Greg Helton
Advanced Member
Advanced Member
Posts:384
Avatar

--
25 Aug 2011 10:07 AM Accepted Answer
For single record access, native IO is faster so, if your job is going to read one record and then end, go with native. Dan Cruickshank's articles on this site and on IBM's site show when you can achieve performance improvements with SQL. Greg
benderd
Advanced Member
Advanced Member
Posts:605

--
25 Aug 2011 10:32 AM Accepted Answer
... next funny application design: the subfile showing one record???
Access any Database from ADABAS to XBASE with embedded SQL in RPG. http://sourceforge.net/projects/appserver4rpg/
You are not authorized to post a reply.
Page 1 of 212 > >>


Acceptable Use Policy