OraStory

Entries categorized as ‘jdbc’

Upscaling your JDBC app using Oracle object type collection

May 1, 2007 · 5 Comments

Regarding oracle jdbc bulk array inserts, Greg Rahn wrote this week about the performance gains to be had by batching up calls to the database using the array interface.

As an addendum to his excellent points, please find attached a comparison with using an Oracle collection of an Oracle object type – forgive my Java. Using StructDescriptor, STRUCT, ArrayDescriptor and ARRAY structures is unsightly and unintuitive but they can deliver some further performance gains. If only we could wrap this approach up in some user-friendly layer then I reckon we could kick some of these iterative row-by-row ORM tools into touch.

First up, for the baseline, based On Greg’s example, this is what my batch size performance was like inserting 10000 rows into emp on my system:
jdbc-update-batching-performance.gif

And, using an Oracle collection of Oracle object types, uploading the 10000 rows in a single INSERT… TABLE … CAST statement it took 0.219 secondsJava class here.

Which compared very favourably.

Inline scripts:

create type to_emp as object
(EMPNO NUMBER(4)
,ENAME VARCHAR2(10)
,JOB VARCHAR2(9)
,MGR NUMBER(4)
,HIREDATE DATE
,SAL NUMBER(7,2)
,COMM NUMBER(7,2)
,DEPTNO NUMBER(2));
/

create type tt_emp as table of to_emp;
/


import java.sql.*;
import java.util.*;
import oracle.jdbc.*;
import oracle.jdbc.pool.OracleDataSource;
import oracle.sql.STRUCT;
import oracle.sql.StructDescriptor;
import oracle.sql.ArrayDescriptor;
import oracle.sql.ARRAY;

public class bulkInsert {

public static void main(String[] args) {
try {

OracleDataSource ods = new OracleDataSource();
ods.setURL("jdbc:oracle:oci8:@ora.noldb507");
ods.setUser("scott");
ods.setPassword("tiger");
OracleConnection conn = (OracleConnection) ods.getConnection();
conn.setAutoCommit(false);

short seqnum = 0;
String[] metric = new
String[OracleConnection.END_TO_END_STATE_INDEX_MAX];

metric[OracleConnection.END_TO_END_ACTION_INDEX] = "insertEmp";
metric[OracleConnection.END_TO_END_MODULE_INDEX] = "bulkInsert";
metric[OracleConnection.END_TO_END_CLIENTID_INDEX] = "myClientId";
conn.setEndToEndMetrics(metric, seqnum);

DatabaseMetaData meta = conn.getMetaData();

System.out.println(
"JDBC driver version is " + meta.getDriverVersion());

Statement stmt = conn.createStatement();

stmt.execute("alter session set sql_trace=true");
stmt.execute("truncate table emp");

int numberOfEmployees = Integer.parseInt(args[0]);

STRUCT[] structEmployee = new STRUCT[numberOfEmployees];

oracle.sql.StructDescriptor descEmployee = oracle.sql.StructDescriptor.createDescriptor("SCOTT.TO_EMP",conn);

java.sql.Timestamp now = new java.sql.Timestamp( (new java.util.Date() ).getTime() );

for (int i = 0; i < numberOfEmployees; i++) {

Object[] empValues = {
(i), // EMPNO
("Name" + i), // ENAME
("Job"), // JOB
(i), // MGR
now , //now
(10000 + i), // SAL
(100 + i), // COMM
(10) // DEPTNO
};

structEmployee[i] = new oracle.sql.STRUCT(descEmployee, conn, empValues);
}

oracle.sql.ArrayDescriptor empArrayDescriptor = oracle.sql.ArrayDescriptor.createDescriptor("SCOTT.TT_EMP",conn);

ARRAY empArray = new ARRAY(empArrayDescriptor,conn,structEmployee);

OraclePreparedStatement psEmp = (OraclePreparedStatement) conn.prepareStatement(
"insert /* insEmpBulk */ into emp (EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO) select * from table (cast (? as tt_emp))");

psEmp.setObject(1,empArray);

long start1 = System.currentTimeMillis();

// Set the batch size for each statment
((OraclePreparedStatement) psEmp).execute();

conn.commit();
psEmp.close();

long elapsedTimeMillis1 = System.currentTimeMillis() - start1;
// Get elapsed time in seconds
float elapsedTimeSec1 = elapsedTimeMillis1 / 1000F;

System.out.println("elapsed seconds: " + elapsedTimeSec1);

conn.close();

} catch (Exception e) {
System.err.println("Got an exception! ");
System.err.println(e.toString());
}
}
}

Categories: bulk · jdbc · oracle · oracle collections

ORM

March 22, 2007 · 2 Comments

Aaaaarggghh. 

ORM – Object Relational Mapping – seems to be the bane of a lot of Oracle Developers / DBAs these days.

 I’m not talking about great Oracle features such as Oracle relational views using UDTs (User-Defined Types) and operators such as CAST, TABLE and MULTISET.

More the object/relational persistence layer type things like Hibernate, iBATIS, JDO, SDO, etc.

I get the whole thing about it saving application developers from writing so much code and therefore there is a reduction in testing and errors, and the application gets written that much quicker (in theory), etc.

But often it’s like death by one thousand little generated SQL statements where one bigger one could do the job much more efficiently.  Either that or you find the most horrific query you’ve ever seen has been generated with seemingly hundreds of tables, silly aliases, and hoardes of UNIONs and ORs, etc.

Maybe one of the problems has been that the DAO layer has never been particularly cool or trendy and that most application developers have never been into writing SQL – it’s always been a bit of a chore and boring to them. But SQL isn’t difficult. You’ve just got to think in sets.

And I’m sure that this one of those scenarios where the oft-quoted 80:20 “rule” can be applied – i.e that an ORM tool might make sense 80% of the time, particularly when SQL experts aren’t available. Trouble is that you can turn that silly rule right around and say that the 20% of code where these ORMs don’t work very well take up 80% of the effort.

The problem for me with this is the database becomes more like just a bucket. And a bucket which is accessed by inefficient SQL. The database was born to store data and manipulate data using set operations. More often than not with ORM, we find row-by-row operations, we don’t see efficient set operations, we don’t see bulk operations, we see dynamically generated IN lists, we see lots of OR clauses, etc.

And then, more often that not, when there’s a problem with that SQL, there’s nothing that can be done about it.

Going back to the O-R features in Oracle, these have been steadily developed since 8i. I’m a big fan of creating a layer of O-R views to create more appropriate interfaces for the application to the database entities and have used them to great success in a varietyof commercial JDBC applications. And it always comes as a surprise to the application developers that it is possible to pass Oracle UDT collections back and forward. Granted, the JDBC is a little fiddly, but it’s a good way of doing efficient set/bulk operations on entities that are a little more natural to the OO world than the base relational entities. It’s a pity that ODP.NET does not yet have the same support for these Oracle Types.

Maybe one day all the application developers or (80%) will be replaced by code generators that work from a few boxes and a few lines put together on something like a Visio diagram. I hope not because I consider myself an application developer/designer starting from the database going out to and including the application.

Alternatively, maybe boxes, memory and disk space get so big and fast that these inefficiencies aren’t a concern anymore (either that or the affects of the inefficiencies are magnified).

Categories: bulk · jdbc · odp.net · oracle · oracle collections · orm

Varying IN lists – part II

March 19, 2007 · 1 Comment

I have blogged before about a solution to varying in lists being issued by an application.

The problem related to hard parsing – when these varying in lists used literals rather than binds (resolved by the cursor_sharing paramter) – and to shared pool usage – with every distinct number of binds in an in list requiring a shared pool entry (or parse if absent therefrom).

Having gently guided the application developers down the path of this type of approach, I was feeling pretty pleased with progress. However, two problems have subsequently presented themselves which underline why the initial approach was taken and why the standard solutions approach is not a silver bullet in all circumstances.

One of the applications concerned uses ODP (Oracle Data Provider for .NET). This is one of those situations where my understanding (or lack thereof) leaves me feeling very vulnerable.
ODP sits on top of OCI in a way that is, at this time, beyond my comprehension in terms of driver internals.

The varying in list “design pattern” uses a single delimited string as a single bind variable. As a result, no matter the length of the single bind, there is only a single statement per x binds for that base bit of SQL.

However, OCI has a limt of 4000 for a VARCHAR2 argument. Therefore, depending on the length of each argument in the delimited string, we are severely limited in the number of arguments that can be passed in.

This brings me back to an old bug raised against the current application. In the old varying IN list approach, the application was limited to something like 1000 arguments in the IN list. There are obvious question marks against the application design of any application which is dynamically building such IN lists – where are the arguments in the IN lists coming from? What is the application object to relational mapping such that this is happening? Can we present alternative structures / entities to avoid these scenarios? etc.

However, by pushing the varying IN list solution, the limit of 1000 arguments is now considered a “nice-to-have”. Because at 8 digits, the maximum number of arguments is now closer to 250!

As the length of the single bind variable reaches the 4000 limit, the error raised is “ORA-01460: unimplemented or unreasonable conversion requested”. We have tried a LONG and CLOB alternative and both are met with the same result (although an erroneous implementation of a workaround can never be ruled out).

One of the frustrations here is that the application developers would love to use a more appropriate interface. The natural choice, IMHO, is to provide some sort of array of the arguments concerned. If we were using JDBC, then using proper Oracle collections (CREATE TYPE…. AS TABLE OF) and the appropriate JDBC structures would be fantastic.

However, for some reason, ODP does not yet support the same functionality. According to the documentation, ODP.NET supports associative arrays/index-by tables/plsql tables but not SQL types. Which raises the question, why has this functionality been absent from ODP.NET for so long?

… more …

Categories: binds · drivers · jdbc · odp.net · oracle · oracle collections · varying in lists