“Simplified” development

Amongst the links that Beth (aka datageekgal) highlights this week is a forthcoming framework from Microsoft to simplify development.

From the article announcing the framework:

The goal of the ADO.Net Entity Framework is to eliminate the impedance mismatch between data models and languages, saving developers from having to deal with these. An example of such a mismatch is objects and relational stores.Developers might write an application that manipulates a CRM system with data for customers stored in 12 different tables…

With the Entity Framework, you can automate the process essentially of bringing all that data together and presenting it to the developer as a single entity so they can interact with it at a higher level of abstraction…

Sounds like Hibernate, Ibatis and other such ORM tools to me.

I’m got two words for you – Stored Procs and/or Object Relational Views (well, two options, seven words).

That’s an abstraction that works every time.

You just need someone who knows what they’re doing on the database – and I know that’s not trendy in IT at the moment ;-)

Mind you, in the past year I’ve worked on my first two .Net projects and, compared to JDBC, the lack of support for Oracle object and collections types in the various data provider options, including ODP, is a frustration. While there is support for associative arrays, Oracle permanent object types (CREATE TYPE …. AS OBJECT / AS TABLE OF ….) are not…. yet.

But I believe that that support is coming in the 11g version of ODP.NET. I’ll try to dig out the link.

ORM

Aaaaarggghh. 

ORM – Object Relational Mapping – seems to be the bane of a lot of Oracle Developers / DBAs these days.

 I’m not talking about great Oracle features such as Oracle relational views using UDTs (User-Defined Types) and operators such as CAST, TABLE and MULTISET.

More the object/relational persistence layer type things like Hibernate, iBATIS, JDO, SDO, etc.

I get the whole thing about it saving application developers from writing so much code and therefore there is a reduction in testing and errors, and the application gets written that much quicker (in theory), etc.

But often it’s like death by one thousand little generated SQL statements where one bigger one could do the job much more efficiently.  Either that or you find the most horrific query you’ve ever seen has been generated with seemingly hundreds of tables, silly aliases, and hoardes of UNIONs and ORs, etc.

Maybe one of the problems has been that the DAO layer has never been particularly cool or trendy and that most application developers have never been into writing SQL – it’s always been a bit of a chore and boring to them. But SQL isn’t difficult. You’ve just got to think in sets.

And I’m sure that this one of those scenarios where the oft-quoted 80:20 “rule” can be applied – i.e that an ORM tool might make sense 80% of the time, particularly when SQL experts aren’t available. Trouble is that you can turn that silly rule right around and say that the 20% of code where these ORMs don’t work very well take up 80% of the effort.

The problem for me with this is the database becomes more like just a bucket. And a bucket which is accessed by inefficient SQL. The database was born to store data and manipulate data using set operations. More often than not with ORM, we find row-by-row operations, we don’t see efficient set operations, we don’t see bulk operations, we see dynamically generated IN lists, we see lots of OR clauses, etc.

And then, more often that not, when there’s a problem with that SQL, there’s nothing that can be done about it.

Going back to the O-R features in Oracle, these have been steadily developed since 8i. I’m a big fan of creating a layer of O-R views to create more appropriate interfaces for the application to the database entities and have used them to great success in a varietyof commercial JDBC applications. And it always comes as a surprise to the application developers that it is possible to pass Oracle UDT collections back and forward. Granted, the JDBC is a little fiddly, but it’s a good way of doing efficient set/bulk operations on entities that are a little more natural to the OO world than the base relational entities. It’s a pity that ODP.NET does not yet have the same support for these Oracle Types.

Maybe one day all the application developers or (80%) will be replaced by code generators that work from a few boxes and a few lines put together on something like a Visio diagram. I hope not because I consider myself an application developer/designer starting from the database going out to and including the application.

Alternatively, maybe boxes, memory and disk space get so big and fast that these inefficiencies aren’t a concern anymore (either that or the affects of the inefficiencies are magnified).

Varying IN lists – last bit

So, just a final say on this series of blog entries.

The first is that often you start down one workaround and then you find something that doesn’t quite work so you workaround that and before you know it you’re along way from where you’d like to be.

What started as a simple desire to reduce the impact of dynamically generated varying IN lists (some not using bind variables) from a couple of applications was severely complicated by ODP.NET’s lack of support / interoperability with Oracle UDTs and the OCI limit of 4000 characters in a VARCHAR2 bind.

As a result, the current choice is between the original situation – lots of similar SQL – and inserting the values into a GLOBAL TEMPORARY TABLE and then using a query which has a WHERE … IN subquery selecting from that GTT. More on that down below…

When I wrote previously, also under consideration was using a global packaged variable. However this was eliminated as a possibility due to at least three frustrating issues. First up was an idea to use a function that would convert an associative array (supported by ODP.NET) to a similar UDT (just using a FOR LOOP that puts the values from one into the other). Using a bit of PL/SQL to demonstrate (8i hence the lack of SYS_REFCURSOR and the need to declare a REF CURSOR in a package):


CREATE OR REPLACE TYPE tt_number AS TABLE OF NUMBER;
/


CREATE OR REPLACE PACKAGE pkg_types
AS
--
TYPE refcursor IS REF CURSOR;
TYPE aa_number IS TABLE OF NUMBER INDEX BY BINARY_INTEGER;
--
END pkg_types;
/


CREATE OR REPLACE FUNCTION f_tt_convert_error_demo (
i_associative_array IN pkg_types.aa_number
)
RETURN tt_number
AS
--
vt_number tt_number := tt_number();
--
BEGIN
--
FOR i IN i_associative_array.FIRST .. i_associative_array.LAST
LOOP
--
vt_number.EXTEND();
vt_number(i) := i;
--
END LOOP;
--
RETURN vt_number;
--
END;
/


declare
v_number number;
v_cursor pkg_types.refcursor;
va_number pkg_types.aa_number;
begin
for i in 1 .. 10
loop
va_number(i) := i;
end loop;
open v_cursor for
select *
from table(cast(f_tt_convert_error_demo(va_number) as tt_number));
loop
fetch v_cursor into v_number;
exit when v_cursor%NOTFOUND;
dbms_output.put_line(v_number);
end loop;
close v_cursor;
end;

I should have known, but this cannot work due to an error “PLS-00425: in SQL, function argument and return types must be SQL type”.

Incidentally, if you try dynamic SQL, you’ll get a “PLS-00457: expressions have to be of SQL types” instead.

Secondly, just because you can do something in SQL doesn’t mean that the same statement will work in a PL/SQL routine (and vice versa from memory – it tends to be features that are relative new). Following on from SQL above:


CREATE OR REPLACE FUNCTION f_tt_error_demo
RETURN tt_number
AS
--
vt_number tt_number := tt_number();
--
BEGIN
--
FOR i IN 1 .. 10
LOOP
--
vt_number.EXTEND();
vt_number(i) := i;
--
END LOOP;
--
RETURN vt_number;
--
END;
/


SQL> select value(t)
2 from table (cast (f_tt_error_demo as tt_number)) t
3 /
VALUE(T)
----------
1
2
3
4
5
6
7
8
9
10


SQL> declare
2 begin
3 for a in (select VALUE(t) num
4 from table (cast (f_tt_error_demo as tt_number)) t)
5 loop
6 dbms_output.put_line(a.num);
7 end loop;
8 end;
9 /
declare
*
ERROR at line 1:
ORA-06550: line 0, column 0:
PLS-00382: expression is of wrong type
ORA-06550: line 3, column 13:
PL/SQL: SQL Statement ignored
ORA-06550: line 6, column 28:
PLS-00364: loop index variable 'A' use is invalid
ORA-06550: line 6, column 7:
PL/SQL: Statement ignored

but, with dynamic sql:


SQL> declare
2 v_number number;
3 v_cursor pkg_types.refcursor;
4 begin
5 open v_cursor for
6 ' select value(t) '||
7 ' from table (cast (f_tt_error_demo as tt_number)) t ';
8 loop
9 fetch v_cursor into v_number;
10 exit when v_cursor%NOTFOUND;
11 dbms_output.put_line(v_number);
12 end loop;
13 close v_cursor;
14 end;
15 /
1
2
3
4
5
6
7
8
9
10
PL/SQL procedure successfully completed.

Thirdly, the above are distilled code examples of further errors that I came across while trying to put together a bit of example code for this blog entry to show the real problem that I had.

I actually wanted to use the function in a subquery like this:


create table error_demo
as
select rownum col1
from all_objects
where rownum < 11
1 select *
2 from error_demo
3 where col1 IN (select value(t) num
4* from table (cast (f_tt_error_demo as tt_number)) t)
SQL> /
COL1
----------
1
2
3
4
5
6
7
8
9
10
10 rows selected.


SQL> declare
2 v_number number;
3 v_cursor pkg_types.refcursor;
4 begin
5 open v_cursor for
6 select col1
7 from error_demo
8 where col1 IN (select *
9 from table (cast (f_tt_error_demo as tt_number)) t);
10 loop
11 fetch v_cursor into v_number;
12 exit when v_cursor%NOTFOUND;
13 dbms_output.put_line(v_number);
14 end loop;
15 close v_cursor;
16 end;
17 /
from table (cast (f_tt_error_demo as tt_number)) t);
*
ERROR at line 9:
ORA-06550: line 9, column 37:
PLS-00220: simple name required in this context
ORA-06550: line 6, column 4:
PL/SQL: SQL Statement ignored

“PLS-00220: simple name required in this context” – that’s a new one for me, first time I’ve had that error, I think.

But again, with dynamic sql:


SQL> ed
Wrote file afiedt.buf
1 declare
2 v_number number;
3 v_cursor pkg_types.refcursor;
4 begin
5 open v_cursor for
6 ' select col1 '||
7 ' from error_demo '||
8 ' where col1 IN (select * '||
9 ' from table (cast (f_tt_error_demo as tt_number)) t)';
10 loop
11 fetch v_cursor into v_number;
12 exit when v_cursor%NOTFOUND;
13 dbms_output.put_line(v_number);
14 end loop;
15 close v_cursor;
16* end;
SQL> /
1
2
3
4
5
6
7
8
9
10
PL/SQL procedure successfully completed.

So, as I mentioned way up above, these issues really have reduced it down to a choice between inserting the values in a GTT and then using that GTT in an IN subquery SELECT, or just to revert back to the IN lists as they were.

On the GTT front, we can bulk insert the values for the IN list from the application into the GTT using a simple procedure which accepts an associative array (like pkg_types.aa_number above).
One important note is that the GTT as to be created as “ON COMMIT PRESERVE ROWS”, otherwise the application will raise error “ORA-08103: object no longer exists”. This rings a bell from previous experiences using a JDBC app.

However this then requires another sort of workaround to delete the rows in the GTT as the first step in the procedure which bulk inserts the rows….

So, this leaves it all in a funny place where you’ve got to weigh up the evils of all the IN lists with the inefficiency, phaffing around and the general “bad smell” of the GTT approach.

It also leaves two open questions. Firstly, should we have another look at a higher level at the application design to reconsider why these IN lists are being constructed / are required in the first place? And secondly, when will ODP.NET provide the same support for Oracle UDTs as it does for JDBC?

Varying IN lists – part III

… continued …

OK, so the varying in list solution is not working for binds where the length > 4000 characters and ODP.NET does not seem to support Oracle collections to the same extent as JDBC (although some results on a Google search indicate that this may have been present in a Beta release but withdrawn). What are the alternatives?

We can return to the previous situation with many similar SQL statments with different length IN lists stuffing up the shared pool.

We could use a routine just before the select in the application to insert the previous IN list argument values into a GLOBAL TEMPORARY TABLE and then change the SQL so that the IN list subquery selects from that. It’s not nice, it’s not the ideal solution but it’s a possibility. An alternative is to the GTT is to use a global variable in a package. Not dissimilar to the GTT approach. Both have a nasty “smell” about them.

In both cases, the general approach is that the application would first have to insert values into the GLOBAL TEMPORARY TABLE or into the global packaged variable (either individual values or in the latter, using the ODP.NET support for associative arrays). Then we would write a query that would have a subquery to either select from the GTT or call a function which would change the associative array to a SQL type.

It sounds a bit daft whichever the method.

(Incidentally I lose track of what I’m meant to call things. Associative arrays used to be called index-by tables which used to be called PLSQL tables. These are part of the Oracle collections framework which includes SQL User Defined Types (UDTs)? That latter are what we need to use in SQL using the TABLE and CAST operators and those are what ODP.NET does not support.)

An initial runstats examination shows that the packaged variable approach is more scalable. This is mainly because a packaged variable is a memory structure whereas inserts into a GTT generate some redo and so runstats picks up on that. In terms of clock speed, the packaged variable approach seems slightly faster.

Hopefully, I will expand this a bit later with proper example code.

Varying IN lists – part II

I have blogged before about a solution to varying in lists being issued by an application.

The problem related to hard parsing – when these varying in lists used literals rather than binds (resolved by the cursor_sharing paramter) – and to shared pool usage – with every distinct number of binds in an in list requiring a shared pool entry (or parse if absent therefrom).

Having gently guided the application developers down the path of this type of approach, I was feeling pretty pleased with progress. However, two problems have subsequently presented themselves which underline why the initial approach was taken and why the standard solutions approach is not a silver bullet in all circumstances.

One of the applications concerned uses ODP (Oracle Data Provider for .NET). This is one of those situations where my understanding (or lack thereof) leaves me feeling very vulnerable.
ODP sits on top of OCI in a way that is, at this time, beyond my comprehension in terms of driver internals.

The varying in list “design pattern” uses a single delimited string as a single bind variable. As a result, no matter the length of the single bind, there is only a single statement per x binds for that base bit of SQL.

However, OCI has a limt of 4000 for a VARCHAR2 argument. Therefore, depending on the length of each argument in the delimited string, we are severely limited in the number of arguments that can be passed in.

This brings me back to an old bug raised against the current application. In the old varying IN list approach, the application was limited to something like 1000 arguments in the IN list. There are obvious question marks against the application design of any application which is dynamically building such IN lists – where are the arguments in the IN lists coming from? What is the application object to relational mapping such that this is happening? Can we present alternative structures / entities to avoid these scenarios? etc.

However, by pushing the varying IN list solution, the limit of 1000 arguments is now considered a “nice-to-have”. Because at 8 digits, the maximum number of arguments is now closer to 250!

As the length of the single bind variable reaches the 4000 limit, the error raised is “ORA-01460: unimplemented or unreasonable conversion requested”. We have tried a LONG and CLOB alternative and both are met with the same result (although an erroneous implementation of a workaround can never be ruled out).

One of the frustrations here is that the application developers would love to use a more appropriate interface. The natural choice, IMHO, is to provide some sort of array of the arguments concerned. If we were using JDBC, then using proper Oracle collections (CREATE TYPE…. AS TABLE OF) and the appropriate JDBC structures would be fantastic.

However, for some reason, ODP does not yet support the same functionality. According to the documentation, ODP.NET supports associative arrays/index-by tables/plsql tables but not SQL types. Which raises the question, why has this functionality been absent from ODP.NET for so long?

… more …

Follow

Get every new post delivered to your Inbox.

Join 72 other followers