RAC – It’s all about the node affinity

January 16, 2013 8 Comments

Or what happens when you don’t have node affinity.

Here’s an old, short post that was lying around unpublished.

SQL executed on the 14th on node 2, on 15th on node 1.

DT	SQL ID	Node	Rows	Execs	Elapsed	Cluster Wait Time
15-NOV	7287jpw5vtm8j	1	4129697	7591	32625	32175
14-NOV	7287jpw5vtm8j	2	19703872	30909	78	60

Timings in seconds.

32625 seconds is just over 9 hours, of which 32175 is cluster wait time.

Database services and application datasource config have now been reconfigured to the extent that there is node affinity and the data required for the workload on node 1 is completely segregated from the workload on node 2.

If you use RAC as a black box and you have problems with performance or problems with predictability of performance then look into whether node affinity, or a lack thereof, is your problem.

Filed under oracle, performance, rac

8 Responses to RAC – It’s all about the node affinity

Frits Hoogland says:

January 16, 2013 at 10:53 am

Way to go Dominic! I always configure node-bound failover services for my clients. (as an Exadata fanboy: why do you think the X3-8 exists ;-))

Reply
- Dom Brooks says:
  
  January 16, 2013 at 11:01 am
  
  I’ll have to read up on the specs of the X3-8 to see what you’re on about.
  
  Hopefully one day I’ll get some exposure to these machines and not just read about them.
  
  Reply
steve says:

January 17, 2013 at 4:44 am

I think you can do better. This blog post has no details whatsoever. For instance what SQL statement was run, what were the wait events, how where the objects defined, which objects caused the wait and so on. Was PL/SQL involved?

Reply
- Dom Brooks says:
  
  January 17, 2013 at 9:16 am
  
  Oh Steve, you don’t like it? I’m sorry to disappoint you.
  
  But I think you misunderstand the point of the post.
  
  You don’t need to see the individual wait details – “These aren’t the droids you’re looking for”
  
  The cluster wait time column tells you all you really need to know about the wait events were.
  
  This is not an walkthrough of a SQL tuning exercise – where such extra information would be relevant – but a very brief real-world illustration of a common performance theme – that of the importance of node affinity or rather how significant a lack of node affinity *might* be.
  
  It was very significant to this particular application leading to unpredictable performance with no “plan flip” or stats job interference in sight.
  
  The details of this particular SQL statement are irrelevant as this was a common, across-the-board issue.
  
  This particular SQL id was one of the more extreme examples from the real world. Straight up multi-table SQL statement, no PL/SQL.
  
  And as such, the details of the individual waits are, for once, unimportant. Suffice to say that they were all in the cluster wait time class as is shown above and exposed in v$sql and dba_hist_sql_stat. This high level class rollup can be very useful when comparing recent performance trends.
  
  For years RAC systems have been marketed as black box systems. It’s very common for organisations to mandate RAC on every single HA database architecture model. But there’s often precious little advice provided to application teams about the pitfalls of RAC nor caveats that you have to be very wary about continually requesting the same data from all the nodes.
  
  I feel that this illustration is at least ten years behind the curve. It’s what others have been warning about for years.
  
  But this problem is still rife…
  
  Reply
Hemant K Chitale says:

January 18, 2013 at 5:36 am

So what is your opinion of SCAN assigning a client to any node in the RAC Cluster ?

Reply
- Dom Brooks says:
  
  January 18, 2013 at 8:05 am
  
  Surely it’s still just a case of configuring database service names appropriately with preferred nodes, is it not? And the application using the appropriate datasource (i.e. service name) depending on the work it’s doing?
  
  Reply
Aníbal says:

December 6, 2021 at 12:33 pm

Hi Dom, nice post. I know it’s an old one, but I’m searching for information on node affinity and ways to solv this sort of problems. Could you provide more information and details on how you solved this issue or point me into this type of information? Thank you.

Reply
- Dom Brooks says:
  
  December 6, 2021 at 1:52 pm
  
  Where possible, configure different database services for different workloads, depends on your application. In the case above, because the way the application was designed/implemented, the best I could do was direct jboss application server internals to one node, and all the application workload to another. The problem with that lack of granularity is that you end up with very different utilisation profiles which is made to feel awkward by some IT departments/policies.
  Application I currently work on could have five different database services for five different data marts which run on the same database. Then it’s a question of figuring out which services work best on which nodes.
  And you configure your dataservices to have preferred nodes when available.
  What we try to avoid is having any workload land anywhere, particularly where there could be contention across nodes on wanting to do stuff to the same blocks.
  
  Reply

OraStory

RAC – It’s all about the node affinity

8 Responses to RAC – It’s all about the node affinity

Leave a reply to Dom Brooks Cancel reply

About Me

You Might Like

Recent Posts

Recent Comments

Top Posts

Blogroll

Documentation

Forums

Metalink Articles (requires login)

Other Websites

Disclaimer

OraStory

RAC – It’s all about the node affinity

Share this:

Related

8 Responses to RAC – It’s all about the node affinity

Leave a reply to Dom Brooks Cancel reply

About Me

You Might Like

Recent Posts

Recent Comments

Top Posts

Blogroll

Documentation

Forums

Metalink Articles (requires login)

Other Websites

Disclaimer