Home » Server Options » RAC & Failsafe » The CPU number between 2 nodes (Linux RHEL 5, Oracle RAC 11gr2)
The CPU number between 2 nodes [message #626824] Mon, 03 November 2014 20:07 Go to next message
trantuananh24hg
Messages: 744
Registered: January 2007
Location: Ha Noi, Viet Nam
Senior Member
Hi all,
Good a day to you.

I have got a discomfortable scenario due to our new project Oracle Enterprise Grid Infrastructure deployment. 2 nodes, exactly, Hitachi blade server would be planned to use Oracle RAC 11gr2 with RHEL 5. One of server has got 2 physical CPU, 16gb memory

When I finished completely configuration and installation RAC, suddendly, one of server had got hardware error, then the Hitachi team replaced one new but 4 CPUs instead of 2. Now, I could not open cluster database due to the different between them.

I've not got any experience of this situation, so, may I ask one question:

- Can I reduce the CPU count (virtual cores) in database through pfile, spfile, then open the cluster database again? Example, instead of 64, I will replace 32?

Thank you!

[Updated on: Mon, 03 November 2014 20:18]

Report message to a moderator

Re: The CPU number between 2 nodes [message #626826 is a reply to message #626824] Mon, 03 November 2014 23:46 Go to previous messageGo to next message
trantuananh24hg
Messages: 744
Registered: January 2007
Location: Ha Noi, Viet Nam
Senior Member
I have just finished fix problem by reducing number of cpu count. The minimum core is 24, so, decreased the parameter cpu_count to 24, then restart database, that's ok.

However, I wonder about some components such as parallel query. As document said if changing the cpu_count greater than 0, so, it disable dynamic and force to use the number value changed in database. However, why does Oracle know the cores belonged to which physical cpu to use? I think or I guess, the virtual core depend to physical CPU which are couple one.

Thank you!

[Updated on: Tue, 04 November 2014 00:32]

Report message to a moderator

Re: The CPU number between 2 nodes [message #626834 is a reply to message #626826] Tue, 04 November 2014 01:11 Go to previous messageGo to next message
John Watson
Messages: 8922
Registered: January 2010
Location: Global Village
Senior Member
I am not aware of any issues caused by different CPU count between nodes in a cluster. Are you certain that this was the problem?

What is definitely a problem is that the hardware upgrade has increased your licensing requirement, from 4 CPUs to 6 (multiplied by the core factor). This may be hundreds of thousands of dollars.
Re: The CPU number between 2 nodes [message #626835 is a reply to message #626824] Tue, 04 November 2014 01:33 Go to previous messageGo to next message
trantuananh24hg
Messages: 744
Registered: January 2007
Location: Ha Noi, Viet Nam
Senior Member
Thank you, John.

Please forgive my bad English, I had not any idea of word calling about this situation, sometime I call as problem, another time, I call as a trouble. So, please correct me.

As my below post, I wonder how does Oracle do process a parallel in this server which has been being reduced the CPU_COUNT parameter? How does Oracle know exactly which the thread done by physical CPU's core? Let me take an example:

- Force to use Parallel SQL execute on statement, assume 2 threads. The one thread will be invoked by one of CPU's core id 20, right, this core is belonged to CPU number 2. What is about the remain thread? It will belong to core id 1 which belonged to physical CPU number 0 or it is core id 23 - physical CPU number 4?
- I guess, due to server's policy, a couple CPUs will be used through 0-1 and 2-3 instead of 0-2 and 1-3, am I right?

And your important comment, the main problem may be hundred of thousands dollars appear after that Very Happy

May I ask one more question: Does Oracle count $ as based on physcial CPU (or core) which in server or which use really in Database?

Thank you!

[Updated on: Tue, 04 November 2014 01:37]

Report message to a moderator

Re: The CPU number between 2 nodes [message #626839 is a reply to message #626835] Tue, 04 November 2014 01:55 Go to previous messageGo to next message
John Watson
Messages: 8922
Registered: January 2010
Location: Global Village
Senior Member
I think you have fixed your problem, and it was not caused by different CPU count?

If you now have a question about parallel processing, please open another topic.

But what is more important is that it sounds as though you are operating illegally. As you want to use parallel processing, you must buy Enterprise Edition which is US$47.5 plus US$23k for RAC, multiplied by the core factor which is usually 50%. So if you have total 24 cores, you need to pay US$846k. Better talk to an Oracle sales droid quickly. Or a partner (my boss will be happy to give you a quote Smile ) might be more inventive in keeping the price down.
Re: The CPU number between 2 nodes [message #626847 is a reply to message #626839] Tue, 04 November 2014 02:11 Go to previous messageGo to next message
trantuananh24hg
Messages: 744
Registered: January 2007
Location: Ha Noi, Viet Nam
Senior Member
Hi, John!

As my post below, I really have not had any experience of number CPU count differently between 2 nodes.
But the truth I can repost, this is:

- Can not using srvctl start database, or can not startup one per one node, I can only start only one node.
- When the Hitachi staff remove 2 CPUs (physcial CPU), I can start database either instance by instance
- When the Hitachi staff re-put 4 CPUs (physial CPU), I can not start database, neither instance by instance, just only one.
- So, I decide to grep how many core in the 2th server, I found it's 24. On the 1st node, there are 64. As following ORA-error (I will re-post as soon as possbile because I am going back to my office), the Metalink note recommend balance the CPUs instead of differential. Ajust CPU_COUNT parameter to 24 value, I started database normally.

As my position, DBA, but not sale or leader, so I can not reply is my operating illegally or not. Our RAC is Enterprise just now, and, I can confirm, many systems including server, application or Database which I deployed or installed, configured is to be paid legally.

Dear John, again, it's seem my bad English caused you did not understand my question, please forgive it!

Thanks so much!
Re: The CPU number between 2 nodes [message #626849 is a reply to message #626847] Tue, 04 November 2014 02:31 Go to previous messageGo to next message
John Watson
Messages: 8922
Registered: January 2010
Location: Global Village
Senior Member
Your tests do suggest that the physical CPUs make a difference, but that is really not likely. There must be something else going on. You say "I can not start database", you need to show what happens when you try to do this, with SQL*Plus and with srvctl.

Use copy/paste, and enclose it in [code] tags.
Re: The CPU number between 2 nodes [message #626871 is a reply to message #626849] Tue, 04 November 2014 04:27 Go to previous messageGo to next message
trantuananh24hg
Messages: 744
Registered: January 2007
Location: Ha Noi, Viet Nam
Senior Member
Hi John,

Please keep a keen on see at my re-switch.

First time, my Oracle RAC open normally at 24 CPU_COUNT parameters and I re-list total of processors at each server

Node 1:
Find CPU_COUNT parameter
sys@ADCDB> show parameter cpu_count

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
cpu_count                            integer     24
sys@ADCDB> ! hostname
app1-pub


Find total processor
sys@ADCDB> ! cat /proc/cpuinfo | grep 'processor'
processor       : 0
processor       : 1
processor       : 2
processor       : 3
processor       : 4
processor       : 5
processor       : 6
processor       : 7
processor       : 8
processor       : 9
processor       : 10
processor       : 11
processor       : 12
processor       : 13
processor       : 14
processor       : 15
processor       : 16
processor       : 17
processor       : 18
processor       : 19
processor       : 20
processor       : 21
processor       : 22
processor       : 23
processor       : 24
processor       : 25
processor       : 26
processor       : 27
processor       : 28
processor       : 29
processor       : 30
processor       : 31
processor       : 32
processor       : 33
processor       : 34
processor       : 35
processor       : 36
processor       : 37
processor       : 38
processor       : 39
processor       : 40
processor       : 41
processor       : 42
processor       : 43
processor       : 44
processor       : 45
processor       : 46
processor       : 47
processor       : 48
processor       : 49
processor       : 50
processor       : 51
processor       : 52
processor       : 53
processor       : 54
processor       : 55
processor       : 56
processor       : 57
processor       : 58
processor       : 59
processor       : 60
processor       : 61
processor       : 62
processor       : 63


Node 2:
Find total processor
[root@app2-pub ~]# hostname
app2-pub
[root@app2-pub ~]# cat /proc/cpuinfo | grep 'processor'
processor       : 0
processor       : 1
processor       : 2
processor       : 3
processor       : 4
processor       : 5
processor       : 6
processor       : 7
processor       : 8
processor       : 9
processor       : 10
processor       : 11
processor       : 12
processor       : 13
processor       : 14
processor       : 15
processor       : 16
processor       : 17
processor       : 18
processor       : 19
processor       : 20
processor       : 21
processor       : 22
processor       : 23


Instance, server and status
Node 1:
sys@ADCDB> /

HOST_NAME    INSTANCE_NAME    INSTANCE_NUMBER STATUS
------------ ---------------- --------------- ------------
app1-pub     adcdb1                         1 OPEN
app2-pub     adcdb2                         2 OPEN


Then, I switch the total processor on N1 changing from 24 to 64
Node 1:
sys@ADCDB> alter system set cpu_count=64 scope=spfile sid='adcdb1';

System altered.


After that, restart the Instance and meet error:
[root@app1-pub ~]# su - grid
[grid@app1-pub ~]$ srvctl stop instance -d adcdb -i adcdb1
[grid@app1-pub ~]$ srvctl start instance -d adcdb -i adcdb2
PRCC-1015 : adcdb was already running on app2-pub
[grid@app1-pub ~]$ srvctl start instance -d adcdb -i adcdb1
PRCR-1013 : Failed to start resource ora.adcdb.db
PRCR-1064 : Failed to start resource ora.adcdb.db on node app1-pub
ORA-01105: mount is incompatible with mounts by other instances
ORA-01606: parameter not identical to that of another mounted instance

CRS-2674: Start of 'ora.adcdb.db' on 'app1-pub' failed


Please look at : Doc ID 1561966.1.

Now, I rechange the CPU_COUNT

idle> ! vi $HOME/initadcdb.ora

idle> create spfile from pfile='$HOME/initadcdb.ora'
  2  /

File created.

idle> exit
Disconnected
[oracle@app1-pub ~]$ su - grid
Password:
[grid@app1-pub ~]$ srvctl start instance -d adcdb -i adcdb1
[grid@app1-pub ~]$
[grid@app1-pub ~]$ exit
logout
[oracle@app1-pub ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.1.0 Production on Tue Nov 4 17:26:02 2014

Copyright (c) 1982, 2009, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

sys@ADCDB> col host_name format a12
sys@ADCDB> select host_name, instance_name, instance_number, status
  2  from gv$instance;

HOST_NAME    INSTANCE_NAME    INSTANCE_NUMBER STATUS
------------ ---------------- --------------- ------------
app1-pub     adcdb1                         1 OPEN
app2-pub     adcdb2                         2 OPEN

sys@ADCDB> show parameter cpu_count

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
cpu_count                            integer     24
sys@ADCDB> ! hostname
app1-pub

Re: The CPU number between 2 nodes [message #626872 is a reply to message #626871] Tue, 04 November 2014 04:39 Go to previous messageGo to next message
John Watson
Messages: 8922
Registered: January 2010
Location: Global Village
Senior Member
I think that is a known bug in 11.2.0.1, see MOS doc id 1561966.1
Re: The CPU number between 2 nodes [message #626873 is a reply to message #626872] Tue, 04 November 2014 04:55 Go to previous messageGo to next message
trantuananh24hg
Messages: 744
Registered: January 2007
Location: Ha Noi, Viet Nam
Senior Member
Hi John,

I agree with the bug will be fixed on 11.2.0.2. From my understanding, I still can not know how does Oracle device the thread using core (processor) based on what?

Thank you!
Re: The CPU number between 2 nodes [message #626875 is a reply to message #626873] Tue, 04 November 2014 05:21 Go to previous message
John Watson
Messages: 8922
Registered: January 2010
Location: Global Village
Senior Member
I don't what you mean. If you are talking about parallel execution, open a fresh topic.
Previous Topic: Convert Single Instance to RAC
Next Topic: RAC 11g, Scan Listener & Failover
Goto Forum:
  


Current Time: Thu Mar 28 06:41:20 CDT 2024