Home » SQL & PL/SQL » SQL & PL/SQL » Optimization of SQL (12.2)
Optimization of SQL [message #684317] |
Sun, 09 May 2021 03:10  |
 |
swasora
Messages: 1 Registered: May 2021
|
Junior Member |
|
|
Hi Team
I have the below code where it processes trillions of records and i would ideally want to break this down to a more simpler query or number of queries that will allow the repetetive condition checks in case statements to be simplified and helps in performance.Any advise will be really helpful
Here if you can see the date - interval condition for a similar let of columns are repeated while the sum of the computations is carried out.What would be the best way to tune the query so that the aaggregations are done in an optimal manner
select a.cust_id,a.cust_dt,
sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '14' day) ) and c.ea_flg in ('6') then c.some_value else 0 end) column 1,
sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '14' day) ) and c.ea_flg in ('7') then c.some_value else 0 end) column 2,
sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '14' day) ) and c.ea_flg in ('8') then c.some_value else 0 end) column 3,
sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '7' day) ) and c.ea_flg in ('5') then c.some_value else 0 end) column 4,
sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '7' day) ) and c.ea_flg in ('34') then c.some_value else 0 end) column 5,
sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '7' day) ) and c.ea_flg in ('56') then c.some_value else 0 end) column 6,
sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '30' day) ) and c.ea_flg in ('76') then c.some_value else 0 end) column 7,
sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '30' day) ) and c.ea_flg in ('23') then c.some_value else 0 end) column 8,
sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '30' day) ) and c.ea_flg in ('54') then c.some_value else 0 end) column 9,
sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '60' day) ) and c.ea_flg in ('78') then c.some_value else 0 end) column 10,
sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '60' day) ) and c.ea_flg in ('99') then c.some_value else 0 end) column 11,
sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '60' day) ) and c.ea_flg in ('21') then c.some_value else 0 end) column 12
from
table_1 a,table_2 b,table_3 c
join conditions on a,b,c
where a.cust_dt = sysdate -1
group by 1,2;
table a = 24k. Here cust_dt is unique
Table b = 50-60 million cust_dt is repeated
Table C = 80 million (approx) cust_dt is repeated
joins between the 3 tables are on below condition
a.cust_id=b.cust_id
b.some_col1 = c.some_col1
b.cust_dt=c.cust_dt
|
|
|
Re: Optimization of SQL [message #684318 is a reply to message #684317] |
Sun, 09 May 2021 05:21   |
John Watson
Messages: 8804 Registered: January 2010 Location: Global Village
|
Senior Member |
|
|
Welcome to the forum.
Please read the OraFAQ Forum Guide and How to use [code] tags and make your code easier to read.
You have not said what the problem is, can I assume that it is performance? You would like the query to run faster?
If so, it is unusual that column projections such as the CASE statements to which you refer are a significant issue. More likely the problem is likely to do with the execution plan. You had better show it, together with the real query and some idea of the data volumes. For example, we know that the query returns only one row, because you have an equality predicate on a unique column of A. But I have no idea how may rows come from B and C to be aggregated.
[Updated on: Sun, 09 May 2021 05:22] Report message to a moderator
|
|
|
Re: Optimization of SQL [message #684319 is a reply to message #684318] |
Sun, 09 May 2021 06:55  |
Solomon Yakobson
Messages: 3212 Registered: January 2010 Location: Connecticut, USA
|
Senior Member |
|
|
Code you posted is not a working code:
sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '14' day) ) and c.ea_flg in ('6') then c.some_value else 0 end) column 1
Extra ), space is select list aliases: column 1 and RTFM on aggregations and how group by 1,2 would work vs order by 1,2.
Anyway, I'd try something like:
with t1 as (
select /*+ materialize */
a.cust_id a_cust_dt,
a.cust_dt
from table_1 a
where a.cust_dt = sysdate - 1
),
t2 as (
select /*+ materialize */ -- index on table_2(cust_id,cust_dt,some_col1) would increase performance
t1.cust_id,
t1.a_cust_dt,
b.cust_dt b_cust_dt,
b.cust_dt - t1.cust_dt cust_dt_diff
b.some_col1
from t1,
table_2 b
where b.cust_id = t1.cust_id
and t1.a_cust_dt >= b.cust_dt - 60
and t1.a_cust_dt < b.cust_dt
)
select -- index on table_3(cust_dt,some_col1,ea_flg) would increase performance
t2.cust_id,
t2.a_cust_dt cust_dt
sum(
case
when t2.cust_dt_diff <= 14 and c.ea_flg in ('6') then c.some_value
else 0
end
) column1,
sum(
case
when t2.cust_dt_diff <= 14 and c.ea_flg in ('7') then c.some_value
else 0
end
) column2,
sum(
case
when t2.cust_dt_diff <= 14 and c.ea_flg in ('8') then c.some_value
else 0
end
) column3,
sum(
case
when t2.cust_dt_diff <= 7 and c.ea_flg in ('5') then c.some_value
else 0
end
) column4,
sum(
case
when t2.cust_dt_diff <= 7 and c.ea_flg in ('34') then c.some_value
else 0
end
) column5,
sum(
case
when t2.cust_dt_diff <= 7 and c.ea_flg in ('56') then c.some_value
else 0
end
) column6,
sum(
case
when t2.cust_dt_diff <= 30 and c.ea_flg in ('76') then c.some_value
else 0
end
) column7,
sum(
case
when t2.cust_dt_diff <= 30 and c.ea_flg in ('23') then c.some_value
else 0
end
) column8,
sum(
case
when t2.cust_dt_diff <= 30 and c.ea_flg in ('54') then c.some_value
else 0
end
) column9,
sum(
case
when t2.cust_dt_diff <= 60 and c.ea_flg in ('78') then c.some_value
else 0
end
) column10,
sum(
case
when t2.cust_dt_diff <= 60 and c.ea_flg in ('99') then c.some_value
else 0
end
) column11,
sum(
case
when t2.cust_dt_diff <= 60 and c.ea_flg in ('21') then c.some_value
else 0
end
) column12
from t2,
table_3 c
where c.cust_dt = t2.b_cust_dt
and c.some_col1 = t2.some_col1
and c.ea_flg in (5,6,7,8,21,23,34,54,56,76,78,99)
group by t2.cust_id,
t2.a_cust_dt
/
SY.
|
|
|
Goto Forum:
Current Time: Tue Mar 28 00:17:42 CDT 2023
|