Home » Server Options » Text & interMedia » Fuzzy search
Fuzzy search [message #279164] Wed, 07 November 2007 03:16 Go to next message
mpls51
Messages: 2
Registered: November 2007
Junior Member
I have the following two tables

smc_book table:
smc_id smc_title smc_publisher smc_release_date
------------------------------------------------------------------------------------------------------------------------------------- --------
1234 "Beautiful Wonder" "Wrox Books" 1/1/1999
2356 "Master PL/SQL" "OReilly Media" 6/5/2004
5432 "Harry Potter and Goblet of Fire" "Simon & Shuster" 2/4/2001

arc_book table:
arc_id arc_title arc_publisher arc_release_date
------------------------------------------------------------------------------------------------------------------------------------- --------
1245 "Wonder, Beautiful" "Wrox" 1/1/1999
1244 "The PL-SQL, Master (RE 1983)" "Media, OReilly" 6/5/2004
4352 "Golbet of Fire, Harry Potter" "Simon and Shuster" 2/4/2001

I want to match up records between the two tables that have similar titles. So, the following would be matches

smc_id arc_id
-----------------------
1234 1245
2356 1244
5432 4352

i.e. The title "Beautiful Wonder" in smc_book table would match up with "Wonder, Beautiful" in arc_book, etc...

How would this be done in oracle.
Re: Fuzzy search [message #279560 is a reply to message #279164] Fri, 09 November 2007 01:05 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9077
Registered: November 2002
Location: California, USA
Senior Member
SCOTT@orcl_11g> CREATE TABLE smc_book
  2    (smc_id		  NUMBER,
  3  	smc_title	  VARCHAR2 (40),
  4  	smc_publisher	  VARCHAR2 (40),
  5  	smc_release_date  DATE)
  6  /

Table created.

SCOTT@orcl_11g> SET DEFINE OFF
SCOTT@orcl_11g> INSERT ALL
  2  INTO smc_book VALUES (1234, 'Beautiful Wonder', 'Wrox Books', TO_DATE ('1/1/1999', 'DD/MM/YYYY'))
  3  INTO smc_book VALUES (2356, 'Master PL/SQL', 'OReilly Media', TO_DATE ('6/5/2004', 'DD/MM/YYYY'))
  4  INTO smc_book VALUES (5432, 'Harry Potter and Goblet of Fire', 'Simon & Shuster', TO_DATE ('2/4/2001', 'DD/MM/YYYY'))
  5  SELECT * FROM DUAL
  6  /

3 rows created.

SCOTT@orcl_11g> CREATE TABLE arc_book
  2    (arc_id		  NUMBER,
  3  	arc_title	  VARCHAR2 (40),
  4  	arc_publisher	  VARCHAR2 (40),
  5  	arc_release_date  DATE)
  6  /

Table created.

SCOTT@orcl_11g> INSERT ALL
  2  INTO arc_book VALUES (1245, 'Wonder, Beautiful', 'Wrox', TO_DATE ('1/1/1999', 'DD/MM/YYYY'))
  3  INTO arc_book VALUES (1244, 'The PL-SQL, Master (RE 1983)', 'Media, OReilly', TO_DATE ('6/5/2004', 'DD/MM/YYYY'))
  4  INTO arc_book VALUES (4352, 'Goblet of Fire, Harry Potter', 'Simon and Shuster', TO_DATE ('2/4/2001', 'DD/MM/YYYY'))
  5  SELECT * FROM DUAL
  6  /

3 rows created.

SCOTT@orcl_11g> CREATE INDEX smc_ind ON smc_book (smc_title)
  2  INDEXTYPE IS CTXSYS.CONTEXT
  3  /

Index created.

SCOTT@orcl_11g> CREATE INDEX arc_ind ON arc_book (arc_title)
  2  INDEXTYPE IS CTXSYS.CONTEXT
  3  /

Index created.

SCOTT@orcl_11g> CREATE OR REPLACE FUNCTION format_string
  2    (p_string IN VARCHAR2)
  3    RETURN VARCHAR2
  4  AS
  5    v_string  VARCHAR2 (32767) := LTRIM (RTRIM (p_string));
  6  BEGIN
  7    RETURN '{' ||
  8  	 REPLACE
  9  	   (REPLACE
 10  	     (REPLACE
 11  	       (REPLACE
 12  		 (REPLACE (v_string, '/', ' '), '-', ' '), ',', ' '), '  ', ' '), ' ', '};{')
 13  	 || '}';
 14  END format_string;
 15  /

Function created.

SCOTT@orcl_11g> SHOW ERRORS
No errors.
SCOTT@orcl_11g> SELECT s.smc_id, a.arc_id
  2  FROM   smc_book s, arc_book a
  3  WHERE  CONTAINS (s.smc_title, format_string (a.arc_title)) > 0
  4  OR     CONTAINS (a.arc_title, format_string (s.smc_title)) > 0
  5  /

    SMC_ID     ARC_ID
---------- ----------
      1234       1245
      2356       1244
      5432       4352

SCOTT@orcl_11g> SELECT s.smc_id, a.arc_id
  2  FROM   smc_book s, arc_book a
  3  WHERE  CONTAINS (s.smc_title, format_string (a.arc_title)) > 0
  4  UNION
  5  SELECT s.smc_id, a.arc_id
  6  FROM   smc_book s, arc_book a
  7  WHERE  CONTAINS (a.arc_title, format_string (s.smc_title)) > 0
  8  /

    SMC_ID     ARC_ID
---------- ----------
      1234       1245
      2356       1244
      5432       4352

SCOTT@orcl_11g> spool off

Re: Fuzzy search [message #279919 is a reply to message #279560] Sun, 11 November 2007 19:15 Go to previous message
mpls51
Messages: 2
Registered: November 2007
Junior Member
Thank You. This helps me immensely.
Previous Topic: Ending printjoin character issue and CONTAINS function not working (merged 2 threads)
Next Topic: Cannot create DETAIL_DATASTORE when detail table is a view?
Goto Forum:
  


Current Time: Thu Mar 28 09:56:39 CDT 2024