Change strings of PL / SQL statement in C ++

This is my use case: Input is a string representing an Oracle PL / SQL statement of arbitration complexity. We can think of this as one statement (not a script). Now a few bits of this input line must be overwritten .

eg. table names must be prefixed, aggregate functions in the select list that do not use a column alias must be assigned by default:

SELECT SUM(ABS(x.value)), 
TO_CHAR(y.ID,'111,111'),
y.some_col
FROM
tableX x,
(SELECT DISTINCT ID
FROM tableZ z
WHERE ID > 10) y
WHERE
...

      

becomes

SELECT SUM(ABS(x.value)) COL1, 
TO_CHAR(y.ID,'111,111') COL2,
y.some_col
FROM
pref.tableX x,
(SELECT DISTINCT ID, some_col
FROM pref.tableZ z
WHERE ID > 10) y
WHERE
...

      

(Disclaimer: just to illustrate the problem, the statement is meaningless)

Since aggregated functions can be nested and subSELECT is b_tch, I cannot use regular expressions. Well, actually I did and got 80% success, but I need the other 20%.

The correct approach, I suppose, is to use grammars and parsers. I've worked with C ++ ANTLR2 (although I don't know much about grammars and parsing with such). I don't see an easy way to get the SQL bit:

list<string> *ssel = theAST.getSubSelectList(); // fantasy land

      

Could someone, maybe, provide some guidance on how "parsing professionals" would pursue this issue? EDIT: I am using Oracle 9i .

+2


source to share


2 answers


Maybe you can use this, it changes the select statement into an xml block:

declare
    cl clob;
begin
    dbms_lob.createtemporary (
        cl,
        true
    );
    sys.utl_xml.parsequery (
        user,
        'select e.deptno from emp e where deptno = 10',
        cl
    );
    dbms_output.put_line (cl);
    dbms_lob.freetemporary (cl);
end;
/ 

<QUERY>
  <SELECT>
    <SELECT_LIST>
      <SELECT_LIST_ITEM>
        <COLUMN_REF>
          <SCHEMA>MICHAEL</SCHEMA>
          <TABLE>EMP</TABLE>
          <TABLE_ALIAS>E</TABLE_ALIAS>
          <COLUMN_ALIAS>DEPTNO</COLUMN_ALIAS>
          <COLUMN>DEPTNO</COLUMN>
        </COLUMN_REF>
        ....
        ....
        ....
</QUERY>

      

See it here: http://forums.oracle.com/forums/thread.jspa?messageID=3693276åĢœ

Now you only need to parse this xml block.

Edit1:



Unfortunately I don't fully understand the OP's needs, but I hope this can help (this is another way to specify the "names" of the columns for example a query select count(*),max(dummy) from dual

):

set serveroutput on

DECLARE
 c       NUMBER;
 d       NUMBER;
 col_cnt PLS_INTEGER;
 f       BOOLEAN;
 rec_tab dbms_sql.desc_tab;
 col_num NUMBER;

PROCEDURE print_rec(rec in dbms_sql.desc_rec) IS
BEGIN
  dbms_output.new_line;
  dbms_output.put_line('col_type = ' || rec.col_type);
  dbms_output.put_line('col_maxlen = ' || rec.col_max_len);
  dbms_output.put_line('col_name = ' || rec.col_name);
  dbms_output.put_line('col_name_len = ' || rec.col_name_len);
  dbms_output.put_line('col_schema_name= ' || rec.col_schema_name);
  dbms_output.put_line('col_schema_name_len= ' || rec.col_schema_name_len);
  dbms_output.put_line('col_precision = ' || rec.col_precision);
  dbms_output.put_line('col_scale = ' || rec.col_scale);
  dbms_output.put('col_null_ok = ');

  IF (rec.col_null_ok) THEN
    dbms_output.put_line('True');
  ELSE
    dbms_output.put_line('False');
  END IF;
END;

BEGIN
  c := dbms_sql.open_cursor; 
  dbms_sql.parse(c,'select count(*),max(dummy) from dual ',dbms_sql.NATIVE); 
  dbms_sql.describe_columns(c, col_cnt, rec_tab);

  for i in rec_tab.first..rec_tab.last loop
    print_rec(rec_tab(i));
  end loop;

  dbms_sql.close_cursor(c);
END;
/

      

(see here for more information: http://www.psoug.org/reference/dbms_sql.html )

The OP also wants to be able to change the table schema name in the query. I think the easiest way to achieve this is to query the table names from user_tables

and search the SQL statement for those table names and prefix them or do 'alter session set current_schema = ....'

.

+2


source


If other coders are the source of the SQL statement strings, you can simply insist that the parts that need to be changed are simply marked with special exclusion conventions, such as writing $ TABLE instead of the table name or $ TABLEPREFIX where one is needed. Then the search for the places requiring correction can be done by searching and replacing the substring.

If you do have arbitrary SQL strings and cannot label them nicely, then you need to parse the SQL string somehow, as you noticed. An XML solution is definitely one possible way.

Another way is to use . Such a tool can parse a string for an instance of the language, build an AST, analyze and convert to an AST, and then spit out the revised string.

The DMS Software Reengineering Toolkit is such a system. It has a front end PLSQL parser. And it can use template oriented transformations to perform the rewritable queries you need. For your example involving individual elements:

domain PLSQL.
rule use_explicit_column(e: expression):select_item -> select_item
   "\e" -> "\e \column\(\e\)".

      

To read the rule, you need to understand that the stuff inside the quotes are abstract trees in some computer langauge that we want to manipulate. What the phrase "PLSQL domain" says is "use a PLSQL parser" to parse the quoted string content as is known. (DMS has many langauge parsers to choose from). The terms "expression" and "select_item" are grammatical constructs from the language of interest, such as PLSQL. See Railroad Diagrams in your PLSQL reference. The backslash represents escape / meta information, not langauge syntax.



What the rule says, convert those parsed items that are select_item s that consist solely of the expression \ e , by converting it to a select_item consisting of the same expression \ e and a corresponding column ( \ column (\ e) ), presumably based on the position in the select list for a particular table. You will need to implement a column function that can determine the appropriate name from the position of the select element. In this example, I decided to define a column function to accept the expression of interest as an argument; the expression is actually passed as a consistent tree, and so the column function can figure out where it is in the select_items list by walking up the abstract syntax tree.

This rule only processes selections. You must add additional rules to handle various other cases of interest.

What the transformation system does for you:

  • analyze the language fragment of interest
  • build AST
  • allows pattern matching for places of interest (by matching against AST pattern) but using target langauge surface syntax
  • replace matched patterns with other patterns
  • compute arbitrary replacements (like AST)
  • restore the original text from the modified AST.

When writing the rules is not always trivial, this is what is needed if your problem is as stated.

The proposed XML solution is another way to create such ASTs. It doesn't have good pattern matching properties, although you can get many benefits from XSLT. What I don't know is if the XML has a parse tree in detail; The DMS parser does provide this by design, as it is necessary if you want to do arbitrary parsing and transformation.

+1


source







All Articles