Retrieve Web-Page Content as XML format in Oracle
March 11, 2011 Leave a comment
In Oracle, there exists a package UTL_HTTP, by which you can make HTTP requests directly from database.
You can use this package if you want to read a webpage.
I will write a simple script, which will send a request to google site and retrieve its content.
declare
v_content long;
v_url varchar2(40):=’www.google.com’;
begin
v_content := utl_http.request(v_url);dbms_output.put_line( v_content );
end;
Note that this code works well if page size is not more than 2GB(Because LONG type saves maximum 2GB).
For larger pages we should use REQUEST_PIECES. By this method you are not limited to 2GB, but you can indicate maximum bytes by yourself. I will use maximum 100000 bytes.
declare
v_content utl_http.html_pieces;
v_url varchar2(25):=’www.google.com’;
begin
v_content := utl_http.request_pieces( v_url,100000);for i in 1 .. v_content .count
loop
dbms_output.put_line( v_content (i) );
–exit when ( i = 2 );
end loop;
end;
This script will return all content from the web-page. If you want to retrieve just several lines from there, uncomment “exit when ( i = 2 );” and write desired number of lines instead of 2.
Note that the page is returned as an array not as a single variable.
That is all.