Read + Write + Report
Home | Start a blog | About Orble | FAQ | Blogs | Writers | My Orble | Login

Using OPENROWSET to read large files into SQL Server

December 11th 2008 16:42
Using OPENROWSET to read large files into SQL Server


Problem
OPENROWSET is a T-SQL function that allows for reading data from many sources including using the SQL Server’s BULK import capability. One of the useful features of the BULK provider is its ability to read individual files from the file system into SQL Server, such as loading a data from a text file or a Word document into a SQL Server table. This capability is the subject of this tip.

Solution
The BULK option was added to T-SQL in SQL Server 2005 and it persists in SQL Server 2008. When using SQL Server 2000 it was possible to read and write to the file system using the sp_OA_Create and sp_OA_Method extended stored procedures. These XPs continue to work, but are disabled by default because of security concerns and it is a better practice to use more secure capabilities like OPENROWSET when they are available.


When used with the BULK provider keyword you can name a data file to read as one of three types of objects:

SINGLE_BLOB, which reads a file as varbinary(max)
SINGLE_CLOB, which reads a file as varchar(max)
SINGLE_NCLOB, which reads a file as nvarchar(max)

OPENROWSET returns a single column, named BulkColumn, as its result. Here’s an example that reads a text file:

SELECT BulkColumn
FROM OPENROWSET (BULK 'c:\temp\mytxtfile.txt', SINGLE_CLOB) MyFile

The correlation name, in this case MyFile, is required by OPENROWSET.

There are additional requirements when reading single files that must also be observed as mentioned below.

Access control is always a concern. The operating system level file operations to read the file are executed with the privileges of the account that the SQL Server data engine is using. Therefore, only files accessible to that account may be read. This includes network drives or UNC paths, which are permitted if the account has the privileges. If you want to read network files, run SQL Server as a domain user.


The BULK provider won’t convert between Unicode and plain ASCII files. It must be told which type of encoding is used in the file. If you don’t the result is error 4806 as seen here:

SELECT BulkColumn
FROM OPENROWSET (BULK 'c:\temp\SampleUnicode.txt', SINGLE_CLOB) MyFile

Msg 4806, Level 16, State 1, Line 1
SINGLE_CLOB requires a double-byte character set (DBCS) (char) input file. The file specified is Unicode.

Unicode files must be read with the SINGLE_NCLOB option shown here:

SELECT BulkColumn
FROM OPENROWSET (BULK 'c:\temp\SampleUnicode.txt', SINGLE_NCLOB) MyFile

Similarly, files with non text structures, such as Word documents are not converted. They must be converted by some other mechanism before being read or they can be read as binary files with the SINGLE_BLOB option.

OPENROWSET isn’t flexible about how you provide the name of the file. It must be a string constant. That requirement forces the use of dynamic SQL when the file name isn’t known in advance.

Here’s a stored procedure that reads any text file and returns the contents as an output variable:

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

CREATE PROC [dbo].[ns_txt_file_read]
@os_file_name NVARCHAR(256)
,@text_file VARCHAR(MAX) OUTPUT
/* Reads a text file into @text_file
*
* Transactions: may be in a transaction but is not affected
* by the transaction.
*
* Error Handling: Errors are not trapped and are thrown to
* the caller.
*
* Example:
declare @t varchar(max)
exec ns_txt_file_read 'c:\temp\SampleTextDoc.txt', @t output
select @t as [SampleTextDoc.txt]
*
* History:
* WHEN WHO WHAT
* ---------- ---------- ----------------------------- ----------
* 2007-02-06 anovick Initial coding
***************************** ***************************** ****/
AS
DECLARE @sql NVARCHAR(MAX)
, @parmsdeclare NVARCHAR(4000)

SET NOCOUNT ON

SET @sql = 'select @text_file=(select * from openrowset (
bulk ''' @os_file_name '''
,SINGLE_CLOB) x
)'

SET @parmsdeclare = '@text_file varchar(max) OUTPUT'

EXEC sp_executesql @stmt = @sql
, @params = @parmsdeclare
, @text_file = @text_file OUTPUT

To see how it works, just execute the example script: First create a text file called "SampleTextDoc.txt" and add some text data to the file. For our example we added the following text "The quick brown fox jumped over the lazy dog.".

DECLARE @t VARCHAR(MAX)
EXEC ns_txt_file_read 'c:\temp\SampleTextDoc.txt', @t output
SELECT @t AS [SampleTextDoc.txt]

The results are:

SampleTextDoc.txt
The quick brown fox jumped over the lazy dog.
(1 row(s) affected)


The performance of reading text files is remarkably fast because the files are read sequentially. Using a 64 bit SQL Server 2008 on a development machine, reading a file of 750,000,000 bytes took only 7 seconds.


46
Vote
Add To: del.icio.us Digg Furl Spurl.net StumbleUpon Yahoo


   

   

   


Add A Comment

To create a fully formatted comment please click here.


CLICK HERE TO LOGIN | CLICK HERE TO REGISTER

Name or Orble Tag
Home Page (optional)
Comments
Bold Italic Underline Strikethrough Separator Left Center Right Separator Quote Insert Link Insert Email
Notify me of replies
Notify extra people about this comment
Is this a private comment?
List the Email Addresses or Orble Tags of the people you would like to be notified about this comment


One per line max of 30

List the Email Addresses or Orble Tags of the people you would like to be notified about this private comment thread. Only the people in this list will be able to see or reply to your comment.


One per line max of 30

Your Name
(for the email going out to the above list, it can be different to your Orble Tag)
Your Email Address
(optional)
(required for reply notification)
Submit
More Posts
4 Posts
11 Posts
1 Posts
77 Posts dating from May 2008
Email Subscription
Receive e-mail notifications of new posts on this blog:
0

Siddharth sood's Blogs

37 Vote(s)
0 Comment(s)
1 Post(s)
84 Vote(s)
0 Comment(s)
2 Post(s)
37 Vote(s)
0 Comment(s)
1 Post(s)
50 Vote(s)
0 Comment(s)
1 Post(s)
29 Vote(s)
0 Comment(s)
1 Post(s)
76 Vote(s)
0 Comment(s)
2 Post(s)
169 Vote(s)
0 Comment(s)
4 Post(s)
213 Vote(s)
2 Comment(s)
5 Post(s)
26 Vote(s)
0 Comment(s)
1 Post(s)
25 Vote(s)
0 Comment(s)
1 Post(s)
Moderated by Siddharth sood
Copyright © 2006 2007 2008 On Topic Media PTY LTD. All Rights Reserved. Design by Vimu.com.
On Topic Media ZPages: Sydney |  Melbourne |  Brisbane |  London |  Birmingham |  Leeds     [ Advertise ] [ Contact Us ] [ Privacy Policy ]