I was asked by my friend whether I could help him with reading a text file with OPENROWSET function and Ad Hoc Distributed Queries option. I stopped by to help.My frined’s first thoughts were to use MSDASQL
select * from openrowset('MSDASQL' ,'Driver={Microsoft Text Driver (*.txt; *.csv)}' ,'select * from C:\Temp\test1.CSV')
but got this weird error message:
Msg 7399, Level 16, State 1, Line 1
The OLE DB provider “MSDASQL” for linked server “(null)” reported an error. The provider did not give any information about the error.
Msg 7303, Level 16, State 1, Line 1
Cannot initialize the data source object of OLE DB provider “MSDASQL” for linked server “(null)”.
I wasn’t able to find out what we were missing. Among other forum posts there were ones that said that MSDASQL is not really reliable at all and this error sometimes happens.
Workaround – better solution
I thought whether I could use a bulk operation on this. Luckily yes! I tried
SELECT * FROM OPENROWSET(BULK 'c:\temp\test.txt', FORMATFILE= 'c:\temp\test.xml') AS a WHERE col0 LIKE N'%doe%'
C:\temp\test.txt = my file for import
C:\temp\test.xml = XML schema Format file for test.txt
test.txt content:
test_content_col0;test_contenct_col1
john doe;male
jane doe;female
A FORMATFILE option stands for data schema of the file being imported. It must be used when a column count in a data file differs from the target table, columns are in a different order or a custom delimiter is used or in this case.
Format file
You may write your own schema manually or generate it with BCP utility based on the table schema.
<?xml version="1.0"?> <BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <RECORD> <FIELD ID="1" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="510" COLLATION="Czech_100_CI_AS"/> <FIELD ID="2" xsi:type="CharTerm" TERMINATOR="\r\n" MAX_LENGTH="510" COLLATION="Czech_100_CI_AS"/> </RECORD> <ROW> <COLUMN SOURCE="1" NAME="col0" xsi:type="SQLNVARCHAR"/> <COLUMN SOURCE="2" NAME="col1" xsi:type="SQLNVARCHAR"/> </ROW> </BCPFORMAT>
Notice that MAX_LENGTH is a column length in Bytes.
About BCP
Bulk Copy program (BCP) is a command line utility used to import/export data. BCP became available in Microsoft SQL Server 6.0 in which it was the tool of choice for import/export of data. With Microsoft SQL Server 7.0 came first version of Data Transformation Services (DTS) since SQL 2005 Integration Services (SSIS) which offered much more flexibility and comfort then BCP. Nowadays BCP is used for large dataset loads (by default BCP ignores constraints and does not fire triggers which makes it even more faster) or just loads which does not require additional data transformation. BCP is also able to create a Format file for BULK INSERT operations. For more information check BOL, bcp Utility article.
Sample bellow demonstrates bcp tool usage
bcp Test.dbo.import format nul -c -x -f import.xml -t; -T
Test.dbo.import – schema will be created based on table import at dbo schema at Test database
Format nul -f -x – specifies the XML format file, for SQL 2000 omit the –x switch and replace xml extension with fmt – the utility will create schema in native format
-c – specifies character data
Import.xml – output file (will be placed to your command prompt actual folder (by default when you open CMD c:\users\USERSHOMEFOLDER)
-t; – specifies that semicolon is used as column delimiter
-T – specifies trusted connection to SQL Server using integrated security. Optionally you cas use –U and –P switch to log in with SQL account.
Happy importing!
Leave a Reply