In one of my projects,
we had requirement of archiving text and excel files in database. We discovered
that this can be done as BLOB fields. BLOB stands for Binary Large Objects.
This data-type is available in most of the modern databases. In SQL Server 2005
- VARCHAR (MAX), NVARCHAR (MAX) and VARBINARY (MAX) data-types provides this
functionality.
For achieving
our goal, we had to take care of following things:-
- Designing
a table structure to store file metadata and file contents.
- Writing
stored procedures to insert and retrieve data from archival table.
- Writing
.NET application code which can store files and retrieve them when needed.
Designing a table
structure to store files
First we need a
table to store our archived data. Let’s name this table as “tblFileArchival”. The structure of
table is shown below:-
The table has
following fields:-
- ArchivalID – Auto incrementing
(identity) integer field.
- ArchivalDateTime – Field for storing
date and time of file archival
- ArchivalFileName – Field for
storing name of file
- ArchivedFileSource – Field for
storing source of file. This can be an optional field for your
requirement. Hence it can be marked as Nullable if needed.
- ArchivedData – Field for storing
actual file data. Please note that data type for this field is varbinary (max).
- ArchivedFileExtn – Extension of stored
file. (Eg.- xls, txt)
- ArchivedFileIODirection – In our
application, we have to store whether the file is an input to our system
or output from our system. Hence this field was used. This may not be
required for your requirement. Hence it can be marked as Nullable if
needed.
The SQL for
creating the table is as follows:-
CREATE TABLE [dbo].[tblFileArchival](
[ArchivalID] [int] IDENTITY(1,1) NOT NULL,
[ArchivalDateTime] [datetime] NOT NULL,
[ArchivedFileName] [nvarchar](100) NOT NULL,
[ArchivedFileSource] [nvarchar](15) NOT NULL,
[ArchivedData] [varbinary](max) NOT NULL,
[ArchivedFileExtn] [nvarchar](4) NOT NULL,
[ArchivedFileIODirection] [nvarchar](6) NOT NULL,
CONSTRAINT [PK_tblFileArchival] PRIMARY KEY CLUSTERED
(
[ArchivalID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Stored Procedure to
insert data into table
We need a stored
procedure to insert data into table. This procedure will take following input
parameters:-
- Name
of file to be archived.
- Source
of file to be archived.
- Data
of file.
- Extension
of file to be archived.
- File
IO direction.
The SQL for
stored procedure is as follows:-
CREATE PROCEDURE [dbo].[uspArchiveFile]
(
@ArchivedFileName nvarchar(100),
@ArchivedFileSource nvarchar(15),
@ArchivedData varbinary(max),
@ArchivedFileExtn nvarchar(4),
@ArchivedFileIODirection nvarchar(6)
)
AS
BEGIN
INSERT INTO
[tblFileArchival]
([ArchivalDateTime]
,[ArchivedFileName]
,[ArchivedFileSource]
,[ArchivedData]
,[ArchivedFileExtn]
,[ArchivedFileIODirection])
VALUES
(GETDATE()
,@ArchivedFileName
,@ArchivedFileSource
,@ArchivedData
,@ArchivedFileExtn
,@ArchivedFileIODirection);
END
Stored Procedure to
retrieve data from table
We need a stored
procedure to retrieve archived file metadata and file data from the table. This
stored procedure will take one input parameter - “ArchivalID” of the file whose data needs to be retrieved and
will return that data.
The SQL for
stored procedure is as follows:-
CREATE PROCEDURE [dbo].[uspGetArchivedFile]
@ArchivalId int
AS
BEGIN
SELECT [ArchivalID]
,[ArchivalDateTime]
,[ArchivedFileName]
,[ArchivedFileSource]
,[ArchivedData]
,[ArchivedFileExtn]
,[ArchivedFileIODirection]
FROM tblFileArchival
WHERE [ArchivalID] = @ArchivalId
END
.NET Code to archive
file in database
To archive a
file in database in a field of type VARBINARY (MAX), we need to read the
contents of file and pass them to our insert stored procedure as an array of
bytes.
The following
code shows how to achieve the same.
FileStream fileStream
= null;
try
{
//Open the file and read its contents in a byte
array using file stream
fileStream = File.OpenRead(FilePath);
int length = (int)fileStream.Length;
byte[] byteArray = new byte[length];
fileStream.Read(byteArray, 0, length);
//Place the file details and content in database
ArchiveDataToDB(FileName,
ReceivedFrom,
byteArray,
FileExtn,
FileIODirection);
}
catch (Exception ex)
{
throw ex;
}
finally
{
if (fileStream != null)
{
fileStream.Close();
fileStream = null;
}
}
Here we have all
the required file properties in following variables:-
- FilePath
- FileName
- ReceivedFrom
- FileExtn
- FileIODirection
Before calling
the stored procedure we need the contents of file also. This is achieved using
following steps:-
- Create
a fileStream object
and open the file using it.
- Get
the length of the file using fileStream.Length.
- Create
a byte array of size fileSream.Length.
- Read
the file from start till end in this byte array using fileStream.Read method.
Now call the
function “ArchiveDataToDB” passing all file
information as input to it. The function “ArchiveDataToDB”
contains ADO.NET code which calls insert stored procedure – “uspArchiveFile” passing all input
parameters and stores the data in SQL Server. For brevity, the code for function “ArchiveDataToDB” is not given as this contains generic
ADO.NET code used for executing a stored procedure. This type of code is
generally project specific.
.NET code to
retrieve archived file from database
There is no use
of archiving a file in database if we can’t retrieve it. For regenerating the
file from archived data, we need to write custom .NET code. Following code
shows how to achieve it:-
FileStream fileStream
= null;
BinaryWriter
binaryWriter = null;
MemoryStream memStream
= null;
byte[] binary =
null;
const int chunkSize
= 100;
int
sizeToWrite = 0;
try
{
//Get archived file metadata(details) and content
from database
DataSet dsArchivedFile = GetArchivedFile(archivalId);
DataTable dtArchivedFile = dsArchivedFile.Tables[0];
//if datatable is empty, it means no such file
exists in database
if (dtArchivedFile.Rows.Count == 0)
{
return "No file exists with archival id - " +
archivalId;
}
//Get file name
and extension
string archivedFileName = dtArchivedFile.Rows[0]["ArchivedFileName"].ToString();
string archivedFileExtn = dtArchivedFile.Rows[0]["ArchivedFileExtn"].ToString();
if (!filePathToStore.EndsWith(@"\"))
{
filePathToStore += @"\";
}
//Create file path where file needs to be created
filePathToStore = filePathToStore +
archivedFileName + "." + archivedFileExtn.ToLower();
//Create the file stream
fileStream = new FileStream(filePathToStore,
FileMode.Create, FileAccess.Write);
binaryWriter = new BinaryWriter(fileStream);
//Get file contents from Database in a byte array
binary = (byte[])dtArchivedFile.Rows[0]["ArchivedData"];
sizeToWrite = chunkSize;
memStream = new MemoryStream(binary);
//using chunksize, read the byte array and write
it to binary writer
for (int i = 0; i < binary.GetUpperBound(0) - 1;
i = i + chunkSize)
{
if (i + chunkSize >= binary.Length)
{
sizeToWrite = binary.Length - i;
}
byte[] chunk = new byte[sizeToWrite];
memStream.Read(chunk, 0, sizeToWrite);
binaryWriter.Write(chunk);
binaryWriter.Flush();
}
return "File generated at " +
filePathToStore;
}
catch (Exception ex)
{
throw ex;
}
finally
{
if (fileStream != null)
{
fileStream.Close();
fileStream.Dispose();
}
if (binaryWriter != null)
{
binaryWriter.Close();
}
}
First we get all
file details including file data using the stored procedure “uspGetArchivedFile” we created
earlier. This stored procedure is executed by the function “GetArchivedFile” which takes “archivalID” as input and uses
ADO.NET code to execute the stored procedure and get file details & data in
a dataset. For brevity, the code for
function “GetArchivedFile” is
not given as this contains generic ADO.NET code used for executing a stored
procedure. This type of code is generally project specific.
To generate the
file from dataset, following steps are taken:-
- Check
if dataset is empty. If yes, we return an error message else we retrieve
various file properties like filename,
fileExtn in appropriate
variables.
- Create
file path where file needs to be stored. This is achieved by specifying
directory and appending file name and file extension to it.
- Create
a fileStream object in File
create mode and specify the path where file needs to be stored. Open the fileStream object in write
mode.
- Create
a binary writer from the file stream object.
- Get
the file contents from dataset in a byte array.
- Now
read this byte array by chunk-size of 100 and write it to binary writer.
The file will be
generated at the mentioned file path.
Conclusion
Storing file
information and data in SQL server provides a secure, reliable and cost
effective way to archive files. The file data can be easily managed. It is also
secure because only users who have access to SQL server can reach the table
where file data is stored. Since file data is stored in VARBINARY field, it is in
hexadecimal format which cannot be interpreted by naked eyes. Hence this
provides additional security for data.
No comments:
Post a Comment