(Print this page)

Using the FileHelpers Library
Published date: Tuesday, November 22, 2011
On: Moer & Éric Moreau's web site

I had this free library on my “to-test-list” for a very long time. This morning, I decided it was the time. And sincerely, I should have done it time long ago!

This component is mainly used to transform any text file (fixed length or delimited) to a structured type or a datatable. But its power doesn’t stop there!

The demo application

This demo is provided in both VB and C#. It was created using Visual Studio 2010 but you really need the version 2.0 of the Framework (so VS 2005 and up).

Figure 1: The demo application in action

Credits

I cannot take the credit for this very powerful library. You should really visit the official web site to check if a newer version is released (latest is dated from April 10, 2007) and to read the documentation and check for other examples.

TextFieldParser class from the .Net Framework

Do you remember my article of May 2010 in which I introduced you to the TextFieldParser class? They both have pros and cons. There is no clear winner. Your input file will dictate you which component to use. For example, the TextFieldParser has a CommentTokens property which is very useful if your input file has comments.

So what is the biggest advantage of the FileHelpers Library? My favorite feature is that your file becomes a strongly type resources bringing you the correct name and data type. No more typing error in your column’s index.

Handling errors

We haven’t seen anything about the library yet and I will already introduce you about one of the properties. This is a nice property that help dictates how to handle the parsing of errors that may occur while opening your file. This property is ErrorMode:

engine.ErrorManager.ErrorMode = ErrorMode.SaveAndContinue

Only 3 values are available:

 

  • ThrowException: whenever a parsing error is found, an exception is thrown and the process stops there. No rows are returned. This is the default value.
  • SaveAndContinue: whenever a parsing error is found, the row is persisted in the ErrorManager and the process continues.
  • IgnoreAndContinue: whenever a parsing error is found, the row is skipped and the process continues.

 

My demo here will use the SaveAndContinue value. The demo will simply list the rows in error.

I have introduced one erroneous row in the delimited input file and 2 incorrect rows in the fixed-width input file.

Processing delimited files

The first type of file we will process is a delimited text file.

The first thing you will need is a class to define the files that will be read.

This is an example of a class that fits the TestDelimited.txt file provided with my demo application:

Imports FileHelpers

<DelimitedRecord(",")>
Public Class cTemplateDelimited

    Public Id As Integer

    Public FullName As String

    Public Height As Decimal

    <FieldConverter(ConverterKind.Date, "yyyy-MM-dd")>
    Public DOB As DateTime

End Class

The first thing we see at the top is an attribute specifying that the file will be delimited and the character used as the delimiter. In this case it is set to a comma but it can be just about any characters. Then, you see the 4 fields that will be processed from the input file if the parsing is valid. As you can see, data will be parsed in the correct data type right away. You can even see that the date field will be parsed using a specific format.

Now that we have this class, we can create the code to open the file using this definition.

These 20 lines of code represents what is required to be able to provide a parser, open the file, display rows in error if any, and display the valid rows.

'Clean the content of the listbox to display the new results
Me.ListBox1.Items.Clear()
ListBox1.Items.Add("Processing delimited file")

'create the parser engine providing the type (the class) to use
Dim engine As New FileHelperEngine(GetType(cTemplateDelimited))
'Specifying the kind of error processing
engine.ErrorManager.ErrorMode = ErrorMode.SaveAndContinue

'read the file
Dim result As cTemplateDelimited() = DirectCast(engine.ReadFile("TestDelimited.txt"), cTemplateDelimited())

'display some summary results
ListBox1.Items.Add("   Number of records: " + engine.TotalRecords.ToString)
ListBox1.Items.Add("   Successful: " + result.Length.ToString)

'report only if parsing errors are found
If engine.ErrorManager.ErrorCount > 0 Then
    ListBox1.Items.Add("   Errors: " + engine.ErrorManager.ErrorCount.ToString)
    For Each err As ErrorInfo In engine.ErrorManager.Errors
        ListBox1.Items.Add(String.Format("   Error: {0} (row {1})", err.ExceptionInfo, err.LineNumber))
    Next
End If

'loop through each valid row
For Each row As cTemplateDelimited In result
    ListBox1.Items.Add("ID: " + row.Id.ToString)
    ListBox1.Items.Add("   Name: " + row.FullName)
    ListBox1.Items.Add("   Height: " + row.Height.ToString)
    ListBox1.Items.Add("   DOB: " + row.DOB.ToString)
Next

Because the library parses the input file into a collection of strongly type object, I can now use row.FullName for example to access valid information.

Processing fixed-width files

Processing a fixed-width file is exactly the same. You need to provide a class used to parse the file like this one:

Imports FileHelpers

<FixedLengthRecord()>
Public Class cTemplateFixed

    <FieldFixedLength(5)>
    Public Id As Integer

    <FieldFixedLength(21),
     FieldTrimAttribute(TrimMode.Right)>
    Public FullName As String

    <FieldFixedLength(7), _
     FieldConverter(GetType(TwoDecimalConverter))> _
    Public Height As Decimal

    <FieldFixedLength(15),
     FieldTrimAttribute(TrimMode.Right)>
    Public Comment As String

    <FieldFixedLength(10),
     FieldConverter(ConverterKind.Date, "yyyy-MM-dd")>
    Public DOB As DateTime


    ' Custom Converter
    Friend Class TwoDecimalConverter
        Inherits ConverterBase

        Public Overrides Function StringToField(ByVal from As String) As Object
            Dim res As Decimal = Convert.ToDecimal(from)
            Return res / 100
        End Function

        Public Overrides Function FieldToString(ByVal from As Object) As String
            Dim d As Decimal = CType(from, Decimal)
            Return Math.Round(d * 100).ToString()
        End Function

    End Class

End Class

This class may look more complex but it isn’t. The first line of the class is an attribute specifying that the file will be a fixed-length one. Then, each property has an attribute to specify the length. Finally, a special type converter is provided (TwoDecimalConverter) because the value in the file does not contain a decimal separator (but it could if you wanted too – it is just to show you how easy it is to use a converter).

And now you are ready to process a file with code like this:

'Clean the content of the listbox to display the new results
Me.ListBox1.Items.Clear()
ListBox1.Items.Add("Processing fixed-width file")

'create the parser engine providing the type (the class) to use
Dim engine As New FileHelperEngine(GetType(cTemplateFixed))
'Specifying the kind of error processing
engine.ErrorManager.ErrorMode = ErrorMode.SaveAndContinue

'read the file
Dim result As cTemplateFixed() = DirectCast(engine.ReadFile("TestFixedWidth.txt"), cTemplateFixed())

'display some summary results
ListBox1.Items.Add("   Number of records: " + engine.TotalRecords.ToString)
ListBox1.Items.Add("   Successful: " + result.Length.ToString)

'report only if parsing errors are found
If engine.ErrorManager.ErrorCount > 0 Then
    ListBox1.Items.Add("   Errors: " + engine.ErrorManager.ErrorCount.ToString)
    For Each err As ErrorInfo In engine.ErrorManager.Errors
        ListBox1.Items.Add(String.Format("   Error: {0} (row {1})", err.ExceptionInfo, err.LineNumber))
    Next
End If

'loop through each valid row
For Each row As cTemplateFixed In result
    ListBox1.Items.Add("ID: " + row.Id.ToString)
    ListBox1.Items.Add("   Name: " + row.FullName)
    ListBox1.Items.Add("   Height: " + row.Height.ToString)
    ListBox1.Items.Add("   DOB: " + row.DOB.ToString)
    ListBox1.Items.Add("   Comment: " + row.Comment)
Next

The only differences here, really, are the name of the class to use as a template and the name of the file to process. Everything else is exactly the same.

Creating a file

Using the very same class we created for our templates, we could just as easily output files. Check this snippet of code:

If File.Exists("temp.txt") Then File.Delete("temp.txt")

'Create a FileHelpers engine
Dim engine As New FileHelperEngine(GetType(cTemplateFixed))

'Create a list of objects to persist
Dim arrObjects As New List(Of cTemplateFixed)

'fill the list of objects
Dim item As New cTemplateFixed
item.Id = 1001
item.FullName = "Eric Moreau"
item.Height = 12345.67D
item.Comment = "Very tall"
item.DOB = New Date(1901, 2, 3)
arrObjects.Add(item)

'write the file
engine.WriteFile("temp.txt", arrObjects)

MessageBox.Show("File created")

This code starts by creating an instance of the object exactly like we did before. We then create and fill a list of objects to finally use the engine to write the content of the collection into a file.

And much more!

This article only touches the tip of the iceberg. If you look at the downloadable sample, you will see that the delimited example also has attributes of the class definition to skip empty lines, to skip the first line, to ignore comments …

I really invite you to explore the documentation to find all the methods, the properties, and the attributes (there are not really many) to really be able to judge the full value of this library.

Conclusion

Did I tell you it is free (commercial and non-commercial use)? It is. There is no reason not to use whenever you have text files to process.

I have shown you here how to retrieve the file content in a collection of objects. There are also other methods from stream or string, or that return a datatable if this is what you need.

I don’t think this library will prevent you from processing your files! Just give it a try.


(Print this page)