System.Text.RegularExpressions.Regex.Replace error in C # for SSIS

Question

System.Text.RegularExpressions.Regex.Replace error in C # for SSIS

I am using the below code to write ssis package in C # and when I write this code I am getting error

    using System;
    using System.Data;
    using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
    using Microsoft.SqlServer.Dts.Runtime.Wrapper;
    using System.Text.RegularExpressions;

    [Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
    public class ScriptMain : UserComponent
    {

        public override void PreExecute()
        {
            base.PreExecute();
        }
        public override void PostExecute()
        {
            base.PostExecute();
        }
        string toreplace = "[~!@#$%^&*()_+`{};':,./<>?]";
        string replacewith = "";
        public override void Input0_ProcessInputRow(Input0Buffer Row)
        {
            Regex reg = new Regex(toreplace);
            Row.NaN = reg.Replace(Row.Na, replacewith);


        }

    }

Mistake

The best overloaded method match for 
'System.Text.RegularExpressions.Regex.Replace(string,System.Text.RegularExpressions.MatchEvaluator)' has some invalid arguments

Here Na

is the input column and NaN

is the output column, which is a varchar with special characters in the Inpout column.

Exceptions:

System.ArgumentNullException
System.ArgumentOutofRangeException

This is the code in BufferWrapper in the SSIS package

/* THIS IS AUTO-GENERATED CODE THAT WILL BE OVERWRITTEN! DO NOT EDIT!
*  Microsoft SQL Server Integration Services buffer wrappers
*  This module defines classes for accessing data flow buffers
*  THIS IS AUTO-GENERATED CODE THAT WILL BE OVERWRITTEN! DO NOT EDIT! */



    using System;
    using System.Data;
    using Microsoft.SqlServer.Dts.Pipeline;
    using Microsoft.SqlServer.Dts.Pipeline.Wrapper;

    public class Input0Buffer: ScriptBuffer

    {
        public Input0Buffer(PipelineBuffer Buffer, int[] BufferColumnIndexes, OutputNameMap OutputMap)
            : base(Buffer, BufferColumnIndexes, OutputMap)
        {
        }

        public BlobColumn Na
        {
            get
            {
                return (BlobColumn)Buffer[BufferColumnIndexes[0]];
            }
        }
        public bool Na_IsNull
        {
            get
            {
                return IsNull(0);
            }
        }

        public Int32 NaN
        {
            set
            {
                this[1] = value;
            }
        }
        public bool NaN_IsNull
        {
            set
            {
                if (value)
                {
                    SetNull(1);
                }
                else
                {
                    throw new InvalidOperationException("IsNull property cannot be set to False. Assign a value to the column instead.");
                }
            }
        }

        new public bool NextRow()
        {
            return base.NextRow();
        }

        new public bool EndOfRowset()
        {
            return base.EndOfRowset();
        }

    }

Data stream

enter image description here

Script input columns

enter image description here

Script, the actual script

enter image description here

0

c # regex ssis

user2588812 13 Aug 13 at 13:16

source to share

1 answer

billinkc · Accepted Answer · 2013-08-14T03:03:58+0000

Your code is mostly fine. You are not testing the possibility that the column Na

is NULL. It may be that your original data is non-nullable and therefore no need to test.

You can improve your performance by looking at the Regex at the participant level and creating it in your PreExecute method, but that's just a performance thing. Doesn't affect the error message you receive.

You can see my package and expected results. I sent 4 lines down, one with a NULL value that shouldn't change, and two that need to change.

My data Stream

I have updated the datastream to match the steps you are using in your chameleon question.

data flow

My original request

I am generating 2 columns of data and 4 rows. The column Na that matches your original question is of type varchar. The Agency_Names column appears as a deprecated text data type to accommodate your future updates.

SELECT 
    D.Na
,   CAST(D.Na AS text) AS Agency_Names
FROM
(
SELECT 'Hello world' AS Na
UNION ALL SELECT 'man~ana'
UNION ALL SELECT 'p@$$word!'
UNION ALL SELECT NULL
) D (Na);

Data transformation

I added a data transformation transformation after my OLE DB source. Thinking about what you did, I converted mine Agency_Name

to a data type string [DT_STR]

with a length of 50 and overlaid it as "Copy Agency Name".

data conversion

Metadata h2>

At this point, I am checking that the metadata for my data stream is of type DT_STR or DT_WSTR, which are the only valid inputs for the upcoming regular expression call. I confirm that Copy of Agency_Names

is the expected data type.

enter image description here

Script Task

I assigned the use of ReadOnly to the columns Na

and Copy of Agency_Name

and changed the name later as "AgencyNames".

enter image description here

I added 2 output columns: NaN which matches your original question and created the Agency NamesCleaned. They are both configured as DT_STR, code page 1252, length 50.

enter image description here

This is the script I was using.

public class ScriptMain : UserComponent
{

    string toreplace = "[~!@#$%^&*()_+`{};':,./<>?]";
    string replacewith = "";


    public override void Input0_ProcessInputRow(Input0Buffer Row)
    {
        Regex reg = new Regex(toreplace);

        // Test for nulls otherwise Replace will blow up
        if (!Row.Na_IsNull)
        {
            Row.NaN = reg.Replace(Row.Na, replacewith);
        }
        else
        {
            Row.NaN_IsNull = true;
        }

        if (!Row.AgencyNames_IsNull)
        {
            Row.AgencyNamesCleaned = reg.Replace(Row.AgencyNames, replacewith);
        }
        else
        {
            Row.AgencyNamesCleaned_IsNull = true;
        }
    }

}

Root cause analysis

I think your main problem might be that the column Na

you have is not row compatible. Commentary from Sriram . If I look at the autogenerated code for the column Na

, in my example I see

    public String Na
    {
        get
        {
            return Buffer.GetString(BufferColumnIndexes[0]);
        }
    }
    public bool Na_IsNull
    {
        get
        {
            return IsNull(0);
        }
    }

Your source system provided metadata, so SSIS assumes this column is binary. It could be NTEXT / TEXT or n / varchar (max) in the host. You need to do something to make it a compatible operand for the regex. I would clear the type of the column in the source, but if that's not an option, use a conversion Data Conversion

to turn it into a DT_STR / DT_WSTR type.

Interchange

In the data viewer attached to my first image, you can observe that NaN and AgencyNamesCleaned have stripped off offensive characters correctly. Also, you may notice that my script task doesn't have a red X attached to it, just like you do. This means the script is in an invalid state.

As you created a Copy of Agency Instances column from the Data Conversion component as DT_TEXT, connected it to the script component, and then changed the data type in the data conversion component, a red X in your script might be allowed if the conversion updates its metadata. Open the script and hit "recompile" (ctrl-shift-b) for a good estimate.

There reg.Replace(...

should be no underscores in your code . If there is, there is another aspect of your problem that has not been referred to. My best advice at this point would be to recreate the Proof of a Package of Concepts as I described, and if that works, it becomes an exercise in finding the difference between what you work and what you don't.

System.Text.RegularExpressions.Regex.Replace error in C # for SSIS

My data Stream

My original request

Data transformation

Metadata h2>

Script Task

Root cause analysis

Interchange

More articles: