Splitting up a String or Text in SQL

Posted on Friday, September 9th, 2011 at 8:15 pm in

Here is a way to split up a block of text and return a result set based on a delimeter. A table function is used to do the ‘heavy lifting’ and, utilizing some newer features of SQL Server (2005 and 2008), you can even join to it providing for even greater flexability.

This situation I was facing involved the display of text within a PowerBuilder Datawindow. When the data for a specific report datawindow was retrieved the display was sometimes cut off or overlapped other portions of the report layout (the header usually). This only seemed to happen when the text column was included on the report and the data within the column exceeded the number of lines you would expect to print on a single page. Since the text data was originally input by users, it almost always included carriage return / linefeed (CRLF) line terminators within it. The easiest way to correct this was to change the data source of the datawindow to parse the text and split it up based on these CRLF sequences. This way I would get a result set of many lines, each with a portion of the text, rather than one with the entire text block (which was what PowerBuilder was having trouble sizing/positioning correctly).

The table function was inspired by the example found here.

CREATE FUNCTION [dbo].[uft_splitnotes] (@StringArray NVARCHAR(MAX),
                                        @Delimiter   NVARCHAR(30))
RETURNS @Results TABLE 
(
  SeqNo INT IDENTITY(1, 1),
  Item  NVARCHAR(MAX)
)
AS
  BEGIN
      DECLARE @Next INT
      DECLARE @lenStringArray INT
      DECLARE @lenDelimiter INT
      DECLARE @ii INT

      --initialise everything
      IF UPPER(@Delimiter) = 'SPACE'
        BEGIN
            SET @Delimiter = ' '
        END
      IF UPPER(@Delimiter) = '[CR]'
        BEGIN
            SET @Delimiter = CHAR(13)
        END
      IF UPPER(@Delimiter) = '[LF]'
        BEGIN
            SET @Delimiter = CHAR(10)
        END
      IF UPPER(@Delimiter) = '[CRLF]'
        BEGIN
            SET @Delimiter = CHAR(13) + CHAR(10)
        END
      IF UPPER(@Delimiter) = 'CHAR(13)'
        BEGIN
            SET @Delimiter = CHAR(13)
        END
      IF UPPER(@Delimiter) = 'CHAR(10)'
        BEGIN
            SET @Delimiter = CHAR(10)
        END
      IF @Delimiter = 'CHAR(13) + CHAR(10)'
        BEGIN
            SET @Delimiter = CHAR(13) + CHAR(10)
        END

      SELECT @ii = 1,
             @lenStringArray = LEN(REPLACE(@StringArray, ' ', '|')),
             @lenDelimiter = LEN(REPLACE(@Delimiter, ' ', '|'))
      --notice we have to be cautious about LEN with trailing spaces!
      --while there is more of the string…
      WHILE @ii <= @lenStringArray
        BEGIN--find the next occurrence of the delimiter in the stringarray
            SELECT @Next = CHARINDEX(@Delimiter, @StringArray + @Delimiter, @ii)

            INSERT INTO @Results
                        (Item)
            SELECT SUBSTRING(@StringArray, @ii, @Next - @ii)
            --note that we can get all the items from the list by appeending a
            --delimiter to the final string
            SELECT @ii = @Next + @lenDelimiter
        END
      RETURN
  END

Now since I wanted to split the string based on embedded CRLF sequences, I set up a series of IF statements to allow for various ways to designate these. I’m sure there are better ways to do this part.

Sample output:

Although this was written in SQL Server 2008, I’m sure it can be replicated in other flavors of SQL.

Top