. NET performance optimization - splicing strings using ValueStringBuilder

preface

The Tips I share with you this time are used in string splicing scenarios. We often encounter many scenarios where short strings need to be spliced. String is not recommended in this scenario Concat means using the + = operator.
At present, the most recommended solution is to use StringBuilder to build these strings. Is there any faster way to use less memory? That is the ValueStringBuilder I want to introduce to you today.

ValueStringBuilder

ValueStringBuilder is not an open API, but it is widely used NET, because it is of value type, it will not be allocated on the heap and will not be under the pressure of GC.
There are two ways to use the ValueStringBuilder provided by Microsoft. One is that it already has a memory space for string construction. This means that you can use stack space, heap space or even unmanaged heap space, which is very friendly to GC and can greatly reduce the pressure of GC in the case of high concurrency.

// Constructor: pass in a Buffer array of Span
public ValueStringBuilder(Span<char> initialBuffer);

// Usage:
// Stack space
var vsb = new ValueStringBuilder(stackalloc char[512]);
// Ordinary rent
var vsb = new ValueStringBuilder(new char[512]);
// Use unmanaged heap
var length = 512;
var ptr = NativeMemory.Alloc((nuint)(512 * Unsafe.SizeOf<char>()));
var span = new Span<char>(ptr, length);
var vsb = new ValueStringBuilder(span);
.....
NativeMemory.Free(ptr); // When unmanaged heap runs out, be sure to Free

Another way is to specify a capacity, which will obtain buffer space from the char object pool of the default ArrayPool. Because the object pool is used, it is also relatively friendly to GC. It should be noted that the objects in the pool must be returned.

// Incoming estimated capacity
public ValueStringBuilder(int initialCapacity)  
{  
    // Get buffer from object pool
    _arrayToReturnToPool = ArrayPool<char>.Shared.Rent(initialCapacity);  
    ......
}

Let's compare the performance of + =, StringBuilder and ValueStringBuilder.

// A simple class
public class SomeClass  
{  
    public int Value1; public int Value2; public float Value3;  
    public double Value4; public string? Value5; public decimal Value6;  
    public DateTime Value7; public TimeOnly Value8; public DateOnly Value9;  
    public int[]? Value10;  
}
// Benchmark class
[MemoryDiagnoser]  
[HtmlExporter]  
[Orderer(SummaryOrderPolicy.FastestToSlowest)]  
public class StringBuilderBenchmark  
{  
    private static readonly SomeClass Data;  
    static StringBuilderBenchmark()  
    {  
        var baseTime = DateTime.Now;  
        Data = new SomeClass  
        {  
            Value1 = 100, Value2 = 200, Value3 = 333,  
            Value4 = 400, Value5 = string.Join('-', Enumerable.Range(0, 10000).Select(i => i.ToString())),  
            Value6 = 655, Value7 = baseTime.AddHours(12),  
            Value8 = TimeOnly.MinValue, Value9 = DateOnly.MaxValue,  
            Value10 = Enumerable.Range(0, 5).ToArray()  
        };  
    }

    // Use the familiar StringBuilder
    [Benchmark(Baseline = true)]  
    public string StringBuilder()  
    {  
        var data = Data;  
        var sb = new StringBuilder();  
        sb.Append("Value1:"); sb.Append(data.Value1);  
        if (data.Value2 > 10)  
        {  
            sb.Append(" ,Value2:"); sb.Append(data.Value2);  
        }  
        sb.Append(" ,Value3:"); sb.Append(data.Value3);  
        sb.Append(" ,Value4:"); sb.Append(data.Value4);  
        sb.Append(" ,Value5:"); sb.Append(data.Value5);  
        if (data.Value6 > 20)  
        {  
            sb.Append(" ,Value6:"); sb.AppendFormat("{0:F2}", data.Value6);  
        }  
        sb.Append(" ,Value7:"); sb.AppendFormat("{0:yyyy-MM-dd HH:mm:ss}", data.Value7);  
        sb.Append(" ,Value8:"); sb.AppendFormat("{0:HH:mm:ss}", data.Value8);  
        sb.Append(" ,Value9:"); sb.AppendFormat("{0:yyyy-MM-dd}", data.Value9);  
        sb.Append(" ,Value10:");  
        if (data.Value10 is null or {Length: 0}) return sb.ToString();  
        for (int i = 0; i < data.Value10.Length; i++)  
        {  
            sb.Append(data.Value10[i]);  
        }  
  
        return sb.ToString();  
    }

    // StringBuilder uses Capacity
    [Benchmark]  
    public string StringBuilderCapacity()  
    {  
        var data = Data;  
        var sb = new StringBuilder(20480);  
        sb.Append("Value1:"); sb.Append(data.Value1);  
        if (data.Value2 > 10)  
        {  
            sb.Append(" ,Value2:"); sb.Append(data.Value2);  
        }  
        sb.Append(" ,Value3:"); sb.Append(data.Value3);  
        sb.Append(" ,Value4:"); sb.Append(data.Value4);  
        sb.Append(" ,Value5:"); sb.Append(data.Value5);  
        if (data.Value6 > 20)  
        {  
            sb.Append(" ,Value6:"); sb.AppendFormat("{0:F2}", data.Value6);  
        }  
        sb.Append(" ,Value7:"); sb.AppendFormat("{0:yyyy-MM-dd HH:mm:ss}", data.Value7);  
        sb.Append(" ,Value8:"); sb.AppendFormat("{0:HH:mm:ss}", data.Value8);  
        sb.Append(" ,Value9:"); sb.AppendFormat("{0:yyyy-MM-dd}", data.Value9);  
        sb.Append(" ,Value10:");  
        if (data.Value10 is null or {Length: 0}) return sb.ToString();  
        for (int i = 0; i < data.Value10.Length; i++)  
        {  
            sb.Append(data.Value10[i]);  
        }  
  
        return sb.ToString();  
    }  

    // Directly use + = to splice strings
    [Benchmark]  
    public string StringConcat()  
    {  
        var str = "";  
        var data = Data;  
        str += ("Value1:"); str += (data.Value1);  
        if (data.Value2 > 10)  
        {  
            str += " ,Value2:"; str += data.Value2;  
        }  
        str += " ,Value3:"; str += (data.Value3);  
        str += " ,Value4:"; str += (data.Value4);  
        str += " ,Value5:"; str += (data.Value5);  
        if (data.Value6 > 20)  
        {  
            str += " ,Value6:"; str += data.Value6.ToString("F2");  
        }  
        str += " ,Value7:"; str += data.Value7.ToString("yyyy-MM-dd HH:mm:ss");  
        str += " ,Value8:"; str += data.Value8.ToString("HH:mm:ss");  
        str += " ,Value9:"; str += data.Value9.ToString("yyyy-MM-dd");  
        str += " ,Value10:";  
        if (data.Value10 is not null && data.Value10.Length > 0)  
        {  
            for (int i = 0; i < data.Value10.Length; i++)  
            {  
                str += (data.Value10[i]);  
            }     
        }  
  
        return str;  
    }  
  
    // Allocation on the stack using ValueStringBuilder
    [Benchmark]  
    public string ValueStringBuilderOnStack()  
    {  
        var data = Data;  
        Span<char> buffer = stackalloc char[20480];  
        var sb = new ValueStringBuilder(buffer);  
        sb.Append("Value1:"); sb.AppendSpanFormattable(data.Value1);  
        if (data.Value2 > 10)  
        {  
            sb.Append(" ,Value2:"); sb.AppendSpanFormattable(data.Value2);  
        }  
        sb.Append(" ,Value3:"); sb.AppendSpanFormattable(data.Value3);  
        sb.Append(" ,Value4:"); sb.AppendSpanFormattable(data.Value4);  
        sb.Append(" ,Value5:"); sb.Append(data.Value5);  
        if (data.Value6 > 20)  
        {  
            sb.Append(" ,Value6:"); sb.AppendSpanFormattable(data.Value6, "F2");  
        }  
        sb.Append(" ,Value7:"); sb.AppendSpanFormattable(data.Value7, "yyyy-MM-dd HH:mm:ss");  
        sb.Append(" ,Value8:"); sb.AppendSpanFormattable(data.Value8, "HH:mm:ss");  
        sb.Append(" ,Value9:"); sb.AppendSpanFormattable(data.Value9, "yyyy-MM-dd");  
        sb.Append(" ,Value10:");  
        if (data.Value10 is not null && data.Value10.Length > 0)  
        {  
            for (int i = 0; i < data.Value10.Length; i++)  
            {  
                sb.AppendSpanFormattable(data.Value10[i]);  
            }     
        }  
  
        return sb.ToString();  
    }
    // Use the StringBuilder allocated on the ArrayPool heap
    [Benchmark]  
    public string ValueStringBuilderOnHeap()  
    {  
        var data = Data;  
        var sb = new ValueStringBuilder(20480);  
        sb.Append("Value1:"); sb.AppendSpanFormattable(data.Value1);  
        if (data.Value2 > 10)  
        {  
            sb.Append(" ,Value2:"); sb.AppendSpanFormattable(data.Value2);  
        }  
        sb.Append(" ,Value3:"); sb.AppendSpanFormattable(data.Value3);  
        sb.Append(" ,Value4:"); sb.AppendSpanFormattable(data.Value4);  
        sb.Append(" ,Value5:"); sb.Append(data.Value5);  
        if (data.Value6 > 20)  
        {  
            sb.Append(" ,Value6:"); sb.AppendSpanFormattable(data.Value6, "F2");  
        }  
        sb.Append(" ,Value7:"); sb.AppendSpanFormattable(data.Value7, "yyyy-MM-dd HH:mm:ss");  
        sb.Append(" ,Value8:"); sb.AppendSpanFormattable(data.Value8, "HH:mm:ss");  
        sb.Append(" ,Value9:"); sb.AppendSpanFormattable(data.Value9, "yyyy-MM-dd");  
        sb.Append(" ,Value10:");  
        if (data.Value10 is not null && data.Value10.Length > 0)  
        {  
            for (int i = 0; i < data.Value10.Length; i++)  
            {  
                sb.AppendSpanFormattable(data.Value10[i]);  
            }     
        }
  
        return sb.ToString();  
    }
      
}

The results are as follows.

From the results above, we can draw the following conclusions.

  • Using StringConcat is the slowest, which is not recommended anyway.
  • Using StringBuilder is 6.5 times faster than using StringConcat, which is the recommended method.
  • Setting the initial capacity of StringBuilder is 25% faster than using StringBuilder directly, as I did in You should set the initial size for the collection type Similarly, setting the initial size is absolutely recommended.
  • The ValueStringBuilder allocated on the stack is 50% faster than the StringBuilder and 25% faster than the StringBuilder with the initial capacity set. In addition, it has the lowest GC times.
  • ValueStringBuilder allocated on the heap is 55% faster than StringBuilder, and its GC times are slightly higher than those allocated on the stack.
    From the above conclusion, we can find that the performance of ValueStringBuilder is very good. Even if the buffer is allocated on the stack, the performance is 25% faster than that of StringBuilder.

Source code analysis

The source code of ValueStringBuilder is not long. Let's pick some important methods to share with you. Some of the source codes are as follows.

// Using ref struct, the object can only be allocated on the stack
public ref struct ValueStringBuilder
{
    // If the buffer is allocated from the ArrayPool, it needs to be stored
    // So that you can return it when you Dispose
    private char[]? _arrayToReturnToPool;
    // Temporarily store external incoming buffer
    private Span<char> _chars;
    // Current string length
    private int _pos;

    // External incoming buffer
    public ValueStringBuilder(Span<char> initialBuffer)
    {
        // If the external buffer is used, the buffer read from the pool will not be used
        _arrayToReturnToPool = null;
        _chars = initialBuffer;
        _pos = 0;
    }

    public ValueStringBuilder(int initialCapacity)
    {
        // If capacity is imported externally, it is obtained from ArrayPool
        _arrayToReturnToPool = ArrayPool<char>.Shared.Rent(initialCapacity);
        _chars = _arrayToReturnToPool;
        _pos = 0;
    }

    // The Length of the returned string is readable and writable
    // So to reuse ValueStringBuilder, just set the Length to 0
    public int Length
    {
        get => _pos;
        set
        {
            Debug.Assert(value >= 0);
            Debug.Assert(value <= _chars.Length);
            _pos = value;
        }
    }

    ......

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public void Append(char c)
    {
        // Adding characters is very efficient and can be set directly to the corresponding Span position
        int pos = _pos;
        if ((uint) pos < (uint) _chars.Length)
        {
            _chars[pos] = c;
            _pos = pos + 1;
        }
        else
        {
            // If the buffer space is insufficient, it will go
            GrowAndAppend(c);
        }
    }

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public void Append(string? s)
    {
        if (s == null)
        {
            return;
        }

        // Appending strings is equally efficient
        int pos = _pos;
        // If the string length is 1, you can directly append characters
        if (s.Length == 1 && (uint) pos < (uint) _chars .Length)
        {
            _chars[pos] = s[0];
            _pos = pos + 1;
        }
        else
        {
            // If it is more than one character, use the slower method
            AppendSlow(s);
        }
    }

    private void AppendSlow(string s)
    {
        // The space for appending string is not enough. Expand the capacity first
        // Then using Span replication is quite efficient
        int pos = _pos;
        if (pos > _chars.Length - s.Length)
        {
            Grow(s.Length);
        }

        s
#if !NETCOREAPP
                .AsSpan()
#endif
            .CopyTo(_chars.Slice(pos));
        _pos += s.Length;
    }

    // Special treatment for objects that need to be formatted
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public void AppendSpanFormattable<T>(T value, string? format = null, IFormatProvider? provider = null)
        where T : ISpanFormattable
    {
        // ISpanFormattable is very efficient
        if (value.TryFormat(_chars.Slice(_pos), out int charsWritten, format, provider))
        {
            _pos += charsWritten;
        }
        else
        {
            Append(value.ToString(format, provider));
        }
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    private void GrowAndAppend(char c)
    {
        // Single character expansion added in
        Grow(1);
        Append(c);
    }

    // Capacity expansion method
    [MethodImpl(MethodImplOptions.NoInlining)]
    private void Grow(int additionalCapacityBeyondPos)
    {
        Debug.Assert(additionalCapacityBeyondPos > 0);
        Debug.Assert(_pos > _chars.Length - additionalCapacityBeyondPos,
            "Grow called incorrectly, no resize is needed.");

        // It is also a 2x expansion. The buffer is obtained from the object pool by default
        char[] poolArray = ArrayPool<char>.Shared.Rent((int) Math.Max((uint) (_pos + additionalCapacityBeyondPos),
            (uint) _chars.Length * 2));

        _chars.Slice(0, _pos).CopyTo(poolArray);

        char[]? toReturn = _arrayToReturnToPool;
        _chars = _arrayToReturnToPool = poolArray;
        if (toReturn != null)
        {
            // If the object pool was originally used, it must be returned
            ArrayPool<char>.Shared.Return(toReturn);
        }
    }

    // 
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public void Dispose()
    {
        char[]? toReturn = _arrayToReturnToPool;
        this = default; // For safety, empty the current object on release
        if (toReturn != null)
        {
            // Be sure to return the object pool
            ArrayPool<char>.Shared.Return(toReturn);
        }
    }
}

From the above source code, we can summarize several characteristics of ValueStringBuilder:

  • Compared with StringBuilder, the implementation method is very simple.
  • Everything is for high performance, such as the usage of various spans, various inline parameters, and the use of object pools.
  • The memory consumption is very low. It is a structure type itself. In addition, it is ref struct, which means it will not be boxed and allocated on the heap.

Applicable scenario

ValueStringBuilder is a high-performance string creation method, which can be used in different scenarios.
1. Very high frequency string splicing scenarios, and the string length is small. At this time, you can use the ValueStringBuilder allocated on the stack.
As we all know, now ASP Net core performance is very good, in its dependent internal library UrlBuilder Because the memory allocated on the stack will be recycled after the end of the current method, it will not cause any GC pressure.

2. Very high frequency string splicing scenario, but the string length is not controllable. At this time, use the ValueStringBuilder with the specified capacity of ArrayPool. For example, in NET BCL library is used in many scenarios, such as dynamic methods ToString realization. Although allocation from the pool is not as efficient as allocation on the stack, it can also reduce memory occupation and GC pressure.

3. Very high frequency string splicing scenario, but the string length is controllable. At this time, it can be used in combination with on stack allocation and ArrayPool allocation, such as regular expression In the parsing class, if the string length is small, the stack space is used, and if the string length is large, the ArrayPool is used.

Scenes needing attention

1. ValueStringBuilder cannot be used in async\await. As we all know, because ValueStringBuilder is ref struct, it can only be allocated on the stack. async\await will be compiled into the state machine to split the methods before and after await. Therefore, ValueStringBuilder is difficult to pass in the method, but the compiler will also warn.

2. ValueStringBuilder cannot be returned as a return value, because it will be released after the method is allocated on the current stack, and it will point to an unknown address. The compiler also warns.

3. If you want to pass ValueStringBuilder to other methods, you must use ref to pass, otherwise there will be multiple instances of value copy. The compiler doesn't warn, but you have to be very careful.

4. If on stack allocation is used, it is safer to control the Buffer size within 5KB. As for why this is necessary, I'll talk about it later.

summary

Today, I shared with you the high-performance string splicing structure ValueStringBuilder with almost no memory occupation, which is recommended for most scenarios. But be very careful Mentioned above If several scenarios do not meet the conditions, you can still use efficient StringBuilder to splice strings.

Source code link of this article: https://github.com/InCerryGit/BlogCode-Use-ValueStringBuilder

Tags: C# .NET Optimize

Posted by pythian on Wed, 11 May 2022 04:29:27 +0300