Optimization of open source software – Practice of parallel programming in code validation algorithms

0

The code validation algorithm is used to validate the code format of a data sequence. A typical example is the UTF-8 code validation algorithm. The UTF-8 validation algorithm is used to check if the data sequence conforms to UTF-8 encoding standards. For example, the UTF-8 validation algorithm can be used to validate the encoding of UTF-8 encoded emails, web pages, and application data. This document uses the optimization case of the UTF-8 validation algorithm in the Go open source community as an example to describe the general methods for analyzing and optimizing the algorithm.

1. GB UTF-8 encoding validation algorithm

Go implements the UTF-8 encoding validation algorithm to verify UTF-8 encoded data. The algorithm is designed based on the variable length UTF-8 encoding function. UTF-8 encoding uses 1 to 4 bytes to encode each character. ASCII encoding is the scenario when UTF-8 code length is 1 byte. Based on this feature, the UTF-8 validation algorithm uses a character to determine if the character is ASCII encoded. If not, it checks if the character is of another type of UTF-8 encoding. The sample code is as follows:

// The UTF-8 validation algorithm before optimization uses a byte to check whether it is encoded by ASCII. If not, it checks whether the character is of another UTF-8 encoding type.
func Valid(p []byte) bool {
 n := len(p)
 for i := 0; i < n; {
  // Check whether the byte character is encoded by ASCII. When there are a large number of ASCII coded characters, the calculation efficiency drops.
  pi := p[i]
  if pi < RuneSelf {
   i++
   continue
  }

  // Check whether the byte character is of another UTF8 encoding type.
  x := first[pi]
  if x == xx {
   return false // Illegal starter byte.
  }
  size := int(x & 7)
  if i+size > n {
   return false // Short or invalid.
  }
  accept := acceptRanges[x>>4]
  if c := p[i+1]; c < accept.lo || accept.hi < c {
   return false
  } else if size == 2 {
  } else if c := p[i+2]; c < locb || hicb < c {
   return false
  } else if size == 3 {
  } else if c := p[i+3]; c < locb || hicb < c {
   return false
  }
  i += size
 }
 return true
}

2. Scenario and problem analysis

After analyzing the UTF-8 validation algorithm before Go language optimization, it turns out that the algorithm is also used to verify ASCII characters. It’s also common to see a lot of ASCII-encoded characters in a long English email. The following figure shows an English email containing 1705 ASCII characters.

When the UTF-8 validation algorithm is used to verify the encoding of English email content, 1705 comparisons are required, as shown in the following figure.

picture

Therefore, in a scenario where a large amount of ASCII-encoded data needs to be validated, the UTF-8 encoding validation algorithm before optimization uses single-character comparison to check the encoding until the data set is checked cyclically. The execution of the algorithm takes time and the performance needs to be improved.

3. Optimization solution and implementation

To improve the UTF-8 validation algorithm, parallel programming can be applied to check multiple ASCII-encoded characters at once, thereby reducing the number of comparisons, speeding up validation, and improving the performance of the algorithm. The Go language UTF-8 validation algorithm uses the algorithm optimization solution based on the idea of ​​parallel programming. Eight ASCII-encoded characters can be checked at once, dramatically improving performance. The following figure shows the code comparison before and after the optimization of the algorithm.

picture

The optimized code is analyzed as follows:

// The optimized code, based on the parallel programming idea, checks whether eight bytes are ASCII encoded characters at a time.
func Valid(p []byte) bool {
 // Check whether eight bytes are ASCII encoded characters in each epoch.
 for len(p) >= 8 {
  // Two uint32 variables, first32 and second32, are used to store 4-byte data, respectively.
  first32 := uint32(p[0]) | uint32(p[1])<<8 | uint32(p[2])<<16 | uint32(p[3])<<24
  second32 := uint32(p[4]) | uint32(p[5])<<8 | uint32(p[6])<<16 | uint32(p[7])<<24
  // The result of the AND operation between any ASCII encoded character and 0x80 is 0. The 8-byte ASCII character check is implemented through the AND operation between (first32|second32) and 0x80808080.
  if (first32|second32)&0x80808080 != 0 {
   break
  }
  p = p[8:]
 }

 // Check whether a byte is encoded in UTF-8 mode each time.
 ......
}

In order to further analyze the parallel optimization technique, the disassembly method is used to obtain the assembly code before and after the optimization of the algorithm. The assembly code before optimization is as follows:

picture

The assembly code after optimization is as follows:

picture

According to the analysis, before optimization, a unit8 variable is used to store data for encoding validation. Corresponding to the assembly code, the MOVBU The instruction is used to get one byte of data to the R4 register, and one byte data encoding is enabled at a time. In the optimized code, two unit32 variables are used to store data for encoding validation. In the corresponding assembly code, the MOVWU The instruction is used to get a piece of data W (4 bytes) to the R3 and R4 registers, respectively, and the 8-byte data encoding is enabled at a time. This optimization method increases the amount of data stored in the register in the assembly instruction by using more types of unit32 variables to store data in the Go language code, and implements parallel validation of more encoded data during the operation of register comparison.

4. Result of optimization

Use the Go to benchmark to test the performance of the algorithm before and after optimization, and use the Go to the test bench to compare performance test results before and after optimization. The following table lists the test results:

Test item Test cases Time per operation before optimization Time per operation after optimization Comparison
BenchmarkValidTenASCIIChars-8 10 byte slice 15.8 ns/op 8.00ns/op -49.37%
BenchmarkValidStringTenASCIIChars-8 A string of 10 characters 12.8 ns/op 8.04ns/operation -37.19%

To note: -8 indicates the GOMAXPROCS value when the function is running, and ns/op indicates the average time in nanoseconds to execute the function each time.

According to the performance test result, once the UTF-8 encoding validation algorithm is optimized, the average time consumed to validate ASCII encoding is reduced and the performance is improved by up to 49%.

5. Summary

The Go Language UTF-8 Verification Algorithm Optimization Case analyzes the performance issues of the algorithm in a specific scenario, provides an optimization solution based on the idea of ​​parallel programming, and verifies the result of optimization. This is an algorithm optimization practice worth learning. The idea of ​​parallel programming in the case can not only be used to optimize data code validation, but also be applied to other scenarios such as data encoding, decoding, compression, and storage.

Share.

Comments are closed.