找回密码
 立即注册
首页 业界区 业界 Go反射:性能瓶颈与零拷贝优化

Go反射:性能瓶颈与零拷贝优化

烯八 3 天前
原文:https://www.yt-blog.top/38912/
做Go开发的,肯定少不了用反射——解析Tag、拿字段偏移、获取类型信息,ORM、序列化、配置绑定这些地方都要用到。
但是官方的reflect包性能真的不太行,解析一个字段或Tag要花几十到几百万纳秒,调得多了,直接成性能瓶颈。
很多人只知道「反射慢」,但不知道慢在哪。咱们今天就从runtime层面分析一下,顺便搞个零拷贝的优化方案。
一、先从底层说起

要搞清楚反射的性能问题,得先知道Go底层是怎么回事。
从Go1.14开始,runtime里几个核心类型的内存布局就没变过。这是个关键点。
Go的反射包就是基于runtime层的abi实现的。
reflect/type.go
  1. // TypeOf returns the reflection [Type] that represents the dynamic type of i.
  2. // If i is a nil interface value, TypeOf returns nil.
  3. func TypeOf(i any) Type {
  4. return toType(abi.TypeOf(i))
  5. }
复制代码
其实reflect.Type就是一个接口,上面代码里的toType()把它转成了reflect.rtype。
  1. // rtype is the common implementation of most values.
  2. // It is embedded in other struct types.
  3. type rtype struct {
  4. t abi.Type
  5. }
  6. func toRType(t *abi.Type) *rtype {
  7. return (*rtype)(unsafe.Pointer(t))
  8. }
复制代码
所以最后拿到的是个abi.Type实例,reflect.rtype只是给它包了一层,提供个友好的接口。也可以换成别的类型专用结构体,但本质上都是对abi.Type的封装。
internal/abi/type.go
  1. // Type is the runtime representation of a Go type.
  2. //
  3. // Be careful about accessing this type at build time, as the version
  4. // of this type in the compiler/linker may not have the same layout
  5. // as the version in the target binary, due to pointer width
  6. // differences and any experiments. Use cmd/compile/internal/rttype
  7. // or the functions in compiletype.go to access this type instead.
  8. // (TODO: this admonition applies to every type in this package.
  9. // Put it in some shared location?)
  10. type Type struct {
  11. Size_       uintptr
  12. PtrBytes    uintptr // number of (prefix) bytes in the type that can contain pointers
  13. Hash        uint32  // hash of type; avoids computation in hash tables
  14. TFlag       TFlag   // extra type information flags
  15. Align_      uint8   // alignment of variable with this type
  16. FieldAlign_ uint8   // alignment of struct field with this type
  17. Kind_       Kind    // enumeration for C
  18. // function for comparing objects of this type
  19. // (ptr to object A, ptr to object B) -> ==?
  20. Equal func(unsafe.Pointer, unsafe.Pointer) bool
  21. // GCData stores the GC type data for the garbage collector.
  22. // Normally, GCData points to a bitmask that describes the
  23. // ptr/nonptr fields of the type. The bitmask will have at
  24. // least PtrBytes/ptrSize bits.
  25. // If the TFlagGCMaskOnDemand bit is set, GCData is instead a
  26. // **byte and the pointer to the bitmask is one dereference away.
  27. // The runtime will build the bitmask if needed.
  28. // (See runtime/type.go:getGCMask.)
  29. // Note: multiple types may have the same value of GCData,
  30. // including when TFlagGCMaskOnDemand is set. The types will, of course,
  31. // have the same pointer layout (but not necessarily the same size).
  32. GCData    *byte
  33. Str       NameOff // string form
  34. PtrToThis TypeOff // type for pointer to this type, may be zero
  35. }
复制代码
当然实际上结构体数据是如上结构体的扩展,同样定义在一起。
internal/abi/type.go
  1. type StructField struct {
  2. Name   Name    // name is always non-empty
  3. Typ    *Type   // type of field
  4. Offset uintptr // byte offset of field
  5. }
  6. type StructType struct {
  7. Type
  8. PkgPath Name
  9. Fields  []StructField
  10. }
复制代码
还有一点,这些底层类型里存的结构体元数据,是编译器编译时就写进程序的只读内存区了,地址固定、GC不回收、运行时不能改。这给直接操作底层内存提供了安全保障。
既然这样,我们可以用固定偏移量精确找到目标字段,不用完整解析整个底层结构体,只要定义几个空的镜像类型来做类型标注就够了。
二、性能瓶颈在哪儿

reflect.TypeOf()底层就是做个指针转换,不拷贝不计算,挺快的。真正的性能损耗出在后面两个阶段,而且因为没缓存,损耗被放大了好几倍。
2.1 Field方法做了无意义的内存分配

调用reflect.Type.Field(i)的时候,rtype会被转成*StructType,然后从Fields字段里读目标字段信息。
reflect/type.go
  1. // Struct field
  2. type structField = abi.StructField // 注意:你平时用的是 reflect.structField,不是reflect.StructField
  3. // structType represents a struct type.
  4. type structType struct {
  5. abi.StructType
  6. }
  7. func (t *rtype) Field(i int) StructField {
  8. if t.Kind() != Struct {
  9.   panic("reflect: Field of non-struct type " + t.String())
  10. }
  11. tt := (*structType)(unsafe.Pointer(t))
  12. return tt.Field(i)
  13. }
  14. // Field returns the i'th struct field.
  15. func (t *structType) Field(i int) (f StructField) {
  16. if i < 0 || i >= len(t.Fields) {
  17.   panic("reflect: Field index out of bounds")
  18. }
  19. p := &t.Fields[i]
  20. f.Type = toType(p.Typ)
  21. f.Name = p.Name.Name()
  22. f.Anonymous = p.Embedded()
  23. if !p.Name.IsExported() {
  24.   f.PkgPath = t.PkgPath.Name()
  25. }
  26. if tag := p.Name.Tag(); tag != "" {
  27.   f.Tag = StructTag(tag)
  28. }
  29. f.Offset = p.Offset
  30. // We can't safely use this optimization on js or wasi,
  31. // which do not appear to support read-only data.
  32. if i < 256 && runtime.GOOS != "js" && runtime.GOOS != "wasip1" {
  33.   staticuint64s := getStaticuint64s()
  34.   p := unsafe.Pointer(&(*staticuint64s)[i])
  35.   if unsafe.Sizeof(int(0)) == 4 && goarch.BigEndian {
  36.    p = unsafe.Add(p, 4)
  37.   }
  38.   f.Index = unsafe.Slice((*int)(p), 1)
  39. } else {
  40.   // NOTE(rsc): This is the only allocation in the interface
  41.   // presented by a reflect.Type. It would be nice to avoid,
  42.   // but we need to make sure that misbehaving clients of
  43.   // reflect cannot affect other uses of reflect.
  44.   // One possibility is CL 5371098, but we postponed that
  45.   // ugliness until there is a demonstrated
  46.   // need for the performance. This is issue 2320.
  47.   f.Index = []int{i}
  48. }
  49. return
  50. }
复制代码
上面这段代码问题在哪儿呢?看f.Index = []int{i}这一行。这里无意义地创建了一个列表,实际上这个数据就是你自己传进去的i,完全没必要。这步操作纯粹是为了兼容性。
具体讨论可以看golang/go · Issue#68380。
2.2 Tag获取时的字符串拷贝

刚才说的获取字段的时候,StructField的Tag字段是StructTag类型,其实就是个string。
reflect/type.go
  1. // A StructTag is the tag string in a struct field.
  2. //
  3. // By convention, tag strings are a concatenation of
  4. // optionally space-separated key:"value" pairs.
  5. // Each key is a non-empty string consisting of non-control
  6. // characters other than space (U+0020 ' '), quote (U+0022 '"'),
  7. // and colon (U+003A ':').  Each value is quoted using U+0022 '"'
  8. // characters and Go string literal syntax.
  9. type StructTag string
  10. // Get returns the value associated with key in the tag string.
  11. // If there is no such key in the tag, Get returns the empty string.
  12. // If the tag does not have the conventional format, the value
  13. // returned by Get is unspecified. To determine whether a tag is
  14. // explicitly set to the empty string, use [StructTag.Lookup].
  15. func (tag StructTag) Get(key string) string {
  16. v, _ := tag.Lookup(key)
  17. return v
  18. }
  19. // Lookup returns the value associated with key in the tag string.
  20. // If the key is present in the tag the value (which may be empty)
  21. // is returned. Otherwise the returned value will be the empty string.
  22. // The ok return value reports whether the value was explicitly set in
  23. // the tag string. If the tag does not have the conventional format,
  24. // the value returned by Lookup is unspecified.
  25. func (tag StructTag) Lookup(key string) (value string, ok bool) {
  26. // When modifying this code, also update the validateStructTag code
  27. // in cmd/vet/structtag.go.
  28. for tag != "" {
  29.   // Skip leading space.
  30.   i := 0
  31.   for i < len(tag) && tag[i] == ' ' {
  32.    i++
  33.   }
  34.   tag = tag[i:]
  35.   if tag == "" {
  36.    break
  37.   }
  38.   // Scan to colon. A space, a quote or a control character is a syntax error.
  39.   // Strictly speaking, control chars include the range [0x7f, 0x9f], not just
  40.   // [0x00, 0x1f], but in practice, we ignore the multi-byte control characters
  41.   // as it is simpler to inspect the tag's bytes than the tag's runes.
  42.   i = 0
  43.   for i < len(tag) && tag[i] > ' ' && tag[i] != ':' && tag[i] != '"' && tag[i] != 0x7f {
  44.    i++
  45.   }
  46.   if i == 0 || i+1 >= len(tag) || tag[i] != ':' || tag[i+1] != '"' {
  47.    break
  48.   }
  49.   name := string(tag[:i])
  50.   tag = tag[i+1:]
  51.   // Scan quoted string to find value.
  52.   i = 1
  53.   for i < len(tag) && tag[i] != '"' {
  54.    if tag[i] == '\\' {
  55.     i++
  56.    }
  57.    i++
  58.   }
  59.   if i >= len(tag) {
  60.    break
  61.   }
  62.   qvalue := string(tag[:i+1])
  63.   tag = tag[i+1:]
  64.   if key == name {
  65.    value, err := strconv.Unquote(qvalue)
  66.    if err != nil {
  67.     break
  68.    }
  69.    return value, true
  70.   }
  71. }
  72. return "", false
  73. }
复制代码
这里的tag[:i]和tag[i+1:]会隐式转成slice,这一步只改了栈上的元信息结构体,但是string转换过程为了保证内存安全,会触发一次内存拷贝,这一步是躲不掉的。
现在主流方案像官方的strings.Builder的String()方法,因为不需要把原始数据和新字符串隔离开,所以用的是unsafe.String(unsafe.SliceData(b.buf), len(b.buf))。
这样得到的string和buf指向同一块内存,不会触发额外的内存拷贝,而且unsafe能保证内存安全,不会被GC回收。
三、零拷贝优化的思路

针对上面说的性能瓶颈,结合Go1.14+底层类型结构固定的特点,零拷贝优化的思路其实挺简单的:

  • 不用反射包那一层封装,直接对接runtime层,全程只读内存,不做任何没必要的拷贝;
  • 定义几个空的镜像类型来做类型标注,不用填任何字段,用Go1.14+固定的内存偏移量精准找到目标字段;
  • 解析reflect.Type接口拿到底层的原始内存地址,通过unsafe操作,用固定偏移量直接读数据;
  • 搞个全局缓存存结构体元数据,每个结构体只解析一次,避免高频场景下的重复操作。
这个方案的核心逻辑跟Go底层操作完全一样,所有偏移量都是基于Go1.14+的固定布局预设的,遇到特殊版本顶多改改偏移量,不用担心兼容性问题。
四、具体实现

前面分析了半天,反射慢主要有两个问题:

  • Field 方法会创建一个无意义的 []int{i} 切片(为了兼容性)
  • Tag.Get 会触发字符串的内存拷贝
下面是完整的零拷贝实现:
4.1 核心定义

[code]//go:build go1.14// +build go1.14package zeroreflimport (  "reflect"  "strconv"  "unsafe")const (  // abiTypeSize 是 abi.Type 结构体的大小  // Go1.14+ 中固定为48字节  abiTypeSize = 48)// 空镜像类型:只做类型标注,不用填字段type rtype struct{}type structType struct {  PkgPath Name  Fields  []structField}type structField struct {  Name   Name  Typ    *rtype  Offset uintptr}// Name 类型,跟 runtime.Name 一样//go:linkname Name runtime.Nametype Name struct {  Bytes *byte}// 下面这些方法都是 runtime.Name 的实现//go:linkname Name_Name runtime.(*Name).Name//go:inlinefunc (n *Name) Name() string {  if n.Bytes == nil {    return ""  }  i, l := n.ReadVarint(1)  return unsafe.String(n.DataChecked(1+i, "non-empty string"), l)}//go:linkname Name_Tag runtime.(*Name).Tag//go:inlinefunc (n *Name) Tag() string {  if !n.HasTag() {    return ""  }  i, l := n.ReadVarint(1)  i2, l2 := n.ReadVarint(1 + i + l)  return unsafe.String(n.DataChecked(1+i+l+i2, "non-empty string"), l2)}//go:linkname Name_IsExported runtime.(*Name).IsExported//go:inlinefunc (n *Name) IsExported() bool {  return (*n.Bytes)&(1

相关推荐

您需要登录后才可以回帖 登录 | 立即注册