54

我有以下代码,它应该将 arune转换为 astring并打印它。但是,打印时我得到了未定义的字符。我无法弄清楚错误在哪里:

package main

import (
    "fmt"
    "strconv"
    "strings"
    "text/scanner"
)

func main() {
    var b scanner.Scanner
    const a = `a`
    b.Init(strings.NewReader(a))
    c := b.Scan()
    fmt.Println(strconv.QuoteRune(c))
}
4

3 回答 3

51

那是因为你曾经Scanner.Scan()读过 arune但它做了别的事情。Scanner.Scan()可用于读取由位掩码控制的令牌rune特殊令牌Scanner.Mode,它从包中返回特殊常量text/scanner,而不是读取符文本身。

改为阅读单次rune使用Scanner.Next()

c := b.Next()
fmt.Println(c, string(c), strconv.QuoteRune(c))

输出:

97 a 'a'

如果您只想将单个转换runestring,请使用简单的类型转换rune是 的别名int32,并将整数转换为string

将有符号或无符号整数值转换为字符串类型会生成一个包含整数的 UTF-8 表示形式的字符串。

所以:

r := rune('a')
fmt.Println(r, string(r))

输出:

97 a

同样要遍历一个string值的符文,您可以简单地使用以下for ... range构造:

for i, r := range "abc" {
    fmt.Printf("%d - %c (%v)\n", i, r, r)
}

输出:

0 - a (97)
1 - b (98)
2 - c (99)

或者您可以简单地将string值转换为[]rune

fmt.Println([]rune("abc")) // Output: [97 98 99]

还有utf8.DecodeRuneInString().

试试Go Playground上的示例。

笔记:

您的原始代码(使用Scanner.Scan())的工作方式如下:

  1. 您调用Scanner.Init()which 将 Mode ( b.Mode) 设置为scanner.GoTokens
  2. 调用Scanner.Scan()输入 (from "a") 返回scanner.Ident,因为"a"它是一个有效的 Go 标识符:

    c := b.Scan()
    if c == scanner.Ident {
        fmt.Println("Identifier:", b.TokenText())
    }
    
    // Output: "Identifier: a"
    
于 2016-08-31T09:31:10.833 回答
4

我知道我参加聚会有点晚了,但这里有一个 []rune to string 函数:

func runesToString(runes []rune) (outString string) {
    // don't need index so _
    for _, v := range runes {
        outString += string(v)
    }
    return
}

是的,有一个命名返回,但我认为在这种情况下可以,因为它减少了行数并且函数很短

于 2017-09-03T08:26:54.473 回答
1

因为我来这个问题搜索符文和字符串和字符,认为这可能会帮助像我这样的新手

// str := "aഐbc"
// testString(str)
func testString(oneString string){

    //string to byte slice - No sweat -just type cast it
    // As string  IS A byte slice
    var twoByteArr []byte = []byte(oneString)

    // string to rune Slices - No sweat 
    // string IS A slice of runes 
    var threeRuneSlice []rune = []rune(oneString)

   // Hmm! String seems to have a dual personality it is both a slice of bytes and
   // a slice of runes - yeah - read on
    
    // A rune slice can be convered to string -
    // No sweat - as string == rune slice
    var thrirdString string = string(threeRuneSlice)
    
    // There is a catch here and that is in printing "characters", using for loop and range 
    
    fmt.Println("Chars in oneString")
    for i,r := range oneString {
        fmt.Printf(" %d  %v  %c ",i,r,r) //you may not get index 0,1,2,3 here  
        // since the range runs specially over strings  https://blog.golang.org/strings
    }
    
    fmt.Println("\nChars in threeRuneSlice")
    for i,r := range threeRuneSlice {
        fmt.Printf(" %d  %v  %c ",i,r,r) // i = 0,1,2,4 , perfect!!
        // as runes are made up of 4 bytes (rune is int32 and byte in unint8
        // and a set of bytes is used to represent a rune which is used to 
       // represent  UTF characters == the REAL CHARECTER 
    }

    fmt.Println("\nValues in oneString ")
    for j := 0; j < len(oneString); j++ {
        fmt.Printf(" %d %v ",j,oneString[j]) // No you cannot get charecters if you iterate through string in this way
        // as you are going over bytes here - not runes
    }
    fmt.Println("\nValues in twoByteArr")
    for j := 0; j < len(twoByteArr); j++ {
        fmt.Printf(" %d=%v ",j,twoByteArr[j]) // == same as above
    }
    
    fmt.Printf("\none - %s, two %s, three %s\n",oneString,twoByteArr,thrirdString)
}

还有一些更无意义的演示https://play.golang.org/p/tagRBVG8k7V 改编自https://groups.google.com/g/golang-nuts/c/84GCvDBhpbg/m/Tt6089MPFQAJ

显示“字符”编码为 1 到最多 4 个字节,具体取决于 unicode 代码点

于 2021-03-07T14:58:27.117 回答