我在 Lua 中有一个字符串,想在其中迭代单个字符。但是我尝试过的代码都没有,官方手册只显示了如何查找和替换子字符串:(
str = "abcd"
for char in str do -- error
print( char )
end
for i = 1, str:len() do
print( str[ i ] ) -- nil
end
我在 Lua 中有一个字符串,想在其中迭代单个字符。但是我尝试过的代码都没有,官方手册只显示了如何查找和替换子字符串:(
str = "abcd"
for char in str do -- error
print( char )
end
for i = 1, str:len() do
print( str[ i ] ) -- nil
end
在 lua 5.1 中,您可以通过几种方式迭代字符串 this 的字符。
基本循环是:
for i = 1, #str do
local c = str:sub(i,i)
-- do something with c
end
string.gmatch()
但是使用模式来获取字符的迭代器可能更有效:
for c in str:gmatch"." do
-- do something with c
end
甚至用于string.gsub()
为每个字符调用一个函数:
str:gsub(".", function(c)
-- do something with c
end)
在上述所有内容中,我利用了string
模块被设置为所有字符串值的元表这一事实,因此可以使用:
符号将其函数作为成员调用。我还使用 (new to 5.1, IIRC)#
来获取字符串长度。
您的应用程序的最佳答案取决于很多因素,如果性能很重要,基准测试是您的朋友。
您可能想要评估为什么需要迭代字符,并查看已绑定到 Lua 的正则表达式模块之一,或者对于现代方法查看 Roberto 的lpeg模块,该模块实现了 Lua 的 Parsing Expression Grammers。
根据手头的任务,它可能更容易使用string.byte
。这也是最快的方法,因为它避免了创建在 Lua 中非常昂贵的新子字符串,这要归功于每个新字符串的散列并检查它是否已知。您可以预先计算您要查找的符号代码,string.byte
以保持可读性和可移植性。
local str = "ab/cd/ef"
local target = string.byte("/")
for idx = 1, #str do
if str:byte(idx) == target then
print("Target found at:", idx)
end
end
如果您使用的是 Lua 5,请尝试:
for i = 1, string.len(str) do
print( string.sub(str, i, i) )
end
提供的答案中已经有很多好的方法(here、here和here)。如果速度是您主要寻找的,您绝对应该考虑通过 Lua 的 C API 来完成这项工作,这比原始 Lua 代码快很多倍。当使用预加载的块(例如load function)时,差异不是很大,但仍然相当大。
至于纯粹的Lua 解决方案,让我分享一下我做的这个小基准。它涵盖了迄今为止提供的所有答案,并添加了一些优化。不过,要考虑的基本事项是:
您需要迭代字符串中的字符多少次?
这是完整的代码:
-- Setup locals
local str = "Hello World!"
local attempts = 5000000
local reuses = 10 -- For the second part of benchmark: Table values are reused 10 times. Change this according to your needs.
local x, c, elapsed, tbl
-- "Localize" funcs to minimize lookup overhead
local stringbyte, stringchar, stringsub, stringgsub, stringgmatch = string.byte, string.char, string.sub, string.gsub, string.gmatch
print("-----------------------")
print("Raw speed:")
print("-----------------------")
-- Version 1 - string.sub in loop
x = os.clock()
for j = 1, attempts do
for i = 1, #str do
c = stringsub(str, i)
end
end
elapsed = os.clock() - x
print(string.format("V1: elapsed time: %.3f", elapsed))
-- Version 2 - string.gmatch loop
x = os.clock()
for j = 1, attempts do
for c in stringgmatch(str, ".") do end
end
elapsed = os.clock() - x
print(string.format("V2: elapsed time: %.3f", elapsed))
-- Version 3 - string.gsub callback
x = os.clock()
for j = 1, attempts do
stringgsub(str, ".", function(c) end)
end
elapsed = os.clock() - x
print(string.format("V3: elapsed time: %.3f", elapsed))
-- For version 4
local str2table = function(str)
local ret = {}
for i = 1, #str do
ret[i] = stringsub(str, i) -- Note: This is a lot faster than using table.insert
end
return ret
end
-- Version 4 - function str2table
x = os.clock()
for j = 1, attempts do
tbl = str2table(str)
for i = 1, #tbl do -- Note: This type of loop is a lot faster than "pairs" loop.
c = tbl[i]
end
end
elapsed = os.clock() - x
print(string.format("V4: elapsed time: %.3f", elapsed))
-- Version 5 - string.byte
x = os.clock()
for j = 1, attempts do
tbl = {stringbyte(str, 1, #str)} -- Note: This is about 15% faster than calling string.byte for every character.
for i = 1, #tbl do
c = tbl[i] -- Note: produces char codes instead of chars.
end
end
elapsed = os.clock() - x
print(string.format("V5: elapsed time: %.3f", elapsed))
-- Version 5b - string.byte + conversion back to chars
x = os.clock()
for j = 1, attempts do
tbl = {stringbyte(str, 1, #str)} -- Note: This is about 15% faster than calling string.byte for every character.
for i = 1, #tbl do
c = stringchar(tbl[i])
end
end
elapsed = os.clock() - x
print(string.format("V5b: elapsed time: %.3f", elapsed))
print("-----------------------")
print("Creating cache table ("..reuses.." reuses):")
print("-----------------------")
-- Version 1 - string.sub in loop
x = os.clock()
for k = 1, attempts do
tbl = {}
for i = 1, #str do
tbl[i] = stringsub(str, i) -- Note: This is a lot faster than using table.insert
end
for j = 1, reuses do
for i = 1, #tbl do
c = tbl[i]
end
end
end
elapsed = os.clock() - x
print(string.format("V1: elapsed time: %.3f", elapsed))
-- Version 2 - string.gmatch loop
x = os.clock()
for k = 1, attempts do
tbl = {}
local tblc = 1 -- Note: This is faster than table.insert
for c in stringgmatch(str, ".") do
tbl[tblc] = c
tblc = tblc + 1
end
for j = 1, reuses do
for i = 1, #tbl do
c = tbl[i]
end
end
end
elapsed = os.clock() - x
print(string.format("V2: elapsed time: %.3f", elapsed))
-- Version 3 - string.gsub callback
x = os.clock()
for k = 1, attempts do
tbl = {}
local tblc = 1 -- Note: This is faster than table.insert
stringgsub(str, ".", function(c)
tbl[tblc] = c
tblc = tblc + 1
end)
for j = 1, reuses do
for i = 1, #tbl do
c = tbl[i]
end
end
end
elapsed = os.clock() - x
print(string.format("V3: elapsed time: %.3f", elapsed))
-- Version 4 - str2table func before loop
x = os.clock()
for k = 1, attempts do
tbl = str2table(str)
for j = 1, reuses do
for i = 1, #tbl do -- Note: This type of loop is a lot faster than "pairs" loop.
c = tbl[i]
end
end
end
elapsed = os.clock() - x
print(string.format("V4: elapsed time: %.3f", elapsed))
-- Version 5 - string.byte to create table
x = os.clock()
for k = 1, attempts do
tbl = {stringbyte(str,1,#str)}
for j = 1, reuses do
for i = 1, #tbl do
c = tbl[i]
end
end
end
elapsed = os.clock() - x
print(string.format("V5: elapsed time: %.3f", elapsed))
-- Version 5b - string.byte to create table + string.char loop to convert bytes to chars
x = os.clock()
for k = 1, attempts do
tbl = {stringbyte(str, 1, #str)}
for i = 1, #tbl do
tbl[i] = stringchar(tbl[i])
end
for j = 1, reuses do
for i = 1, #tbl do
c = tbl[i]
end
end
end
elapsed = os.clock() - x
print(string.format("V5b: elapsed time: %.3f", elapsed))
示例输出(Lua 5.3.4,Windows):
-----------------------
Raw speed:
-----------------------
V1: elapsed time: 3.713
V2: elapsed time: 5.089
V3: elapsed time: 5.222
V4: elapsed time: 4.066
V5: elapsed time: 2.627
V5b: elapsed time: 3.627
-----------------------
Creating cache table (10 reuses):
-----------------------
V1: elapsed time: 20.381
V2: elapsed time: 23.913
V3: elapsed time: 25.221
V4: elapsed time: 20.551
V5: elapsed time: 13.473
V5b: elapsed time: 18.046
结果:
就我而言,string.byte
和string.sub
就原始速度而言是最快的。当使用缓存表并在每个循环中重复使用 10 次时,string.byte
即使将 charcode 转换回 chars,该版本也是最快的(这并不总是必要的,取决于使用情况)。
您可能已经注意到,我根据之前的基准做了一些假设,并将它们应用到代码中:
tbl[idx] = value
使用table.insert(tbl, value)
.for i = 1, #tbl
比for k, v in pairs(tbl)
.希望能帮助到你。
迭代构造一个字符串并将这个字符串作为一个带有 load() 的表返回...
itab=function(char)
local result
for i=1,#char do
if i==1 then
result=string.format('%s','{')
end
result=result..string.format('\'%s\'',char:sub(i,i))
if i~=#char then
result=result..string.format('%s',',')
end
if i==#char then
result=result..string.format('%s','}')
end
end
return load('return '..result)()
end
dump=function(dump)
for key,value in pairs(dump) do
io.write(string.format("%s=%s=%s\n",key,type(value),value))
end
end
res=itab('KOYAANISQATSI')
dump(res)
发出...
1=string=K
2=string=O
3=string=Y
4=string=A
5=string=A
6=string=N
7=string=I
8=string=S
9=string=Q
10=string=A
11=string=T
12=string=S
13=string=I
所有人都建议一种不太理想的方法
将是最好的:
function chars(str)
strc = {}
for i = 1, #str do
table.insert(strc, string.sub(str, i, i))
end
return strc
end
str = "Hello world!"
char = chars(str)
print("Char 2: "..char[2]) -- prints the char 'e'
print("-------------------\n")
for i = 1, #str do -- testing printing all the chars
if (char[i] == " ") then
print("Char "..i..": [[space]]")
else
print("Char "..i..": "..char[i])
end
end