0

我导出了两个字段:nameheader从数据库中使用:

SELECT name, header 
INTO OUTFILE '/var/lib/mysql-files/myfile.txt'
FIELDS TERMINATED BY '<xx>' 
LINES TERMINATED BY '\n'
FROM mytable;

一条记录具有此header值:

{'Date':'Fri, 19 Apr 2019 07:23:14 GMT','Server':'Apache','Vary':'Qualys-Scan','Strict-Transport-Security':'max-age= 31536000;includeSubDomains;preload', 'Set-Cookie': 'ASP.NET_SessionId=ivoa5bhet0s2ygkylmimvkie; 路径=/; 安全的; HttpOnly;SameSite=严格,SC_ANALYTICS_GLOBAL_COOKIE=12f133ea5080403692b4ce458fd1a540;到期=格林威治标准时间 2029 年 4 月 19 日星期四 07:23:14;路径=/; 安全的; HttpOnly;SameSite=strict, SC_ANALYTICS_SESSION_COOKIE=336B597E7A534D6393C57DF11E047484|1|ivoa5bhet0s2ygkylmimvkie; 路径=/; 安全的; HttpOnly;SameSite=严格,incap_ses_885_270026=cDp/VlO1AHgshF9F6SZIDGJ3uVwAAAAAg7DwpecyehBCyhXgoYO5GA==;路径=/; 域=.zurich.co.uk,___utmvmykuNyVY=dlNaoEsuXSO;路径=/; 最大年龄=900,___utmvaykuNyVY=nWJx01KvGT;路径=/; 最大年龄=900,___utmvbykuNyVY=JZy XEtOwalQ: PtR; 路径=/; Max-Age=900', 'X-Content-Type-Options': 'nosniff', 'X-XSS-Protection': '1; 模式=块','缓存控制':'私人','内容类型':'文本/html;charset=utf-8', 'Keep-Alive': 'timeout=5, max=10', 'Connection': 'Keep-Alive', 'X-Iinfo': '8-3925806-3925807 NNNN CT(73 151 0) RT(1555658593583 5) q(0 0 3 0) r(6 6) U5', 'X-CDN': 'Incapsula', 'Content-Encoding': 'gzip', 'Transfer-Encoding': 'chunked '} 文本/html;charset=utf-8', 'Keep-Alive': 'timeout=5, max=10', 'Connection': 'Keep-Alive', 'X-Iinfo': '8-3925806-3925807 NNNN CT(73 151 0) RT(1555658593583 5) q(0 0 3 0) r(6 6) U5', 'X-CDN': 'Incapsula', 'Content-Encoding': 'gzip', 'Transfer-Encoding': 'chunked '} 文本/html;charset=utf-8', 'Keep-Alive': 'timeout=5, max=10', 'Connection': 'Keep-Alive', 'X-Iinfo': '8-3925806-3925807 NNNN CT(73 151 0) RT(1555658593583 5) q(0 0 3 0) r(6 6) U5', 'X-CDN': 'Incapsula', 'Content-Encoding': 'gzip', 'Transfer-Encoding': 'chunked '}

它导出为:

https://z.co.uk<xx> {'Date':'Fri, 19 Apr 2019 07:23:14 GMT','Server':'Apache','Vary':'Qualys-Scan','Strict-Transport -Security':'max-age=31536000;includeSubDomains;preload','Set-Cookie':'ASP.NET_SessionId=ivoa5bhet0s2ygkylmimvkie; 路径=/; 安全的; HttpOnly;SameSite=严格,SC_ANALYTICS_GLOBAL_COOKIE=12f133ea5080403692b4ce458fd1a540;到期=格林威治标准时间 2029 年 4 月 19 日星期四 07:23:14;路径=/; 安全的; HttpOnly;SameSite=strict, SC_ANALYTICS_SESSION_COOKIE=336B597E7A534D6393C57DF11E047484|1|ivoa5bhet0s2ygkylmimvkie; 路径=/; 安全的; HttpOnly;SameSite=严格,incap_ses_885_270026=cDp/VlO1AHgshF9F6SZIDGJ3uVwAAAAAg7DwpecyehBCyhXgoYO5GA==;路径=/; 域=.zurich.co.uk,___utmvmykuNyVY=dlNaoEsuXSO;路径=/; 最大年龄=900,__utmvaykuNyVY=nWJx01KvGT;路径=/; 最大年龄=900,___utmvbykuNyVY=JZy

在新行中(请注意,它以选项卡开头,这就是堆栈溢出将其显示为代码的原因):

XEtOwalQ: PtR; path=/; Max-Age=900', 'X-Content-Type-Options': 'nosniff', 'X-XSS-Protection': '1; mode=block', 'Cache-Control':

'私人','内容类型':'文本/ html;charset=utf-8', 'Keep-Alive': 'timeout=5, max=10', 'Connection': 'Keep-Alive', 'X-Iinfo': '8-3925806-3925807 NNNN CT(73 151 0) RT(1555658593583 5) q(0 0 3 0) r(6 6) U5', 'X-CDN': 'Incapsula', 'Content-Encoding': 'gzip', 'Transfer-Encoding': 'chunked '}

为什么会这样?如何避免这种情况?

它给我带来了一个大问题,并出现在许多其他记录中(但不是全部)。

我需要使用 python 读取文件行,python 将截断的行识别为两行而不是一行,这使得这些行不符合我用 python 处理的行格式,并且我得到错误提示out of index

4

1 回答 1

0

像这样的东西(未经测试):

with open('/var/lib/mysql-files/myfile.txt') as f:
    lines = f.read().splitlines()
i = 0
lines2 = []
while i<len(lines):
    if ('{' in lines[i]) and ('}' not in lines[i]):
        l = lines[i] + ' ' + lines[i+1] + '\n'
        i += 1
    else:
        l = lines[i] + '\n'
    lines2.append(l)
    i += 1

with open('/var/lib/mysql-files/fixed.txt', 'w') as f:
    f.writelines(lines2)
于 2019-05-11T13:54:54.697 回答