0

我有一个 json 格式的城市和省份文件。我想将它插入到我的数据库表中,但结构是 json 格式。我需要删除一些数字并添加结构以将其正确插入我的表中。

这是我的文件文本的示例。

"246":"Bangued","278":"Boliney","287":"Bucay","309":"Bucloc","314":"Daguioman","319":"Danglas","327":"Dolores","343":"LaPaz","356":"Lacub","363":"Lagangilang","381":"Lagayan","387":"Langiden","394":"Licuan~Baay","406":"Luba","415":"Malibcong","428":"Manabo","440":"Penarrubia","450":"Pidigan","466":"Pilar","486":"Sallapadan","496":"San Isidro","506":"San Juan","526":"San Quintin","533":"Tayum","545":"Tineg","556":"Tubo","567":"Villaviciosa"

我想把它变成:

(NULL, "1", "Bangued"),
(NULL, "1", "Boliney"),
(NULL, "1", "Bucay"),
(NULL, "1", "Bucloc"),
(NULL, "1", "Daguioman"),
(NULL, "1", "Danglas"),
(NULL, "1", "Dolores"),
(NULL, "1", "La Paz"),
(NULL, "1", "Lacub"),
(NULL, "1", "Lagangilang"),
(NULL, "1", "Lagayan"),
(NULL, "1", "Langiden"),
(NULL, "1", "Licuan~Baay"),
(NULL, "1", "Luba"),
(NULL, "1", "Malibcong"),
(NULL, "1", "Manabo"),
(NULL, "1", "Penarrubia"),
(NULL, "1", "Pidigan"),
(NULL, "1", "Pilar"),
(NULL, "1", "Sallapadan"),
(NULL, "1", "San Isidro"),
(NULL, "1", "San Juan"),
(NULL, "1", "San Quintin"),
(NULL, "1", "Tayum"),
(NULL, "1", "Tineg"),
(NULL, "1", "Tubo"),
(NULL, "1", "Villaviciosa"),

你能告诉我如何在正则表达式中做到这一点吗?

4

3 回答 3

0
sed 's/"[0-9]*":\("[a-zA-Z ~]*"\),/(NULL, "1", \1),\
/g'

反斜杠后跟换行符。

于 2013-05-25T17:42:54.263 回答
0

此组合适用于您的示例数据trsed

tr ',' '\n' < yourfile.txt | tr ':' ',' | sed 's/^/\(NULL, /g;s/$/\),/g;s/"[0-9]*"/"1" /g'

但是有一个问题,主要是如果任何字符串包含:,否则,它将被搞砸。

于 2013-05-25T17:43:48.410 回答
0

由于您对此进行了标记sublimetext2并提到您正在使用该编辑器来执行此操作,因此这是一个崇高的答案。

您可以在不到一分钟的时间内完成此操作,而无需使用正则表达式。只需使用编辑器的内置功能即可。

初始选择

所以你从单行的一长串东西开始,看起来:

"246":"Bangued","278":"Boliney","287":"Bucay","309":"Bucloc","314":"Daguioman","319":"Danglas","327 ":"Dolores","343":"LaPaz","356":"Lacub","363":"Lagangilang","381":"Lagayan","387":"Langiden","394": "李川~巴伊","406":"鲁巴","415":"马利康","428":"马纳博","440":"Penarrubia","450":"Pidigan","466": "皮拉尔","486":"萨拉帕丹","496":"圣伊西德罗","506":"圣胡安","526":"圣昆廷","533":"泰尤姆","545":"蒂内格","556":"吐蕃","567":"Villaviciosa"

由于此处所有内容的格式都相同,因此您可以利用它来发挥自己的优势。突出显示一个冒号。你应该得到这样的结果(我选择了第一个冒号,但在这种情况下你从哪个开始并不重要):

"246" :"Bangued","278":"Boliney","287":"Bucay","309":"Bucloc","314":"Daguioman","319":"Danglas","327" :"Dolores","343":"LaPaz","356":"Lacub","363":"Lagangilang","381":"Lagayan","387":"Langiden","394":"利川~拜","406":"鲁巴","415":"马利康","428":"Manabo","440":"Penarrubia","450":"Pidigan","466":" Pilar","486":"Sallapadan","496":"圣伊西德罗","506":"圣胡安","526":"圣昆廷",“533”:“Tayum”,“545”:“Tineg”,“556”:“Tubo”,“567”:“Villaviciosa”

现在按Alt+ F3。这将选择那里的每个冒号。所以你最终会是这样的:

"246" :"Bangued","278" :"Boliney","287" :"Bucay","309" :"Bucloc","314" :"Daguioman","319" "Danglas" :,"327" :"Dolores"," 343" :"LaPaz","356" :"Lacub","363" :"Lagangilang","381" :"Lagayan","387" :"Langiden","394" :"Licuan~Baay","406" :"Luba", "415" :"Malibcong","428" :"Manabo","440" :"Penarrubia","450" :"Pidigan","466" :"Pilar","486" :"Sallapadan","496" :"San Isidro","506" :"San Juan","526" :"San Quintin","533" :"Tayum","545" :"Tineg" ,"556" :"吐蕃","567" :"维拉维乔萨"

与旧

现在,您想要获取所有这些数字并将它们转换为(NULL, "1", 所以按向右箭头→</kbd>in order to get to the end of each selection:

然后移位选择每个数字。你可以通过几种方式做到这一点。

  • Ctrl+ Shift+←</kbd> will go by word boundaries
  • Shift+←</kbd> will go by character boundaries

您可能必须使用它才能找到最适合您的方法。在任何情况下,您都希望得到以下选择:

"246":"278":Bangued”、"287":“Boliney”、“Bucay”、“Bucloc”、"309":“Daguioman”、"314":“Danglas” "319":"327":“Dolores”、"343":“LaPaz”、"356":“Lacub”、"363":“Lagangilang”、"381":“Lagayan”、"387":“ Langiden”、 "394":“ Licuan” ~Baay”、"406":“Luba”、"415":“Malibcong”、"428":“Manabo”、"440":“Penarrubia”、"450":“Pidigan”、"466":“Pilar”、"486":“Sallapadan”、"496":“San Isidro”、"506":“San Juan”、"526":“San Quintin”、"533":“Tayum” , "545":"廷格", "556":"吐蕃", "567":"比利亚维西奥萨”

在与新

现在,只需开始输入(NULL, "1",. 它将用它替换每个选定的实例。

注意:您可能已将其设置为自动插入右括号,暂时将其删除

所以你最终得到:

(NULL,“1”,“Bangued”,(NULL,“1”,“Boliney”,(NULL,“1”,“Bucay”,(NULL,“1”,“Bucloc”,(NULL,“1” , "Daguioman",(NULL, "1", "Danglas",(NULL, "1", "Dolores",(NULL, "1", "LaPaz",(NULL, "1", "Lacub",( NULL, "1", "Lagangilang",(NULL, "1", "Lagayan",(NULL, "1", "Langiden",(NULL, "1", "Licuan~Baay",(NULL, "1 ", "Luba",(NULL, "1", "Malibcong",(NULL, "1", "Manabo",(NULL, "1", "Penarubia",(NULL, "1", "Pidigan", (NULL,“1”,“皮拉尔”,(NULL,“1”,“Sallapadan”,(NULL,“1", "圣伊西德罗",(NULL, "1", "圣胡安",(NULL, "1", "圣昆廷",(NULL, "1", "Tayum",(NULL, "1", “Tineg”,(NULL,“1”,“Tubo”,(NULL,“1”,“Villaviciosa”

关闭括号

现在,由于您有一些带有单词分隔符、波浪号和空格的省份,您必须从另一侧处理其余部分。突出显示 a,(并再次按Alt+F3以选择所有这些(我选择,(而不是仅仅(为了排除第一个(

(NULL,“1 ,(”,“Bangued” ,(NULL,“1”,“Boliney”NULL,“1”,“Bucay” ,(NULL,“1”,“Bucloc” ,(NULL,“1”,“Daguioman” ,(NULL,“1 ","Danglas" ,(NULL,"1","Dolores" ,(NULL,"1","LaPaz" ,(NULL,"1","Lacub" ,(NULL,"1","Lagangilang" ,(NULL,"1","Lagayan" ,(NULL,“1”,“Langiden” ,(NULL,“1”,“Licuan~Baay” ,(NULL,“1”,“Luba” ,(NULL,“1”,“Malibcong” ,(NULL,“1”,“Manabo” ,(NULL,“ 1", "Penarrubia" ,(NULL, "1", "Pidigan" ,(NULL, "1", "Pilar" ,(NULL, "1", "Sallapadan" ,(NULL, "1", "San Isidro" ,(NULL, "1", "San Juan" ,(NULL , "1" ,(, "圣昆廷" NULL, "1", "Tayum" ,(NULL, "1", "Tineg"“廷格”“廷格”,(NULL,“1”,“Tubo” ,(NULL,“1”,“Villaviciosa”

←</kbd> in order to go to the beginning of your selection. Now close down your parens with ). So you end up like this:

(NULL,“1”,“Bangued”),(NULL,“1”,“Boliney”),(NULL,“1”,“Bucay”),(NULL,“1”,“Bucloc”),(NULL , "1", "Daguioman"),(NULL, "1", "Danglas"),(NULL, "1", "Dolores"),(NULL, "1", "LaPaz"),(NULL, " 1", "Lacub"),(NULL, "1", "Lagangilang"),(NULL, "1", "Lagayan"),(NULL, "1", "Langiden"),(NULL, "1" , "李川~Baay"),(NULL, "1", "Luba"),(NULL, "1", "Malibcong"),(NULL, "1", "Manabo"),(NULL, "1" , "Penarubia"),(NULL, "1", "Pidigan"),(NULL, "1", "Pilar"),(NULL, "1", "萨拉帕丹"),(NULL, "1", "圣伊西德罗"),(NULL, "1", "圣胡安"),(NULL, "1", "圣昆廷"),(NULL, "1", “Tayum”),(NULL,“1”,“Tineg”),(NULL,“1”,“Tubo”),(NULL,“1”,“Villaviciosa”

插入换行符

光标仍位于每个结束括号处,但您需要在逗号后换行。向右移动一个字符→</kbd>(you want to get beyond your comma there) and press Enter or Return to enter a newline. Your lines now look like this:

(NULL, "1", "Bangued"),
(NULL, "1", "Boliney"),
(NULL, "1", "Bucay"),
(NULL, "1", "Bucloc"),
(NULL, "1", "Daguioman"),
(NULL, "1", "Danglas"),
(NULL, "1", "Dolores"),
(NULL, "1", "LaPaz"),
(NULL, "1", "Lacub"),
(NULL, "1", "Lagangilang"),
(NULL, "1", "Lagayan"),
(NULL, "1", "Langiden"),
(NULL, "1", "Licuan~Baay"),
(NULL, "1", "Luba"),
(NULL, "1", "Malibcong"),
(NULL, "1", "Manabo"),
(NULL, "1", "Penarrubia"),
(NULL, "1", "Pidigan"),
(NULL, "1", "Pilar"),
(NULL, "1", "Sallapadan"),
(NULL, "1", "San Isidro"),
(NULL, "1", "San Juan"),
(NULL, "1", "San Quintin"),
(NULL, "1", "Tayum"),
(NULL, "1", "Tineg"),
(NULL, "1", "Tubo"),
(NULL, "1", "Villaviciosa"

最后一步是完成最后一行。它从来没有结束括号和逗号。因此,转到最后一行并执行此操作。

这花了大约半个小时来写,但实际上不到一分钟。仅用文本很难解释,但这里的关键是您可以在不询问其他人如何使用正则表达式的情况下解决此问题。它可以更快地完成。使用多选需要一些时间来适应,但最终对于您节省的时间来说是值得的。

于 2013-05-28T16:57:46.473 回答