0

在尝试创建一个输入版本以馈入我的实际代码时,我需要创建一个文件,该文件能够在键的值中获取“包含”具有区分大小写的字符串版本的字符串。即构建一个正则表达式来实现“包含” “camfrog or tubemate, or soundcloud”。

示例 JSON 输入:

{"appid":"537c6d4a9c4846b8bc44ebdf78ab8e2d","app_name":"TubeMate
YouTube Downloader","publisher_id":"1690d6387fcc441091a2f2d73f89709d"}
{"appid":"f8022204aaa7478a88fca1a417ddb125","app_name":"Camfrog
Android Smartphone","publisher_id":"085d0268a9674ce885a2f185ec895246"}
{"appid":"agltb3B1Yi1pbmNyDAsSA0FwcBih9tMUDA","app_name":"TuneIn Radio
- iPad","publisher_id":"agltb3B1Yi1pbmNyEAsSB0FjY291bnQYsv-PFAw"} {"appid":"537c6d4a9c4846b8bc44ebdf78ab8e2d","app_name":"TubeMate
YouTube Downloader","publisher_id":"1690d6387fcc441091a2f2d73f89709d"}
{"appid":"f8022204aaa7478a88fca1a417ddb125","app_name":"Camfrog
Android Smartphone","publisher_id":"085d0268a9674ce885a2f185ec895246"}
 {"appid":"92255b8b662148e59973b8eca128adde","app_name":"SubwaySimulator3D","publisher_id":"0d78f4d244ec4309b4aa06cdfb871341"}
{"appid":"agltb3B1Yi1pbmNyDAsSA0FwcBjq_6EUDA","app_name":"TuneIn
Radio","publisher_id":"agltb3B1Yi1pbmNyEAsSB0FjY291bnQYsv-PFAw"}
{"appid":"f7cc119ca9e1426c8d162d2d37c8558f","app_name":"Android Skout
New","publisher_id":"agltb3B1Yi1pbmNyEAsSB0FjY291bnQY7cCnEgw"}
{"appid":"agltb3B1Yi1pbmNyDAsSA0FwcBim6MAVDA","app_name":"Draw
Something
Android","publisher_id":"agltb3B1Yi1pbmNyEAsSB0FjY291bnQYgYC-FQw"}

从这个 Json 输入中,我需要过滤名称为“像”Camfrog 的应用程序(可以是 CAMFROG、camfrog .. 等,因此正则表达式必须不区分大小写。除此之外,我还需要输出一系列 app_names,例如,说“Camfrog”、“Tubemate”、“soundcloud”等。我在这里查阅了 jq 手册,http ://stedolan.github.io/jq/manual/ ,但无法构建表达式。

这是我尝试过的-:

 </home/ekta/Prototype1/sample.dat jq -c '{app_name:.app_name} |
 match(["Camfrog", "ig"])'  
 map(select(.app.name like "%Camfrog%" ))

但我得到匹配未定义及其编译错误。我怎样才能在 Jq 中做到这一点。

fallback -:我可以将它作为数据框加载到 pandas 中,并在那里执行正则表达式,但由于我的文件有很多我并不真正需要的东西,我想在 Jq 中快速过滤。

过滤应用程序后的示例输出(我需要原始输出中的所有键、值 - :

{"appid":"537c6d4a9c4846b8bc44ebdf78ab8e2d","app_name":"TubeMate
YouTube Downloader","publisher_id":"1690d6387fcc441091a2f2d73f89709d"}
{"appid":"f8022204aaa7478a88fca1a417ddb125","app_name":"Camfrog
Android Smartphone","publisher_id":"085d0268a9674ce885a2f185ec895246"}
{"appid":"537c6d4a9c4846b8bc44ebdf78ab8e2d","app_name":"TubeMate
YouTube Downloader","publisher_id":"1690d6387fcc441091a2f2d73f89709d"}
{"appid":"f8022204aaa7478a88fca1a417ddb125","app_name":"Camfrog
Android Smartphone","publisher_id":"085d0268a9674ce885a2f185ec895246"}

PPS:如果你能“教我钓鱼”,而不是仅仅构建应该匹配的正则表达式,将不胜感激。

跟进问题-:

另外,当我尝试在 jq 手册中测试示例示例时,例如-:

回声 [{"foo": 1, "bar": 2}, {"foo": 1, "bar": 3}, {"foo": 4, "bar": 5}] | jq '独特的(.foo)

我得到,错误:唯一参数太多(预期为 0 但得到 1)唯一(.foo)1 编译错误

当 jq 手册阅读时,一个示例示例如下 - :

jq 'unique(.foo)'
Input [{"foo": 1, "bar": 2}, {"foo": 1, "bar": 3}, {"foo": 4, "bar": 5}]
Output    [{"foo": 1, "bar": 2}, {"foo": 4, "bar": 5}]

我还应该如何在这里尝试输入?

我构建字典的方式确实是 , </home/ekta/SamplePrototype.dat jq -c '{appid:.app.id,,app_name:.app.name,publisher_id:.app.publisher_id}',但我想测试 jq 手册中的内容。你能给我指点一下我在这里做错了什么吗?

4

1 回答 1

3

这是我们的老好朋友 Grep(和 egrep)对我有用的方法

$<sample.dat  jq -c '{appid:.appid,app_name:.app_name}'  | egrep -i "camfrog|draw something"
{"appid":"f8022204aaa7478a88fca1a417ddb125","app_name":"Camfrog Android Smartphone"}
{"appid":"f8022204aaa7478a88fca1a417ddb125","app_name":"Camfrog Android Smartphone"}
{"appid":"agltb3B1Yi1pbmNyDAsSA0FwcBim6MAVDA","app_name":"Draw Something Android"}
于 2014-08-23T17:40:05.647 回答