ruby - 如何从 URL 中删除 Google 跟踪参数 (UTM)？

Question

我有一堆要清理的 URL。它们都包含UTM参数，这些参数在这种情况下不是必需的，或者是有害的。例子：

http://houseofbuttons.tumblr.com/post/22326009438?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+HouseOfButtons+%28House+of+Buttons%29

所有可能的参数都以开头utm_。如何在不破坏其他潜在“好”URL 参数的情况下使用 ruby 脚本/结构轻松删除它们？

score 12 · Accepted Answer

您可以将正则表达式应用于 url 以清理它们。这样的事情应该可以解决问题：

url = 'http://houseofbuttons.tumblr.com/post/22326009438?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+HouseOfButtons+%28House+of+Buttons%29&normal_param=1'
url.gsub(/&?utm_.+?(&|$)/, '') => "http://houseofbuttons.tumblr.com/post/22326009438?normal_param=1"

score 11 · Accepted Answer

这使用URI lib来解构和更改查询字符串（无正则表达式）：

require 'uri'
str ='http://houseofbuttons.tumblr.com/post/22326009438?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+HouseOfButtons+%28House+of+Buttons%29&normal_param=1'

uri = URI.parse(str)
clean_key_vals = URI.decode_www_form(uri.query).reject{|k, _| k.start_with?('utm_')}
uri.query = URI.encode_www_form(clean_key_vals)
p uri.to_s #=> "http://houseofbuttons.tumblr.com/post/22326009438?normal_param=1"

ruby - 如何从 URL 中删除 Google 跟踪参数 (UTM)？

2 回答 2

Related

Reference