1

I have always used rawurlencode to store user entered data into my mysql databases. The main reason I do this is so that stroing foreign characters is very simple I find. I'd then use rawurldecode to retrieve and display the data.

I read somewhere that rawurlencode was not meant for this purpose. Are there any disadvantages to what I'm doing?


So let's say I have a German address with many characters like umlauts etc. What is the simplest way to store this in a mysql database with no risks of it coming out wrong and being searchable using a search script? So far rawurelencode has been excellent for our system. Perhaps the practise can be improved upon by only encoding foreign letters and not common characters like spaces etc, which is a waste of space I totally agree.

4

3 回答 3

1

当然有。

让我们从实际开始:对于一大类字符,您为每个数据字节花费 3 个字节的存储空间。(当然还有 RFC)的描述rawurlencode说这些字符是

除 -_.~ 之外的所有非字母数字字符

这意味着总共有 26 + 26 + 10(字母数字)+ 4(特殊例外)= 66 个字符,您不会为此浪费空间。

然后还有逻辑上的缺点:您不是存储数据本身,而是针对 URL 定制的数据表示。除非数据本身是 URL,否则您不应该这样做。

于 2012-04-20T11:03:30.610 回答
0
  • I am using a fork to eat a soup
  • I am using money bills to fire the coals for BBQ
  • I am using a kettle to boil eggs.
  • I am using a microscope to hammer the nails.

Are there any disadvantages to what I'm doing?

YES

You are using a tool not on purpose. This is always a disadvantage.

A sane human being alway using a tool that is intended for the certain job. Not some randomly picked one. Especially if there is no shortage in the right tool supply.

URL encoding is not intended to be used with database, as one can tell from the name. That's alone reason enough for the sane developer. Take a look around: find the proper tool.

There is a thing called "common sense" - a thing widely used in the regular life but for some reason always absent in the php world.
A common sense can warn us: if we're using a wrong tool, it may spoil the work. Sooner or later it will spoil it. No need to ask for the certain details - it's a general rule. We are learning this rule at about age of 5.

Why not to use it while playing with some web thingies too?

Why not to ask yourself a question:

What's wrong with storing foreign characters at all?

urlencode makes stroing foreign characters very simple

Any hardships you encountered without urlencode?

Although I feel that common sense should be enough to answer the question, people always look for the "omen", the proof. Here you are:

Database's job is not limited to just storing and retrieving data. A plain text file can handle such a primitive task as well.
Data manipulations is what we are using databases for.
Most widely used ones are sorting and filtering.

  • Such a quite intelligent thing as a database can sort and filter data character-insensitive, which is very handy feature. But of course it can be done only if characters being saved as is, not as some random codes.
  • Sorting texts also may use ordering other than just binary order in the character table. Some umlaut characters may be present at the other parts of the table but database collation will put them in the right place. Of course it can be done only if characters being saved as is, not as some random codes.
  • Sometimes we have to manipulate the data that already stored in the database. Say, cut some piece from the string and compare with the entered value. How it is supposed to be done with urlencoded data?
于 2012-04-20T11:28:14.873 回答
0

我能想到的缺点:

  • 浪费磁盘空间。
  • 每次读取和每次写入都会浪费 CPU 周期编码和解码。
  • 额外的复杂性(您甚至无法使用 MySQL 客户端检查数据)。
  • 无法使用全文搜索。
  • URL 编码不一定是唯一的(至少有两个 RFC)。它可能不会导致数据丢失,但会导致重复数据(例如,两行实际上包含相同数据的唯一索引)。
  • 您可能会不小心对非字符串数据(例如日期)进行编码:2012-04-20%2013%3A23%3A00

但主要考虑的是这种技术是完全任意的和不必要的,因为 MySQL 在存储完整的 Unicode 目录时没有任何问题。您还可以决定交换所有字符串中的 e 和 o Holle, werdl!:. 您的应用程序可以正常运行,但不会提供任何附加值。

更新:正如你的常识所指出的,一个基本的 SQL 子句ORDER BY不再可用。并不是说国际字符会被忽略;你基本上会得到一个基于 ASCII 码%和十六进制字符的任意排序顺序。如果你不能SELECT * FROM city ORDER BY city_name可靠,你已经使你的数据库无用。

于 2012-04-20T11:19:53.910 回答