问题标签 [non-breaking-characters]

For questions regarding programming in ECMAScript (JavaScript/JS) and its various dialects/implementations (excluding ActionScript). Note JavaScript is NOT the same as Java! Please include all relevant tags on your question; e.g., [node.js], [jquery], [json], [reactjs], [angular], [ember.js], [vue.js], [typescript], [svelte], etc.

0 投票
1 回答
50 浏览

java - 使用节点访问者时,如何获得两个节点之间的不间断空格?

我尝试解析以下 HTML 源代码:

我实现了接口org.jsoup.select.NodeVisitor。但是,它似乎跳过了 和 之间的</a>内容<a。禁用漂亮的打印并不能解决我的问题。

您可以运行第一个 JUnit 测试来重现此错误: https ://github.com/gouessej/HtmlFlow/blob/patch-1/src/test/java/htmlflow/flowifier/test/TestFlowifier.java 它转换 HTML 源我的主页的代码转换为 Java 源代码,它将这个 Java 源代码转换回 HTML,并将生成的 HTML 源代码与原始源代码进行比较。

PS:实际上TextNode.getWholeText()返回\n而不是&nbsp;&nbsp;&nbsp;\n.

0 投票
3 回答
884 浏览

javascript - JavaScript:用普通空格替换不间断空格和特殊空格字符

我试图调试在字符串中搜索的问题,它归结为以下有趣的代码。

两者似乎都相等"item ""item "但事实并非如此!

通过将其粘贴到 Python 解释器上进一步调查后,我发现第一个"item "具有不同类型的空间,如"item\xc2\xa0". 我认为这是一个不间断的空间

现在,匹配这些字符串的一种可能解决方案是\xc2\xa0用空格替换,但是有没有更好的方法将所有特殊空格字符转换为普通空格?

0 投票
1 回答
125 浏览

google-apps-script - 用空间谷歌脚本替换 u00A0

我需要u00AO用空格替换字符的所有实例。我想这样做而不循环

它运行但不会用此文本中的空格替换 u00AO 字符

另外,我认为 u00AO 与 chr(160) 相同,这也不起作用

在VBA中我只是这样做

或这个

0 投票
1 回答
139 浏览

r - 删除 R 中的不间断空格字符

我有几列和 50K 加观察的数据框。我们将其命名为 df1。其中一个变量是 PLATES(此处表示为“y”),其中包含城市中公共汽车的车牌号。我想将此数据框与另一个(df2)匹配,其中我也有板块数据。我只想保留匹配的记录。在查看来自 CSV 文件的 df1 中的数据时,我意识到对于 y,有几个观察值在车牌号之前有符号,这些符号对应于不间断空格。我如何摆脱它,以便在我进行匹配时它不是问题。这里有一些代码可以帮助说明。假设您有 5 个车牌号:

但经过进一步检查

您会看到以下内容

我试过这个,从这篇文章https://blog.tonytsai.name/blog/2017-12-04-detecting-non-break-space-in-r/,但没有奏效

非常感谢您对如何处理此问题的帮助。谢谢!

0 投票
1 回答
284 浏览

html - 非破坏空间(160)和正常空间(32)的宽度不同

在此处输入图像描述

第一行的宽度不等于第二行,原因显然是   和正常空间的宽度不同。字体是 Lato,这仅适用于某些字符。这是一个拉托错误吗?或者有什么办法可以解决这个问题?字母间距或字间距 CSS 不走运

在此处输入图像描述

黑色文本中的间距是不间断空格( 160 )。黑色文本顶部的红色文本间距是正常间距 (32)。字体是 Lato,这只发生在某些字符上( f, r, v, w, y )

0 投票
1 回答
193 浏览

unicode - 在同一个单词中使用 Word Joiner (U+2060) 是否正确?

0 投票
0 回答
33 浏览

sql - 在 Teradata SQL 助手中搜索符号

我需要实现以下请求:

SELECT *

FROM TABLE_1

WHERE COLUMN_1 LIKE ANY ('%[1]%', '%[2]%')

目标是识别行中任何位置带有不间断空格单引号 (')的记录。

问题是需要什么代码而不是 [1] 和 [2],

其中 [1] - 不间断空格的代码,[2] - 单引号的代码。

0 投票
1 回答
107 浏览

python - 熊猫 read_html 问题  

我正在使用 pandas read_html 读取 html 文件,但遇到了不间断空格的问题。我在结果数据框的一列中有数据,该列应包含类似“ABCDEF G”的字符串(F 和 G 之间的三个空格)。相反,我得到“ABCDEF G”(F 和 G 之间的一个空格)。当我检查 html 文件时,它显示“ABCDEF G”,因此出于某种原因,这三个不间断空格仅更改为一个空格。html 中的所有单个不间断空格都可以正常工作。有没有办法解决这个问题,所以它保留了 F 和 G 之间的三个空格?

0 投票
0 回答
149 浏览

powershell - 在 Get-ChildItem 之后找不到 Powershell Get-Content 文件

我目前正在尝试使用 PowerShell 从索引服务器导出中读取元数据,以便能够使用此信息标记 SharePoint 文档库。

首先我搜索当前目录中的所有 XML 文件:

之后我从每个 XML 文件中获取数据内容:

执行脚本时,我收到错误消息“指定路径中没有对象,或者它被参数“-Include”或“-Exclude”过滤

由于这只发生在 .msg.xml 文件中,我认为这可能是由于文件名中的不间断空格或类似字符造成的。 尽管如此,令我感到惊讶的是,这些信息似乎在命令Get-ChildItemGet-Content.

关于如何解决这个问题的任何建议?

0 投票
0 回答
83 浏览

java - Encoding issue causing difference in strings

I have a legacy service which returns a XML string from the database. Now for one particular scenario this service returns a string which has the character &#160; in it. I recently shifted this service to a new Windows 10 machine and when I wrote this string to a file, the XML file became un-parseable. On opening the file present on the old Machine in the new machine I saw the file was UTF-8 encoded and on my new machine the file was being written in ANSI. So I started writing the file in UTF-8. The file became parseable now, and was exactly the same as the file on the old machine. But now the issue is that the service is still sending the XML string with the character &#160; in it. But I have started writing the file in UTF-8 and thus the local file has the character "Â " and the String which the service sends has the char xA0. And the logic now compares these two strings and finds a difference, when actually the only difference is in the encoding of these files. Now I am pretty sure that the encoding I want to write the files is in UTF-8, because the files are identical for both machines but how do I convert the String sent by the service such that it is in UTF-8. So that the difference is found only when there is any actual difference. This encoding thing is really confusing for me. Please help me understand what is actually happening here.

Another thing to note here is that the XML file on the old Windows 7 machine shows the encoding ANSI but when I copy that file on my new Windows 10 machine the encoding shows as UTF-8. I check the encoding using notepad(I open the save dialogue). Can someone please help me understand that there was some kind of issue on Windows 7 which was fixed in Windows 10, which is the reason behind the encoding difference between the 2 machines for the same file.

I already asked a question regarding this. I answered my own question as I did solve the parsing issue by writing the file in UTF-8 encoding.

I already tried using below:

retVal is the string sent by the service. When comparing retVal and the string written to the file, I still get a difference.

This is the code I use to get the string from the service:

So I tried to fix the encoding at the source, but unfortunately that did not work either. This is my new nodeToString method.

On comparing strRepeatString and the local file saved using UTF-8 encoding(code can be found in the answer of the question ) I am still getting the difference of the char Â.