我要解决的问题是给定一个可能包含回车的字符串,插入额外的回车,使每行不超过设定的字符数。如果可能的话,它也应该尽量保持一个词的完整性。
Java 或 Scala 中是否有一个库可以满足我的需要?
我要解决的问题是给定一个可能包含回车的字符串,插入额外的回车,使每行不超过设定的字符数。如果可能的话,它也应该尽量保持一个词的完整性。
Java 或 Scala 中是否有一个库可以满足我的需要?
包中有一个 BreakIterator 类java.text
,可以告诉你在哪里可以插入换行符,但是使用起来有点复杂。像这样的正则表达式可以完成 80% 的工作:
str += "\n"; // Needed to handle last line correctly
// insert line break after max 50 chars on a line
str = str.replaceAll("(.{1,50})\\s+", "$1\n");
Apache commons lang 库有一个WordUtils
类,其中包含一个wrap
方法,用于将一长行文本包装到单词边界上给定长度的几行。
public static String addReturns(String s, int maxLength)
{
String newString = "";
int ind = 0;
while(ind < s.length())
{
String temp = s.substring(ind, Math.min(s.length(), ind+maxLength));
int lastSpace = temp.lastIndexOf(" ");
int firstNewline = temp.indexOf("\n");
if(firstNewline>-1)
{
newString += temp.substring(0, firstNewline + 1);
ind += firstNewline + 1;
}
else if(lastSpace>-1)
{
newString += temp.substring(0, lastSpace + 1) + "\n";
ind += lastSpace + 1;
}
else
{
newString += temp + "\n";
ind += maxLength;
}
}
return newString;
}
如果您不想使用正则表达式,这将起到作用。
System.out.println(addReturns("Hi there, I'm testing to see if this\nalgorithm is going to work or not. Let's see. ThisIsAReallyLongWordThatShouldGetSplitUp", 20));
输出:
Hi there, I'm
testing to see if
this
algorithm is going
to work or not.
Let's see.
ThisIsAReallyLongWor
dThatShouldGetSplitU
p
如果有人感兴趣,我的最终解决方案使用了 Apache Commons WordUtils,感谢 Joni 向我指出 WordUtils。
private static String wrappify(String source, int lineLength, String eolMarker){
String[] lines = source.split(eolMarker);
StringBuffer wrappedStr = new StringBuffer();
for (String line : lines) {
if(line.length() <= lineLength){
wrappedStr.append(line + eolMarker);
}else{
wrappedStr.append(WordUtils.wrap(line, lineLength, eolMarker, true) + eolMarker);
}
}
return wrappedStr.replace(wrappedStr.lastIndexOf(eolMarker), wrappedStr.length(), "").toString();
}
我认为你可以从这样的事情开始。请注意,当单词超过 MAX_LINE_LENGTH 时,您必须处理特殊情况。
package com.ekse.nothing;
public class LimitColumnSize {
private static String DATAS = "It was 1998 and the dot-com boom was in full effect. I was making websites as a 22 year old freelance programmer in NYC. I charged my first client $1,400. My second client paid $5,400. The next paid $24,000. I remember the exact amounts — they were the largest checks I’d seen up til that point.\n"
+ "Then I wrote a proposal for $340,000 to help an online grocery store with their website. I had 5 full time engineers at that point (all working from my apartment) but it was still a ton of dough. The client approved, but wanted me to sign a contract — everything had been handshakes up til then.\n"
+ "No prob. Sent the contract to my lawyer. She marked it up, sent it to the client. Then the client marked it up and sent it back to my lawyer. And so on, back and forth for almost a month. I was inexperienced and believed that this is just how business was done."
+ "Annoyed by my lawyering, the client eventually gave up and hired someone else.";
private static int MAX_LINE_LENGTH = 80;
private static char[] BREAKING_CHAR = {' ', ',', ';', '!', '?', ')', ']', '}'}; // Probably some others
public static void main(String[] args) {
String current = DATAS;
String result = "";
while (current.length() != 0) {
for (int i = (current.length() - 1) < MAX_LINE_LENGTH ? current.length() - 1 : MAX_LINE_LENGTH; i >= 0; i--) {
if (current.charAt(i) == '\n') {
result += current.substring(0, i);
current = current.substring(i + 1);
break;
} else if (isBreakingChar(current.charAt(i))) {
result += current.substring(0, i) + "\n";
current = current.substring(i + 1);
break;
} else if (i == 0 && (current.length() - 1) < MAX_LINE_LENGTH) {
result += current;
current = "";
} else {
// Line cannot be break, try to go to the right and find the next BREAKING_CHAR
}
}
}
System.out.println(result);
}
private static boolean isBreakingChar(char c) {
for (int i = 0; i < BREAKING_CHAR.length; ++i) {
if (c == BREAKING_CHAR[i]) {
return true;
}
}
return false;
}
}