当您修改 DNS 字符串$query
时,您需要区分核苷酸的位置和一般字符串偏移量。
虽然起初两者是相同的,但添加到字符串中的越多,差异就越大。
如果您将字符串封装到它自己的对象中,该对象负责处理 HTML 标签并且不计算它们(包括标签周围的空格),事情就变得容易了:
$query = "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC";
// wrap line: 11 lines à 50 nucleotids
$seq = chunk_split($query, 50, "<br />\n");
// get help from sequence object
$sequence = new AmendSequence($seq);
// some HTML comments for demonstration purposes
$sequence->insertAt(0, "<!-- first line -->\n");
$sequence->insertAt(50, "<!-- second line -->\n", TRUE); # TRUE: Place after <br />
$sequence->insertAt(75, "<!-- inside second line -->");
$sequence->insertAt(550, "<!-- at end -->", TRUE); # TRUE: Place after <br />
// colorize
$color = '<span style="color: hsl(0,100%,25%);">';
$sequence->insertAt(49, $color);
$sequence->insertAt(50, '</span>');
printf("Sequence with %d nucleotids:\n", count($sequence)); # count gives length w/o amends
echo $sequence, "\n"; # that prints your sequence with all amends
这将创建以下输出:
Sequence with 550 nucleotids:
<!-- first line -->
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA<span style="color: hsl(0,100%,25%);">A</span><br />
<!-- second line -->
AAAAAAAAAAAAAAAAAAAAAAAAA<!-- inside second line -->AAAAAAAAAAAAAAAAAAAAAAAAA<br />
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA<br />
AAAAABBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB<br />
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB<br />
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB<br />
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBCCCCCC<br />
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC<br />
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC<br />
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC<br />
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC<br />
<!-- at end -->
所以这里真正的魔法是insertAt
接受核苷酸的偏移位置并为其计算字符串偏移的方法。这一切都封装在一个它自己的类中,这对于开始可能有点过多,但真正的动作是将包含 DNA 和 HTML 的字符串划分为仅 DNS 的段并获取它们的实际字符串偏移量。完整的源代码:
<?php
/**
* @link http://stackoverflow.com/questions/10446162/how-to-wordwrap-with-different-length-string-modification
*/
/**
* Treat text with "tags" as text without tags when insertAt() is called.
*/
class AmendSequence implements IteratorAggregate, Countable
{
/**
* regex pattern for a tag
*/
const TAG = '\s*<[^>]*>\s*';
/**
* @var string
*/
private $query;
/**
* @param string $query
*/
public function __construct($query = '')
{
$this->setQuery($query);
}
/**
* @param int $offset
* @param string $string
* @param bool $after (optional) prefer after next tag instead before
*/
public function insertAt($offset, $string, $after = FALSE)
{
$offset = $this->translate($offset, $after);
$this->query = substr_replace($this->query, $string, $offset, 0);
}
/**
* Translate virtual offset to string offset
* @param int $virtualOffset
* @return int
* @throws InvalidArgumentException
*/
public function translate($virtualOffset, $after)
{
if ($virtualOffset < 0) throw new InvalidArgumentException(sprintf('Offset can not be lower than 0, is %d.', $virtualOffset));
$virtualCurrent = 0;
foreach ($this as $segment) {
list(, $current, $length) = $segment;
$delta = ($virtualOffset - $virtualCurrent) - $length;
if ($delta < 0 || ($delta === 0 && !$after)) {
return $current + $length + $delta;
}
$virtualCurrent += $length;
}
if ($virtualCurrent === $virtualOffset && $after) {
return strlen($this->query);
}
throw new InvalidArgumentException(sprintf('Offset can not be larger than %d, is %d.', $virtualCurrent, $virtualOffset));
}
/**
* @return array
*/
public function getSegments()
{
$segments = preg_split('/' . self::TAG . '/', $this->query, 0, PREG_SPLIT_OFFSET_CAPTURE | PREG_SPLIT_NO_EMPTY);
foreach ($segments as &$segment) {
$segment[2] = strlen($segment[0]);
}
return $segments;
}
public function getSequence()
{
$buffer = '';
foreach ($this as $segment) {
$buffer .= $segment[0];
}
return $buffer;
}
/**
* @return string
*/
public function getQuery()
{
return $this->query;
}
/**
* @param string $query
*/
public function setQuery($query)
{
$this->query = (string)$query;
}
/**
* Retrieve an external iterator
* @link http://php.net/manual/en/iteratoraggregate.getiterator.php
* @return Traversable An instance of an object implementing <b>Iterator</b> or <b>Traversable</b>
*/
public function getIterator()
{
return new ArrayIterator($this->getSegments());
}
/**
* @link http://php.net/manual/en/countable.count.php
* @return int The custom count as an integer.
*/
public function count()
{
$length = 0;
foreach ($this as $segment) {
$length += $segment[2];
}
return $length;
}
/**
* @return string
*/
public function __toString()
{
return $this->query;
}
}
$query = "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC";
// wrap line: 11 lines à 50 nucleotids
$seq = chunk_split($query, 50, "<br />\n");
// get help from sequence object
$sequence = new AmendSequence($seq);
// some HTML comments for demonstration purposes
$sequence->insertAt(0, "<!-- first line -->\n");
$sequence->insertAt(50, "<!-- second line -->\n", TRUE); # TRUE: Place after <br />
$sequence->insertAt(75, "<!-- inside second line -->");
$sequence->insertAt(550, "<!-- at end -->", TRUE); # TRUE: Place after <br />
// colorize
$color = '<span style="color: hsl(0,100%,25%);">';
$sequence->insertAt(49, $color);
$sequence->insertAt(50, '</span>');
printf("Sequence with %d nucleotids:\n", count($sequence)); # count gives length w/o amends
echo $sequence, "\n"; # that prints your sequence with all amends
echo $sequence->getSequence(); # without the amends
现在随意对任意数量的部分进行着色 - 序列中是否已经有其他 HTML。