-1

我正在尝试从长字符串文本中提取特定信息。正文是:

评分: 明确 得分: 17 标签: 围裙 金发女郎 brown_eyes itaru_chokusha kirigaya_kazuto long_hair 乳头 no_bra nopan 裸体 Sword_art_online yuuki_asuna 用户: openui

我想将它们提取为

  1. 评级:明确
  2. 得分:17
  3. Tags: 围裙blonde_hair brown_eyes itaru_chokusha kirigaya_kazuto long_hair Sword_art_online yuuki_asuna
  4. 用户:openui

我尝试的代码只能取出标题

$imageTitle = "Rating: Explicit Score: 17 Tags: apron blonde_hair brown_eyes itaru_chokusha kirigaya_kazuto long_hair nipples no_bra nopan nude sword_art_online yuuki_asuna User: openui";
preg_match_all("/[a-z]{1,}\:\s/i", $imageTitle, $matches);
var_dump($matches);

我最后尝试使用(.*),但它给出了整个文本。这个只提取一个词

preg_match_all("/[a-z]{1,}\:\s[a-z0-9]{1,}/i", $imageTitle, $matches);
//Output
array (size=1)
  0 => 
    array (size=4)
      0 => string 'Rating: Explicit' (length=16)
      1 => string 'Score: 17' (length=9)
      2 => string 'Tags: apron' (length=11)
      3 => string 'User: openui' (length=12)

如何提取剩余信息?如果可能的话,也可以作为数组索引和值。

4

1 回答 1

0

preg_match_all应该有效:

$s = 'Rating: Explicit Score: 17 Tags: apron blonde_hair brown_eyes itaru_chokusha
      kirigaya_kazuto long_hair sword_art_online yuuki_asuna User: openui';

if (preg_match_all('#\s*(.+?(?=((^|\s)[A-Z][a-z]*:\s*|$)))#i', $s, $arr))    
   print_r($arr[1]);

输出:

Array
(
    [0] => Rating: Explicit
    [1] => Score: 17
    [2] => Tags: apron blonde_hair brown_eyes itaru_chokusha kirigaya_kazuto long_hair sword_art_online yuuki_asuna
    [3] => User: openui
)
于 2013-08-06T18:39:26.570 回答