3

这是我遇到的问题,我正在尝试按类别分隔新闻,我有以下 txt 文件(其中包括所有新闻除以

 <item></item>

这是一组 4 个新闻,在我的实际文件中我有数千个。

 <item>
 Title: News from Washington
 Author: John Doe
 Category: New Laws
 Body: News content...
 </item>

 <item>
 Title: News from Texas
 Author: General Lee
 Category: Road Accidents
 Body: News content/
 </item>

 <item>
 Title: News from Georgia
 Author: Marcus Smith
 Category: Street Food
 Body: News content
 </item>

 <item>
 Title: News from Illinois
 Author: Robert Simpson
 Category: School Projects
 Body: News content
 </item>

我有以下编码:

//I get the content from the news file:
 $news = file_get_contents("news.txt");

//Then I create the following variables to get each set of news from the news variable:
 $regexp = '@<item>(.*?)</item>@msi';

我想从这里做的是,如果我只想获取一个包含“街头食品”作为类别的新闻文件,并忽略/忽略其他不同类别的新闻。

例如

我从上面的例子中得到的结果将是一个只包含这个项目的文件:

 <item>
 Title: News from Georgia
 Author: Marcus Smith
 Category: Street Food
 Body: News content
 </item>

我尝试使用 preg_match_all 和 foreach 函数来获取一组特定类别的新闻,但没有运气。

你有什么建议来做到这一点?或者如果你能给我一个很好的例子。

提前致谢!

4

2 回答 2

3

你可以试试

$final = array();
$filename = "log.txt";
$news = simplexml_load_file($filename);

foreach ( $news as $item ) {
    $item = trim($item);
    $content = array();
    foreach ( explode("\n", $item) as $info ) {
        list($title, $data) = explode(":", $info);
        $content[trim($title)] = $data;
    }
    $final[trim($content['Category'])][] = $content;
}


#Remove Street Food
unset($final['Street Food']);

#Output The Rest 
var_dump($final);

输出

    array
  'New Laws' => 
    array
      0 => 
        array
          'Title' => string ' News from Washington' (length=21)
          'Author' => string ' John Doe' (length=9)
          'Category' => string ' New Laws' (length=9)
          'Body' => string ' News content...' (length=16)
  'Road Accidents' => 
    array
      0 => 
        array
          'Title' => string ' News from Texas' (length=16)
          'Author' => string ' General Lee' (length=12)
          'Category' => string ' Road Accidents' (length=15)
          'Body' => string ' News content/' (length=14)
  'School Projects' => 
    array
      0 => 
        array
          'Title' => string ' News from Illinois' (length=19)
          'Author' => string ' Robert Simpson' (length=15)
          'Category' => string ' School Projects' (length=16)
          'Body' => string ' News content' (length=13)

您还可以Rewrite The XML使用以下

#Rewrite the array to new XML Fromat
rewriteToXML($final,"log.xml");

这将返回

<?xml version="1.0"?>
<items>
    <item>
        <Title> News from Washington</Title>
        <Author> John Doe</Author>
        <Category> New Laws</Category>
        <Body> News content...</Body>
    </item>
    <item>
        <Title> News from Texas</Title>
        <Author> General Lee</Author>
        <Category> Road Accidents</Category>
        <Body> News content/</Body>
    </item>
    <item>
        <Title> News from Illinois</Title>
        <Author> Robert Simpson</Author>
        <Category> School Projects</Category>
        <Body> News content</Body>
    </item>
</items>

阅读新格式更容易

$final = array();
$filename = "log.xml";
$news = simplexml_load_file($filename);

foreach ( $news as $item ) {
    #Check if not Street Food
    if(trim($item->Category) != 'Street Food')
            $final[trim($item->Category)][] = (array) $item;
}

#Output The Rest
var_dump($final);

重写功能

function rewriteToXML($array, $fileName = null) {
    $xml = new SimpleXMLElement("<items />");
    foreach ( $array as $key => $item ) {
        $child = $xml->addChild("item");
        foreach ( $item as $list ) {
            foreach ( $list as $title => $data ) 
            {
                $child->addChild($title, $data);
            }
        }
    }
    $xml->asXML($fileName);
}
于 2012-10-12T20:29:39.353 回答
0

如果这是一个 xml 文件,我会使用 simpleXML 而不是正则表达式。然后您可以使用 xQuery 查询 simpleXML 文档。

http://php.net/manual/en/book.simplexml.php

于 2012-10-12T20:15:31.343 回答