0

过去 4 天,我正在尝试使用此字段分布将此 XML 文件转换为 CSV

XML 文件部分

<!-- language: lang-xml -->

<ponudba podjetje="SO d.o.o." velja_od="23.09.2012 @ 12:30:48">
    <artikel koda="LS593EAR" naziv="HP ENVY 17-2199e" kategorija="Prenosniki" podkategorija="Hewlett Packard (HP)" v_akciji="ne" kosovnost="več">
    <opis>
    HP ENVY 17-2199el, Intel Core i7-2630QM (2.0 GHz), 17.3'' FHD AG LED 3D, 8 GB DDR3 (2x 4 GB), 1 TB, BluRay, ATI Radeon HD6850 1024 MB, WiFi, Bluetooth, Webcam, 3D glasses, Microsoft Windows 7 Home Premium (64 bit)
    </opis>
    <opis_detail>
    HP ENVY 17-2199el, Intel Core i7-2630QM (2.0 GHz), 17.3'' FHD AG LED 3D, 8 GB DDR3 (2x 4 GB), 1 TB, BluRay, ATI Radeon HD6850 1024 MB, WiFi, Bluetooth, Webcam, 3D glasses, Microsoft Windows 7 Home Premium (64 bit)<br/><table> <col width="25%" /> <col /> <tbody> <tr> <th>Procesor</th> <td>Intel® Core™ i7-2630QM / 2.00 GHz / Quad-Core</td> </tr> <tr> <th>Delovni pomnilnik</th> <td>8 GB DDR3</td> </tr> <tr> <th>Trdi disk</th> <td>1 TB (1000 GB) / 5400 / SATA</td> </tr> <tr> <th>LCD zaslon</th> <td>43,9 cm (17,3'') Full HD HP Ultra BrightView Infinity Display (1920x1080)</td> </tr> <tr> <th>Grafična kartica</th> <td>AMD Radeon™ HD 6850 Graphics</td> </tr> <tr> <th>Optična enota</th> <td>SuperMulti DVD-RW Double Layer</td> </tr> <tr> <th>USB 2.0</th> <td>2x</td> </tr> <tr> <th>USB 3.0</th> <td>1x</td> </tr>    <tr> <th>eSATA</th> <td>da</td> </tr> <tr> <th>HDMI</th> <td>da</td> </tr> <tr> <th>WiFi</th> <td>da</td> </tr> <tr> <th>Bluetooth</th> <td>da</td> </tr> <tr> <th>WWAN</th> <td>ne</td> </tr> <tr> <th>Spletna kamera</th> <td>da</td> </tr> <tr> <th>Card Reader</th> <td>da</td> </tr> <tr> <th>Express Card</th> <td>ne</td> </tr> <tr> <th>TV kartica</th> <td>ne</td> </tr> <tr> <th>Finger Print</th> <td>ne</td> </tr> <tr> <th>Vhodne naprave</th> <td>brez</td> </tr>     <tr> <th>Operacijski sistem</th> <td>Microsoft Windows 7 Home Premium (64 bit)</td> </tr> <tr> <th>Država uvoza</th> <td>Italijanska tipkovnica (priložene SLO nalepke)</td> </tr>  <tr> <th>Stanje modela</th> <td>HP Renew</td> </tr>     </tbody> </table>
    </opis_detail>
    <garancija_v_mesecih>12</garancija_v_mesecih>
    <cena_v_EUR>1.049,00</cena_v_EUR>
    <proizvajalec>HP</proizvajalec>
    <stanje>na zalogi</stanje>
    <url_foto_artikla>
    http://www.so-doo.si/media/catalog/product/cache/1/image/265x/9df78eab33525d08d6e5fb8d27136e95/c/0/c02034964.jpg.hri_4.jpg
    </url_foto_artikla>
    <vec_fotk_artikla>
    <slika href="http://www.so-doo.si/media/catalog/product/c/0/c02034982.jpg.hri_4.jpg"/>
    <slika href="http://www.so-doo.si/media/catalog/product/c/0/c02034991.jpg.hri_4.jpg"/>
    </vec_fotk_artikla>
    <teza_artikla_v_kg>2.9000</teza_artikla_v_kg>
    </artikel>

这是我想要的 CSV 文件 - 标头所有字段来自 XML 的所有数据,而不仅仅是一些数据:(

<!-- language: lang-csv -->

koda    naziv   kategorija  podkategorija   v_akciji    kosovnost   opis    opis_detail garancija_v_mesecih cena_v_EUR  proizvajalec    stanje  password    url_foto_artikla    vec_fotk_artikla

我试过这个:

// The order here determines the order in the output CSV file
$columns = array(
    'koda',
    'naziv',
    'kategorija',
    'podkategorija',
    'v_akciji',
    'kosovnost'
);

// This will be used later on to correctly sort in the attribute values
// Note: the third paramter of "array_fill" determines what value to use
// in case a node lacks an attribute
$csv_blueprint = array_combine(
    $columns,
    array_fill(0, count($columns), '')
);

$data = array($columns);
$filexml = 'so_feed.xml';

if ( !file_exists($filexml) ) {
    // Do some error routine
} else {
    $xml = simplexml_load_file($filexml);
    $artikel = $xml->artikel;

    if ( !count($artikel) ) {
        // Stop processing 'cause there's nothing to do
    } else {
        foreach ( $artikel as $item ) {
            // Clone the row blueprint to leave the original unspoiled
            $row = $csv_blueprint;

我也试过这个:

$xml = simplexml_load_file($filexml);
//$artikel = $xml->artikel;
$ponudbas = $xml->ponudba;
...
    foreach ( $ponudbas as $ponudba ) {
        // Clone the row blueprint to leave the original unspoiled
        $row = $csv_blueprint;

但是这两种情况都不会解析 XML 中的所有数据。我不知道该怎么办 :(

4

1 回答 1

0

如果您的 XML 与您复制的完全相同,则它不是有效的 XML 文档。最后缺少了</ponudba>

要考虑的另一件事是 XML 格式是元素内的数据,在您的情况下,我们可以看到在两个元素(17'')中使用双引号 ''。在某些特殊情况下,这可能会导致解析错误。如果您真的想使用这些,也许最好使用 CDATA 块中的数据来转义那些特殊字符。

编辑:我刚刚看到您的 XML 在 XML 元素中包含 HTML 元素,我们鼓励您对这种 XML 元素使用 CDATA 块。

如果对您来说更容易,您可以简单地将 XML 转换为 JSON 并将其直接解码为 php 对象:

$json = json_encode($xml);
$data = json_decode($json, TRUE);

如果你想写回一个 csv 文件,你应该考虑使用 fputcsv (http://php.net/manual/fr/function.fputcsv.php)

编辑 2 尝试一个简单的测试:

利用:

$file='file.xml';
$xml = simplexml_load_file($file);

foreach ($xml->artikel as $art)
{    
    echo $art->opis_detail;
}

这将仅输出:

HP ENVY 17-2199el, Intel Core i7-2630QM (2.0 GHz), 17.3'' FHD AG LED 3D, 8 GB DDR3 (2x 4 GB), 1 TB, BluRay, ATI Radeon HD6850 1024 MB, WiFi, Bluetooth, Webcam, 3D glasses, Microsoft Windows 7 Home Premium (64 bit)

现在,如果您在节点上的 XML 上使用 CDATA 元素:

<opis_detail><![CDATA[HP ENVY 17-2199el, Intel Core i7-2630QM (2.0 GHz), 17.3'' FHD AG LED 3D, 8 GB DDR3 (2x 4 GB), 1 TB, BluRay, ATI Radeon HD6850 1024 MB, WiFi, Bluetooth, Webcam, 3D glasses, Microsoft Windows 7 Home Premium (64 bit)<br/><table> <col width="25%" /> <col /> <tbody> <tr> <th>Procesor</th> <td>Intel® Core™ i7-2630QM / 2.00 GHz / Quad-Core</td> </tr> <tr> <th>Delovni pomnilnik</th> <td>8 GB DDR3</td> </tr> <tr> <th>Trdi disk</th> <td>1 TB (1000 GB) / 5400 / SATA</td> </tr> <tr> <th>LCD zaslon</th> <td>43,9 cm (17,3'') Full HD HP Ultra BrightView Infinity Display (1920x1080)</td> </tr> <tr> <th>Grafična kartica</th> <td>AMD Radeon™ HD 6850 Graphics</td> </tr> <tr> <th>Optična enota</th> <td>SuperMulti DVD-RW Double Layer</td> </tr> <tr> <th>USB 2.0</th> <td>2x</td> </tr> <tr> <th>USB 3.0</th> <td>1x</td> </tr>    <tr> <th>eSATA</th> <td>da</td> </tr> <tr> <th>HDMI</th> <td>da</td> </tr> <tr> <th>WiFi</th> <td>da</td> </tr> <tr> <th>Bluetooth</th> <td>da</td> </tr> <tr> <th>WWAN</th> <td>ne</td> </tr> <tr> <th>Spletna kamera</th> <td>da</td> </tr> <tr> <th>Card Reader</th> <td>da</td> </tr> <tr> <th>Express Card</th> <td>ne</td> </tr> <tr> <th>TV kartica</th> <td>ne</td> </tr> <tr> <th>Finger Print</th> <td>ne</td> </tr> <tr> <th>Vhodne naprave</th> <td>brez</td> </tr>     <tr> <th>Operacijski sistem</th> <td>Microsoft Windows 7 Home Premium (64 bit)</td> </tr> <tr> <th>Država uvoza</th> <td>Italijanska tipkovnica (priložene SLO nalepke)</td> </tr>  <tr> <th>Stanje modela</th> <td>HP Renew</td> </tr>     </tbody> </table>]]>
    </opis_detail>

现在将输出:

HP ENVY 17-2199el, Intel Core i7-2630QM (2.0 GHz), 17.3'' FHD AG LED 3D, 8 GB DDR3 (2x 4 GB), 1 TB, BluRay, ATI Radeon HD6850 1024 MB, WiFi, Bluetooth, Webcam, 3D glasses, Microsoft Windows 7 Home Premium (64 bit)
Procesor    Intel® Core™ i7-2630QM / 2.00 GHz / Quad-Core
Delovni pomnilnik   8 GB DDR3
Trdi disk   1 TB (1000 GB) / 5400 / SATA
LCD zaslon  43,9 cm (17,3'') Full HD HP Ultra BrightView Infinity Display (1920x1080)
GrafiÄna kartica    AMD Radeonâ„¢ HD 6850 Graphics
OptiÄna enota   SuperMulti DVD-RW Double Layer
USB 2.0 2x
USB 3.0 1x
eSATA   da
HDMI    da
WiFi    da
Bluetooth   da
WWAN    ne
Spletna kamera  da
Card Reader da
Express Card    ne
TV kartica  ne
Finger Print    ne
Vhodne naprave  brez
Operacijski sistem  Microsoft Windows 7 Home Premium (64 bit)
Država uvoza   Italijanska tipkovnica (priložene SLO nalepke)
Stanje modela   HP Renew

我认为这是缺少的数据不是吗?

于 2012-09-25T15:10:19.970 回答